Make relfile tombstone files conditional on WAL level
Hi,
I'm starting a new thread for this patch that originated as a
side-discussion in [1]/messages/by-id/CA+hUKGLdemy2gBm80kz20GTe6hNVwoErE8KwcJk6-U56oStjtg@mail.gmail.com, to give it its own CF entry in the next cycle.
This is a WIP with an open question to research: what could actually
break if we did this?
[1]: /messages/by-id/CA+hUKGLdemy2gBm80kz20GTe6hNVwoErE8KwcJk6-U56oStjtg@mail.gmail.com
Attachments:
0001-Make-relfile-tombstone-files-conditional-on-WAL-leve.patchtext/x-patch; charset=US-ASCII; name=0001-Make-relfile-tombstone-files-conditional-on-WAL-leve.patchDownload
From 61a15ed286a1fd824b4e2b4b689cbe6688930e6e Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 2 Mar 2021 16:09:51 +1300
Subject: [PATCH] Make relfile tombstone files conditional on WAL level.
Traditionally we have left behind an empty file to be unlinked after the
next checkpoint, when dropping a relation. That would prevent
GetNewRelFileNode() from recycling the relfile, avoiding a schedule
that could corrupt data in crash recovery, with wal_level=minimal.
Since the default wal_level changed to replica in release 10, and since
this mechanism introduces costs elsewhere, let's only do it if we have
to. In particular, this avoids the need for DROP TABLESPACE for force
an expensive checkpoint just to clear out tombstone files.
XXX What would break if we did this?
Discussion: https://postgr.es/m/CA%2BhUKGLT3zibuLkn_j9xiPWn6hxH9Br-TsJoSaFgQOpxpEUnPQ%40mail.gmail.com
---
src/backend/commands/tablespace.c | 14 +++++++++++---
src/backend/storage/smgr/md.c | 10 ++++++----
src/include/access/xlog.h | 6 ++++++
3 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 69ea155d50..c1d12f3d19 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -512,8 +512,10 @@ DropTableSpace(DropTableSpaceStmt *stmt)
*/
if (!destroy_tablespace_directories(tablespaceoid, false))
{
+ bool try_again = false;
+
/*
- * Not all files deleted? However, there can be lingering empty files
+ * Not all files deleted? However, there can be lingering tombstones
* in the directories, left behind by for example DROP TABLE, that
* have been scheduled for deletion at next checkpoint (see comments
* in mdunlink() for details). We could just delete them immediately,
@@ -528,8 +530,14 @@ DropTableSpace(DropTableSpaceStmt *stmt)
* TABLESPACE should not give up on the tablespace becoming empty
* until all relevant invalidation processing is complete.
*/
- RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
- if (!destroy_tablespace_directories(tablespaceoid, false))
+
+ if (XLogNeedRelFileTombstones())
+ {
+ RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
+ try_again = true;
+ }
+
+ if (!try_again || !destroy_tablespace_directories(tablespaceoid, false))
{
/* Still not empty, the files must be important then */
ereport(ERROR,
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 1e12cfad8e..447122519e 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -236,8 +236,9 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
* to delete all forks.
*
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
+ * For regular relations, we don't always unlink the first segment file,
+ * depending on the WAL level. If XLogNeedRelFileTombstones() is true, we
+ * just truncate it to zero length, and record a request to unlink it after
* the next checkpoint. Additional segments can be unlinked immediately,
* however. Leaving the empty file in place prevents that relfilenode
* number from being reused. The scenario this protects us from is:
@@ -321,7 +322,8 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode) ||
+ !XLogNeedRelFileTombstones())
{
if (!RelFileNodeBackendIsTemp(rnode))
{
@@ -349,7 +351,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
- /* Register request to unlink first segment later */
+ /* Leave the file as a tombstone, to be unlinked at checkpoint time. */
register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
}
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 75ec1073bd..cca04a6aa8 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -207,6 +207,12 @@ extern PGDLLIMPORT int wal_level;
/* Do we need to WAL-log information required only for logical replication? */
#define XLogLogicalInfoActive() (wal_level >= WAL_LEVEL_LOGICAL)
+/*
+ * Is the WAL-level so low that it is unsafe to recycle relfilenodes between
+ * checkpoints? See mdunlinkfork().
+ */
+#define XLogNeedRelFileTombstones() (wal_level < WAL_LEVEL_REPLICA)
+
#ifdef WAL_DEBUG
extern bool XLOG_DEBUG;
#endif
--
2.30.0
On 05/03/2021 00:02, Thomas Munro wrote:
Hi,
I'm starting a new thread for this patch that originated as a
side-discussion in [1], to give it its own CF entry in the next cycle.
This is a WIP with an open question to research: what could actually
break if we did this?
I don't see a problem.
It would indeed be nice to have some other mechanism to prevent the
issue with wal_level=minimal, the tombstone files feel hacky and
complicated. Maybe a new shared memory hash table to track the
relfilenodes of dropped tables.
- Heikki
On Thu, Jun 10, 2021 at 6:47 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
It would indeed be nice to have some other mechanism to prevent the
issue with wal_level=minimal, the tombstone files feel hacky and
complicated. Maybe a new shared memory hash table to track the
relfilenodes of dropped tables.
Just to summarize the issue here as I understand it, if a relfilenode
is used for two unrelated relations during the same checkpoint cycle
with wal_level=minimal, and if the WAL-skipping optimization is
applied to the second of those but not to the first, then crash
recovery will lose our only copy of the new relation's data, because
we'll replay the removal of the old relfilenode but will not have
logged the new data. Furthermore, we've wondered about writing an
end-of-recovery record in all cases rather than sometimes writing an
end-of-recovery record and sometimes a checkpoint record. That would
allow another version of this same problem, since a single checkpoint
cycle could now span multiple server lifetimes. At present, we dodge
all this by keeping the first segment of the main fork around as a
zero-length file for the rest of the checkpoint cycle, which I think
prevents the problem in both cases. Now, apparently that caused some
problem with the AIO patch set so Thomas is curious about getting rid
of it, and Heikki concurs that it's a hack.
I guess my concern about this patch is that it just seems to be
reducing the number of cases where that hack is used without actually
getting rid of it. Rarely-taken code paths are more likely to have
undiscovered bugs, and that seems particularly likely in this case,
because this is a low-probability scenario to begin with. A lot of
clusters probably never have an OID counter wraparound ever, and even
in those that do, getting an OID collision with just the right timing
followed by a crash before a checkpoint can intervene has got to be
super-unlikely. Even as things are today, if this mechanism has subtle
bugs, it seems entirely possible that they could have escaped notice
up until now.
So I spent some time thinking about the question of getting rid of
tombstone files altogether. I don't think that Heikki's idea of a
shared memory hash table to track dropped relfilenodes can work. The
hash table will have to be of some fixed size N, and whatever the
value of N, the approach will break down if N+1 relfilenodes are
dropped in the same checkpoint cycle.
The two most principled solutions to this problem that I can see are
(1) remove wal_level=minimal and (2) use 64-bit relfilenodes. I have
been reluctant to support #1 because it's hard for me to believe that
there aren't cases where being able to skip a whole lot of WAL-logging
doesn't work out to a nice performance win, but I realize opinions on
that topic vary. And I'm pretty sure that Andres, at least, will hate
#2 because he's unhappy with the width of buffer tags already. So I
don't really have a good idea. I agree this tombstone system is a bit
of a wart, but I'm not sure that this patch really makes anything any
better, and I'm not really seeing another idea that seems better
either.
Maybe I am missing something...
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 2021-08-02 16:03:31 -0400, Robert Haas wrote:
The two most principled solutions to this problem that I can see are
(1) remove wal_level=minimal and
I'm personally not opposed to this. It's not practically relevant and makes a
lot of stuff more complicated. We imo should rather focus on optimizing the
things wal_level=minimal accelerates a lot than adding complications for
wal_level=minimal. Such optimizations would have practical relevance, and
there's plenty low hanging fruits.
(2) use 64-bit relfilenodes. I have
been reluctant to support #1 because it's hard for me to believe that
there aren't cases where being able to skip a whole lot of WAL-logging
doesn't work out to a nice performance win, but I realize opinions on
that topic vary. And I'm pretty sure that Andres, at least, will hate
#2 because he's unhappy with the width of buffer tags already.
Yep :/
I guess there's a somewhat hacky way to get somewhere without actually
increasing the size. We could take 3 bytes from the fork number and use that
to get to a 7 byte relfilenode portion. 7 bytes are probably enough for
everyone.
It's not like we can use those bytes in a useful way, due to alignment
requirements. Declaring that the high 7 bytes are for the relNode portion and
the low byte for the fork would still allow efficient comparisons and doesn't
seem too ugly.
So I don't really have a good idea. I agree this tombstone system is a
bit of a wart, but I'm not sure that this patch really makes anything
any better, and I'm not really seeing another idea that seems better
either.
Maybe I am missing something...
What I proposed in the past was to have a new shared table that tracks
relfilenodes. I still think that's a decent solution for just the problem at
hand. But it'd also potentially be the way to redesign relation forks and even
slim down buffer tags:
Right now a buffer tag is:
- 4 byte tablespace oid
- 4 byte database oid
- 4 byte "relfilenode oid" (don't think we have a good name for this)
- 4 byte fork number
- 4 byte block number
If we had such a shared table we could put at least tablespace, fork number
into that table mapping them to an 8 byte "new relfilenode". That'd only make
the "new relfilenode" unique within a database, but that'd be sufficient for
our purposes. It'd give use a buffertag consisting out of the following:
- 4 byte database oid
- 8 byte "relfilenode"
- 4 byte block number
Of course, it'd add some complexity too, because a buffertag alone wouldn't be
sufficient to read data (as you'd need the tablespace oid from elsewhere). But
that's probably ok, I think all relevant places would have that information.
It's probably possible to remove the database oid from the tag as well, but
it'd make CREATE DATABASE tricker - we'd need to change the filenames of
tables as we copy, to adjust them to the differing oid.
Greetings,
Andres Freund
On Mon, Aug 2, 2021 at 6:38 PM Andres Freund <andres@anarazel.de> wrote:
What I proposed in the past was to have a new shared table that tracks
relfilenodes. I still think that's a decent solution for just the problem at
hand.
It's not really clear to me what problem is at hand. The problems that
the tombstone system created for the async I/O stuff weren't really
explained properly, IMHO. And I don't think the current system is all
that ugly. it's not the most beautiful thing in the world but we have
lots of way worse hacks. And, it's easy to understand, requires very
little code, and has few moving parts that can fail. As hacks go it's
a quality hack, I would say.
But it'd also potentially be the way to redesign relation forks and even
slim down buffer tags:Right now a buffer tag is:
- 4 byte tablespace oid
- 4 byte database oid
- 4 byte "relfilenode oid" (don't think we have a good name for this)
- 4 byte fork number
- 4 byte block numberIf we had such a shared table we could put at least tablespace, fork number
into that table mapping them to an 8 byte "new relfilenode". That'd only make
the "new relfilenode" unique within a database, but that'd be sufficient for
our purposes. It'd give use a buffertag consisting out of the following:
- 4 byte database oid
- 8 byte "relfilenode"
- 4 byte block number
Yep. I think this is a good direction.
Of course, it'd add some complexity too, because a buffertag alone wouldn't be
sufficient to read data (as you'd need the tablespace oid from elsewhere). But
that's probably ok, I think all relevant places would have that information.
I think the thing to look at would be the places that call
relpathperm() or relpathbackend(). I imagine this can be worked out,
but it might require some adjustment.
It's probably possible to remove the database oid from the tag as well, but
it'd make CREATE DATABASE tricker - we'd need to change the filenames of
tables as we copy, to adjust them to the differing oid.
Yeah, I'm not really sure that works out to a win. I tend to think
that we should be trying to make databases within the same cluster
more rather than less independent of each other. If we switch to using
a radix tree for the buffer mapping table as you have previously
proposed, then presumably each backend can cache a pointer to the
second level, after the database OID has been resolved. Then you have
no need to compare database OIDs for every lookup. That might turn out
to be better for performance than shoving everything into the buffer
tag anyway, because then backends in different databases would be
accessing distinct parts of the buffer mapping data structure instead
of contending with one another.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Fri, Mar 5, 2021 at 11:02 AM Thomas Munro <thomas.munro@gmail.com> wrote:
This is a WIP with an open question to research: what could actually
break if we did this?
I thought this part of bgwriter.c might be a candidate:
if (FirstCallSinceLastCheckpoint())
{
/*
* After any checkpoint, close all smgr files. This is so we
* won't hang onto smgr references to deleted files indefinitely.
*/
smgrcloseall();
}
Hmm, on closer inspection, isn't the lack of real interlocking with
checkpoints a bit suspect already? What stops bgwriter from writing
to the previous relfilenode generation's fd, if a relfilenode is
recycled while BgBufferSync() is running? Not sinval, and not the
above code that only runs between BgBufferSync() invocations.
On Wed, Aug 4, 2021 at 3:22 AM Robert Haas <robertmhaas@gmail.com> wrote:
It's not really clear to me what problem is at hand. The problems that
the tombstone system created for the async I/O stuff weren't really
explained properly, IMHO. And I don't think the current system is all
that ugly. it's not the most beautiful thing in the world but we have
lots of way worse hacks. And, it's easy to understand, requires very
little code, and has few moving parts that can fail. As hacks go it's
a quality hack, I would say.
It's not really an AIO problem. It's just that while testing the AIO
stuff across a lot of operating systems, we had tests failing on
Windows because the extra worker processes you get if you use
io_method=worker were holding cached descriptors and causing stuff
like DROP TABLESPACE to fail. AFAIK every problem we discovered in
that vein is a current live bug in all versions of PostgreSQL for
Windows (it just takes other backends or the bgwriter to hold an fd at
the wrong moment). The solution I'm proposing to that general class
of problem is https://commitfest.postgresql.org/34/2962/ .
In the course of thinking about that, it seemed natural to look into
the possibility of getting rid of the tombstones, so that at least
Unix systems don't find themselves having to suffer through a
CHECKPOINT just to drop a tablespace that happens to contain a
tombstone.
On Wed, Sep 29, 2021 at 4:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
Hmm, on closer inspection, isn't the lack of real interlocking with
checkpoints a bit suspect already? What stops bgwriter from writing
to the previous relfilenode generation's fd, if a relfilenode is
recycled while BgBufferSync() is running? Not sinval, and not the
above code that only runs between BgBufferSync() invocations.
I managed to produce a case where live data is written to an unlinked
file and lost, with a couple of tweaks to get the right timing and
simulate OID wraparound. See attached. If you run the following
commands repeatedly with shared_buffers=256kB and
bgwriter_lru_multiplier=10, you should see a number lower than 10,000
from the last query in some runs, depending on timing.
create extension if not exists chaos;
create extension if not exists pg_prewarm;
drop table if exists t1, t2;
checkpoint;
vacuum pg_class;
select clobber_next_oid(200000);
create table t1 as select 42 i from generate_series(1, 10000);
select pg_prewarm('t1'); -- fill buffer pool with t1
update t1 set i = i; -- dirty t1 buffers so bgwriter writes some
select pg_sleep(2); -- give bgwriter some time
drop table t1;
checkpoint;
vacuum pg_class;
select clobber_next_oid(200000);
create table t2 as select 0 i from generate_series(1, 10000);
select pg_prewarm('t2'); -- fill buffer pool with t2
update t2 set i = 1 where i = 0; -- dirty t2 buffers so bgwriter writes some
select pg_sleep(2); -- give bgwriter some time
select pg_prewarm('pg_attribute'); -- evict all clean t2 buffers
select sum(i) as t2_sum_should_be_10000 from t2; -- have any updates been lost?
Attachments:
0001-HACK-A-function-to-control-the-OID-allocator.patchtext/x-patch; charset=US-ASCII; name=0001-HACK-A-function-to-control-the-OID-allocator.patchDownload
From b116b80b2775b004e35a9e7be0a057ee2724041b Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 30 Sep 2021 15:47:23 +1300
Subject: [PATCH 1/2] HACK: A function to control the OID allocator.
---
src/test/modules/chaos/Makefile | 21 +++++++++++++++++++++
src/test/modules/chaos/chaos--1.0.sql | 8 ++++++++
src/test/modules/chaos/chaos.c | 26 ++++++++++++++++++++++++++
src/test/modules/chaos/chaos.control | 4 ++++
4 files changed, 59 insertions(+)
create mode 100644 src/test/modules/chaos/Makefile
create mode 100644 src/test/modules/chaos/chaos--1.0.sql
create mode 100644 src/test/modules/chaos/chaos.c
create mode 100644 src/test/modules/chaos/chaos.control
diff --git a/src/test/modules/chaos/Makefile b/src/test/modules/chaos/Makefile
new file mode 100644
index 0000000000..ac69721af6
--- /dev/null
+++ b/src/test/modules/chaos/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/chaos/Makefile
+
+MODULE_big = chaos
+OBJS = \
+ $(WIN32RES) \
+ chaos.o
+PGFILEDESC = "chaos - module in which to write throwaway fault-injection hacks"
+
+EXTENSION = chaos
+DATA = chaos--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/chaos
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/chaos/chaos--1.0.sql b/src/test/modules/chaos/chaos--1.0.sql
new file mode 100644
index 0000000000..5016f7e586
--- /dev/null
+++ b/src/test/modules/chaos/chaos--1.0.sql
@@ -0,0 +1,8 @@
+/* src/test/modules/chaos/chaos--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION chaos" to load this file. \quit
+
+CREATE FUNCTION clobber_next_oid(size BIGINT)
+ RETURNS pg_catalog.void STRICT
+ AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/chaos/chaos.c b/src/test/modules/chaos/chaos.c
new file mode 100644
index 0000000000..f1052f865e
--- /dev/null
+++ b/src/test/modules/chaos/chaos.c
@@ -0,0 +1,26 @@
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "fmgr.h"
+#include "storage/lwlock.h"
+
+#include <limits.h>
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(clobber_next_oid);
+
+Datum
+clobber_next_oid(PG_FUNCTION_ARGS)
+{
+ int64 oid = PG_GETARG_INT64(0);
+
+ if (oid < FirstNormalObjectId || oid > UINT_MAX)
+ elog(ERROR, "invalid oid");
+
+ LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextOid = oid;
+ LWLockRelease(OidGenLock);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/chaos/chaos.control b/src/test/modules/chaos/chaos.control
new file mode 100644
index 0000000000..137ab8a58d
--- /dev/null
+++ b/src/test/modules/chaos/chaos.control
@@ -0,0 +1,4 @@
+comment = 'Test module for throwaway fault-injection hacks...'
+default_version = '1.0'
+module_pathname = '$libdir/chaos'
+relocatable = true
--
2.30.2
0002-HACK-Slow-the-bgwriter-down-a-bit.patchtext/x-patch; charset=US-ASCII; name=0002-HACK-Slow-the-bgwriter-down-a-bit.patchDownload
From 2acc2ad31c1db268de0e8927d5c10ba2bb06e33c Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 30 Sep 2021 17:16:01 +1300
Subject: [PATCH 2/2] HACK: Slow the bgwriter down a bit.
---
src/backend/postmaster/bgwriter.c | 2 ++
src/backend/storage/buffer/bufmgr.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 5584f4bc24..b65284b1f6 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -238,7 +238,9 @@ BackgroundWriterMain(void)
/*
* Do one cycle of dirty-buffer writing.
*/
+ elog(LOG, "=== begin BgBufferSync ===");
can_hibernate = BgBufferSync(&wb_context);
+ elog(LOG, "=== end BgBufferSync ===");
/*
* Send off activity statistics to the stats collector
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e88e4e918b..989125e37f 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -2452,6 +2452,8 @@ BgBufferSync(WritebackContext *wb_context)
}
else if (sync_state & BUF_REUSABLE)
reusable_buffers++;
+
+ pg_usleep(1000000);
}
PendingBgWriterStats.m_buf_written_clean += num_written;
--
2.30.2
On Thu, Sep 30, 2021 at 11:32 PM Thomas Munro <thomas.munro@gmail.com> wrote:
I managed to produce a case where live data is written to an unlinked
file and lost
I guess this must have been broken since release 9.2 moved checkpoints
out of here[1]/messages/by-id/CA+U5nMLv2ah-HNHaQ=2rxhp_hDJ9jcf-LL2kW3sE4msfnUw9gA@mail.gmail.com. The connection between checkpoints, tombstone files
and file descriptor cache invalidation in auxiliary (non-sinval)
backends was not documented as far as I can see (or at least not
anywhere near the load-bearing parts).
How could it be fixed, simply and backpatchably? If BgSyncBuffer()
did if-FirstCallSinceLastCheckpoint()-then-smgrcloseall() after
locking each individual buffer and before flushing, then I think it
might logically have the correct interlocking against relfilenode
wraparound, but that sounds a tad expensive :-( I guess it could be
made cheaper by using atomics for the checkpoint counter instead of
spinlocks. Better ideas?
[1]: /messages/by-id/CA+U5nMLv2ah-HNHaQ=2rxhp_hDJ9jcf-LL2kW3sE4msfnUw9gA@mail.gmail.com
On Tue, Oct 5, 2021 at 4:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Sep 30, 2021 at 11:32 PM Thomas Munro <thomas.munro@gmail.com> wrote:
I managed to produce a case where live data is written to an unlinked
file and lost
In conclusion, there *is* something else that would break, so I'm
withdrawing this CF entry (#3030) for now. Also, that something else
is already subtly broken, so I'll try to come up with a fix for that
separately.
On Mon, Aug 2, 2021 at 6:38 PM Andres Freund <andres@anarazel.de> wrote:
I guess there's a somewhat hacky way to get somewhere without actually
increasing the size. We could take 3 bytes from the fork number and use that
to get to a 7 byte relfilenode portion. 7 bytes are probably enough for
everyone.It's not like we can use those bytes in a useful way, due to alignment
requirements. Declaring that the high 7 bytes are for the relNode portion and
the low byte for the fork would still allow efficient comparisons and doesn't
seem too ugly.
I think this idea is worth more consideration. It seems like 2^56
relfilenodes ought to be enough for anyone, recalling that you can
only ever have 2^64 bytes of WAL. So if we do this, we can eliminate a
bunch of code that is there to guard against relfilenodes being
reused. In particular, we can remove the code that leaves a 0-length
tombstone file around until the next checkpoint to guard against
relfilenode reuse. On Windows, we still need
https://commitfest.postgresql.org/36/2962/ because of the problem that
Windows won't remove files from the directory listing until they are
both unlinked and closed. But in general this seems like it would lead
to cleaner code. For example, GetNewRelFileNode() needn't loop. If it
allocate the smallest unsigned integer that the cluster (or database)
has never previously assigned, the file should definitely not exist on
disk, and if it does, an ERROR is appropriate, as the database is
corrupted. This does assume that allocations from this new 56-bit
relfilenode counter are properly WAL-logged.
I think this would also solve a problem Dilip mentioned to me today:
suppose you make ALTER DATABASE SET TABLESPACE WAL-logged, as he's
been trying to do. Then suppose you do "ALTER DATABASE foo SET
TABLESPACE used_recently_but_not_any_more". You might get an error
complaining that “some relations of database \“%s\” are already in
tablespace \“%s\“” because there could be tombstone files in that
database. With this combination of changes, you could just use the
barrier mechanism from https://commitfest.postgresql.org/36/2962/ to
wait for those files to disappear, because they've got to be
previously-unliked files that Windows is still returning because
they're still opening -- or else they could be a sign of a corrupted
database, but there are no other possibilities.
I think, anyway.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Jan 6, 2022 at 3:07 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Aug 2, 2021 at 6:38 PM Andres Freund <andres@anarazel.de> wrote:
I guess there's a somewhat hacky way to get somewhere without actually
increasing the size. We could take 3 bytes from the fork number and use that
to get to a 7 byte relfilenode portion. 7 bytes are probably enough for
everyone.It's not like we can use those bytes in a useful way, due to alignment
requirements. Declaring that the high 7 bytes are for the relNode portion and
the low byte for the fork would still allow efficient comparisons and doesn't
seem too ugly.I think this idea is worth more consideration. It seems like 2^56
relfilenodes ought to be enough for anyone, recalling that you can
only ever have 2^64 bytes of WAL. So if we do this, we can eliminate a
bunch of code that is there to guard against relfilenodes being
reused. In particular, we can remove the code that leaves a 0-length
tombstone file around until the next checkpoint to guard against
relfilenode reuse.
+1
I think this would also solve a problem Dilip mentioned to me today:
suppose you make ALTER DATABASE SET TABLESPACE WAL-logged, as he's
been trying to do. Then suppose you do "ALTER DATABASE foo SET
TABLESPACE used_recently_but_not_any_more". You might get an error
complaining that “some relations of database \“%s\” are already in
tablespace \“%s\“” because there could be tombstone files in that
database. With this combination of changes, you could just use the
barrier mechanism from https://commitfest.postgresql.org/36/2962/ to
wait for those files to disappear, because they've got to be
previously-unliked files that Windows is still returning because
they're still opening -- or else they could be a sign of a corrupted
database, but there are no other possibilities.
Yes, this approach will solve the problem for the WAL-logged ALTER
DATABASE we are facing.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Jan 6, 2022 at 1:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I think this idea is worth more consideration. It seems like 2^56
relfilenodes ought to be enough for anyone, recalling that you can
only ever have 2^64 bytes of WAL. So if we do this, we can eliminate a
bunch of code that is there to guard against relfilenodes being
reused. In particular, we can remove the code that leaves a 0-length
tombstone file around until the next checkpoint to guard against
relfilenode reuse.+1
I IMHO a few top level point for implementing this idea would be as listed here,
1) the "relfilenode" member inside the RelFileNode will be now 64
bytes and remove the "forkNum" all together from the BufferTag. So
now whenever we want to use the relfilenode or fork number we can use
the respective mask and fetch their values.
2) GetNewRelFileNode() will not loop for checking the file existence
and retry with other relfilenode.
3) Modify mdunlinkfork() so that we immediately perform the unlink
request, make sure to register_forget_request() before unlink.
4) In checkpointer, now we don't need any handling for pendingUnlinks.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Jan 6, 2022 at 9:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jan 6, 2022 at 1:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I think this idea is worth more consideration. It seems like 2^56
relfilenodes ought to be enough for anyone, recalling that you can
only ever have 2^64 bytes of WAL. So if we do this, we can eliminate a
bunch of code that is there to guard against relfilenodes being
reused. In particular, we can remove the code that leaves a 0-length
tombstone file around until the next checkpoint to guard against
relfilenode reuse.+1
+1
I IMHO a few top level point for implementing this idea would be as listed here,
1) the "relfilenode" member inside the RelFileNode will be now 64
bytes and remove the "forkNum" all together from the BufferTag. So
now whenever we want to use the relfilenode or fork number we can use
the respective mask and fetch their values.
2) GetNewRelFileNode() will not loop for checking the file existence
and retry with other relfilenode.
3) Modify mdunlinkfork() so that we immediately perform the unlink
request, make sure to register_forget_request() before unlink.
4) In checkpointer, now we don't need any handling for pendingUnlinks.
Another problem is that relfilenodes are normally allocated with
GetNewOidWithIndex(), and initially match a relation's OID. We'd need
a new allocator, and they won't be able to match the OID in general
(while we have 32 bit OIDs at least).
On Thu, Jan 6, 2022 at 3:47 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Another problem is that relfilenodes are normally allocated with
GetNewOidWithIndex(), and initially match a relation's OID. We'd need
a new allocator, and they won't be able to match the OID in general
(while we have 32 bit OIDs at least).
Personally I'm not sad about that. Values that are the same in simple
cases but diverge in more complex cases are kind of a trap for the
unwary. There's no real reason to have them ever match. Yeah, in
theory, it makes it easier to tell which file matches which relation,
but in practice, you always have to double-check in case the table has
ever been rewritten. It doesn't seem worth continuing to contort the
code for a property we can't guarantee anyway.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 2022-01-06 08:52:01 -0500, Robert Haas wrote:
On Thu, Jan 6, 2022 at 3:47 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Another problem is that relfilenodes are normally allocated with
GetNewOidWithIndex(), and initially match a relation's OID. We'd need
a new allocator, and they won't be able to match the OID in general
(while we have 32 bit OIDs at least).Personally I'm not sad about that. Values that are the same in simple
cases but diverge in more complex cases are kind of a trap for the
unwary.
+1
On Thu, Jan 6, 2022 at 7:22 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Jan 6, 2022 at 3:47 AM Thomas Munro <thomas.munro@gmail.com>
wrote:Another problem is that relfilenodes are normally allocated with
GetNewOidWithIndex(), and initially match a relation's OID. We'd need
a new allocator, and they won't be able to match the OID in general
(while we have 32 bit OIDs at least).Personally I'm not sad about that. Values that are the same in simple
cases but diverge in more complex cases are kind of a trap for the
unwary. There's no real reason to have them ever match. Yeah, in
theory, it makes it easier to tell which file matches which relation,
but in practice, you always have to double-check in case the table has
ever been rewritten. It doesn't seem worth continuing to contort the
code for a property we can't guarantee anyway.
Make sense, I have started working on this idea, I will try to post the
first version by early next week.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jan 19, 2022 at 10:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jan 6, 2022 at 7:22 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Jan 6, 2022 at 3:47 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Another problem is that relfilenodes are normally allocated with
GetNewOidWithIndex(), and initially match a relation's OID. We'd need
a new allocator, and they won't be able to match the OID in general
(while we have 32 bit OIDs at least).Personally I'm not sad about that. Values that are the same in simple
cases but diverge in more complex cases are kind of a trap for the
unwary. There's no real reason to have them ever match. Yeah, in
theory, it makes it easier to tell which file matches which relation,
but in practice, you always have to double-check in case the table has
ever been rewritten. It doesn't seem worth continuing to contort the
code for a property we can't guarantee anyway.Make sense, I have started working on this idea, I will try to post the first version by early next week.
Here is the first working patch, with that now we don't need to
maintain the TombStone file until the next checkpoint. This is still
a WIP patch with this I can see my problem related to ALTER DATABASE
SET TABLESPACE WAL-logged problem is solved which Robert reported a
couple of mails above in the same thread.
General idea of the patch:
- Change the RelFileNode.relNode to be 64bit wide, out of which 8 bits
for fork number and 56 bits for the relNode as shown below. [1]/* * RelNodeId: * * this is a storage type for RelNode. The reasoning behind using this is same * as using the BlockId so refer comment atop BlockId. */ typedef struct RelNodeId { uint32 rn_hi; uint32 rn_lo; } RelNodeId; typedef struct RelFileNode { Oid spcNode; /* tablespace */ Oid dbNode; /* database */ RelNodeId relNode; /* relation */ } RelFileNode;
- GetNewRelFileNode() will just generate a new unique relfilenode and
check the file existence and if it already exists then throw an error,
so no loop. We also need to add the logic for preserving the
nextRelNode across restart and also WAL logging it but that is similar
to the preserving nextOid.
- mdunlinkfork, will directly forget the relfilenode, so we get rid of
all unlinking code from the code.
- Now, we don't need any post checkpoint unlinking activity.
[1]: /* * RelNodeId: * * this is a storage type for RelNode. The reasoning behind using this is same * as using the BlockId so refer comment atop BlockId. */ typedef struct RelNodeId { uint32 rn_hi; uint32 rn_lo; } RelNodeId; typedef struct RelFileNode { Oid spcNode; /* tablespace */ Oid dbNode; /* database */ RelNodeId relNode; /* relation */ } RelFileNode;
/*
* RelNodeId:
*
* this is a storage type for RelNode. The reasoning behind using this is same
* as using the BlockId so refer comment atop BlockId.
*/
typedef struct RelNodeId
{
uint32 rn_hi;
uint32 rn_lo;
} RelNodeId;
typedef struct RelFileNode
{
Oid spcNode; /* tablespace */
Oid dbNode; /* database */
RelNodeId relNode; /* relation */
} RelFileNode;
TODO:
There are a couple of TODOs and FIXMEs which I am planning to improve
by next week. I am also planning to do the testing where relfilenode
consumes more than 32 bits, maybe for that we can set the
FirstNormalRelfileNode to higher value for the testing purpose. And,
Improve comments.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v1-WIP-0001-Don-t-wait-for-next-checkpoint-to-remove-unwanted.patchtext/x-patch; charset=US-ASCII; name=v1-WIP-0001-Don-t-wait-for-next-checkpoint-to-remove-unwanted.patchDownload
From 4a6502c7950969262c6982388865bbc23e531cde Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Fri, 28 Jan 2022 18:32:35 +0530
Subject: [PATCH v1] Don't wait for next checkpoint to remove unwanted
relfilenode
Currently, relfilenode is 32 bits wide so if we remove the relfilenode
immediately after it is no longer needed then there is risk of reusing
the same relfilenode in the same checkpoint. So for avoiding that we
delay cleaning up the relfilenode until the next checkpoint. With this
patch we are using 56 bits for the relfilenode. Ideally we can make it
64 bits wider but that will increase the size of the BufferTag so for
keeping that size same we are making RelFileNode.relNode as 64 bit wider,
in that 8 bits will be used for storing the fork number and the remaining
56 bits for the relfilenode.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 4 +-
contrib/pg_prewarm/autoprewarm.c | 4 +-
src/backend/access/common/syncscan.c | 3 +-
src/backend/access/gin/ginxlog.c | 5 +-
src/backend/access/rmgrdesc/gistdesc.c | 4 +-
src/backend/access/rmgrdesc/heapdesc.c | 4 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/xlogdesc.c | 15 +++-
src/backend/access/transam/varsup.c | 42 +++++++++-
src/backend/access/transam/xlog.c | 57 ++++++++++---
src/backend/access/transam/xloginsert.c | 12 +++
src/backend/access/transam/xlogutils.c | 9 ++-
src/backend/catalog/catalog.c | 61 +++-----------
src/backend/catalog/heap.c | 6 +-
src/backend/catalog/index.c | 4 +-
src/backend/catalog/storage.c | 3 +-
src/backend/commands/tablecmds.c | 18 +++--
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 61 +++++++-------
src/backend/storage/buffer/localbuf.c | 8 +-
src/backend/storage/freespace/fsmpage.c | 4 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 68 +++++-----------
src/backend/storage/sync/sync.c | 101 ------------------------
src/backend/utils/adt/dbsize.c | 10 +--
src/backend/utils/cache/relcache.c | 30 ++++---
src/backend/utils/cache/relmapper.c | 39 ++++-----
src/backend/utils/misc/pg_controldata.c | 9 ++-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_rewind/filemap.c | 16 ++--
src/bin/pg_waldump/pg_waldump.c | 14 ++--
src/common/relpath.c | 22 +++---
src/include/access/transam.h | 6 ++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 4 +-
src/include/catalog/pg_class.h | 10 +--
src/include/catalog/pg_control.h | 2 +
src/include/commands/tablecmds.h | 2 +-
src/include/common/relpath.h | 6 +-
src/include/storage/buf_internals.h | 12 +--
src/include/storage/relfilenode.h | 66 +++++++++++++++-
src/include/storage/sync.h | 1 -
src/include/utils/relmapper.h | 6 +-
src/test/regress/expected/alter_table.out | 16 ++--
46 files changed, 405 insertions(+), 374 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..ddf33ac 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
+ fctx->record[i].relfilenode = RELFILENODE_GETRELNODE(bufHdr->tag.rnode);
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].forknum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 5d40fb5..a03fd03 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -617,8 +617,8 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].filenode = RELFILENODE_GETRELNODE(bufHdr->tag.rnode);
+ block_info_array[num_blocks].forknum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..386de77 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -161,7 +161,8 @@ SyncScanShmemInit(void)
*/
item->location.relfilenode.spcNode = InvalidOid;
item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ RELFILENODE_SETRELNODE(item->location.relfilenode,
+ InvalidRelfileNode);
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..b73a430 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,8 +100,9 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &node, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ elog(ERROR, "failed to add item to index page in %u/%u/" UINT64_FORMAT,
+ node.spcNode, node.dbNode,
+ RELFILENODE_GETRELNODE(node));
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 9cab4fa..4ebe661 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,9 +26,9 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ RELFILENODE_GETRELNODE(xlrec->node), xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..0e024a9 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,10 +169,10 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; tid %u/%u",
xlrec->target_node.spcNode,
xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ RELFILENODE_GETRELNODE(xlrec->target_node),
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..78c5eb4 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,9 +100,9 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ RELFILENODE_GETRELNODE(xlrec->node),
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..56a9e26 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,9 +25,9 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT,
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ RELFILENODE_GETRELNODE(xlrec->node));
}
const char *
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index e7452af..1c5b561 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenode " UINT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelNode,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelFilenode;
+
+ memcpy(&nextRelFilenode, rec, sizeof(RelNode));
+ appendStringInfo(buf, UINT64_FORMAT, nextRelFilenode);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENODE:
+ id = "NEXT_RELFILENODE";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..22396a5 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,42 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelNode
+ *
+ * Simmilar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenode. And the relfilenode is 56 bits wide so we don't need to
+ * worry about the wraparound case.
+ */
+RelNode
+GetNewRelNode(void)
+{
+ RelNode result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNode during recovery");
+
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+
+ /* If we run out of logged for use RelNode then we must log more */
+ if (ShmemVariableCache->relnodecount == 0)
+ {
+ XLogPutNextRelFileNode(ShmemVariableCache->nextRelNode +
+ VAR_RFN_PREFETCH);
+
+ ShmemVariableCache->relnodecount = VAR_RFN_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelNode;
+ (ShmemVariableCache->nextRelNode)++;
+ (ShmemVariableCache->relnodecount)--;
+
+ LWLockRelease(RelNodeGenLock);
+
+ return result;
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dfe2a0b..be633dc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1541,8 +1541,9 @@ checkXLogConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ "inconsistent page found, rel %u/%u/" UINT64_FORMAT ", forknum %u, blkno %u",
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
forknum, blkno);
}
}
@@ -5396,6 +5397,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelNode = FirstNormalRelfileNode;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -5409,7 +5411,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -7147,7 +7151,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -9259,6 +9265,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_SHARED);
+ checkPoint.nextRelNode = ShmemVariableCache->nextRelNode;
+ if (!shutdown)
+ checkPoint.nextRelNode += ShmemVariableCache->relnodecount;
+ LWLockRelease(RelNodeGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -9405,11 +9417,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
@@ -10070,6 +10077,18 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Simmialr to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a XLOG_NEXT_RELFILENODE log record.
+ */
+void
+XLogPutNextRelFileNode(RelNode nextrelnode)
+{
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnode), sizeof(RelNode));
+ (void) XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENODE);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -10331,6 +10350,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelNode;
+
+ memcpy(&nextRelNode, XLogRecGetData(record), sizeof(RelNode));
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -10344,6 +10373,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
@@ -10713,15 +10746,17 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" UINT64_FORMAT ", fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" UINT64_FORMAT ", blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index c260310..dc5e101 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -244,6 +244,18 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+
+ /*
+ * In the registered buffer we are writing the fork number separately so
+ * clear it from the rnode. The reason we need to clear this is because if
+ * we are registering multiple blocks which have the same RelFileNode then
+ * we will not write the RelFileNode multiple times. So the problem is
+ * that if those blocks are for different fork numbers then if we keep the
+ * fork number as part of the RelFileNode.relNode then we can not reuse the
+ * same RelFileNode.
+ */
+ RELFILENODE_CLEARFORKNUM(regbuf->rnode);
+
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 90e1c483..d09ead1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -593,17 +593,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ sprintf(RelationGetRelationName(rel), UINT64_FORMAT,
+ RELFILENODE_GETRELNODE(rnode));
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index dfd5fb6..5afbd07 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -472,27 +472,18 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
/*
* GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
- * database of the given tablespace.
+ * Generate a new relfilenode number.
*
- * If the relfilenode will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
+ * We are using 56 bits for the relfilenode so we expect that to be unique for
+ * the cluster so if it is already exists then report and error.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelNode
+GetNewRelFileNode(Oid reltablespace, char relpersistence)
{
RelFileNodeBackend rnode;
char *rpath;
- bool collides;
BackendId backend;
+ RelNode relNode;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
@@ -525,42 +516,16 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
* are properly detected.
*/
rnode.backend = backend;
+ relNode = GetNewRelNode();
+ RELFILENODE_SETRELNODE(rnode.node, relNode);
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rnode.node.relNode = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ /* Check for existing file of same name */
+ rpath = relpath(rnode, MAIN_FORKNUM);
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
+ if (access(rpath, F_OK) == 0)
+ elog(ERROR, "new relfilenode file already exists: \"%s\"\n", rpath);
- return rnode.node.relNode;
+ return relNode;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7e99de8..d00575a 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -359,7 +359,7 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ relfilenode = GetNewRelFileNode(reltablespace, relpersistence);
}
/*
@@ -1243,8 +1243,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 2308d40..76e3702 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -935,8 +935,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 9b80755..712e995 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -593,7 +593,8 @@ RestorePendingSyncs(char *startAddress)
RelFileNode *rnode;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
+ for (rnode = (RelFileNode *) startAddress;
+ RELFILENODE_GETRELNODE(*rnode) != 0; rnode++)
AddPendingSync(rnode);
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1f0654c..7a048b3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3304,7 +3304,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelNode newRelFileNode)
{
Relation pg_class;
HeapTuple tuple;
@@ -3324,7 +3324,7 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
+ if (newRelFileNode != InvalidRelfileNode)
rd_rel->relfilenode = newRelFileNode;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
@@ -13441,7 +13441,7 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNode = RELFILENODE_GETRELNODE(irel->rd_node);
stmt->oldCreateSubid = irel->rd_createSubid;
stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
}
@@ -14290,7 +14290,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
+ RelNode newrelfilenode;
RelFileNode newrnode;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14320,15 +14320,17 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenode for the table in new tablespace.
+ *
+ * XXX Relfilenodes are unique in the cluster, so can use the same
+ * relfilenodein the new tablespace?
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
+ newrelfilenode = GetNewRelFileNode(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
+ RELFILENODE_SETRELNODE(newrnode, newrelfilenode);
newrnode.spcNode = newTableSpace;
/* hand off to AM to actually create the new filenode and copy the data */
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 3fb5a92..23822c1 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENODE:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 19b2ba2..143d403 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2134,7 +2134,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
Assert(snapshot_now);
reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ RELFILENODE_GETRELNODE(change->data.tp.relnode));
/*
* Mapped catalog tuple without data, emitted while
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index a2512e7..88d276c 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -818,7 +818,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend);
@@ -880,7 +880,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend,
found);
@@ -1070,7 +1070,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend,
found);
@@ -1249,7 +1249,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node));
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1260,7 +1260,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node));
}
else
{
@@ -1640,7 +1640,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
- bufHdr->tag.forkNum == forkNum)
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1651,7 +1651,7 @@ ReleaseAndReadBuffer(Buffer buffer,
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
- bufHdr->tag.forkNum == forkNum)
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -1993,8 +1993,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
- item->forkNum = bufHdr->tag.forkNum;
+ item->relNode = RELFILENODE_GETRELNODE(bufHdr->tag.rnode);
+ item->forkNum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2701,7 +2701,8 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode, backend,
+ RELFILENODE_GETFORKNUM(buf->tag.rnode));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2781,7 +2782,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
/* pinned, so OK to read tag without spinlock */
*rnode = bufHdr->tag.rnode;
- *forknum = bufHdr->tag.forkNum;
+ *forknum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
*blknum = bufHdr->tag.blockNum;
}
@@ -2833,11 +2834,11 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
if (reln == NULL)
reln = smgropen(buf->tag.rnode, InvalidBackendId);
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(RELFILENODE_GETFORKNUM(buf->tag.rnode),
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node));
buf_state = LockBufHdr(buf);
@@ -2892,7 +2893,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ RELFILENODE_GETFORKNUM(buf->tag.rnode),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2913,11 +2914,11 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(RELFILENODE_GETFORKNUM(buf->tag.rnode),
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node));
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3142,7 +3143,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
for (j = 0; j < nforks; j++)
{
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3374,7 +3375,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = LockBufHdr(bufHdr);
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
- bufHdr->tag.forkNum == forkNum &&
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3528,7 +3529,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode),
bufHdr->tag.blockNum,
localpage,
false);
@@ -4491,7 +4492,8 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rnode,
+ RELFILENODE_GETFORKNUM(buf->tag.rnode));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4515,7 +4517,8 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rnode,
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4534,7 +4537,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum);
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4551,9 +4554,9 @@ rnode_comparator(const void *p1, const void *p2)
RelFileNode n1 = *(const RelFileNode *) p1;
RelFileNode n2 = *(const RelFileNode *) p2;
- if (n1.relNode < n2.relNode)
+ if (RELFILENODE_GETRELNODE(n1) < RELFILENODE_GETRELNODE(n2))
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (RELFILENODE_GETRELNODE(n1) > RELFILENODE_GETRELNODE(n2))
return 1;
if (n1.dbNode < n2.dbNode)
@@ -4634,9 +4637,9 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (RELFILENODE_GETFORKNUM(ba->rnode) < RELFILENODE_GETFORKNUM(bb->rnode))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (RELFILENODE_GETFORKNUM(ba->rnode) > RELFILENODE_GETFORKNUM(bb->rnode))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4801,7 +4804,8 @@ IssuePendingWritebacks(WritebackContext *context)
/* different file, stop */
if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
- cur->tag.forkNum != next->tag.forkNum)
+ RELFILENODE_GETFORKNUM(cur->tag.rnode) <
+ RELFILENODE_GETFORKNUM(next->tag.rnode))
break;
/* ok, block queued twice, skip */
@@ -4820,7 +4824,8 @@ IssuePendingWritebacks(WritebackContext *context)
/* and finally tell the kernel to write the data to storage */
reln = smgropen(tag.rnode, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ smgrwriteback(reln, RELFILENODE_GETFORKNUM(tag.rnode), tag.blockNum,
+ nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..2892733 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -221,7 +221,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode),
bufHdr->tag.blockNum,
localpage,
false);
@@ -338,14 +338,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
if ((buf_state & BM_TAG_VALID) &&
RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
- bufHdr->tag.forkNum == forkNum &&
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
@@ -389,7 +389,7 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..cbb667f 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,8 +273,8 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" UINT64_FORMAT,
+ blknum, rnode.spcNode, rnode.dbNode, RELFILENODE_GETRELNODE(rnode));
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..1eb6d78 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelNodeGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index d26c915..8e2c60f 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -124,8 +124,6 @@ static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -321,36 +319,25 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileNodeBackendIsTemp(rnode))
{
- if (!RelFileNodeBackendIsTemp(rnode))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
- }
- else
- ret = 0;
+ /* Prevent other backends' fds from holding on to the disk space */
+ ret = do_truncate(path);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
- {
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
+ /* Forget any pending sync requests for the first segment */
+ register_forget_request(rnode, forkNum, 0 /* first seg */ );
}
else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
+ ret = 0;
- /* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ /* Next unlink the file, unless it was already found to be missing */
+ if (ret == 0 || errno != ENOENT)
+ {
+ ret = unlink(path);
+ if (ret < 0 && errno != ENOENT)
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
/*
@@ -640,7 +627,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
@@ -655,7 +642,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend,
nbytes,
BLCKSZ);
@@ -710,7 +697,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
@@ -725,7 +712,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend,
nbytes,
BLCKSZ);
@@ -995,23 +982,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
@@ -1036,7 +1006,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
rnode.dbNode = dbid;
rnode.spcNode = 0;
- rnode.relNode = 0;
+ RELFILENODE_SETRELNODE(rnode, 0);
INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index 543f691..46a1242 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -188,92 +188,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -519,21 +433,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 3a2f2e1..dc8bc52 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -945,21 +945,21 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
else
rnode.dbNode = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ RELFILENODE_SETRELNODE(rnode, relform->relfilenode);
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ RELFILENODE_SETRELNODE(rnode, RelationMapOidToFilenode(relid,
+ relform->relisshared));
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ RELFILENODE_SETRELNODE(rnode, InvalidRelfileNode);
/* some compilers generate warnings without these next two lines */
rnode.dbNode = InvalidOid;
rnode.spcNode = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (RELFILENODE_GETRELNODE(rnode) == InvalidRelfileNode)
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2e760e8..f0196c8 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1288,7 +1288,7 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ Oid oldnode = RELFILENODE_GETRELNODE(relation->rd_node);
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
@@ -1335,15 +1335,16 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ RELFILENODE_SETRELNODE(relation->rd_node,
+ relation->rd_rel->relfilenode);
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ RELFILENODE_SETRELNODE(relation->rd_node,
+ RelationMapOidToFilenode(relation->rd_id,
+ relation->rd_rel->relisshared));
+ if (RELFILENODE_GETRELNODE(relation->rd_node) == InvalidRelfileNode)
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
@@ -1353,7 +1354,8 @@ RelationInitPhysicalAddr(Relation relation)
* rd_firstRelfilenodeSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnode !=
+ RELFILENODE_GETRELNODE(relation->rd_node))
{
if (RelFileNodeSkippingWAL(relation->rd_node))
relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
@@ -1958,13 +1960,14 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
* because there is no other way to know what filenode they currently
- * have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * have. In bootstrap mode, add them to the initial relation mapper data.
+ *
+ * TODO: Is it right to allocate new relnode here??
*/
relation->rd_rel->relfilenode = InvalidOid;
if (IsBootstrapProcessingMode())
RelationMapUpdateMap(RelationGetRelid(relation),
- RelationGetRelid(relation),
+ GetNewRelNode(),
isshared, true);
/*
@@ -3673,7 +3676,7 @@ RelationBuildLocalRelation(const char *relname,
void
RelationSetNewRelfilenode(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelNode newrelfilenode;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
@@ -3682,7 +3685,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelFileNode newrnode;
/* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
+ newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace,
persistence);
/*
@@ -3711,7 +3714,8 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
* caught here, if GetNewRelFileNode messes up for any reason.
*/
newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ RELFILENODE_SETRELNODE(newrnode, newrelfilenode);
+
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 4f6811f..740753d 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -79,7 +79,7 @@
typedef struct RelMapping
{
Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ RelNodeId mapfilenode; /* its filenode number */
} RelMapping;
typedef struct RelMapFile
@@ -132,7 +132,7 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
+static void apply_map_update(RelMapFile *map, Oid relationId, uint64 fileNode,
bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
@@ -155,7 +155,7 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
+RelNode
RelationMapOidToFilenode(Oid relationId, bool shared)
{
const RelMapFile *map;
@@ -168,13 +168,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
}
else
@@ -183,17 +183,17 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
}
- return InvalidOid;
+ return InvalidRelfileNode;
}
/*
@@ -209,7 +209,7 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
* relfilenode doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenodeToOid(RelNode filenode, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -220,13 +220,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
}
@@ -235,13 +235,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
}
@@ -258,7 +258,7 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -316,7 +316,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
+ bool add_okay)
{
int32 i;
@@ -325,7 +326,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ RELNODEID_SET_RELNODE(map->mappings[i].mapfilenode, fileNode);
return;
}
}
@@ -337,7 +338,8 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ RELNODEID_SET_RELNODE(map->mappings[map->num_mappings].mapfilenode,
+ fileNode);
map->num_mappings++;
}
@@ -356,7 +358,8 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ RELNODEID_GET_RELNODE(
+ updates->mappings[i].mapfilenode),
add_okay);
}
}
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..85ed88c 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenode",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelNode);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index f911f98..4f14a1b 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNode: " UINT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelNode);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 7211090..7d626b7 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -512,6 +512,7 @@ isRelDataFile(const char *path)
RelFileNode rnode;
unsigned int segNo;
int nmatch;
+ uint64 relNode;
bool matched;
/*----
@@ -535,11 +536,12 @@ isRelDataFile(const char *path)
*/
rnode.spcNode = InvalidOid;
rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ RELFILENODE_SETRELNODE(rnode, InvalidRelfileNode);
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/" UINT64_FORMAT ".%u", &relNode, &segNo);
+ RELFILENODE_SETRELNODE(rnode, relNode);
if (nmatch == 1 || nmatch == 2)
{
rnode.spcNode = GLOBALTABLESPACE_OID;
@@ -548,8 +550,9 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "base/%u/" UINT64_FORMAT ".%u",
+ &rnode.dbNode, &relNode, &segNo);
+ RELFILENODE_SETRELNODE(rnode, relNode);
if (nmatch == 2 || nmatch == 3)
{
rnode.spcNode = DEFAULTTABLESPACE_OID;
@@ -557,9 +560,10 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" UINT64_FORMAT ".%u",
+ &rnode.spcNode, &rnode.dbNode, &relNode,
&segNo);
+ RELFILENODE_SETRELNODE(rnode, relNode);
if (nmatch == 3 || nmatch == 4)
matched = true;
}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a6251e1..e88cfdf 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -518,15 +518,17 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%d: rel %u/%u/" UINT64_FORMAT " fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
forkNames[forknum],
blk);
else
- printf(", blkref #%d: rel %u/%u/%u blk %u",
+ printf(", blkref #%d: rel %u/%u/" UINT64_FORMAT "blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
blk);
if (XLogRecHasBlockImage(record, block_id))
{
@@ -548,9 +550,9 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
continue;
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf("\tblkref #%d: rel %u/%u/" UINT64_FORMAT " fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode, RELFILENODE_GETRELNODE(rnode),
forkNames[forknum],
blk);
if (XLogRecHasBlockImage(record, block_id))
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..0a458d8 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -138,7 +138,7 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbNode, Oid spcNode, uint64 relNode,
int backendId, ForkNumber forkNumber)
{
char *path;
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
Assert(dbNode == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" UINT64_FORMAT "_%s",
relNode, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/" UINT64_FORMAT, relNode);
}
else if (spcNode == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" UINT64_FORMAT "_%s",
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" UINT64_FORMAT,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" UINT64_FORMAT "_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" UINT64_FORMAT,
dbNode, backendId, relNode);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" UINT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" UINT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" UINT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" UINT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode);
}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 9a2816d..2e68920 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -15,6 +15,7 @@
#define TRANSAM_H
#include "access/xlogdefs.h"
+#include "storage/relfilenode.h"
/* ----------------
@@ -195,6 +196,7 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstGenbkiObjectId 10000
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+#define FirstNormalRelfileNode 1
/* OIDs of Template0 and Postgres database are fixed */
#define Template0ObjectId 4
@@ -217,6 +219,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelNode nextRelNode; /* next relfilenode to assign */
+ uint32 relnodecount; /* Relfilenode available before must do XLOG
+ work */
/*
* These fields are protected by XidGenLock.
@@ -298,6 +303,7 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelNode GetNewRelNode(void);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index bb0c526..04f0cd6 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -304,6 +304,7 @@ extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern XLogRecPtr CalculateMaxmumSafeLSN(void);
extern void XLogPutNextOid(Oid nextOid);
+extern void XLogPutNextRelFileNode(RelNode nextrelnode);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..1b83c79 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -15,6 +15,7 @@
#define CATALOG_H
#include "catalog/pg_class.h"
+#include "storage/relfilenode.h"
#include "utils/relcache.h"
@@ -38,7 +39,6 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelNode GetNewRelFileNode(Oid reltablespace, char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 304e8c1..4659ed3 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -52,13 +52,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* # of blocks (not always up-to-date) */
int32 relpages BKI_DEFAULT(0);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 1f3dc24..27d584d 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelNode nextRelNode; /* next relfile node */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENODE 0xE0
/*
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..167655e 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ uint64 newRelFileNode);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a4b5dc8..3756364 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -66,7 +66,7 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
*/
extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbNode, Oid spcNode, uint64 relNode,
int backendId, ForkNumber forkNumber);
/*
@@ -76,8 +76,8 @@ extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
/* First argument is a RelFileNode */
#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
- backend, forknum)
+ GetRelationPath((rnode).dbNode, (rnode).spcNode, \
+ RELFILENODE_GETRELNODE((rnode)), backend, forknum)
/* First argument is a RelFileNode */
#define relpathperm(rnode, forknum) \
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b903d2b..293dc90 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -21,6 +21,7 @@
#include "storage/condition_variable.h"
#include "storage/latch.h"
#include "storage/lwlock.h"
+#include "storage/relfilenode.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
@@ -91,7 +92,6 @@
typedef struct buftag
{
RelFileNode rnode; /* physical relation identifier */
- ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
@@ -99,23 +99,23 @@ typedef struct buftag
( \
(a).rnode.spcNode = InvalidOid, \
(a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
- (a).forkNum = InvalidForkNumber, \
+ RELFILENODE_SETRELNODE((a).rnode, 0), \
+ RELFILENODE_SETFORKNUM((a).rnode, InvalidForkNumber), \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
( \
(a).rnode = (xx_rnode), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
+ (a).blockNum = (xx_blockNum), \
+ RELFILENODE_SETFORKNUM((a).rnode, (xx_forkNum)) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
RelFileNodeEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
+ RELFILENODE_GETFORKNUM((a).rnode) == RELFILENODE_GETFORKNUM((b).rnode) \
)
/*
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4fdc606..57b1c2c 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -17,6 +17,27 @@
#include "common/relpath.h"
#include "storage/backendid.h"
+/* FIXME: where to keep this typedef. */
+typedef uint64 RelNode;
+
+#ifdef __cplusplus
+#define InvalidRelfileNode (RelNode(0))
+#else
+#define InvalidRelfileNode ((RelNode) 0)
+#endif
+
+/*
+ * RelNodeId:
+ *
+ * this is a storage type for RelNode. The reasoning behind using this is same
+ * as using the BlockId so refer comment atop BlockId.
+ */
+typedef struct RelNodeId
+{
+ uint32 rn_hi;
+ uint32 rn_lo;
+} RelNodeId;
+
/*
* RelFileNode must provide all that we need to know to physically access
* a relation, with the exception of the backend ID, which can be provided
@@ -58,7 +79,7 @@ typedef struct RelFileNode
{
Oid spcNode; /* tablespace */
Oid dbNode; /* database */
- Oid relNode; /* relation */
+ RelNodeId relNode; /* relation */
} RelFileNode;
/*
@@ -86,14 +107,53 @@ typedef struct RelFileNodeBackend
* RelFileNodeBackendEquals.
*/
#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
+ ((RELFILENODE_GETRELNODE((node1)) == RELFILENODE_GETRELNODE((node2))) && \
(node1).dbNode == (node2).dbNode && \
(node1).spcNode == (node2).spcNode)
#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
+ (RELFILENODE_GETRELNODE((node1)) == RELFILENODE_GETRELNODE((node2)) && \
(node1).node.dbNode == (node2).node.dbNode && \
(node1).backend == (node2).backend && \
(node1).node.spcNode == (node2).node.spcNode)
+/*
+ * These macros define the relNode filed of the RelFileNode, 8 high order bita
+ * defines the fork no and remaining 56 bits define the relfilenode.
+ */
+#define RELFILENODE_RELNODE_BITS 56
+#define RELFILENODE_RELNODE_MASK ((((uint64) 1) << RELFILENODE_RELNODE_BITS) - 1)
+#define RELFILENODE_RELNODE_MASK1 ((((uint32) 1) << 24) - 1)
+
+/* Getting and setitng RelNode from RelNodeId. */
+#define RELNODEID_GET_RELNODE(rnode) \
+ (uint64) (((uint64) (rnode).rn_hi << 32) | ((uint32) (rnode).rn_lo))
+
+#define RELNODEID_SET_RELNODE(rnode, val) \
+( \
+ (rnode).rn_hi = (val) >> 32, \
+ (rnode).rn_lo = (val) & 0xffffffff \
+)
+
+/*
+ * Macros to get and set the relNode and forkNum inside RelFileNode.relNode.
+ */
+#define RELFILENODE_GETRELNODE(rnode) \
+ (RELNODEID_GET_RELNODE((rnode).relNode) & RELFILENODE_RELNODE_MASK)
+
+#define RELFILENODE_GETFORKNUM(rnode) \
+ (RELNODEID_GET_RELNODE((rnode).relNode) >> RELFILENODE_RELNODE_BITS)
+
+#define RELFILENODE_SETRELNODE(rnode, val) \
+ RELNODEID_SET_RELNODE((rnode).relNode, (val) & RELFILENODE_RELNODE_MASK)
+
+#define RELFILENODE_SETFORKNUM(rnode, forkNum) \
+ RELNODEID_SET_RELNODE((rnode).relNode, \
+ (RELNODEID_GET_RELNODE((rnode).relNode)) | \
+ ((uint64) (forkNum) << RELFILENODE_RELNODE_BITS))
+
+/* Clear fork number from RelFileNode.relNode. */
+#define RELFILENODE_CLEARFORKNUM(rnode) \
+ RELFILENODE_SETRELNODE(rnode, RELFILENODE_GETRELNODE(rnode))
+
#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..4d67850 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -57,7 +57,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 9fbb5a7..58234a8 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -35,11 +35,11 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelNode RelationMapOidToFilenode(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
+extern Oid RelationMapFilenodeToOid(RelNode relationId, bool shared);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+extern void RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 16e0475..3de3b1c 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2175,10 +2175,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2209,10 +2209,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | OTHER | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | OTHER | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
--
1.8.3.1
On Fri, Jan 28, 2022 at 8:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jan 19, 2022 at 10:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
TODO:
There are a couple of TODOs and FIXMEs which I am planning to improve
by next week. I am also planning to do the testing where relfilenode
consumes more than 32 bits, maybe for that we can set the
FirstNormalRelfileNode to higher value for the testing purpose. And,
Improve comments.
I have fixed most of TODO and FIXMEs but there are still a few which I
could not decide, the main one currently we do not have uint8 data
type only int8 is there so I have used int8 for storing relfilenode +
forknumber. Although this is sufficient because I don't think we will
ever get more than 128 fork numbers. But my question is should we
think for adding uint8 as new data type or infect make RelNode itself
as new data type like we have Oid.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v2-WIP-0001-Don-t-delay-removing-Tombstone-file-until-next-ch.patchtext/x-patch; charset=US-ASCII; name=v2-WIP-0001-Don-t-delay-removing-Tombstone-file-until-next-ch.patchDownload
From 044fa268cf3a8e85a8bc3c2f05f4d296a80a6ab1 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Sun, 30 Jan 2022 20:06:59 +0530
Subject: [PATCH v2] Don't delay removing Tombstone file until next checkpoint
Currently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.
With this patch we are removing the need of keeping the
Tombstone files until the next checkpoint by making the
relfilenode unique within a cluster. For doing that we are
making RelFileNode.relNode 64 bits wider so that relfilenode
is never reused within a cluster. But that will make buffer
tag wider by 32 bits so to avoid that we are removing the
ForkNumber from the buffer tag and using 8 high bits of
relNode for storing the fork number and remaining 56 bits
for the relfilenode.
---
.../pg_buffercache/pg_buffercache--1.0--1.1.sql | 2 +-
contrib/pg_buffercache/pg_buffercache--1.2.sql | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 10 +-
contrib/pg_prewarm/autoprewarm.c | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/common/syncscan.c | 3 +-
src/backend/access/gin/ginxlog.c | 5 +-
src/backend/access/rmgrdesc/gistdesc.c | 4 +-
src/backend/access/rmgrdesc/heapdesc.c | 4 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/xlogdesc.c | 15 ++-
src/backend/access/transam/varsup.c | 52 ++++++++++-
src/backend/access/transam/xlog.c | 57 +++++++++---
src/backend/access/transam/xlogutils.c | 9 +-
src/backend/catalog/catalog.c | 61 +++----------
src/backend/catalog/heap.c | 23 +++--
src/backend/catalog/index.c | 15 ++-
src/backend/catalog/storage.c | 3 +-
src/backend/commands/cluster.c | 4 +-
src/backend/commands/indexcmds.c | 6 +-
src/backend/commands/sequence.c | 2 +-
src/backend/commands/tablecmds.c | 23 +++--
src/backend/nodes/outfuncs.c | 2 +-
src/backend/parser/parse_utilcmd.c | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 68 ++++++++------
src/backend/storage/buffer/localbuf.c | 8 +-
src/backend/storage/freespace/fsmpage.c | 4 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 68 ++++----------
src/backend/storage/sync/sync.c | 101 ---------------------
src/backend/utils/adt/dbsize.c | 22 ++---
src/backend/utils/adt/pg_upgrade_support.c | 4 +-
src/backend/utils/cache/relcache.c | 35 +++----
src/backend/utils/cache/relfilenodemap.c | 10 +-
src/backend/utils/cache/relmapper.c | 39 ++++----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 ++--
src/bin/pg_rewind/filemap.c | 16 ++--
src/bin/pg_upgrade/info.c | 4 +-
src/bin/pg_upgrade/pg_upgrade.h | 4 +-
src/bin/pg_upgrade/relfilenode.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 14 +--
src/common/relpath.c | 22 ++---
src/fe_utils/option_utils.c | 42 +++++++++
src/include/access/transam.h | 4 +
src/include/access/xlog.h | 1 +
src/include/catalog/binary_upgrade.h | 2 +-
src/include/catalog/catalog.h | 4 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/pg_class.h | 10 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 6 +-
src/include/commands/tablecmds.h | 2 +-
src/include/common/relpath.h | 6 +-
src/include/fe_utils/option_utils.h | 3 +
src/include/nodes/parsenodes.h | 2 +-
src/include/postgres_ext.h | 15 +++
src/include/storage/buf_internals.h | 12 +--
src/include/storage/relfilenode.h | 70 ++++++++++++--
src/include/storage/sync.h | 1 -
src/include/utils/rel.h | 2 +-
src/include/utils/relcache.h | 2 +-
src/include/utils/relfilenodemap.h | 2 +-
src/include/utils/relmapper.h | 6 +-
src/test/regress/expected/alter_table.out | 20 ++--
src/test/regress/sql/alter_table.sql | 4 +-
73 files changed, 545 insertions(+), 463 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
index 54d02f5..5e93238 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
@@ -6,6 +6,6 @@
-- Upgrade view to 1.1. format
CREATE OR REPLACE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.2.sql
index 6ee5d84..f52ddcd 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.2.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.2.sql
@@ -12,7 +12,7 @@ LANGUAGE C PARALLEL SAFE;
-- Create a view for convenient access.
CREATE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..ab1f959 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelNode relfilenode;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -103,7 +103,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ INT8OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
+ fctx->record[i].relfilenode = RELFILENODE_GETRELNODE(bufHdr->tag.rnode);
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].forknum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = Int8GetDatum(fctx->record[i].relfilenode);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 5d40fb5..a03fd03 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -617,8 +617,8 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].filenode = RELFILENODE_GETRELNODE(bufHdr->tag.rnode);
+ block_info_array[num_blocks].forknum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 1e65c42..6eddce1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1960,7 +1960,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index e68d159..631cd2f 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..386de77 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -161,7 +161,8 @@ SyncScanShmemInit(void)
*/
item->location.relfilenode.spcNode = InvalidOid;
item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ RELFILENODE_SETRELNODE(item->location.relfilenode,
+ InvalidRelfileNode);
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..131e6f1 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,8 +100,9 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &node, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
+ node.spcNode, node.dbNode,
+ RELFILENODE_GETRELNODE(node));
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 9cab4fa..7ba70c0 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,9 +26,9 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ RELFILENODE_GETRELNODE(xlrec->node), xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..3c28c09 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,10 +169,10 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_node.spcNode,
xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ RELFILENODE_GETRELNODE(xlrec->target_node),
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..cfcc3a1 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,9 +100,9 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ RELFILENODE_GETRELNODE(xlrec->node),
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..ffa6a86 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,9 +25,9 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ RELFILENODE_GETRELNODE(xlrec->node));
}
const char *
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index e7452af..9066566 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenode " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelNode,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelFilenode;
+
+ memcpy(&nextRelFilenode, rec, sizeof(RelNode));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFilenode);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENODE:
+ id = "NEXT_RELFILENODE";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..7edb9fa 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,52 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelNode
+ *
+ * Simmilar to GetNewObjectId but instead of new Oid it generates new relnode.
+ */
+RelNode
+GetNewRelNode(void)
+{
+ RelNode result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNode during recovery");
+
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+
+ /*
+ * Check for the wraparound for the relnode counter.
+ *
+ * XXX Actually the relnode is 56 bits wide so we don't need to worry about
+ * the wraparound case.
+ */
+ if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE)
+ {
+ ShmemVariableCache->nextRelNode = FirstNormalRelfileNode;
+ ShmemVariableCache->relnodecount = 0;
+ }
+
+ /* If we run out of logged for use RelNode then we must log more */
+ if (ShmemVariableCache->relnodecount == 0)
+ {
+ XLogPutNextRelFileNode(ShmemVariableCache->nextRelNode +
+ VAR_RFN_PREFETCH);
+
+ ShmemVariableCache->relnodecount = VAR_RFN_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelNode;
+ (ShmemVariableCache->nextRelNode)++;
+ (ShmemVariableCache->relnodecount)--;
+
+ LWLockRelease(RelNodeGenLock);
+
+ return result;
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dfe2a0b..290e4fc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1541,8 +1541,9 @@ checkXLogConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
forknum, blkno);
}
}
@@ -5396,6 +5397,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelNode = FirstNormalRelfileNode;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -5409,7 +5411,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -7147,7 +7151,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -9259,6 +9265,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_SHARED);
+ checkPoint.nextRelNode = ShmemVariableCache->nextRelNode;
+ if (!shutdown)
+ checkPoint.nextRelNode += ShmemVariableCache->relnodecount;
+ LWLockRelease(RelNodeGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -9405,11 +9417,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
@@ -10070,6 +10077,18 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Simmialr to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENODE log record.
+ */
+void
+XLogPutNextRelFileNode(RelNode nextrelnode)
+{
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnode), sizeof(RelNode));
+ (void) XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENODE);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -10331,6 +10350,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelNode;
+
+ memcpy(&nextRelNode, XLogRecGetData(record), sizeof(RelNode));
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -10344,6 +10373,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
@@ -10713,15 +10746,17 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 90e1c483..0c4d8e2 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -593,17 +593,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT,
+ RELFILENODE_GETRELNODE(rnode));
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index dfd5fb6..5afbd07 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -472,27 +472,18 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
/*
* GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
- * database of the given tablespace.
+ * Generate a new relfilenode number.
*
- * If the relfilenode will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
+ * We are using 56 bits for the relfilenode so we expect that to be unique for
+ * the cluster so if it is already exists then report and error.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelNode
+GetNewRelFileNode(Oid reltablespace, char relpersistence)
{
RelFileNodeBackend rnode;
char *rpath;
- bool collides;
BackendId backend;
+ RelNode relNode;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
@@ -525,42 +516,16 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
* are properly detected.
*/
rnode.backend = backend;
+ relNode = GetNewRelNode();
+ RELFILENODE_SETRELNODE(rnode.node, relNode);
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rnode.node.relNode = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ /* Check for existing file of same name */
+ rpath = relpath(rnode, MAIN_FORKNUM);
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
+ if (access(rpath, F_OK) == 0)
+ elog(ERROR, "new relfilenode file already exists: \"%s\"\n", rpath);
- return rnode.node.relNode;
+ return relNode;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7e99de8..4976df0 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -93,7 +93,7 @@
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelfileNode;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -303,7 +303,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -358,8 +358,8 @@ heap_create(const char *relname,
* If relfilenode is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelfileNodeIsValid(relfilenode))
+ relfilenode = GetNewRelFileNode(reltablespace, relpersistence);
}
/*
@@ -912,7 +912,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1129,7 +1129,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelNode relfilenode = InvalidRelfileNode;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1187,8 +1187,7 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure that the Oid doesn't collide with either pg_class OIDs.
*/
if (!OidIsValid(relid))
{
@@ -1210,13 +1209,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelfileNodeIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("toast relfilenode value not set when in binary upgrade mode")));
relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelfileNode;
}
}
else
@@ -1243,8 +1242,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 2308d40..6e43237 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -661,7 +661,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNode: normally, pass InvalidRelfileNode to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -702,7 +702,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -734,7 +734,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelfileNodeIsValid(relFileNode);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -901,8 +901,7 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure it doesn't collide with either pg_class OIDs.
*/
if (!OidIsValid(indexRelationId))
{
@@ -935,8 +934,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
@@ -1406,7 +1405,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelfileNode, /* relFileNode */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 9b80755..712e995 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -593,7 +593,8 @@ RestorePendingSyncs(char *startAddress)
RelFileNode *rnode;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
+ for (rnode = (RelFileNode *) startAddress;
+ RELFILENODE_GETRELNODE(*rnode) != 0; rnode++)
AddPendingSync(rnode);
}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 2e8efe4..2423003 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1006,9 +1006,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
+ RelNode relfilenode1,
relfilenode2;
- Oid swaptemp;
+ RelNode swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 42aacc8..a1c9c24 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1085,7 +1085,7 @@ DefineIndex(Oid relationId,
* A valid stmt->oldNode implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelfileNodeIsValid(stmt->oldNode) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1315,7 +1315,7 @@ DefineIndex(Oid relationId,
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNode = InvalidRelfileNode;
childStmt->oldCreateSubid = InvalidSubTransactionId;
childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
@@ -2896,7 +2896,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelfileNodeIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 27cb630..72137f6 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -74,7 +74,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelNode filenode; /* last seen relfilenode of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1f0654c..36987f3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3304,7 +3304,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelNode newRelFileNode)
{
Relation pg_class;
HeapTuple tuple;
@@ -3324,7 +3324,7 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
+ if (newRelFileNode != InvalidRelfileNode)
rd_rel->relfilenode = newRelFileNode;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
@@ -8572,7 +8572,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelfileNodeIsValid(stmt->oldNode);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8596,7 +8596,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelfileNodeIsValid(stmt->oldNode))
{
Relation irel = index_open(address.objectId, NoLock);
@@ -13441,7 +13441,7 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNode = RELFILENODE_GETRELNODE(irel->rd_node);
stmt->oldCreateSubid = irel->rd_createSubid;
stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
}
@@ -14290,7 +14290,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
+ RelNode newrelfilenode;
RelFileNode newrnode;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14320,15 +14320,18 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenode. Although relfilenodes are unique within a
+ * cluster, we are unable to use the old relfilenode since unused
+ * relfilenodes are not unlinked until commit. So if within a transaction,
+ * if we set the old tablespace again, we will get conflicting relfilenode
+ * file.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
+ newrelfilenode = GetNewRelFileNode(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
+ RELFILENODE_SETRELNODE(newrnode, newrelfilenode);
newrnode.spcNode = newTableSpace;
/* hand off to AM to actually create the new filenode and copy the data */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 2b02369..9b64842 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2771,7 +2771,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_UINT64_FIELD(oldNode);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 0eea214..209eabf 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1577,7 +1577,7 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelfileNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
@@ -2197,7 +2197,7 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelfileNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->transformed = false;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 3fb5a92..23822c1 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENODE:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 19b2ba2..143d403 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2134,7 +2134,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
Assert(snapshot_now);
reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ RELFILENODE_GETRELNODE(change->data.tp.relnode));
/*
* Mapped catalog tuple without data, emitted while
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index a2512e7..ada326b 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -818,7 +818,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend);
@@ -880,7 +880,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend,
found);
@@ -1070,7 +1070,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend,
found);
@@ -1249,7 +1249,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node));
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1260,7 +1260,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(smgr->smgr_rnode.node));
}
else
{
@@ -1640,7 +1640,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
- bufHdr->tag.forkNum == forkNum)
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1651,7 +1651,7 @@ ReleaseAndReadBuffer(Buffer buffer,
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
- bufHdr->tag.forkNum == forkNum)
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -1993,8 +1993,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
- item->forkNum = bufHdr->tag.forkNum;
+ item->relNode = RELFILENODE_GETRELNODE(bufHdr->tag.rnode);
+ item->forkNum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2701,7 +2701,8 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode, backend,
+ RELFILENODE_GETFORKNUM(buf->tag.rnode));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2781,7 +2782,14 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
/* pinned, so OK to read tag without spinlock */
*rnode = bufHdr->tag.rnode;
- *forknum = bufHdr->tag.forkNum;
+
+ /*
+ * Clear the fork number from the output rnode->relNode. For more details
+ * refer comments atop RelFileNode.
+ */
+ RELFILENODE_SETRELNODE(*rnode, RELFILENODE_GETRELNODE(*rnode));
+
+ *forknum = RELFILENODE_GETFORKNUM(bufHdr->tag.rnode);
*blknum = bufHdr->tag.blockNum;
}
@@ -2833,11 +2841,11 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
if (reln == NULL)
reln = smgropen(buf->tag.rnode, InvalidBackendId);
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(RELFILENODE_GETFORKNUM(buf->tag.rnode),
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node));
buf_state = LockBufHdr(buf);
@@ -2892,7 +2900,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ RELFILENODE_GETFORKNUM(buf->tag.rnode),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2913,11 +2921,11 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(RELFILENODE_GETFORKNUM(buf->tag.rnode),
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node));
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3142,7 +3150,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
for (j = 0; j < nforks; j++)
{
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3374,7 +3382,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = LockBufHdr(bufHdr);
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
- bufHdr->tag.forkNum == forkNum &&
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3528,7 +3536,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode),
bufHdr->tag.blockNum,
localpage,
false);
@@ -4491,7 +4499,8 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rnode,
+ RELFILENODE_GETFORKNUM(buf->tag.rnode));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4515,7 +4524,8 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rnode,
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4534,7 +4544,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum);
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4551,9 +4561,9 @@ rnode_comparator(const void *p1, const void *p2)
RelFileNode n1 = *(const RelFileNode *) p1;
RelFileNode n2 = *(const RelFileNode *) p2;
- if (n1.relNode < n2.relNode)
+ if (RELFILENODE_GETRELNODE(n1) < RELFILENODE_GETRELNODE(n2))
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (RELFILENODE_GETRELNODE(n1) > RELFILENODE_GETRELNODE(n2))
return 1;
if (n1.dbNode < n2.dbNode)
@@ -4634,9 +4644,9 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (RELFILENODE_GETFORKNUM(ba->rnode) < RELFILENODE_GETFORKNUM(bb->rnode))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (RELFILENODE_GETFORKNUM(ba->rnode) > RELFILENODE_GETFORKNUM(bb->rnode))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4801,7 +4811,8 @@ IssuePendingWritebacks(WritebackContext *context)
/* different file, stop */
if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
- cur->tag.forkNum != next->tag.forkNum)
+ RELFILENODE_GETFORKNUM(cur->tag.rnode) <
+ RELFILENODE_GETFORKNUM(next->tag.rnode))
break;
/* ok, block queued twice, skip */
@@ -4820,7 +4831,8 @@ IssuePendingWritebacks(WritebackContext *context)
/* and finally tell the kernel to write the data to storage */
reln = smgropen(tag.rnode, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ smgrwriteback(reln, RELFILENODE_GETFORKNUM(tag.rnode), tag.blockNum,
+ nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..2892733 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -221,7 +221,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode),
bufHdr->tag.blockNum,
localpage,
false);
@@ -338,14 +338,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
if ((buf_state & BM_TAG_VALID) &&
RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
- bufHdr->tag.forkNum == forkNum &&
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
@@ -389,7 +389,7 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ RELFILENODE_GETFORKNUM(bufHdr->tag.rnode)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..41942f5 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,8 +273,8 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
+ blknum, rnode.spcNode, rnode.dbNode, RELFILENODE_GETRELNODE(rnode));
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..1eb6d78 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelNodeGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index d26c915..8e2c60f 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -124,8 +124,6 @@ static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -321,36 +319,25 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileNodeBackendIsTemp(rnode))
{
- if (!RelFileNodeBackendIsTemp(rnode))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
- }
- else
- ret = 0;
+ /* Prevent other backends' fds from holding on to the disk space */
+ ret = do_truncate(path);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
- {
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
+ /* Forget any pending sync requests for the first segment */
+ register_forget_request(rnode, forkNum, 0 /* first seg */ );
}
else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
+ ret = 0;
- /* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ /* Next unlink the file, unless it was already found to be missing */
+ if (ret == 0 || errno != ENOENT)
+ {
+ ret = unlink(path);
+ if (ret < 0 && errno != ENOENT)
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
/*
@@ -640,7 +627,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
@@ -655,7 +642,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend,
nbytes,
BLCKSZ);
@@ -710,7 +697,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
@@ -725,7 +712,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RELFILENODE_GETRELNODE(reln->smgr_rnode.node1),
reln->smgr_rnode.backend,
nbytes,
BLCKSZ);
@@ -995,23 +982,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
@@ -1036,7 +1006,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
rnode.dbNode = dbid;
rnode.spcNode = 0;
- rnode.relNode = 0;
+ RELFILENODE_SETRELNODE(rnode, 0);
INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index 543f691..46a1242 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -188,92 +188,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -519,21 +433,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 3a2f2e1..03fcac7 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelNode result;
HeapTuple tuple;
Form_pg_class relform;
@@ -870,15 +870,15 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelfileNode;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelfileNodeIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,11 +898,11 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelNode relfilenode = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ if (!RelfileNodeIsValid(relfilenode))
PG_RETURN_NULL();
heaprel = RelidByRelfilenode(reltablespace, relfilenode);
@@ -945,21 +945,21 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
else
rnode.dbNode = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ RELFILENODE_SETRELNODE(rnode, relform->relfilenode);
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ RELFILENODE_SETRELNODE(rnode, RelationMapOidToFilenode(relid,
+ relform->relisshared));
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ RELFILENODE_SETRELNODE(rnode, InvalidRelfileNode);
/* some compilers generate warnings without these next two lines */
rnode.dbNode = InvalidOid;
rnode.spcNode = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (RELFILENODE_GETRELNODE(rnode) == InvalidRelfileNode)
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..ab8d148 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenode = relnode;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2e760e8..9f86ea7 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1288,7 +1288,7 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ Oid oldnode = RELFILENODE_GETRELNODE(relation->rd_node);
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
@@ -1335,15 +1335,16 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ RELFILENODE_SETRELNODE(relation->rd_node,
+ relation->rd_rel->relfilenode);
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ RELFILENODE_SETRELNODE(relation->rd_node,
+ RelationMapOidToFilenode(relation->rd_id,
+ relation->rd_rel->relisshared));
+ if (RELFILENODE_GETRELNODE(relation->rd_node) == InvalidRelfileNode)
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
@@ -1353,7 +1354,8 @@ RelationInitPhysicalAddr(Relation relation)
* rd_firstRelfilenodeSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnode !=
+ RELFILENODE_GETRELNODE(relation->rd_node))
{
if (RelFileNodeSkippingWAL(relation->rd_node))
relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
@@ -1958,13 +1960,13 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
* because there is no other way to know what filenode they currently
- * have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * have. In bootstrap mode, generate a new relfilenode and add them to the
+ * initial relation mapper data.
*/
- relation->rd_rel->relfilenode = InvalidOid;
+ relation->rd_rel->relfilenode = InvalidRelfileNode;
if (IsBootstrapProcessingMode())
RelationMapUpdateMap(RelationGetRelid(relation),
- RelationGetRelid(relation),
+ GetNewRelNode(),
isshared, true);
/*
@@ -3433,7 +3435,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3604,7 +3606,7 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelfileNode;
/* Add it to the active mapping information */
RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
}
@@ -3673,7 +3675,7 @@ RelationBuildLocalRelation(const char *relname,
void
RelationSetNewRelfilenode(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelNode newrelfilenode;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
@@ -3682,7 +3684,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelFileNode newrnode;
/* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
+ newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace,
persistence);
/*
@@ -3711,7 +3713,8 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
* caught here, if GetNewRelFileNode messes up for any reason.
*/
newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ RELFILENODE_SETRELNODE(newrnode, newrelfilenode);
+
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
index 70c323c..8c4e924 100644
--- a/src/backend/utils/cache/relfilenodemap.c
+++ b/src/backend/utils/cache/relfilenodemap.c
@@ -37,7 +37,7 @@ static ScanKeyData relfilenode_skey[2];
typedef struct
{
Oid reltablespace;
- Oid relfilenode;
+ RelNodeId relfilenode;
} RelfilenodeMapKey;
typedef struct
@@ -135,7 +135,7 @@ InitializeRelfilenodeMap(void)
* Returns InvalidOid if no relation matching the criteria could be found.
*/
Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
+RelidByRelfilenode(Oid reltablespace, RelNode relfilenode)
{
RelfilenodeMapKey key;
RelfilenodeMapEntry *entry;
@@ -155,7 +155,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
MemSet(&key, 0, sizeof(key));
key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
+ RELNODEID_SET_RELNODE(key.relfilenode, relfilenode);
/*
* Check cache and return entry if one is found. Even if no target
@@ -196,7 +196,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
+ skey[1].sk_argument = Int64GetDatum(relfilenode);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
+ "unexpected duplicate for tablespace %u, relfilenode" INT64_FORMAT,
reltablespace, relfilenode);
found = true;
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 4f6811f..503dd19 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -79,7 +79,7 @@
typedef struct RelMapping
{
Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ RelNodeId mapfilenode; /* its filenode number */
} RelMapping;
typedef struct RelMapFile
@@ -132,7 +132,7 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
+static void apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
@@ -155,7 +155,7 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
+RelNode
RelationMapOidToFilenode(Oid relationId, bool shared)
{
const RelMapFile *map;
@@ -168,13 +168,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
}
else
@@ -183,17 +183,17 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode);
}
}
- return InvalidOid;
+ return InvalidRelfileNode;
}
/*
@@ -209,7 +209,7 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
* relfilenode doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenodeToOid(RelNode filenode, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -220,13 +220,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
}
@@ -235,13 +235,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RELNODEID_GET_RELNODE(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
}
@@ -258,7 +258,7 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -316,7 +316,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
+ bool add_okay)
{
int32 i;
@@ -325,7 +326,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ RELNODEID_SET_RELNODE(map->mappings[i].mapfilenode, fileNode);
return;
}
}
@@ -337,7 +338,8 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ RELNODEID_SET_RELNODE(map->mappings[map->num_mappings].mapfilenode,
+ fileNode);
map->num_mappings++;
}
@@ -356,7 +358,8 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ RELNODEID_GET_RELNODE(
+ updates->mappings[i].mapfilenode),
add_okay);
}
}
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..85ed88c 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenode",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelNode);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 7e69475..94ec594 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -520,9 +520,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index f911f98..2513fc3 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNode: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelNode);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e3ddf19..2f3968d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4637,12 +4637,12 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
{
PQExpBuffer upgrade_query = createPQExpBuffer();
PGresult *upgrade_res;
- Oid relfilenode;
+ RelNode relfilenode;
Oid toast_oid;
- Oid toast_relfilenode;
+ RelNode toast_relfilenode;
char relkind;
Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ RelNode toast_index_relfilenode;
/*
* Preserve the OID and relfilenode of the table, table's index, table's
@@ -4668,11 +4668,11 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
@@ -4693,9 +4693,9 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
* partitioned tables have a relfilenode, which should not be preserved
* when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelfileNodeIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenode);
/*
@@ -4709,7 +4709,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenode);
/* every toast table has an index */
@@ -4717,7 +4717,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenode);
}
@@ -4730,7 +4730,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.oid);\n",
relfilenode);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 7211090..06d3445 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -512,6 +512,7 @@ isRelDataFile(const char *path)
RelFileNode rnode;
unsigned int segNo;
int nmatch;
+ RelNode relNode;
bool matched;
/*----
@@ -535,11 +536,12 @@ isRelDataFile(const char *path)
*/
rnode.spcNode = InvalidOid;
rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ RELFILENODE_SETRELNODE(rnode, 0); /* FIXME-1 */
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &relNode, &segNo);
+ RELFILENODE_SETRELNODE(rnode, relNode);
if (nmatch == 1 || nmatch == 2)
{
rnode.spcNode = GLOBALTABLESPACE_OID;
@@ -548,8 +550,9 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
+ &rnode.dbNode, &relNode, &segNo);
+ RELFILENODE_SETRELNODE(rnode, relNode);
if (nmatch == 2 || nmatch == 3)
{
rnode.spcNode = DEFAULTTABLESPACE_OID;
@@ -557,9 +560,10 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
+ &rnode.spcNode, &rnode.dbNode, &relNode,
&segNo);
+ RELFILENODE_SETRELNODE(rnode, relNode);
if (nmatch == 3 || nmatch == 4)
matched = true;
}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 69ef231..d3c5d53 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -383,8 +383,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
i_reltablespace;
+ RelNode i_relfilenode;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -511,7 +511,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenode = atorelnode(PQgetvalue(res, relnum, i_relfilenode));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 1db8e3f..a3503aa 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -122,7 +122,7 @@ typedef struct
char *nspname; /* namespace name */
char *relname; /* relation name */
Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
+ RelNode relfilenode; /* relation file node */
Oid indtable; /* if index, OID of its table, else 0 */
Oid toastheap; /* if toast table, OID of base table, else 0 */
char *tablespace; /* tablespace path; "" for cluster default */
@@ -146,7 +146,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelNode relfilenode;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
index 2f4deb3..10e6a6c 100644
--- a/src/bin/pg_upgrade/relfilenode.c
+++ b/src/bin/pg_upgrade/relfilenode.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenode,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a6251e1..ae4a2d8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -518,15 +518,17 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
forkNames[forknum],
blk);
else
- printf(", blkref #%d: rel %u/%u/%u blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT "blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RELFILENODE_GETRELNODE(rnode),
blk);
if (XLogRecHasBlockImage(record, block_id))
{
@@ -548,9 +550,9 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
continue;
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf("\tblkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode, RELFILENODE_GETRELNODE(rnode),
forkNames[forknum],
blk);
if (XLogRecHasBlockImage(record, block_id))
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..27b8547 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -138,7 +138,7 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber)
{
char *path;
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
Assert(dbNode == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNode, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/" INT64_FORMAT, relNode);
}
else if (spcNode == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbNode, backendId, relNode);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 9a2816d..8113335 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -217,6 +217,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelNode nextRelNode; /* next relfilenode to assign */
+ uint32 relnodecount; /* Relfilenode available before must do XLOG
+ work */
/*
* These fields are protected by XidGenLock.
@@ -298,6 +301,7 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelNode GetNewRelNode(void);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index bb0c526..04f0cd6 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -304,6 +304,7 @@ extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern XLogRecPtr CalculateMaxmumSafeLSN(void);
extern void XLogPutNextOid(Oid nextOid);
+extern void XLogPutNextRelFileNode(RelNode nextrelnode);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..d2b45ba 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -26,7 +26,7 @@ extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_toast_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..1b83c79 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -15,6 +15,7 @@
#define CATALOG_H
#include "catalog/pg_class.h"
+#include "storage/relfilenode.h"
#include "utils/relcache.h"
@@ -38,7 +39,6 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelNode GetNewRelFileNode(Oid reltablespace, char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index c4757bd..66d41af 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1e79ec9 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 304e8c1..4659ed3 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -52,13 +52,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* # of blocks (not always up-to-date) */
int32 relpages BKI_DEFAULT(0);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 1f3dc24..27d584d 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelNode nextRelNode; /* next relfile node */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENODE 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 0859dc8..bc21fdc 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7270,11 +7270,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11050,7 +11050,7 @@
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..297c20b 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelNode newRelFileNode);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a4b5dc8..52f06a5 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -66,7 +66,7 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
*/
extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber);
/*
@@ -76,8 +76,8 @@ extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
/* First argument is a RelFileNode */
#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
- backend, forknum)
+ GetRelationPath((rnode).dbNode, (rnode).spcNode, \
+ RELFILENODE_GETRELNODE((rnode)), backend, forknum)
/* First argument is a RelFileNode */
#define relpathperm(rnode, forknum) \
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e9bdc7..ab0648a 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2900,7 +2900,7 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
+ RelNode oldNode; /* relfilenode of existing storage, if any */
SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
* oldNode */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..bd907f7 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,21 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelNode data type identifies the specific relation file name. RelNode is
+ * unique within a cluster.
+ *
+ * XXX idealy we can use uint64 but current we only have int8 as an exposed
+ * datatype so maybe we should make a new datatype relnode which will be of
+ * type 8 bytes unsigned integer.
+ */
+typedef pg_int64 RelNode;
+
+#define atorelnode(x) ((RelNode) strtoul((x), NULL, 10))
+
+#define InvalidRelfileNode ((RelNode) 0)
+#define FirstNormalRelfileNode ((RelNode) 1)
+#define RelfileNodeIsValid(relNode) ((bool) ((relNode) != InvalidRelfileNode))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b903d2b..293dc90 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -21,6 +21,7 @@
#include "storage/condition_variable.h"
#include "storage/latch.h"
#include "storage/lwlock.h"
+#include "storage/relfilenode.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
@@ -91,7 +92,6 @@
typedef struct buftag
{
RelFileNode rnode; /* physical relation identifier */
- ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
@@ -99,23 +99,23 @@ typedef struct buftag
( \
(a).rnode.spcNode = InvalidOid, \
(a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
- (a).forkNum = InvalidForkNumber, \
+ RELFILENODE_SETRELNODE((a).rnode, 0), \
+ RELFILENODE_SETFORKNUM((a).rnode, InvalidForkNumber), \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
( \
(a).rnode = (xx_rnode), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
+ (a).blockNum = (xx_blockNum), \
+ RELFILENODE_SETFORKNUM((a).rnode, (xx_forkNum)) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
RelFileNodeEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
+ RELFILENODE_GETFORKNUM((a).rnode) == RELFILENODE_GETFORKNUM((b).rnode) \
)
/*
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4fdc606..5474311 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -18,6 +18,18 @@
#include "storage/backendid.h"
/*
+ * RelNodeId:
+ *
+ * this is a storage type for RelNode. The reasoning behind using this is same
+ * as using the BlockId so refer comment atop BlockId.
+ */
+typedef struct RelNodeId
+{
+ uint32 rn_hi;
+ uint32 rn_lo;
+} RelNodeId;
+
+/*
* RelFileNode must provide all that we need to know to physically access
* a relation, with the exception of the backend ID, which can be provided
* separately. Note, however, that a "physical" relation is comprised of
@@ -31,11 +43,14 @@
* "shared" relations (those common to all databases of a cluster).
* Nonzero dbNode values correspond to pg_database.oid.
*
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
+ * relNode identifies the specific relation and its fork number. High 8 bits
+ * represent the fork number and the remaining 56 bits represent the
+ * relation. relNode corresponds to pg_class.relfilenode (NOT pg_class.oid).
+ * Notice that relNode is unique within a cluster.
+ *
+ * Note: When RelFileNode is part of the BufferTag only then the first 8 bits
+ * of the relNode will represent the fork number otherwise those will be
+ * cleared.
*
* Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
* zero. We support shared relations only in the "global" tablespace.
@@ -53,12 +68,14 @@
* Note: various places use RelFileNode in hashtable keys. Therefore,
* there *must not* be any unused padding bytes in this struct. That
* should be safe as long as all the fields are of type Oid.
+ *
+ * We use RelNodeId in order to avoid the alignment padding.
*/
typedef struct RelFileNode
{
Oid spcNode; /* tablespace */
Oid dbNode; /* database */
- Oid relNode; /* relation */
+ RelNodeId relNode; /* relation */
} RelFileNode;
/*
@@ -86,14 +103,51 @@ typedef struct RelFileNodeBackend
* RelFileNodeBackendEquals.
*/
#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
+ ((RELFILENODE_GETRELNODE((node1)) == RELFILENODE_GETRELNODE((node2))) && \
(node1).dbNode == (node2).dbNode && \
(node1).spcNode == (node2).spcNode)
#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
+ (RELFILENODE_GETRELNODE((node1)) == RELFILENODE_GETRELNODE((node2)) && \
(node1).node.dbNode == (node2).node.dbNode && \
(node1).backend == (node2).backend && \
(node1).node.spcNode == (node2).node.spcNode)
+/*
+ * These macros define the "relation" stored in the RelFileNode.relNode. Its
+ * remaining 8 high-order bits identify the relation's fork number.
+ */
+#define RELFILENODE_RELNODE_BITS 56
+#define RELFILENODE_RELNODE_MASK ((((RelNode) 1) << RELFILENODE_RELNODE_BITS) - 1)
+#define MAX_RELFILENODE RELFILENODE_RELNODE_MASK
+
+/* Retrieve the RelNode from a RelNodeId. */
+#define RELNODEID_GET_RELNODE(rnodeid) \
+ (RelNode) (((RelNode) (rnodeid).rn_hi << 32) | ((uint32) (rnodeid).rn_lo))
+
+/* Store the given value in RelNodeId. */
+#define RELNODEID_SET_RELNODE(rnodeid, val) \
+( \
+ (rnodeid).rn_hi = (val) >> 32, \
+ (rnodeid).rn_lo = (val) & 0xffffffff \
+)
+
+/* Gets the relfilenode stored in rnode.relNode. */
+#define RELFILENODE_GETRELNODE(rnode) \
+ (RELNODEID_GET_RELNODE((rnode).relNode) & RELFILENODE_RELNODE_MASK)
+
+/* Gets the fork number stored in rnode.relNode. */
+#define RELFILENODE_GETFORKNUM(rnode) \
+ (RELNODEID_GET_RELNODE((rnode).relNode) >> RELFILENODE_RELNODE_BITS)
+
+/* Sets input val in the relfilenode part of the rnode.relNode. */
+#define RELFILENODE_SETRELNODE(rnode, val) \
+ RELNODEID_SET_RELNODE((rnode).relNode, (val) & RELFILENODE_RELNODE_MASK)
+
+/* Sets input val in the fork number part of the rnode.relNode. */
+#define RELFILENODE_SETFORKNUM(rnode, val) \
+ RELNODEID_SET_RELNODE((rnode).relNode, \
+ (RELNODEID_GET_RELNODE((rnode).relNode)) | \
+ ((RelNode) (val) << RELFILENODE_RELNODE_BITS))
+
#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..4d67850 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -57,7 +57,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 6da1b22..d799b71 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -526,7 +526,7 @@ typedef struct ViewOptions
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelfileNode))
/*
* RelationGetSmgr
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 84d6afe..5d13660 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -102,7 +102,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
index 77d8046..d324981 100644
--- a/src/include/utils/relfilenodemap.h
+++ b/src/include/utils/relfilenodemap.h
@@ -13,6 +13,6 @@
#ifndef RELFILENODEMAP_H
#define RELFILENODEMAP_H
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
+extern Oid RelidByRelfilenode(Oid reltablespace, RelNode relfilenode);
#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 9fbb5a7..58234a8 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -35,11 +35,11 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelNode RelationMapOidToFilenode(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
+extern Oid RelationMapFilenodeToOid(RelNode relationId, bool shared);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+extern void RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 16e0475..58aeddb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,7 +2164,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,7 +2197,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | OTHER | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | OTHER | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index ac894c0..250e6cd 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,7 +1478,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1499,7 +1498,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
On Mon, Jan 31, 2022 at 12:29 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
the main one currently we do not have uint8 data
type only int8 is there so I have used int8 for storing relfilenode +
forknumber.
I'm confused. We use int8 in tons of places, so I feel like it must exist.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Jan 31, 2022 at 9:04 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 31, 2022 at 12:29 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
the main one currently we do not have uint8 data
type only int8 is there so I have used int8 for storing relfilenode +
forknumber.I'm confused. We use int8 in tons of places, so I feel like it must exist.
Rather, we use uint8 in tons of places, so I feel like it must exist.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Jan 31, 2022 at 7:36 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 31, 2022 at 9:04 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 31, 2022 at 12:29 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
the main one currently we do not have uint8 data
type only int8 is there so I have used int8 for storing relfilenode +
forknumber.I'm confused. We use int8 in tons of places, so I feel like it must exist.
Rather, we use uint8 in tons of places, so I feel like it must exist.
Hmm, at least pg_type doesn't have anything with a name like uint8.
postgres[101702]=# select oid, typname from pg_type where typname like '%int8';
oid | typname
------+---------
20 | int8
1016 | _int8
(2 rows)
postgres[101702]=# select oid, typname from pg_type where typname like '%uint%';
oid | typname
-----+---------
(0 rows)
I agree that we are using 8 bytes unsigned int multiple places in code
as uint64. But I don't see it as an exposed data type and not used as
part of any exposed function. But we will have to use the relfilenode
in the exposed c function e.g.
binary_upgrade_set_next_heap_relfilenode().
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Jan 31, 2022 at 9:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I agree that we are using 8 bytes unsigned int multiple places in code
as uint64. But I don't see it as an exposed data type and not used as
part of any exposed function. But we will have to use the relfilenode
in the exposed c function e.g.
binary_upgrade_set_next_heap_relfilenode().
Oh, I thought we were talking about the C data type uint8 i.e. an
8-bit unsigned integer. Which in retrospect was a dumb thought because
you said you wanted to store the relfilenode AND the fork number
there, which only make sense if you were talking about SQL data types
rather than C data types. It is confusing that we have an SQL data
type called int8 and a C data type called int8 and they're not the
same.
But if you're talking about SQL data types, why? pg_class only stores
the relfilenode and not the fork number currently, and I don't see why
that would change. I think that the data type for the relfilenode
column would change to a 64-bit signed integer (i.e. bigint or int8)
that only ever uses the low-order 56 bits, and then when you need to
store a relfilenode and a fork number in the same 8-byte quantity
you'd do that using either a struct with bit fields or by something
like combined = ((uint64) signed_representation_of_relfilenode) |
(((int) forknumber) << 56);
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, Feb 2, 2022 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 31, 2022 at 9:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I agree that we are using 8 bytes unsigned int multiple places in code
as uint64. But I don't see it as an exposed data type and not used as
part of any exposed function. But we will have to use the relfilenode
in the exposed c function e.g.
binary_upgrade_set_next_heap_relfilenode().Oh, I thought we were talking about the C data type uint8 i.e. an
8-bit unsigned integer. Which in retrospect was a dumb thought because
you said you wanted to store the relfilenode AND the fork number
there, which only make sense if you were talking about SQL data types
rather than C data types. It is confusing that we have an SQL data
type called int8 and a C data type called int8 and they're not the
same.But if you're talking about SQL data types, why? pg_class only stores
the relfilenode and not the fork number currently, and I don't see why
that would change. I think that the data type for the relfilenode
column would change to a 64-bit signed integer (i.e. bigint or int8)
that only ever uses the low-order 56 bits, and then when you need to
store a relfilenode and a fork number in the same 8-byte quantity
you'd do that using either a struct with bit fields or by something
like combined = ((uint64) signed_representation_of_relfilenode) |
(((int) forknumber) << 56);
Yeah you're right. I think whenever we are using combined then we can
use uint64 C type and in pg_class we can keep it as int64 because that
is only representing the relfilenode part. I think I was just
confused and tried to use the same data type everywhere whether it is
combined with fork number or not. Thanks for your input, I will
change this.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Feb 2, 2022 at 7:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Feb 2, 2022 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote:
I have splitted the patch into multiple patches which can be
independently committable and easy to review. I have explained the
purpose and scope of each patch in the respective commit messages.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v3-0004-Don-t-delay-removing-Tombstone-file-until-next-ch.patchtext/x-patch; charset=US-ASCII; name=v3-0004-Don-t-delay-removing-Tombstone-file-until-next-ch.patchDownload
From cf846cb824a8f69fc12e470242a8fd871f7f3969 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Sun, 6 Feb 2022 17:31:32 +0530
Subject: [PATCH v3 4/4] Don't delay removing Tombstone file until next
checkpoint
Currently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/storage/smgr/md.c | 58 ++++++----------------
src/backend/storage/sync/sync.c | 101 --------------------------------------
src/include/storage/sync.h | 1 -
4 files changed, 14 insertions(+), 151 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 267f747..6c1635b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9416,11 +9416,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 0ed59f3..e0503a2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -124,8 +124,6 @@ static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -321,36 +319,25 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileNodeBackendIsTemp(rnode))
{
- if (!RelFileNodeBackendIsTemp(rnode))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
- }
- else
- ret = 0;
+ /* Prevent other backends' fds from holding on to the disk space */
+ ret = do_truncate(path);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
- {
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
+ /* Forget any pending sync requests for the first segment */
+ register_forget_request(rnode, forkNum, 0 /* first seg */ );
}
else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
+ ret = 0;
- /* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ /* Next unlink the file, unless it was already found to be missing */
+ if (ret == 0 || errno != ENOENT)
+ {
+ ret = unlink(path);
+ if (ret < 0 && errno != ENOENT)
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
/*
@@ -995,23 +982,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index 543f691..46a1242 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -188,92 +188,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -519,21 +433,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..4d67850 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -57,7 +57,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
v3-0001-Preliminary-refactoring-for-supporting-larger-rel.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Preliminary-refactoring-for-supporting-larger-rel.patchDownload
From 3699cc598222916da27dfb83c6279f9b76cdf666 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Thu, 3 Feb 2022 21:25:27 +0530
Subject: [PATCH v3 1/4] Preliminary refactoring for supporting larger
relfilenode
Currently, relNode in RelFileNode is Oid type and it can
wraparound so as part of the larger patch set we are trying
to make it 64 bit to avoid wraparound and that will make a
couple of other things simpler as explained in the next
patches.
So this is just a preliminary refactoring patch which will
replace the direct access of RelFileNode.relNode with a macros
to get and set the relNode without changing any other things.
And, the later patches will change the datatype of relNode
and hence the content of the macro to set and get the relNode.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 2 +-
contrib/pg_prewarm/autoprewarm.c | 3 ++-
src/backend/access/common/syncscan.c | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/transam/xlog.c | 8 +++++---
src/backend/access/transam/xlogutils.c | 4 ++--
src/backend/catalog/catalog.c | 12 ++++++++----
src/backend/catalog/storage.c | 3 ++-
src/backend/commands/tablecmds.c | 4 ++--
src/backend/replication/logical/reorderbuffer.c | 4 ++--
src/backend/storage/buffer/bufmgr.c | 20 ++++++++++----------
src/backend/storage/buffer/localbuf.c | 5 +++--
src/backend/storage/freespace/fsmpage.c | 3 ++-
src/backend/storage/smgr/md.c | 10 +++++-----
src/backend/utils/adt/dbsize.c | 10 +++++-----
src/backend/utils/cache/relcache.c | 19 +++++++++++--------
src/backend/utils/cache/relmapper.c | 2 +-
src/bin/pg_rewind/filemap.c | 14 +++++++++-----
src/bin/pg_waldump/pg_waldump.c | 8 +++++---
src/include/common/relpath.h | 3 ++-
src/include/storage/buf_internals.h | 2 +-
src/include/storage/relfilenode.h | 11 +++++++++--
26 files changed, 93 insertions(+), 66 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..737e9b4 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,7 +153,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
+ fctx->record[i].relfilenode = RelFileNodeGetRel(bufHdr->tag.rnode);
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 5d40fb5..91dc4d4 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -617,7 +617,8 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].filenode =
+ RelFileNodeGetRel(bufHdr->tag.rnode);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..cc20c98 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -161,7 +161,7 @@ SyncScanShmemInit(void)
*/
item->location.relfilenode.spcNode = InvalidOid;
item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ RelFileNodeSetRel(item->location.relfilenode, InvalidOid);
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..19b8e3c 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -101,7 +101,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BufferGetTag(buffer, &node, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ node.spcNode, node.dbNode, RelFileNodeGetRel(node));
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 9cab4fa..52ab941 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -28,7 +28,7 @@ out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ RelFileNodeGetRel(xlrec->node), xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..dbfc788 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -172,7 +172,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
xlrec->target_node.spcNode,
xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ RelFileNodeGetRel(xlrec->target_node),
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..7812b4b 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -102,7 +102,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ RelFileNodeGetRel(xlrec->node),
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..edb4470 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -27,7 +27,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SEQ_LOG)
appendStringInfo(buf, "rel %u/%u/%u",
xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ RelFileNodeGetRel(xlrec->node));
}
const char *
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dfe2a0b..4c561a7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1542,7 +1542,7 @@ checkXLogConsistency(XLogReaderState *record)
{
elog(FATAL,
"inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode, RelFileNodeGetRel(rnode),
forknum, blkno);
}
}
@@ -10715,13 +10715,15 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
if (forknum != MAIN_FORKNUM)
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RelFileNodeGetRel(rnode),
forknum,
blk);
else
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RelFileNodeGetRel(rnode),
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 90e1c483..8b4c06c 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -593,7 +593,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ sprintf(RelationGetRelationName(rel), "%u", RelFileNodeGetRel(rnode));
/*
* We set up the lockRelId in case anything tries to lock the dummy
@@ -603,7 +603,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.relId = RelFileNodeGetRel(rnode);
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index dfd5fb6..a0787b2 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -528,14 +528,18 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
do
{
+ Oid relnode;
+
CHECK_FOR_INTERRUPTS();
/* Generate the OID */
if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
+ relnode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
else
- rnode.node.relNode = GetNewObjectId();
+ relnode = GetNewObjectId();
+
+ RelFileNodeSetRel(rnode.node, relnode);
/* Check for existing file of same name */
rpath = relpath(rnode, MAIN_FORKNUM);
@@ -560,7 +564,7 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
pfree(rpath);
} while (collides);
- return rnode.node.relNode;
+ return RelFileNodeGetRel(rnode.node);
}
/*
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 9b80755..254ad60 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -593,7 +593,8 @@ RestorePendingSyncs(char *startAddress)
RelFileNode *rnode;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
+ for (rnode = (RelFileNode *) startAddress; RelFileNodeGetRel(*rnode) != 0;
+ rnode++)
AddPendingSync(rnode);
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3e83f37..339d6eb 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13442,7 +13442,7 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNode = RelFileNodeGetRel(irel->rd_node);
stmt->oldCreateSubid = irel->rd_createSubid;
stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
}
@@ -14329,7 +14329,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
/* Open old and new relation */
newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
+ RelFileNodeSetRel(newrnode, newrelfilenode);
newrnode.spcNode = newTableSpace;
/* hand off to AM to actually create the new filenode and copy the data */
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 19b2ba2..82e59b0 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2134,7 +2134,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
Assert(snapshot_now);
reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ RelFileNodeGetRel(change->data.tp.relnode));
/*
* Mapped catalog tuple without data, emitted while
@@ -4866,7 +4866,7 @@ DisplayMapping(HTAB *tuplecid_data)
elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
ent->key.relnode.dbNode,
ent->key.relnode.spcNode,
- ent->key.relnode.relNode,
+ RelFileNodeGetRel(ent->key.relnode),
ItemPointerGetBlockNumber(&ent->key.tid),
ItemPointerGetOffsetNumber(&ent->key.tid),
ent->cmin,
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index f5459c6..68e16c7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -819,7 +819,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RelFileNodeGetRel(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend);
@@ -881,7 +881,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RelFileNodeGetRel(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend,
found);
@@ -1071,7 +1071,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
+ RelFileNodeGetRel(smgr->smgr_rnode.node),
smgr->smgr_rnode.backend,
isExtend,
found);
@@ -1250,7 +1250,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ RelFileNodeGetRel(smgr->smgr_rnode.node));
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1261,7 +1261,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ RelFileNodeGetRel(smgr->smgr_rnode.node));
}
else
{
@@ -1994,7 +1994,7 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->relNode = RelFileNodeGetRel(bufHdr->tag.rnode);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2838,7 +2838,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ RelFileNodeGetRel(reln->smgr_rnode.node));
buf_state = LockBufHdr(buf);
@@ -2918,7 +2918,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ RelFileNodeGetRel(reln->smgr_rnode.node));
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -4552,9 +4552,9 @@ rnode_comparator(const void *p1, const void *p2)
RelFileNode n1 = *(const RelFileNode *) p1;
RelFileNode n2 = *(const RelFileNode *) p2;
- if (n1.relNode < n2.relNode)
+ if (RelFileNodeGetRel(n1) < RelFileNodeGetRel(n2))
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (RelFileNodeGetRel(n1) > RelFileNodeGetRel(n2))
return 1;
if (n1.dbNode < n2.dbNode)
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..dbb52a9 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -134,7 +134,8 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum, -b - 1);
+ RelFileNodeGetRel(smgr->smgr_rnode.node), forkNum, blockNum,
+ -b - 1);
#endif
buf_state = pg_atomic_read_u32(&bufHdr->state);
@@ -162,7 +163,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum,
+ RelFileNodeGetRel(smgr->smgr_rnode.node), forkNum, blockNum,
-nextFreeLocalBuf - 1);
#endif
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..31d891e 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -274,7 +274,8 @@ restart:
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.spcNode, rnode.dbNode,
+ RelFileNodeGetRel(rnode));
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index d26c915..0ed59f3 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -640,7 +640,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RelFileNodeGetRel(reln->smgr_rnode.node),
reln->smgr_rnode.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
@@ -655,7 +655,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RelFileNodeGetRel(reln->smgr_rnode.node),
reln->smgr_rnode.backend,
nbytes,
BLCKSZ);
@@ -710,7 +710,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RelFileNodeGetRel(reln->smgr_rnode.node),
reln->smgr_rnode.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
@@ -725,7 +725,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ RelFileNodeGetRel(reln->smgr_rnode.node),
reln->smgr_rnode.backend,
nbytes,
BLCKSZ);
@@ -1036,7 +1036,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
rnode.dbNode = dbid;
rnode.spcNode = 0;
- rnode.relNode = 0;
+ RelFileNodeSetRel(rnode, 0);
INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 3a2f2e1..58181bf 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -945,21 +945,21 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
else
rnode.dbNode = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ RelFileNodeSetRel(rnode, relform->relfilenode);
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ RelFileNodeSetRel(rnode,
+ RelationMapOidToFilenode(relid, relform->relisshared));
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ RelFileNodeSetRel(rnode, InvalidOid);
/* some compilers generate warnings without these next two lines */
rnode.dbNode = InvalidOid;
rnode.spcNode = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!OidIsValid(RelFileNodeGetRel(rnode)))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2707fed..0a949af 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1288,7 +1288,7 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ Oid oldnode = RelFileNodeGetRel(relation->rd_node);
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
@@ -1335,15 +1335,18 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ RelFileNodeSetRel(relation->rd_node, relation->rd_rel->relfilenode);
}
else
{
+ Oid relnode;
+
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ relnode = RelationMapOidToFilenode(relation->rd_id,
+ relation->rd_rel->relisshared);
+ RelFileNodeSetRel(relation->rd_node, relnode);
+
+ if (!OidIsValid(relnode))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
@@ -1353,7 +1356,7 @@ RelationInitPhysicalAddr(Relation relation)
* rd_firstRelfilenodeSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnode != RelFileNodeGetRel(relation->rd_node))
{
if (RelFileNodeSkippingWAL(relation->rd_node))
relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
@@ -3712,7 +3715,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
* caught here, if GetNewRelFileNode messes up for any reason.
*/
newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ RelFileNodeSetRel(newrnode, newrelfilenode);
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 4f6811f..38a097b 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -929,7 +929,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
rnode.spcNode = tsid;
rnode.dbNode = dbid;
- rnode.relNode = newmap->mappings[i].mapfilenode;
+ RelFileNodeSetRel(rnode, newmap->mappings[i].mapfilenode);
RelationPreserveStorage(rnode, false);
}
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 7211090..daebb11 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -513,6 +513,7 @@ isRelDataFile(const char *path)
unsigned int segNo;
int nmatch;
bool matched;
+ Oid relnode;
/*----
* Relation data files can be in one of the following directories:
@@ -535,11 +536,11 @@ isRelDataFile(const char *path)
*/
rnode.spcNode = InvalidOid;
rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ relnode = InvalidOid;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/%u.%u", &relnode, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rnode.spcNode = GLOBALTABLESPACE_OID;
@@ -549,7 +550,7 @@ isRelDataFile(const char *path)
else
{
nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ &rnode.dbNode, &relnode, &segNo);
if (nmatch == 2 || nmatch == 3)
{
rnode.spcNode = DEFAULTTABLESPACE_OID;
@@ -558,7 +559,7 @@ isRelDataFile(const char *path)
else
{
nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ &rnode.spcNode, &rnode.dbNode, &relnode,
&segNo);
if (nmatch == 3 || nmatch == 4)
matched = true;
@@ -573,7 +574,10 @@ isRelDataFile(const char *path)
*/
if (matched)
{
- char *check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
+ char *check_path;
+
+ RelFileNodeSetRel(rnode, relnode);
+ check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
if (strcmp(check_path, path) != 0)
matched = false;
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a6251e1..42edba0 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -520,13 +520,15 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
if (forknum != MAIN_FORKNUM)
printf(", blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RelFileNodeGetRel(rnode),
forkNames[forknum],
blk);
else
printf(", blkref #%d: rel %u/%u/%u blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode,
+ RelFileNodeGetRel(rnode),
blk);
if (XLogRecHasBlockImage(record, block_id))
{
@@ -550,7 +552,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
printf("\tblkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rnode.spcNode, rnode.dbNode, RelFileNodeGetRel(rnode),
forkNames[forknum],
blk);
if (XLogRecHasBlockImage(record, block_id))
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a4b5dc8..2b09ad1 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -76,7 +76,8 @@ extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
/* First argument is a RelFileNode */
#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
+ GetRelationPath((rnode).dbNode, (rnode).spcNode, \
+ RelFileNodeGetRel((rnode)), \
backend, forknum)
/* First argument is a RelFileNode */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b903d2b..8397a1d 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -99,7 +99,7 @@ typedef struct buftag
( \
(a).rnode.spcNode = InvalidOid, \
(a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ RelFileNodeSetRel((a).rnode, InvalidOid), \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4fdc606..2ff20c2 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -78,6 +78,13 @@ typedef struct RelFileNodeBackend
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
+/* Macros to get and set the relNode member of the RelFileNode structure. */
+#define RelFileNodeGetRel(node) \
+ ((node).relNode)
+
+#define RelFileNodeSetRel(node, relnode) \
+ ((node).relNode = (relnode))
+
/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
@@ -86,12 +93,12 @@ typedef struct RelFileNodeBackend
* RelFileNodeBackendEquals.
*/
#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
+ ((RelFileNodeGetRel((node1)) == RelFileNodeGetRel((node2))) && \
(node1).dbNode == (node2).dbNode && \
(node1).spcNode == (node2).spcNode)
#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
+ (RelFileNodeGetRel((node1)) == RelFileNodeGetRel((node2)) && \
(node1).node.dbNode == (node2).node.dbNode && \
(node1).backend == (node2).backend && \
(node1).node.spcNode == (node2).node.spcNode)
--
1.8.3.1
v3-0002-Preliminary-refactoring-for-storing-fork-number-i.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Preliminary-refactoring-for-storing-fork-number-i.patchDownload
From cace365aa078cbf68514fa7ca36eac7c0ff58cb9 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Sat, 5 Feb 2022 15:54:36 +0530
Subject: [PATCH v3 2/4] Preliminary refactoring for storing fork number in
RelFileNode
As described in the previous patch that we are planning to make
relNode as 64 bit but that will increase the size of the buffer
tag. So in order to avoid that we are planning to store the
fork number in the first 8 bits of the relNode and the remaining
56 bits will be for the actual relNode. So this is refactoring
for the same and as part of this patch we just avoid accessing
the fork number directly from the buffer tag and instead it will
be accessed using macro and the later patche will change the
macro definition such that it will access from the first 8 bits
of the relNode.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 2 +-
contrib/pg_prewarm/autoprewarm.c | 3 ++-
src/backend/storage/buffer/bufmgr.c | 37 ++++++++++++++-------------
src/backend/storage/buffer/localbuf.c | 8 +++---
src/include/storage/buf_internals.h | 4 +++
5 files changed, 30 insertions(+), 24 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 737e9b4..e181d66 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -156,7 +156,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
fctx->record[i].relfilenode = RelFileNodeGetRel(bufHdr->tag.rnode);
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].forknum = BUFFERTAG_GETFORK(bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 91dc4d4..f62aea4 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -619,7 +619,8 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
block_info_array[num_blocks].filenode =
RelFileNodeGetRel(bufHdr->tag.rnode);
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].forknum =
+ BUFFERTAG_GETFORK(bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 68e16c7..4cf74aa 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1641,7 +1641,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
- bufHdr->tag.forkNum == forkNum)
+ BUFFERTAG_GETFORK(bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1652,7 +1652,7 @@ ReleaseAndReadBuffer(Buffer buffer,
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
- bufHdr->tag.forkNum == forkNum)
+ BUFFERTAG_GETFORK(bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -1995,7 +1995,7 @@ BufferSync(int flags)
item->buf_id = buf_id;
item->tsId = bufHdr->tag.rnode.spcNode;
item->relNode = RelFileNodeGetRel(bufHdr->tag.rnode);
- item->forkNum = bufHdr->tag.forkNum;
+ item->forkNum = BUFFERTAG_GETFORK(bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2702,7 +2702,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode, backend, BUFFERTAG_GETFORK(buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2782,7 +2782,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
/* pinned, so OK to read tag without spinlock */
*rnode = bufHdr->tag.rnode;
- *forknum = bufHdr->tag.forkNum;
+ *forknum = BUFFERTAG_GETFORK(bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2834,7 +2834,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
if (reln == NULL)
reln = smgropen(buf->tag.rnode, InvalidBackendId);
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BUFFERTAG_GETFORK(buf->tag),
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
@@ -2893,7 +2893,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BUFFERTAG_GETFORK(buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2914,7 +2914,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BUFFERTAG_GETFORK(buf->tag),
buf->tag.blockNum,
reln->smgr_rnode.node.spcNode,
reln->smgr_rnode.node.dbNode,
@@ -3143,7 +3143,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
for (j = 0; j < nforks; j++)
{
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ BUFFERTAG_GETFORK(bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3375,7 +3375,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = LockBufHdr(bufHdr);
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
- bufHdr->tag.forkNum == forkNum &&
+ BUFFERTAG_GETFORK(bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3529,7 +3529,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BUFFERTAG_GETFORK(bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -4492,7 +4492,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rnode, BUFFERTAG_GETFORK(buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4516,7 +4516,8 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rnode,
+ BUFFERTAG_GETFORK(bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4535,7 +4536,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum);
+ BUFFERTAG_GETFORK(bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4635,9 +4636,9 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BUFFERTAG_GETFORK(*ba) < BUFFERTAG_GETFORK(*bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BUFFERTAG_GETFORK(*ba) > BUFFERTAG_GETFORK(*bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4802,7 +4803,7 @@ IssuePendingWritebacks(WritebackContext *context)
/* different file, stop */
if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
- cur->tag.forkNum != next->tag.forkNum)
+ BUFFERTAG_GETFORK(cur->tag) != BUFFERTAG_GETFORK(next->tag))
break;
/* ok, block queued twice, skip */
@@ -4821,7 +4822,7 @@ IssuePendingWritebacks(WritebackContext *context)
/* and finally tell the kernel to write the data to storage */
reln = smgropen(tag.rnode, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ smgrwriteback(reln, BUFFERTAG_GETFORK(tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index dbb52a9..2dda7ba 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -222,7 +222,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BUFFERTAG_GETFORK(bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -339,14 +339,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
if ((buf_state & BM_TAG_VALID) &&
RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
- bufHdr->tag.forkNum == forkNum &&
+ BUFFERTAG_GETFORK(bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ BUFFERTAG_GETFORK(bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
@@ -390,7 +390,7 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ BUFFERTAG_GETFORK(bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 8397a1d..38ac9ea 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -118,6 +118,10 @@ typedef struct buftag
(a).forkNum == (b).forkNum \
)
+/* Macro to get the fork number from the buffer tag. */
+#define BUFFERTAG_GETFORK(a) \
+ ((a).forkNum)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v3-0003-Change-relfilenode-from-Oid-to-56-bit-wide-int.patchtext/x-patch; charset=US-ASCII; name=v3-0003-Change-relfilenode-from-Oid-to-56-bit-wide-int.patchDownload
From e87596b42a15f8d0f536914409e0b8cf4080003c Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Sun, 6 Feb 2022 17:30:47 +0530
Subject: [PATCH v3 3/4] Change relfilenode from Oid to 56 bit wide int
As part of this patch change the RelFileNode.relNode 64 bit
wide. Out of this high 8 bits will be reserved the fork number
and the remaining 56 bits will store the actual relation file.
So the idea is that if RelFileNode is used independently then
the RelFileNode.relNode will only store the 56 bits of the
relation and 8 high bits will be cleaned. But as soon as we
store the RelFileNode as part of the BufferTag then we will
use these 8 high bits for the fork number. So that we make the
relfilenode wider without actually increasing the size of
the buffer tag.
---
.../pg_buffercache/pg_buffercache--1.0--1.1.sql | 2 +-
contrib/pg_buffercache/pg_buffercache--1.2.sql | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/common/syncscan.c | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 15 ++++-
src/backend/access/transam/varsup.c | 52 +++++++++++++++-
src/backend/access/transam/xlog.c | 43 +++++++++++++-
src/backend/access/transam/xlogutils.c | 9 +--
src/backend/catalog/catalog.c | 65 ++++----------------
src/backend/catalog/heap.c | 23 ++++----
src/backend/catalog/index.c | 15 +++--
src/backend/commands/cluster.c | 4 +-
src/backend/commands/indexcmds.c | 6 +-
src/backend/commands/sequence.c | 2 +-
src/backend/commands/tablecmds.c | 19 +++---
src/backend/nodes/outfuncs.c | 2 +-
src/backend/parser/parse_utilcmd.c | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 9 +++
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/utils/adt/dbsize.c | 16 ++---
src/backend/utils/adt/pg_upgrade_support.c | 4 +-
src/backend/utils/cache/relcache.c | 20 +++----
src/backend/utils/cache/relfilenodemap.c | 10 ++--
src/backend/utils/cache/relmapper.c | 41 +++++++------
src/backend/utils/misc/pg_controldata.c | 9 ++-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 +++----
src/bin/pg_rewind/filemap.c | 10 ++--
src/bin/pg_upgrade/info.c | 4 +-
src/bin/pg_upgrade/pg_upgrade.h | 4 +-
src/bin/pg_upgrade/relfilenode.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 6 +-
src/common/relpath.c | 22 +++----
src/fe_utils/option_utils.c | 42 +++++++++++++
src/include/access/transam.h | 4 ++
src/include/access/xlog.h | 1 +
src/include/catalog/binary_upgrade.h | 2 +-
src/include/catalog/catalog.h | 4 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/pg_class.h | 10 ++--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 6 +-
src/include/commands/tablecmds.h | 2 +-
src/include/common/relpath.h | 2 +-
src/include/fe_utils/option_utils.h | 3 +
src/include/nodes/parsenodes.h | 2 +-
src/include/postgres_ext.h | 15 +++++
src/include/storage/buf_internals.h | 12 ++--
src/include/storage/relfilenode.h | 69 ++++++++++++++++++----
src/include/utils/rel.h | 2 +-
src/include/utils/relcache.h | 2 +-
src/include/utils/relfilenodemap.h | 2 +-
src/include/utils/relmapper.h | 6 +-
src/test/regress/expected/alter_table.out | 20 +++----
src/test/regress/sql/alter_table.sql | 4 +-
67 files changed, 442 insertions(+), 252 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
index 54d02f5..5e93238 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
@@ -6,6 +6,6 @@
-- Upgrade view to 1.1. format
CREATE OR REPLACE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.2.sql
index 6ee5d84..f52ddcd 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.2.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.2.sql
@@ -12,7 +12,7 @@ LANGUAGE C PARALLEL SAFE;
-- Create a view for convenient access.
CREATE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index e181d66..6eef33a 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelNode relfilenode;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -103,7 +103,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ INT8OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = Int8GetDatum(fctx->record[i].relfilenode);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 879d2db..f72cd34 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1960,7 +1960,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index e68d159..631cd2f 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index cc20c98..4165b18 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -161,7 +161,7 @@ SyncScanShmemInit(void)
*/
item->location.relfilenode.spcNode = InvalidOid;
item->location.relfilenode.dbNode = InvalidOid;
- RelFileNodeSetRel(item->location.relfilenode, InvalidOid);
+ RelFileNodeSetRel(item->location.relfilenode, InvalidRelNode);
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 19b8e3c..0b523ec 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &node, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
node.spcNode, node.dbNode, RelFileNodeGetRel(node));
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 52ab941..7c601ef9 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
RelFileNodeGetRel(xlrec->node), xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index dbfc788..86fc4dd 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_node.spcNode,
xlrec->target_node.dbNode,
RelFileNodeGetRel(xlrec->target_node),
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 7812b4b..1dfc3d57 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
RelFileNodeGetRel(xlrec->node),
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index edb4470..e49e791 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->node.spcNode, xlrec->node.dbNode,
RelFileNodeGetRel(xlrec->node));
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index e7452af..9066566 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenode " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelNode,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelFilenode;
+
+ memcpy(&nextRelFilenode, rec, sizeof(RelNode));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFilenode);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENODE:
+ id = "NEXT_RELFILENODE";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..1361393 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,52 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelNode
+ *
+ * Simmilar to GetNewObjectId but instead of new Oid it generates new relnode.
+ */
+RelNode
+GetNewRelNode(void)
+{
+ RelNode result;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelNode during recovery");
+
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+
+ /*
+ * Check for the wraparound for the relnode counter.
+ *
+ * XXX Actually the relnode is 56 bits wide so we don't need to worry about
+ * the wraparound case.
+ */
+ if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE)
+ {
+ ShmemVariableCache->nextRelNode = FirstNormalRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ }
+
+ /* If we run out of logged for use RelNode then we must log more */
+ if (ShmemVariableCache->relnodecount == 0)
+ {
+ XLogPutNextRelFileNode(ShmemVariableCache->nextRelNode +
+ VAR_RFN_PREFETCH);
+
+ ShmemVariableCache->relnodecount = VAR_RFN_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelNode;
+ (ShmemVariableCache->nextRelNode)++;
+ (ShmemVariableCache->relnodecount)--;
+
+ LWLockRelease(RelNodeGenLock);
+
+ return result;
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 4c561a7..267f747 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1541,7 +1541,7 @@ checkXLogConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rnode.spcNode, rnode.dbNode, RelFileNodeGetRel(rnode),
forknum, blkno);
}
@@ -5396,6 +5396,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelNode = FirstNormalRelNode;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -5409,7 +5410,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -7147,7 +7150,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -9259,6 +9264,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_SHARED);
+ checkPoint.nextRelNode = ShmemVariableCache->nextRelNode;
+ if (!shutdown)
+ checkPoint.nextRelNode += ShmemVariableCache->relnodecount;
+ LWLockRelease(RelNodeGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -10070,6 +10081,18 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Simmialr to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENODE log record.
+ */
+void
+XLogPutNextRelFileNode(RelNode nextrelnode)
+{
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnode), sizeof(RelNode));
+ (void) XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENODE);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -10331,6 +10354,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelNode;
+
+ memcpy(&nextRelNode, XLogRecGetData(record), sizeof(RelNode));
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -10344,6 +10377,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
@@ -10713,14 +10750,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rnode.spcNode, rnode.dbNode,
RelFileNodeGetRel(rnode),
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rnode.spcNode, rnode.dbNode,
RelFileNodeGetRel(rnode),
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 8b4c06c..2c64c68 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -593,17 +593,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", RelFileNodeGetRel(rnode));
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT,
+ RelFileNodeGetRel(rnode));
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = RelFileNodeGetRel(rnode);
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a0787b2..db8e3ff 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -472,27 +472,18 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
/*
* GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
- * database of the given tablespace.
+ * Generate a new relfilenode number.
*
- * If the relfilenode will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
+ * We are using 56 bits for the relfilenode so we expect that to be unique for
+ * the cluster so if it is already exists then report and error.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelNode
+GetNewRelFileNode(Oid reltablespace, char relpersistence)
{
RelFileNodeBackend rnode;
char *rpath;
- bool collides;
BackendId backend;
+ RelNode relNode;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
@@ -525,46 +516,16 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
* are properly detected.
*/
rnode.backend = backend;
+ relNode = GetNewRelNode();
+ RelFileNodeSetRel(rnode.node, relNode);
- do
- {
- Oid relnode;
-
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- relnode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- relnode = GetNewObjectId();
-
- RelFileNodeSetRel(rnode.node, relnode);
+ /* Check for existing file of same name */
+ rpath = relpath(rnode, MAIN_FORKNUM);
- /* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
+ if (access(rpath, F_OK) == 0)
+ elog(ERROR, "new relfilenode file already exists: \"%s\"\n", rpath);
- return RelFileNodeGetRel(rnode.node);
+ return relNode;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7e99de8..32ae8b4 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -93,7 +93,7 @@
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelNode;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -303,7 +303,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -358,8 +358,8 @@ heap_create(const char *relname,
* If relfilenode is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelNodeIsValid(relfilenode))
+ relfilenode = GetNewRelFileNode(reltablespace, relpersistence);
}
/*
@@ -912,7 +912,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1129,7 +1129,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelNode relfilenode = InvalidRelNode;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1187,8 +1187,7 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure that the Oid doesn't collide with either pg_class OIDs.
*/
if (!OidIsValid(relid))
{
@@ -1210,13 +1209,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelNodeIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("toast relfilenode value not set when in binary upgrade mode")));
relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelNode;
}
}
else
@@ -1243,8 +1242,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 5e3fc2b..ed48101 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -662,7 +662,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNode: normally, pass InvalidRelNode to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -703,7 +703,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +735,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelNodeIsValid(relFileNode);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -902,8 +902,7 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure it doesn't collide with either pg_class OIDs.
*/
if (!OidIsValid(indexRelationId))
{
@@ -936,8 +935,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
@@ -1408,7 +1407,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelNode, /* relFileNode */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 2e8efe4..2423003 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1006,9 +1006,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
+ RelNode relfilenode1,
relfilenode2;
- Oid swaptemp;
+ RelNode swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 560dcc8..9ac827c 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1086,7 +1086,7 @@ DefineIndex(Oid relationId,
* A valid stmt->oldNode implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelNodeIsValid(stmt->oldNode) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1316,7 +1316,7 @@ DefineIndex(Oid relationId,
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNode = InvalidRelNode;
childStmt->oldCreateSubid = InvalidSubTransactionId;
childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
@@ -2897,7 +2897,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelNodeIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 27cb630..72137f6 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -74,7 +74,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelNode filenode; /* last seen relfilenode of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 339d6eb..5b03ba7 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3305,7 +3305,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelNode newRelFileNode)
{
Relation pg_class;
HeapTuple tuple;
@@ -3325,7 +3325,7 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
+ if (RelNodeIsValid(newRelFileNode))
rd_rel->relfilenode = newRelFileNode;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
@@ -8573,7 +8573,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelNodeIsValid(stmt->oldNode);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8597,7 +8597,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelNodeIsValid(stmt->oldNode))
{
Relation irel = index_open(address.objectId, NoLock);
@@ -14291,7 +14291,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
+ RelNode newrelfilenode;
RelFileNode newrnode;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14321,10 +14321,13 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenode. Although relfilenodes are unique within a
+ * cluster, we are unable to use the old relfilenode since unused
+ * relfilenodes are not unlinked until commit. So if within a transaction,
+ * if we set the old tablespace again, we will get conflicting relfilenode
+ * file.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
+ newrelfilenode = GetNewRelFileNode(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad46..7372fc0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2771,7 +2771,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_UINT64_FIELD(oldNode);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 99efa26..4b6b2ca 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1577,7 +1577,7 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
@@ -2200,7 +2200,7 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->transformed = false;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 3fb5a92..23822c1 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENODE:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 82e59b0..80a4693 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4863,7 +4863,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT "tid: %u/%u cmin: %u, cmax: %u",
ent->key.relnode.dbNode,
ent->key.relnode.spcNode,
RelFileNodeGetRel(ent->key.relnode),
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4cf74aa..bd92630 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -2782,6 +2782,15 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
/* pinned, so OK to read tag without spinlock */
*rnode = bufHdr->tag.rnode;
+
+ /*
+ * Inside BufferTag rnode->relNode store fork number + relNode. So now
+ * we are extracting the RelFileNode and ForkNumber separately from the
+ * BufferTag so extract the actual relNode part without fork number and
+ * set that into the destination RelFileNode.
+ */
+ RelFileNodeSetRel(*rnode, RelFileNodeGetRel(*rnode));
+
*forknum = BUFFERTAG_GETFORK(bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index 31d891e..0585584 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rnode.spcNode, rnode.dbNode,
RelFileNodeGetRel(rnode));
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..1eb6d78 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelNodeGenLock 48
\ No newline at end of file
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 58181bf..357d4b0 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelNode result;
HeapTuple tuple;
Form_pg_class relform;
@@ -870,15 +870,15 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelNode;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelNodeIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,11 +898,11 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelNode relfilenode = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ if (!RelNodeIsValid(relfilenode))
PG_RETURN_NULL();
heaprel = RelidByRelfilenode(reltablespace, relfilenode);
@@ -953,13 +953,13 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
else
{
/* no storage, return NULL */
- RelFileNodeSetRel(rnode, InvalidOid);
+ RelFileNodeSetRel(rnode, InvalidRelNode);
/* some compilers generate warnings without these next two lines */
rnode.dbNode = InvalidOid;
rnode.spcNode = InvalidOid;
}
- if (!OidIsValid(RelFileNodeGetRel(rnode)))
+ if (!RelNodeIsValid(RelFileNodeGetRel(rnode)))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..ab8d148 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenode = relnode;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 0a949af..e6ae8e0 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1339,14 +1339,14 @@ RelationInitPhysicalAddr(Relation relation)
}
else
{
- Oid relnode;
+ RelNode relnode;
/* Consult the relation mapper */
relnode = RelationMapOidToFilenode(relation->rd_id,
relation->rd_rel->relisshared);
RelFileNodeSetRel(relation->rd_node, relnode);
- if (!OidIsValid(relnode))
+ if (!RelNodeIsValid(relnode))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
@@ -1961,13 +1961,13 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
* because there is no other way to know what filenode they currently
- * have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * have. In bootstrap mode, generate a new relfilenode and add them to the
+ * initial relation mapper data.
*/
- relation->rd_rel->relfilenode = InvalidOid;
+ relation->rd_rel->relfilenode = InvalidRelNode;
if (IsBootstrapProcessingMode())
RelationMapUpdateMap(RelationGetRelid(relation),
- RelationGetRelid(relation),
+ GetNewRelNode(),
isshared, true);
/*
@@ -3437,7 +3437,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3608,7 +3608,7 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelNode;
/* Add it to the active mapping information */
RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
}
@@ -3677,7 +3677,7 @@ RelationBuildLocalRelation(const char *relname,
void
RelationSetNewRelfilenode(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelNode newrelfilenode;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
@@ -3686,7 +3686,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelFileNode newrnode;
/* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
+ newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace,
persistence);
/*
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
index 70c323c..34f7c0b 100644
--- a/src/backend/utils/cache/relfilenodemap.c
+++ b/src/backend/utils/cache/relfilenodemap.c
@@ -37,7 +37,7 @@ static ScanKeyData relfilenode_skey[2];
typedef struct
{
Oid reltablespace;
- Oid relfilenode;
+ RelNodeId relfilenode;
} RelfilenodeMapKey;
typedef struct
@@ -135,7 +135,7 @@ InitializeRelfilenodeMap(void)
* Returns InvalidOid if no relation matching the criteria could be found.
*/
Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
+RelidByRelfilenode(Oid reltablespace, RelNode relfilenode)
{
RelfilenodeMapKey key;
RelfilenodeMapEntry *entry;
@@ -155,7 +155,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
MemSet(&key, 0, sizeof(key));
key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
+ RelNodeIDSetRelNode(key.relfilenode, relfilenode);
/*
* Check cache and return entry if one is found. Even if no target
@@ -196,7 +196,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
+ skey[1].sk_argument = Int64GetDatum(relfilenode);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
+ "unexpected duplicate for tablespace %u, relfilenode" INT64_FORMAT,
reltablespace, relfilenode);
found = true;
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 38a097b..f6b7299 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -79,7 +79,7 @@
typedef struct RelMapping
{
Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ RelNodeId mapfilenode; /* its filenode number */
} RelMapping;
typedef struct RelMapFile
@@ -132,7 +132,7 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
+static void apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
@@ -155,7 +155,7 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
+RelNode
RelationMapOidToFilenode(Oid relationId, bool shared)
{
const RelMapFile *map;
@@ -168,13 +168,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RelNodeIDGetRelNode(map->mappings[i].mapfilenode);
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RelNodeIDGetRelNode(map->mappings[i].mapfilenode);
}
}
else
@@ -183,17 +183,17 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RelNodeIDGetRelNode(map->mappings[i].mapfilenode);
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return RelNodeIDGetRelNode(map->mappings[i].mapfilenode);
}
}
- return InvalidOid;
+ return InvalidRelNode;
}
/*
@@ -209,7 +209,7 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
* relfilenode doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenodeToOid(RelNode filenode, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -220,13 +220,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RelNodeIDGetRelNode(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RelNodeIDGetRelNode(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
}
@@ -235,13 +235,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RelNodeIDGetRelNode(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenode == RelNodeIDGetRelNode(map->mappings[i].mapfilenode))
return map->mappings[i].mapoid;
}
}
@@ -258,7 +258,7 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -316,7 +316,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
+ bool add_okay)
{
int32 i;
@@ -325,7 +326,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ RelNodeIDSetRelNode(map->mappings[i].mapfilenode, fileNode);
return;
}
}
@@ -337,7 +338,8 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ RelNodeIDSetRelNode(map->mappings[map->num_mappings].mapfilenode,
+ fileNode);
map->num_mappings++;
}
@@ -356,7 +358,8 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ RelNodeIDGetRelNode(
+ updates->mappings[i].mapfilenode),
add_okay);
}
}
@@ -929,7 +932,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
rnode.spcNode = tsid;
rnode.dbNode = dbid;
- RelFileNodeSetRel(rnode, newmap->mappings[i].mapfilenode);
+ rnode.relNode = newmap->mappings[i].mapfilenode;
RelationPreserveStorage(rnode, false);
}
}
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..85ed88c 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenode",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelNode);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 7e69475..94ec594 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -520,9 +520,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index f911f98..2513fc3 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNode: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelNode);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3499c0a..6672d82 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4637,12 +4637,12 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
{
PQExpBuffer upgrade_query = createPQExpBuffer();
PGresult *upgrade_res;
- Oid relfilenode;
+ RelNode relfilenode;
Oid toast_oid;
- Oid toast_relfilenode;
+ RelNode toast_relfilenode;
char relkind;
Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ RelNode toast_index_relfilenode;
/*
* Preserve the OID and relfilenode of the table, table's index, table's
@@ -4668,11 +4668,11 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
@@ -4693,9 +4693,9 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
* partitioned tables have a relfilenode, which should not be preserved
* when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelNodeIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenode);
/*
@@ -4709,7 +4709,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenode);
/* every toast table has an index */
@@ -4717,7 +4717,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenode);
}
@@ -4730,7 +4730,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.oid);\n",
relfilenode);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index daebb11..832ba6d 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -513,7 +513,7 @@ isRelDataFile(const char *path)
unsigned int segNo;
int nmatch;
bool matched;
- Oid relnode;
+ RelNode relnode;
/*----
* Relation data files can be in one of the following directories:
@@ -536,11 +536,11 @@ isRelDataFile(const char *path)
*/
rnode.spcNode = InvalidOid;
rnode.dbNode = InvalidOid;
- relnode = InvalidOid;
+ relnode = InvalidRelNode;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &relnode, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &relnode, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rnode.spcNode = GLOBALTABLESPACE_OID;
@@ -549,7 +549,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rnode.dbNode, &relnode, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -558,7 +558,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rnode.spcNode, &rnode.dbNode, &relnode,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 69ef231..d3c5d53 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -383,8 +383,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
i_reltablespace;
+ RelNode i_relfilenode;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -511,7 +511,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenode = atorelnode(PQgetvalue(res, relnum, i_relfilenode));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 1db8e3f..a3503aa 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -122,7 +122,7 @@ typedef struct
char *nspname; /* namespace name */
char *relname; /* relation name */
Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
+ RelNode relfilenode; /* relation file node */
Oid indtable; /* if index, OID of its table, else 0 */
Oid toastheap; /* if toast table, OID of base table, else 0 */
char *tablespace; /* tablespace path; "" for cluster default */
@@ -146,7 +146,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelNode relfilenode;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
index 2f4deb3..10e6a6c 100644
--- a/src/bin/pg_upgrade/relfilenode.c
+++ b/src/bin/pg_upgrade/relfilenode.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenode,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 42edba0..cfc0e5a 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -518,14 +518,14 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rnode.spcNode, rnode.dbNode,
RelFileNodeGetRel(rnode),
forkNames[forknum],
blk);
else
- printf(", blkref #%d: rel %u/%u/%u blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT "blk %u",
block_id,
rnode.spcNode, rnode.dbNode,
RelFileNodeGetRel(rnode),
@@ -550,7 +550,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
continue;
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf("\tblkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rnode.spcNode, rnode.dbNode, RelFileNodeGetRel(rnode),
forkNames[forknum],
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..27b8547 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -138,7 +138,7 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber)
{
char *path;
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
Assert(dbNode == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNode, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/" INT64_FORMAT, relNode);
}
else if (spcNode == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbNode, backendId, relNode);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 9a2816d..8113335 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -217,6 +217,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelNode nextRelNode; /* next relfilenode to assign */
+ uint32 relnodecount; /* Relfilenode available before must do XLOG
+ work */
/*
* These fields are protected by XidGenLock.
@@ -298,6 +301,7 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelNode GetNewRelNode(void);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 5f934dd..35f5832 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -303,6 +303,7 @@ extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern XLogRecPtr CalculateMaxmumSafeLSN(void);
extern void XLogPutNextOid(Oid nextOid);
+extern void XLogPutNextRelFileNode(RelNode nextrelnode);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..d2b45ba 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -26,7 +26,7 @@ extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_toast_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..1b83c79 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -15,6 +15,7 @@
#define CATALOG_H
#include "catalog/pg_class.h"
+#include "storage/relfilenode.h"
#include "utils/relcache.h"
@@ -38,7 +39,6 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelNode GetNewRelFileNode(Oid reltablespace, char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index c4757bd..66d41af 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1e79ec9 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 304e8c1..4659ed3 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -52,13 +52,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* # of blocks (not always up-to-date) */
int32 relpages BKI_DEFAULT(0);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 1f3dc24..27d584d 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelNode nextRelNode; /* next relfile node */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENODE 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7024dbe..2fcca8d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7275,11 +7275,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11055,7 +11055,7 @@
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..297c20b 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelNode newRelFileNode);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 2b09ad1..3e7951b 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -66,7 +66,7 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
*/
extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber);
/*
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 37fcc4c..40a92bc 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2901,7 +2901,7 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
+ RelNode oldNode; /* relfilenode of existing storage, if any */
SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
* oldNode */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..7454933 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,21 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelNode data type identifies the specific relation file name. RelNode is
+ * unique within a cluster.
+ *
+ * XXX idealy we can use uint64 but current we only have int8 as an exposed
+ * datatype so maybe we should make a new datatype relnode which will be of
+ * type 8 bytes unsigned integer.
+ */
+typedef pg_int64 RelNode;
+
+#define atorelnode(x) ((RelNode) strtoul((x), NULL, 10))
+
+#define InvalidRelNode ((RelNode) 0)
+#define FirstNormalRelNode ((RelNode) 1)
+#define RelNodeIsValid(relNode) ((bool) ((relNode) != InvalidRelNode))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 38ac9ea..d6bf90c 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -21,6 +21,7 @@
#include "storage/condition_variable.h"
#include "storage/latch.h"
#include "storage/lwlock.h"
+#include "storage/relfilenode.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
@@ -91,7 +92,6 @@
typedef struct buftag
{
RelFileNode rnode; /* physical relation identifier */
- ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
@@ -99,15 +99,15 @@ typedef struct buftag
( \
(a).rnode.spcNode = InvalidOid, \
(a).rnode.dbNode = InvalidOid, \
- RelFileNodeSetRel((a).rnode, InvalidOid), \
- (a).forkNum = InvalidForkNumber, \
+ RelFileNodeSetRel((a).rnode, InvalidRelNode), \
+ RelFileNodeSetFork((a).rnode, (InvalidForkNumber)), \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
( \
(a).rnode = (xx_rnode), \
- (a).forkNum = (xx_forkNum), \
+ RelFileNodeSetFork((a).rnode, (xx_forkNum)), \
(a).blockNum = (xx_blockNum) \
)
@@ -115,12 +115,12 @@ typedef struct buftag
( \
RelFileNodeEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
+ RelFileNodeGetFork((a).rnode) == RelFileNodeGetFork((b).rnode) \
)
/* Macro to get the fork number from the buffer tag. */
#define BUFFERTAG_GETFORK(a) \
- ((a).forkNum)
+ RelFileNodeGetFork((a).rnode)
/*
* The shared buffer mapping table is partitioned to reduce contention.
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 2ff20c2..9359ca2 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -18,6 +18,18 @@
#include "storage/backendid.h"
/*
+ * RelNodeId:
+ *
+ * this is a storage type for RelFileNode.relNode. Instead of directly using
+ * uint64 we use 2 uint32 for avoiding the alignment padding.
+ */
+typedef struct RelNodeId
+{
+ uint32 rn_hi;
+ uint32 rn_lo;
+} RelNodeId;
+
+/*
* RelFileNode must provide all that we need to know to physically access
* a relation, with the exception of the backend ID, which can be provided
* separately. Note, however, that a "physical" relation is comprised of
@@ -31,11 +43,14 @@
* "shared" relations (those common to all databases of a cluster).
* Nonzero dbNode values correspond to pg_database.oid.
*
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
+ * relNode identifies the specific relation and its fork number. High 8 bits
+ * represent the fork number and the remaining 56 bits represent the
+ * relation. relNode corresponds to pg_class.relfilenode (NOT pg_class.oid).
+ * Notice that relNode is unique within a cluster.
+ *
+ * Note: When RelFileNode is part of the BufferTag only then the first 8 bits
+ * of the relNode will represent the fork number otherwise those will be
+ * cleared.
*
* Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
* zero. We support shared relations only in the "global" tablespace.
@@ -53,12 +68,14 @@
* Note: various places use RelFileNode in hashtable keys. Therefore,
* there *must not* be any unused padding bytes in this struct. That
* should be safe as long as all the fields are of type Oid.
+ *
+ * We use RelNodeId in order to avoid the alignment padding.
*/
typedef struct RelFileNode
{
Oid spcNode; /* tablespace */
Oid dbNode; /* database */
- Oid relNode; /* relation */
+ RelNodeId relNode; /* relation */
} RelFileNode;
/*
@@ -78,12 +95,42 @@ typedef struct RelFileNodeBackend
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
-/* Macros to get and set the relNode member of the RelFileNode structure. */
-#define RelFileNodeGetRel(node) \
- ((node).relNode)
+/*
+ * These macros define the "relation" stored in the RelFileNode.relNode. Its
+ * remaining 8 high-order bits identify the relation's fork number.
+ */
+#define RELFILENODE_RELNODE_BITS 56
+#define RELFILENODE_RELNODE_MASK ((((uint64) 1) << RELFILENODE_RELNODE_BITS) - 1)
+#define MAX_RELFILENODE RELFILENODE_RELNODE_MASK
+
+/* Retrieve the RelNode from a RelNodeId. */
+#define RelNodeIDGetRelNode(rnodeid) \
+ (uint64) (((uint64) (rnodeid).rn_hi << 32) | ((uint32) (rnodeid).rn_lo))
+
+/* Store the given value in RelNodeId. */
+#define RelNodeIDSetRelNode(rnodeid, val) \
+( \
+ (rnodeid).rn_hi = (val) >> 32, \
+ (rnodeid).rn_lo = (val) & 0xffffffff \
+)
+
+/* Gets the relfilenode stored in rnode.relNode. */
+#define RelFileNodeGetRel(rnode) \
+ (RelNodeIDGetRelNode((rnode).relNode) & RELFILENODE_RELNODE_MASK)
+
+/* Gets the fork number stored in rnode.relNode. */
+#define RelFileNodeGetFork(rnode) \
+ (RelNodeIDGetRelNode((rnode).relNode) >> RELFILENODE_RELNODE_BITS)
+
+/* Sets input val in the relfilenode part of the rnode.relNode. */
+#define RelFileNodeSetRel(rnode, val) \
+ RelNodeIDSetRelNode((rnode).relNode, (val) & RELFILENODE_RELNODE_MASK)
-#define RelFileNodeSetRel(node, relnode) \
- ((node).relNode = (relnode))
+/* Sets input val in the fork number part of the rnode.relNode. */
+#define RelFileNodeSetFork(rnode, val) \
+ RelNodeIDSetRelNode((rnode).relNode, \
+ (RelNodeIDGetRelNode((rnode).relNode)) | \
+ ((uint64) (val) << RELFILENODE_RELNODE_BITS))
/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 6da1b22..a47ede3 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -526,7 +526,7 @@ typedef struct ViewOptions
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelNode))
/*
* RelationGetSmgr
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 84d6afe..5d13660 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -102,7 +102,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
index 77d8046..d324981 100644
--- a/src/include/utils/relfilenodemap.h
+++ b/src/include/utils/relfilenodemap.h
@@ -13,6 +13,6 @@
#ifndef RELFILENODEMAP_H
#define RELFILENODEMAP_H
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
+extern Oid RelidByRelfilenode(Oid reltablespace, RelNode relfilenode);
#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 9fbb5a7..58234a8 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -35,11 +35,11 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelNode RelationMapOidToFilenode(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
+extern Oid RelationMapFilenodeToOid(RelNode relationId, bool shared);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+extern void RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 16e0475..58aeddb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,7 +2164,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,7 +2197,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | OTHER | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | OTHER | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index ac894c0..250e6cd 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,7 +1478,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1499,7 +1498,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
On Mon, Feb 7, 2022 at 12:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have splitted the patch into multiple patches which can be
independently committable and easy to review. I have explained the
purpose and scope of each patch in the respective commit messages.
Hmm. The parts of this I've looked at seem reasonably clean, but I
don't think I like the design choice. You're inventing
RelFileNodeSetFork(), but at present the RelFileNode struct doesn't
include a fork number. I feel like we should leave that alone, and
only change the definition of a BufferTag. What about adding accessors
for all of the BufferTag fields in 0001, and then in 0002 change it to
look like something this:
typedef struct BufferTag
{
Oid dbOid;
Oid tablespaceOid;
uint32 fileNode_low;
uint32 fileNode_hi:24;
uint32 forkNumber:8;
BlockNumber blockNumber;
} BufferTag;
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Feb 7, 2022 at 9:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 7, 2022 at 12:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have splitted the patch into multiple patches which can be
independently committable and easy to review. I have explained the
purpose and scope of each patch in the respective commit messages.Hmm. The parts of this I've looked at seem reasonably clean, but I
don't think I like the design choice. You're inventing
RelFileNodeSetFork(), but at present the RelFileNode struct doesn't
include a fork number. I feel like we should leave that alone, and
only change the definition of a BufferTag. What about adding accessors
for all of the BufferTag fields in 0001, and then in 0002 change it to
look like something this:typedef struct BufferTag
{
Oid dbOid;
Oid tablespaceOid;
uint32 fileNode_low;
uint32 fileNode_hi:24;
uint32 forkNumber:8;
BlockNumber blockNumber;
} BufferTag;
Okay, we can do that. But we can not leave RelFileNode untouched I
mean inside RelFileNode also we will have to change the relNode as 2
32 bit integers, I mean like below.
typedef struct RelFileNode
{
Oid spcNode;
Oid dbNode;
uint32 relNode_low;
uint32 relNode_hi;
} RelFileNode;
For RelFileNode also we need to use 2, 32-bit integers so that we do
not add extra alignment padding because there are a few more
structures that include RelFileNode e.g. xl_xact_relfilenodes,
RelFileNodeBackend, and many other structures.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Feb 7, 2022 at 11:31 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
For RelFileNode also we need to use 2, 32-bit integers so that we do
not add extra alignment padding because there are a few more
structures that include RelFileNode e.g. xl_xact_relfilenodes,
RelFileNodeBackend, and many other structures.
Are you sure that kind of stuff is really important enough to justify
the code churn? I don't think RelFileNodeBackend is used widely enough
or in sufficiently performance-critical places that we really need to
care about a few bytes of alignment padding. xl_xact_relfilenodes is
more concerning because that goes into the WAL format, but I don't
know that we use it often enough for an extra 4 bytes per record to
really matter, especially considering that this proposal also adds 4
bytes *per relfilenode* which has to be a much bigger deal than a few
padding bytes after 'nrels'. The reason why BufferTag matters a lot is
because (1) we have an array of this struct that can easily contain a
million or eight entries, so the alignment padding adds up a lot more
and (2) access to that array is one of the most performance-critical
parts of PostgreSQL.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Feb 7, 2022 at 10:13 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 7, 2022 at 11:31 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
For RelFileNode also we need to use 2, 32-bit integers so that we do
not add extra alignment padding because there are a few more
structures that include RelFileNode e.g. xl_xact_relfilenodes,
RelFileNodeBackend, and many other structures.Are you sure that kind of stuff is really important enough to justify
the code churn? I don't think RelFileNodeBackend is used widely enough
or in sufficiently performance-critical places that we really need to
care about a few bytes of alignment padding. xl_xact_relfilenodes is
more concerning because that goes into the WAL format, but I don't
know that we use it often enough for an extra 4 bytes per record to
really matter, especially considering that this proposal also adds 4
bytes *per relfilenode* which has to be a much bigger deal than a few
padding bytes after 'nrels'. The reason why BufferTag matters a lot is
because (1) we have an array of this struct that can easily contain a
million or eight entries, so the alignment padding adds up a lot more
and (2) access to that array is one of the most performance-critical
parts of PostgreSQL.
I agree with you that adding 4 extra bytes to these structures might
not be really critical. I will make the changes based on this idea
and see how the changes look.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Jan 6, 2022 at 1:43 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
2) GetNewRelFileNode() will not loop for checking the file existence
and retry with other relfilenode.
While working on this I realized that even if we make the relfilenode
56 bits we can not remove the loop inside GetNewRelFileNode() for
checking the file existence. Because it is always possible that the
file reaches to the disk even before the WAL for advancing the next
relfilenode and if the system crashes in between that then we might
generate the duplicate relfilenode right?
I think the second paragraph in XLogPutNextOid() function explain this
issue and now even after we get the wider relfilenode we will have
this issue. Correct?
I am also attaching the latest set of patches for reference, these
patches fix the review comments given by Robert about moving the
dbOid, tbsOid and RelNode directly into the buffer tag.
Open Issues- there are currently 2 open issues in the patch 1) Issue
as discussed above about removing the loop, so currently in this patch
the loop is removed. 2) During upgrade from the previous version we
need to advance the nextrelfilenode to the current relfilenode we are
setting for the object in order to avoid the conflict.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v4-0001-Preliminary-refactoring-for-supporting-larger-rel.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Preliminary-refactoring-for-supporting-larger-rel.patchDownload
From f9ccf63067bd6073442856f2e81721b74fa92f05 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Tue, 8 Feb 2022 13:54:54 +0530
Subject: [PATCH v4 1/4] Preliminary refactoring for supporting larger
relfilenode
Currently, relfilenode is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileNode, we will keep the
tablespace Oid, database Oid, and the relfilenode directly. So that
once we change the relNode in RelFileNode to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
contrib/pg_prewarm/autoprewarm.c | 6 +-
src/backend/storage/buffer/bufmgr.c | 113 ++++++++++++++++++--------
src/backend/storage/buffer/localbuf.c | 24 ++++--
src/include/storage/buf_internals.h | 32 ++++++--
5 files changed, 128 insertions(+), 53 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..6af96c8 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.fileNode;
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 1d4d74b..fe537e9 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -616,9 +616,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenode = bufHdr->tag.fileNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index f5459c6..5014fe6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1640,7 +1640,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ BuffTagRelFileNodeEquals(bufHdr->tag, relation->rd_node) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1651,7 +1651,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ BuffTagRelFileNodeEquals(bufHdr->tag, relation->rd_node) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1993,8 +1993,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNode = bufHdr->tag.fileNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2686,6 +2686,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileNode rnode;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2701,8 +2702,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BuffTagGetRelFileNode(buf->tag, rnode);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(rnode, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2781,7 +2784,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ BuffTagGetRelFileNode(bufHdr->tag, *rnode);
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2832,7 +2835,12 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ {
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(buf->tag, rnode);
+ reln = smgropen(rnode, InvalidBackendId);
+ }
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -3135,14 +3143,14 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!BuffTagRelFileNodeEquals(bufHdr->tag, rnode.node))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rnode.node) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3295,7 +3303,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3304,7 +3312,10 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
+ RelFileNode node;
+
+ BuffTagGetRelFileNode(bufHdr->tag, node);
+ rnode = bsearch((const void *) &(node),
nodes, n, sizeof(RelFileNode),
rnode_comparator);
}
@@ -3314,7 +3325,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3374,7 +3385,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3413,11 +3424,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3441,13 +3452,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(buf->tag, rnode);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rnode, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3467,12 +3481,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(buf->tag, rnode);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3511,7 +3529,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rel->rd_node) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3558,13 +3576,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!BuffTagRelFileNodeEquals(bufHdr->tag, rel->rd_node))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rel->rd_node) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3638,7 +3656,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, srels[j].rnode))
{
srelent = &srels[j];
break;
@@ -3648,7 +3666,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+ srelent = bsearch((const void *) &(rnode),
srels, nrels, sizeof(SMgrSortArray),
rnode_comparator);
}
@@ -3660,7 +3681,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, srelent->rnode) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3710,13 +3731,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3876,6 +3897,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+
/*
* If we must not write WAL, due to a relfilenode-specific
* condition or being in recovery, don't dirty the page. We can
@@ -3884,8 +3909,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ if (RecoveryInProgress() || RelFileNodeSkippingWAL(rnode))
return;
/*
@@ -4491,8 +4515,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileNode rnode;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ BuffTagGetRelFileNode(buf->tag, rnode);
+ path = relpathperm(rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4516,7 +4542,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path;
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+ path = relpathperm(rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4534,8 +4564,11 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+ path = relpathbackend(rnode, MyBackendId, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4629,8 +4662,13 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileNode rnodea;
+ RelFileNode rnodeb;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ BuffTagGetRelFileNode(*ba, rnodea);
+ BuffTagGetRelFileNode(*bb, rnodeb);
+
+ ret = rnode_comparator(&rnodea, &rnodeb);
if (ret != 0)
return ret;
@@ -4787,10 +4825,13 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileNode currnode;
+
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BuffTagGetRelFileNode(tag, currnode);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4798,10 +4839,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileNode nextrnode;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BuffTagGetRelFileNode(next->tag, nextrnode);
+
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeEquals(currnode, nextrnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4820,7 +4865,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(currnode, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..28446da 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,9 +213,12 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(rnode, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -337,16 +340,21 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ BuffTagRelFileNodeEquals(bufHdr->tag, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rnode, MyBackendId, bufHdr->tag.forkNum),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +391,15 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ BuffTagRelFileNodeEquals(bufHdr->tag, rnode))
{
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rnode, MyBackendId, bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b903d2b..0286d51 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,54 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ Oid spcOid; /* tablespace oid. */
+ Oid dbOid; /* database oid. */
+ Oid fileNode; /* relation file node. */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ (a).fileNode = InvalidOid, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).spcOid = (xx_rnode).spcNode, \
+ (a).dbOid = (xx_rnode).dbNode, \
+ (a).fileNode = (xx_rnode).relNode, \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).fileNode == (b).fileNode && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BuffTagGetRelFileNode(a, node) \
+do { \
+ (node).spcNode = (a).spcOid; \
+ (node).dbNode = (a).dbOid; \
+ (node).relNode = (a).fileNode; \
+} while(0)
+
+#define BuffTagRelFileNodeEquals(a, node) \
+( \
+ (a).spcOid == (node).spcNode && \
+ (a).dbOid == (node).dbNode && \
+ (a).fileNode == (node).relNode \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v4-0003-Don-t-delay-removing-Tombstone-file-until-next-ch.patchtext/x-patch; charset=US-ASCII; name=v4-0003-Don-t-delay-removing-Tombstone-file-until-next-ch.patchDownload
From d1909c42e3a290e7f3252399e3a93e44525f7185 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Tue, 8 Feb 2022 18:38:08 +0530
Subject: [PATCH v4 3/4] Don't delay removing Tombstone file until next
checkpoint
Currently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/storage/smgr/md.c | 58 ++++++----------------
src/backend/storage/sync/sync.c | 101 --------------------------------------
src/include/storage/sync.h | 1 -
4 files changed, 14 insertions(+), 151 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 92ac7a7..2181d6b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6611,11 +6611,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 879f647..7943b17 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -124,8 +124,6 @@ static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -321,36 +319,25 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileNodeBackendIsTemp(rnode))
{
- if (!RelFileNodeBackendIsTemp(rnode))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
- }
- else
- ret = 0;
+ /* Prevent other backends' fds from holding on to the disk space */
+ ret = do_truncate(path);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
- {
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
+ /* Forget any pending sync requests for the first segment */
+ register_forget_request(rnode, forkNum, 0 /* first seg */ );
}
else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
+ ret = 0;
- /* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ /* Next unlink the file, unless it was already found to be missing */
+ if (ret == 0 || errno != ENOENT)
+ {
+ ret = unlink(path);
+ if (ret < 0 && errno != ENOENT)
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
/*
@@ -1001,23 +988,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e161d57..18cb350 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -189,92 +189,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -520,21 +434,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..4d67850 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -57,7 +57,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
v4-0002-Use-56-bits-for-relfilenode-to-avoid-wraparound.patchtext/x-patch; charset=UTF-8; name=v4-0002-Use-56-bits-for-relfilenode-to-avoid-wraparound.patchDownload
From d409230e66ac3197e5cff9346caa9a72f438373e Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Wed, 16 Feb 2022 17:29:39 +0530
Subject: [PATCH v4 2/4] Use 56 bits for relfilenode to avoid wraparound
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As part of this patch, we will make the relfilenode 64 bits wide.
But the problem is that if we make it 64 bits wide then the size
of the BufferTag will be increased which will increase the memory
usage and that may also impact the performance. So in order to
avoid that inside the buffer tag, instead of using 64 bits for the
relfilenode we will use 8 bits for the fork number and 56 bits for
the relfilenode.
---
.../pg_buffercache/pg_buffercache--1.0--1.1.sql | 2 +-
contrib/pg_buffercache/pg_buffercache--1.2.sql | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +--
contrib/pg_prewarm/autoprewarm.c | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/common/syncscan.c | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 15 +++++-
src/backend/access/transam/README | 4 +-
src/backend/access/transam/varsup.c | 52 +++++++++++++++++++-
src/backend/access/transam/xlog.c | 37 ++++++++++++++
src/backend/access/transam/xlogrecovery.c | 6 +--
src/backend/access/transam/xlogutils.c | 8 +--
src/backend/catalog/catalog.c | 57 ++++------------------
src/backend/catalog/heap.c | 29 ++++++-----
src/backend/catalog/index.c | 21 ++++----
src/backend/commands/cluster.c | 12 ++---
src/backend/commands/indexcmds.c | 6 +--
src/backend/commands/sequence.c | 2 +-
src/backend/commands/tablecmds.c | 19 +++++---
src/backend/nodes/outfuncs.c | 2 +-
src/backend/parser/parse_utilcmd.c | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 16 +++---
src/backend/utils/adt/pg_upgrade_support.c | 12 ++---
src/backend/utils/cache/relcache.c | 18 +++----
src/backend/utils/cache/relfilenodemap.c | 8 +--
src/backend/utils/cache/relmapper.c | 15 +++---
src/backend/utils/misc/pg_controldata.c | 9 +++-
src/bin/pg_checksums/pg_checksums.c | 6 +--
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 28 +++++------
src/bin/pg_rewind/filemap.c | 8 +--
src/bin/pg_upgrade/info.c | 4 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +--
src/bin/pg_upgrade/pg_upgrade.h | 4 +-
src/bin/pg_upgrade/relfilenode.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 6 +--
src/common/relpath.c | 22 ++++-----
src/fe_utils/option_utils.c | 42 ++++++++++++++++
src/include/access/transam.h | 4 ++
src/include/access/xlog.h | 1 +
src/include/catalog/binary_upgrade.h | 6 +--
src/include/catalog/catalog.h | 4 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/pg_class.h | 10 ++--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 ++--
src/include/commands/tablecmds.h | 2 +-
src/include/common/relpath.h | 2 +-
src/include/fe_utils/option_utils.h | 3 ++
src/include/nodes/parsenodes.h | 2 +-
src/include/postgres_ext.h | 15 ++++++
src/include/storage/buf_internals.h | 29 ++++++++---
src/include/storage/relfilenode.h | 14 ++++--
src/include/utils/rel.h | 2 +-
src/include/utils/relcache.h | 2 +-
src/include/utils/relfilenodemap.h | 2 +-
src/include/utils/relmapper.h | 6 +--
src/test/regress/expected/alter_table.out | 20 ++++----
src/test/regress/sql/alter_table.sql | 4 +-
72 files changed, 411 insertions(+), 259 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
index 54d02f5..5e93238 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
@@ -6,6 +6,6 @@
-- Upgrade view to 1.1. format
CREATE OR REPLACE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.2.sql
index 6ee5d84..f52ddcd 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.2.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.2.sql
@@ -12,7 +12,7 @@ LANGUAGE C PARALLEL SAFE;
-- Create a view for convenient access.
CREATE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 6af96c8..94d2570 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelNode relfilenode;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -103,7 +103,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ INT8OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -153,7 +153,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.fileNode;
+ fctx->record[i].relfilenode = BufTagGetFileNode(bufHdr->tag);
fctx->record[i].reltablespace = bufHdr->tag.spcOid;
fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = Int8GetDatum(fctx->record[i].relfilenode);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index fe537e9..6899ace 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ RelNode filenode;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -618,7 +618,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
block_info_array[num_blocks].database = bufHdr->tag.dbOid;
block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
- block_info_array[num_blocks].filenode = bufHdr->tag.fileNode;
+ block_info_array[num_blocks].filenode = BufTagGetFileNode(bufHdr->tag);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 5a1627a..d6e1fad 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1960,7 +1960,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index e68d159..631cd2f 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..aa71523 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -161,7 +161,7 @@ SyncScanShmemInit(void)
*/
item->location.relfilenode.spcNode = InvalidOid;
item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilenode.relNode = InvalidRelNode;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..17b77b9 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &node, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
node.spcNode, node.dbNode, node.relNode);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 9cab4fa..203685a 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
xlrec->node.relNode, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..57af152 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_node.spcNode,
xlrec->target_node.dbNode,
xlrec->target_node.relNode,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..8c44ebd 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
xlrec->node.relNode,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..5385ded 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->node.spcNode, xlrec->node.dbNode,
xlrec->node.relNode);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index e7452af..9066566 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenode " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelNode,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelFilenode;
+
+ memcpy(&nextRelFilenode, rec, sizeof(RelNode));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFilenode);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENODE:
+ id = "NEXT_RELFILENODE";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..5c81f6c 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,8 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
-OIDs. So cleaning up isn't really necessary.
+because relfilenode is 56 bit wide so logically there should not be any
+collisions. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..1361393 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,52 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelNode
+ *
+ * Simmilar to GetNewObjectId but instead of new Oid it generates new relnode.
+ */
+RelNode
+GetNewRelNode(void)
+{
+ RelNode result;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelNode during recovery");
+
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+
+ /*
+ * Check for the wraparound for the relnode counter.
+ *
+ * XXX Actually the relnode is 56 bits wide so we don't need to worry about
+ * the wraparound case.
+ */
+ if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE)
+ {
+ ShmemVariableCache->nextRelNode = FirstNormalRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ }
+
+ /* If we run out of logged for use RelNode then we must log more */
+ if (ShmemVariableCache->relnodecount == 0)
+ {
+ XLogPutNextRelFileNode(ShmemVariableCache->nextRelNode +
+ VAR_RFN_PREFETCH);
+
+ ShmemVariableCache->relnodecount = VAR_RFN_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelNode;
+ (ShmemVariableCache->nextRelNode)++;
+ (ShmemVariableCache->relnodecount)--;
+
+ LWLockRelease(RelNodeGenLock);
+
+ return result;
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ce78ac4..92ac7a7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4547,6 +4547,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelNode = FirstNormalRelNode;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4560,7 +4561,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5026,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6454,6 +6459,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_SHARED);
+ checkPoint.nextRelNode = ShmemVariableCache->nextRelNode;
+ if (!shutdown)
+ checkPoint.nextRelNode += ShmemVariableCache->relnodecount;
+ LWLockRelease(RelNodeGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7308,6 +7319,18 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Simmialr to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENODE log record.
+ */
+void
+XLogPutNextRelFileNode(RelNode nextrelnode)
+{
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnode), sizeof(RelNode));
+ (void) XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENODE);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7522,6 +7545,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelNode;
+
+ memcpy(&nextRelNode, XLogRecGetData(record), sizeof(RelNode));
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7536,6 +7569,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index d5269ed..57c9d75 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2144,13 +2144,13 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
blk);
@@ -2343,7 +2343,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rnode.spcNode, rnode.dbNode, rnode.relNode,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 54d5f20..f9f0aa8 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -593,17 +593,17 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rnode.relNode);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index dfd5fb6..9bc1809 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -472,26 +472,16 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
/*
* GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
- * database of the given tablespace.
+ * Generate a new relfilenode number.
*
- * If the relfilenode will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
+ * We are using 56 bits for the relfilenode so we expect that to be unique for
+ * the cluster so if it is already exists then report and error.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelNode
+GetNewRelFileNode(Oid reltablespace, char relpersistence)
{
RelFileNodeBackend rnode;
char *rpath;
- bool collides;
BackendId backend;
/*
@@ -525,40 +515,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
* are properly detected.
*/
rnode.backend = backend;
+ rnode.node.relNode = GetNewRelNode();
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rnode.node.relNode = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ /* Check for existing file of same name */
+ rpath = relpath(rnode, MAIN_FORKNUM);
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
+ if (access(rpath, F_OK) == 0)
+ elog(ERROR, "new relfilenode file already exists: \"%s\"\n", rpath);
return rnode.node.relNode;
}
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7e99de8..67f3225 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -91,9 +91,9 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_heap_pg_class_relfilenode = InvalidRelNode;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelNode;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -303,7 +303,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -358,8 +358,8 @@ heap_create(const char *relname,
* If relfilenode is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelNodeIsValid(relfilenode))
+ relfilenode = GetNewRelFileNode(reltablespace, relpersistence);
}
/*
@@ -912,7 +912,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1129,7 +1129,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelNode relfilenode = InvalidRelNode;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1187,8 +1187,7 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure that the Oid doesn't collide with other pg_class OIDs.
*/
if (!OidIsValid(relid))
{
@@ -1210,13 +1209,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelNodeIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("toast relfilenode value not set when in binary upgrade mode")));
relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelNode;
}
}
else
@@ -1231,20 +1230,20 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!RelNodeIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("relfilenode value not set when in binary upgrade mode")));
relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_heap_pg_class_relfilenode = InvalidRelNode;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 5e3fc2b..7a19f45 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,7 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_index_pg_class_relfilenode = InvalidRelNode;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,7 +662,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNode: normally, pass InvalidRelNode to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -703,7 +703,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +735,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelNodeIsValid(relFileNode);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -902,8 +902,7 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure it doesn't collide with other pg_class OIDs.
*/
if (!OidIsValid(indexRelationId))
{
@@ -920,12 +919,12 @@ index_create(Relation heapRelation,
/* Overide the index relfilenode */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!RelNodeIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("index relfilenode value not set when in binary upgrade mode")));
relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_index_pg_class_relfilenode = InvalidRelNode;
/*
* Note that we want create_storage = true for binary upgrade.
@@ -936,8 +935,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
@@ -1408,7 +1407,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelNode, /* relFileNode */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 02a7e94..0cee4c6 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1005,9 +1005,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
+ RelNode relfilenode1,
relfilenode2;
- Oid swaptemp;
+ RelNode swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
@@ -1026,7 +1026,7 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
relfilenode1 = relform1->relfilenode;
relfilenode2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (RelNodeIsValid(relfilenode1) && RelNodeIsValid(relfilenode2))
{
/*
* Normal non-mapped relations: swap relfilenodes, reltablespaces,
@@ -1064,7 +1064,7 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (RelNodeIsValid(relfilenode1) || RelNodeIsValid(relfilenode2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1093,11 +1093,11 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ if (!RelNodeIsValid(relfilenode1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ if (!RelNodeIsValid(relfilenode2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 560dcc8..9ac827c 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1086,7 +1086,7 @@ DefineIndex(Oid relationId,
* A valid stmt->oldNode implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelNodeIsValid(stmt->oldNode) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1316,7 +1316,7 @@ DefineIndex(Oid relationId,
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNode = InvalidRelNode;
childStmt->oldCreateSubid = InvalidSubTransactionId;
childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
@@ -2897,7 +2897,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelNodeIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ab592ce..aafca83 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -74,7 +74,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelNode filenode; /* last seen relfilenode of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3e83f37..3f17f7d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3305,7 +3305,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelNode newRelFileNode)
{
Relation pg_class;
HeapTuple tuple;
@@ -3325,7 +3325,7 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
+ if (RelNodeIsValid(newRelFileNode))
rd_rel->relfilenode = newRelFileNode;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
@@ -8573,7 +8573,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelNodeIsValid(stmt->oldNode);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8597,7 +8597,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelNodeIsValid(stmt->oldNode))
{
Relation irel = index_open(address.objectId, NoLock);
@@ -14291,7 +14291,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
+ RelNode newrelfilenode;
RelFileNode newrnode;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14321,10 +14321,13 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenode. Although relfilenodes are unique within a
+ * cluster, we are unable to use the old relfilenode since unused
+ * relfilenodes are not unlinked until commit. So if within a transaction,
+ * if we set the old tablespace again, we will get conflicting relfilenode
+ * file.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
+ newrelfilenode = GetNewRelFileNode(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad46..7372fc0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2771,7 +2771,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_UINT64_FIELD(oldNode);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 99efa26..4b6b2ca 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1577,7 +1577,7 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
@@ -2200,7 +2200,7 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->transformed = false;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 18cf931..ffd89b6 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -156,6 +156,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENODE:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c2d9be8..8b228b8 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -5268,7 +5268,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT "tid: %u/%u cmin: %u, cmax: %u",
ent->key.relnode.dbNode,
ent->key.relnode.spcNode,
ent->key.relnode.relNode,
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 5014fe6..42f551d 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1994,7 +1994,7 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
item->tsId = bufHdr->tag.spcOid;
- item->relNode = bufHdr->tag.fileNode;
+ item->relNode = BufTagGetFileNode(bufHdr->tag);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..3c0c88d 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..1eb6d78 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelNodeGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index d71a557..a550823 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -156,7 +156,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = SizeOfRelFileNodeBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 3a2f2e1..9a8d6a5 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelNode result;
HeapTuple tuple;
Form_pg_class relform;
@@ -870,15 +870,15 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelNode;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelNodeIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,11 +898,11 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelNode relfilenode = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ if (!RelNodeIsValid(relfilenode))
PG_RETURN_NULL();
heaprel = RelidByRelfilenode(reltablespace, relfilenode);
@@ -953,13 +953,13 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rnode.relNode = InvalidRelNode;
/* some compilers generate warnings without these next two lines */
rnode.dbNode = InvalidOid;
rnode.spcNode = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!RelNodeIsValid(rnode.relNode))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..568ff1f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -98,10 +98,10 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenode = relnode;
PG_RETURN_VOID();
}
@@ -120,10 +120,10 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_index_pg_class_relfilenode = relnode;
PG_RETURN_VOID();
}
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenode = relnode;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2707fed..515bd44 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1343,7 +1343,7 @@ RelationInitPhysicalAddr(Relation relation)
relation->rd_node.relNode =
RelationMapOidToFilenode(relation->rd_id,
relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ if (!RelNodeIsValid(relation->rd_node.relNode))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
@@ -1958,13 +1958,13 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
* because there is no other way to know what filenode they currently
- * have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * have. In bootstrap mode, generate a new relfilenode and add them to the
+ * initial relation mapper data.
*/
- relation->rd_rel->relfilenode = InvalidOid;
+ relation->rd_rel->relfilenode = InvalidRelNode;
if (IsBootstrapProcessingMode())
RelationMapUpdateMap(RelationGetRelid(relation),
- RelationGetRelid(relation),
+ GetNewRelNode(),
isshared, true);
/*
@@ -3434,7 +3434,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3605,7 +3605,7 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelNode;
/* Add it to the active mapping information */
RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
}
@@ -3674,7 +3674,7 @@ RelationBuildLocalRelation(const char *relname,
void
RelationSetNewRelfilenode(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelNode newrelfilenode;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
@@ -3683,7 +3683,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelFileNode newrnode;
/* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
+ newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace,
persistence);
/*
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
index 70c323c..4d3e068 100644
--- a/src/backend/utils/cache/relfilenodemap.c
+++ b/src/backend/utils/cache/relfilenodemap.c
@@ -37,7 +37,7 @@ static ScanKeyData relfilenode_skey[2];
typedef struct
{
Oid reltablespace;
- Oid relfilenode;
+ RelNode relfilenode;
} RelfilenodeMapKey;
typedef struct
@@ -135,7 +135,7 @@ InitializeRelfilenodeMap(void)
* Returns InvalidOid if no relation matching the criteria could be found.
*/
Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
+RelidByRelfilenode(Oid reltablespace, RelNode relfilenode)
{
RelfilenodeMapKey key;
RelfilenodeMapEntry *entry;
@@ -196,7 +196,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
+ skey[1].sk_argument = Int64GetDatum(relfilenode);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
+ "unexpected duplicate for tablespace %u, relfilenode" INT64_FORMAT,
reltablespace, relfilenode);
found = true;
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 4f6811f..1a637b0 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -79,7 +79,7 @@
typedef struct RelMapping
{
Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ RelNode mapfilenode; /* its filenode number */
} RelMapping;
typedef struct RelMapFile
@@ -132,7 +132,7 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
+static void apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
@@ -155,7 +155,7 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
+RelNode
RelationMapOidToFilenode(Oid relationId, bool shared)
{
const RelMapFile *map;
@@ -193,7 +193,7 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
}
}
- return InvalidOid;
+ return InvalidRelNode;
}
/*
@@ -209,7 +209,7 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
* relfilenode doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenodeToOid(RelNode filenode, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -258,7 +258,7 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -316,7 +316,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
+ bool add_okay)
{
int32 i;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..85ed88c 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenode",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelNode);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 7e69475..94ec594 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -520,9 +520,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index f911f98..2513fc3 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNode: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelNode);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4485ea8..d1b0eb9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4658,12 +4658,12 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
{
PQExpBuffer upgrade_query = createPQExpBuffer();
PGresult *upgrade_res;
- Oid relfilenode;
+ RelNode relfilenode;
Oid toast_oid;
- Oid toast_relfilenode;
+ RelNode toast_relfilenode;
char relkind;
Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ RelNode toast_index_relfilenode;
/*
* Preserve the OID and relfilenode of the table, table's index, table's
@@ -4689,16 +4689,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4714,9 +4714,9 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
* partitioned tables have a relfilenode, which should not be preserved
* when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelNodeIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenode);
/*
@@ -4730,7 +4730,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenode);
/* every toast table has an index */
@@ -4738,7 +4738,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenode);
}
@@ -4751,7 +4751,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenode);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 7211090..2674b00 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -535,11 +535,11 @@ isRelDataFile(const char *path)
*/
rnode.spcNode = InvalidOid;
rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rnode.relNode = InvalidRelNode;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rnode.relNode, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rnode.spcNode = GLOBALTABLESPACE_OID;
@@ -548,7 +548,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rnode.dbNode, &rnode.relNode, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -557,7 +557,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rnode.spcNode, &rnode.dbNode, &rnode.relNode,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 69ef231..d3c5d53 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -383,8 +383,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
i_reltablespace;
+ RelNode i_relfilenode;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -511,7 +511,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenode = atorelnode(PQgetvalue(res, relnum, i_relfilenode));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index f66bbd5..c8bbedb 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 0aca0a7..13975b7 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -130,7 +130,7 @@ typedef struct
char *nspname; /* namespace name */
char *relname; /* relation name */
Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
+ RelNode relfilenode; /* relation file node */
Oid indtable; /* if index, OID of its table, else 0 */
Oid toastheap; /* if toast table, OID of base table, else 0 */
char *tablespace; /* tablespace path; "" for cluster default */
@@ -154,7 +154,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelNode relfilenode;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
index 2f4deb3..10e6a6c 100644
--- a/src/bin/pg_upgrade/relfilenode.c
+++ b/src/bin/pg_upgrade/relfilenode.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenode,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a6251e1..54c3da7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -518,13 +518,13 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
else
- printf(", blkref #%d: rel %u/%u/%u blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT "blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
blk);
@@ -548,7 +548,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
continue;
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf("\tblkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..27b8547 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -138,7 +138,7 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber)
{
char *path;
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
Assert(dbNode == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNode, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/" INT64_FORMAT, relNode);
}
else if (spcNode == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbNode, backendId, relNode);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 9a2816d..8113335 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -217,6 +217,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelNode nextRelNode; /* next relfilenode to assign */
+ uint32 relnodecount; /* Relfilenode available before must do XLOG
+ work */
/*
* These fields are protected by XidGenLock.
@@ -298,6 +301,7 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelNode GetNewRelNode(void);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 4b45ac6..cd5ab2d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -233,6 +233,7 @@ extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern XLogRecPtr CalculateMaxmumSafeLSN(void);
extern void XLogPutNextOid(Oid nextOid);
+extern void XLogPutNextRelFileNode(RelNode nextrelnode);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..401bfa2 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_heap_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_index_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_toast_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..1b83c79 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -15,6 +15,7 @@
#define CATALOG_H
#include "catalog/pg_class.h"
+#include "storage/relfilenode.h"
#include "utils/relcache.h"
@@ -38,7 +39,6 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelNode GetNewRelFileNode(Oid reltablespace, char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index c4757bd..66d41af 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1e79ec9 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 304e8c1..4659ed3 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -52,13 +52,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* # of blocks (not always up-to-date) */
int32 relpages BKI_DEFAULT(0);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 1f3dc24..27d584d 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelNode nextRelNode; /* next relfile node */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENODE 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7f1ee97..c0f0d74 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7287,11 +7287,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11059,15 +11059,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..297c20b 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelNode newRelFileNode);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a4b5dc8..d6d6215 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -66,7 +66,7 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
*/
extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber);
/*
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 34218b7..6f7bd0f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2901,7 +2901,7 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
+ RelNode oldNode; /* relfilenode of existing storage, if any */
SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
* oldNode */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..7454933 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,21 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelNode data type identifies the specific relation file name. RelNode is
+ * unique within a cluster.
+ *
+ * XXX idealy we can use uint64 but current we only have int8 as an exposed
+ * datatype so maybe we should make a new datatype relnode which will be of
+ * type 8 bytes unsigned integer.
+ */
+typedef pg_int64 RelNode;
+
+#define atorelnode(x) ((RelNode) strtoul((x), NULL, 10))
+
+#define InvalidRelNode ((RelNode) 0)
+#define FirstNormalRelNode ((RelNode) 1)
+#define RelNodeIsValid(relNode) ((bool) ((relNode) != InvalidRelNode))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 0286d51..6e940a6 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -21,6 +21,7 @@
#include "storage/condition_variable.h"
#include "storage/latch.h"
#include "storage/lwlock.h"
+#include "storage/relfilenode.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
@@ -92,8 +93,9 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid. */
Oid dbOid; /* database oid. */
- Oid fileNode; /* relation file node. */
- ForkNumber forkNum;
+ uint32 fileNode_low; /* relation file node 32 lower bits */
+ uint32 fileNode_hi:24; /* relation file node 24 high bits */
+ uint32 forkNum:8;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
@@ -101,7 +103,8 @@ typedef struct buftag
( \
(a).spcOid = InvalidOid, \
(a).dbOid = InvalidOid, \
- (a).fileNode = InvalidOid, \
+ (a).fileNode_low = 0, \
+ (a).fileNode_hi = 0, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -110,7 +113,7 @@ typedef struct buftag
( \
(a).spcOid = (xx_rnode).spcNode, \
(a).dbOid = (xx_rnode).dbNode, \
- (a).fileNode = (xx_rnode).relNode, \
+ BufTagSetFileNode(a, (xx_rnode).relNode), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
@@ -119,23 +122,33 @@ typedef struct buftag
( \
(a).spcOid == (b).spcOid && \
(a).dbOid == (b).dbOid && \
- (a).fileNode == (b).fileNode && \
+ (a).fileNode_low == (b).fileNode_low && \
+ (a).fileNode_hi == (b).fileNode_hi && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BufTagGetFileNode(a) \
+ ((((uint64) (a).fileNode_hi << 32) | ((uint32) (a).fileNode_low)))
+
+#define BufTagSetFileNode(a, node) \
+( \
+ (a).fileNode_hi = (node) >> 32, \
+ (a).fileNode_low = (node) & 0xffffffff \
+)
+
#define BuffTagGetRelFileNode(a, node) \
do { \
(node).spcNode = (a).spcOid; \
(node).dbNode = (a).dbOid; \
- (node).relNode = (a).fileNode; \
+ (node).relNode = BufTagGetFileNode(a); \
} while(0)
#define BuffTagRelFileNodeEquals(a, node) \
( \
(a).spcOid == (node).spcNode && \
(a).dbOid == (node).dbNode && \
- (a).fileNode == (node).relNode \
+ BufTagGetFileNode(a) == (node).relNode \
)
/*
@@ -312,7 +325,7 @@ extern BufferDesc *LocalBufferDescriptors;
typedef struct CkptSortItem
{
Oid tsId;
- Oid relNode;
+ RelNode relNode;
ForkNumber forkNum;
BlockNumber blockNum;
int buf_id;
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4fdc606..cd2110c 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -34,8 +34,7 @@
* relNode identifies the specific relation. relNode corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
+ * Notice that relNode is unique within a cluster.
*
* Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
* zero. We support shared relations only in the "global" tablespace.
@@ -58,7 +57,7 @@ typedef struct RelFileNode
{
Oid spcNode; /* tablespace */
Oid dbNode; /* database */
- Oid relNode; /* relation */
+ RelNode relNode; /* relation */
} RelFileNode;
/*
@@ -75,6 +74,15 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+#define SizeOfRelFileNodeBackend \
+ (offsetof(RelFileNodeBackend, backend) + sizeof(BackendId))
+
+/*
+ * Max value of the relfilnode. Relfilenode will be of 56bits wide for more
+ * details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENODE ((((uint64) 1) << 56) - 1)
+
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 6da1b22..a47ede3 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -526,7 +526,7 @@ typedef struct ViewOptions
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelNode))
/*
* RelationGetSmgr
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 84d6afe..5d13660 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -102,7 +102,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
index 77d8046..d324981 100644
--- a/src/include/utils/relfilenodemap.h
+++ b/src/include/utils/relfilenodemap.h
@@ -13,6 +13,6 @@
#ifndef RELFILENODEMAP_H
#define RELFILENODEMAP_H
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
+extern Oid RelidByRelfilenode(Oid reltablespace, RelNode relfilenode);
#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 9fbb5a7..58234a8 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -35,11 +35,11 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelNode RelationMapOidToFilenode(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
+extern Oid RelationMapFilenodeToOid(RelNode relationId, bool shared);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+extern void RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 16e0475..58aeddb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,7 +2164,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,7 +2197,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | OTHER | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | OTHER | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index ac894c0..250e6cd 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,7 +1478,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1499,7 +1498,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
On Mon, Feb 21, 2022 at 1:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jan 6, 2022 at 1:43 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
2) GetNewRelFileNode() will not loop for checking the file existence
and retry with other relfilenode.
Open Issues- there are currently 2 open issues in the patch 1) Issue
as discussed above about removing the loop, so currently in this patch
the loop is removed. 2) During upgrade from the previous version we
need to advance the nextrelfilenode to the current relfilenode we are
setting for the object in order to avoid the conflict.
In this version I have fixed both of these issues. Thanks Robert for
suggesting the solution for both of these problems in our offlist
discussion. Basically, for the first problem we can flush the xlog
immediately because we are actually logging the WAL every time after
we allocate 64 relfilenode so this should not have much impact on the
performance and I have added the same in the comments. And during
pg_upgrade, whenever we are assigning the relfilenode as part of the
upgrade we will set that relfilenode + 1 as nextRelFileNode to be
assigned so that we never generate the conflicting relfilenode.
The only part I do not like in the patch is that before this patch we
could directly access the buftag->rnode. But since now we are not
having directly relfilenode as part of the buffertag and instead of
that we are keeping individual fields (i.e. dbOid, tbsOid and relNode)
in the buffer tag. So if we have to directly get the relfilenode we
need to generate it. However those changes are very limited to just 1
or 2 file so maybe not that bad.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v5-0001-Preliminary-refactoring-for-supporting-larger-rel.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Preliminary-refactoring-for-supporting-larger-rel.patchDownload
From c831c8efc1d165cd9d743e8e5462070e9936c35c Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Tue, 8 Feb 2022 13:54:54 +0530
Subject: [PATCH v5 1/3] Preliminary refactoring for supporting larger
relfilenode
Currently, relfilenode is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileNode, we will keep the
tablespace Oid, database Oid, and the relfilenode directly. So that
once we change the relNode in RelFileNode to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
contrib/pg_prewarm/autoprewarm.c | 6 +-
src/backend/storage/buffer/bufmgr.c | 113 ++++++++++++++++++--------
src/backend/storage/buffer/localbuf.c | 24 ++++--
src/include/storage/buf_internals.h | 32 ++++++--
5 files changed, 128 insertions(+), 53 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..6af96c8 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.fileNode;
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 45e012a..f982330 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -616,9 +616,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenode = bufHdr->tag.fileNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index f5459c6..5014fe6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1640,7 +1640,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ BuffTagRelFileNodeEquals(bufHdr->tag, relation->rd_node) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1651,7 +1651,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ BuffTagRelFileNodeEquals(bufHdr->tag, relation->rd_node) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1993,8 +1993,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNode = bufHdr->tag.fileNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2686,6 +2686,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileNode rnode;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2701,8 +2702,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BuffTagGetRelFileNode(buf->tag, rnode);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(rnode, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2781,7 +2784,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ BuffTagGetRelFileNode(bufHdr->tag, *rnode);
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2832,7 +2835,12 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ {
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(buf->tag, rnode);
+ reln = smgropen(rnode, InvalidBackendId);
+ }
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -3135,14 +3143,14 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!BuffTagRelFileNodeEquals(bufHdr->tag, rnode.node))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rnode.node) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3295,7 +3303,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3304,7 +3312,10 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
+ RelFileNode node;
+
+ BuffTagGetRelFileNode(bufHdr->tag, node);
+ rnode = bsearch((const void *) &(node),
nodes, n, sizeof(RelFileNode),
rnode_comparator);
}
@@ -3314,7 +3325,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3374,7 +3385,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3413,11 +3424,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3441,13 +3452,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(buf->tag, rnode);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rnode, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3467,12 +3481,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(buf->tag, rnode);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3511,7 +3529,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rel->rd_node) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3558,13 +3576,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!BuffTagRelFileNodeEquals(bufHdr->tag, rel->rd_node))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, rel->rd_node) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3638,7 +3656,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, srels[j].rnode))
{
srelent = &srels[j];
break;
@@ -3648,7 +3666,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+ srelent = bsearch((const void *) &(rnode),
srels, nrels, sizeof(SMgrSortArray),
rnode_comparator);
}
@@ -3660,7 +3681,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (BuffTagRelFileNodeEquals(bufHdr->tag, srelent->rnode) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3710,13 +3731,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3876,6 +3897,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+
/*
* If we must not write WAL, due to a relfilenode-specific
* condition or being in recovery, don't dirty the page. We can
@@ -3884,8 +3909,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ if (RecoveryInProgress() || RelFileNodeSkippingWAL(rnode))
return;
/*
@@ -4491,8 +4515,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileNode rnode;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ BuffTagGetRelFileNode(buf->tag, rnode);
+ path = relpathperm(rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4516,7 +4542,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path;
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+ path = relpathperm(rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4534,8 +4564,11 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
+ path = relpathbackend(rnode, MyBackendId, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4629,8 +4662,13 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileNode rnodea;
+ RelFileNode rnodeb;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ BuffTagGetRelFileNode(*ba, rnodea);
+ BuffTagGetRelFileNode(*bb, rnodeb);
+
+ ret = rnode_comparator(&rnodea, &rnodeb);
if (ret != 0)
return ret;
@@ -4787,10 +4825,13 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileNode currnode;
+
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BuffTagGetRelFileNode(tag, currnode);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4798,10 +4839,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileNode nextrnode;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BuffTagGetRelFileNode(next->tag, nextrnode);
+
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeEquals(currnode, nextrnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4820,7 +4865,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(currnode, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..28446da 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,9 +213,12 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(rnode, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -337,16 +340,21 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ BuffTagRelFileNodeEquals(bufHdr->tag, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rnode, MyBackendId, bufHdr->tag.forkNum),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +391,15 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ BuffTagRelFileNodeEquals(bufHdr->tag, rnode))
{
+ RelFileNode rnode;
+
+ BuffTagGetRelFileNode(bufHdr->tag, rnode);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rnode, MyBackendId, bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b903d2b..0286d51 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,54 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ Oid spcOid; /* tablespace oid. */
+ Oid dbOid; /* database oid. */
+ Oid fileNode; /* relation file node. */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ (a).fileNode = InvalidOid, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).spcOid = (xx_rnode).spcNode, \
+ (a).dbOid = (xx_rnode).dbNode, \
+ (a).fileNode = (xx_rnode).relNode, \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).fileNode == (b).fileNode && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BuffTagGetRelFileNode(a, node) \
+do { \
+ (node).spcNode = (a).spcOid; \
+ (node).dbNode = (a).dbOid; \
+ (node).relNode = (a).fileNode; \
+} while(0)
+
+#define BuffTagRelFileNodeEquals(a, node) \
+( \
+ (a).spcOid == (node).spcNode && \
+ (a).dbOid == (node).dbNode && \
+ (a).fileNode == (node).relNode \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v5-0003-Don-t-delay-removing-Tombstone-file-until-next-ch.patchtext/x-patch; charset=US-ASCII; name=v5-0003-Don-t-delay-removing-Tombstone-file-until-next-ch.patchDownload
From fbfe97f80c4ca4792566734bee9ea8bb561de9c6 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Tue, 8 Feb 2022 18:38:08 +0530
Subject: [PATCH v5 3/3] Don't delay removing Tombstone file until next
checkpoint
Currently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/storage/smgr/md.c | 58 ++++++----------------
src/backend/storage/sync/sync.c | 101 --------------------------------------
src/include/storage/sync.h | 1 -
4 files changed, 14 insertions(+), 151 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a79eac1..9569706 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6613,11 +6613,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 879f647..7943b17 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -124,8 +124,6 @@ static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -321,36 +319,25 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileNodeBackendIsTemp(rnode))
{
- if (!RelFileNodeBackendIsTemp(rnode))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
- }
- else
- ret = 0;
+ /* Prevent other backends' fds from holding on to the disk space */
+ ret = do_truncate(path);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
- {
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
+ /* Forget any pending sync requests for the first segment */
+ register_forget_request(rnode, forkNum, 0 /* first seg */ );
}
else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
+ ret = 0;
- /* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ /* Next unlink the file, unless it was already found to be missing */
+ if (ret == 0 || errno != ENOENT)
+ {
+ ret = unlink(path);
+ if (ret < 0 && errno != ENOENT)
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
/*
@@ -1001,23 +988,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e161d57..18cb350 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -189,92 +189,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -520,21 +434,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..4d67850 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -57,7 +57,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
v5-0002-Use-56-bits-for-relfilenode-to-avoid-wraparound.patchtext/x-patch; charset=UTF-8; name=v5-0002-Use-56-bits-for-relfilenode-to-avoid-wraparound.patchDownload
From d4c2a3b01dfe261068502c8ed1a0d49a65cd82fb Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Wed, 16 Feb 2022 17:29:39 +0530
Subject: [PATCH v5 2/3] Use 56 bits for relfilenode to avoid wraparound
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As part of this patch, we will make the relfilenode 64 bits wide.
But the problem is that if we make it 64 bits wide then the size
of the BufferTag will be increased which will increase the memory
usage and that may also impact the performance. So in order to
avoid that inside the buffer tag, instead of using 64 bits for the
relfilenode we will use 8 bits for the fork number and 56 bits for
the relfilenode.
---
.../pg_buffercache/pg_buffercache--1.0--1.1.sql | 2 +-
contrib/pg_buffercache/pg_buffercache--1.2.sql | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/common/syncscan.c | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 15 +++-
src/backend/access/transam/README | 4 +-
src/backend/access/transam/varsup.c | 98 +++++++++++++++++++++-
src/backend/access/transam/xlog.c | 48 +++++++++++
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 8 +-
src/backend/catalog/catalog.c | 57 +++----------
src/backend/catalog/heap.c | 29 ++++---
src/backend/catalog/index.c | 21 +++--
src/backend/commands/cluster.c | 12 +--
src/backend/commands/indexcmds.c | 6 +-
src/backend/commands/sequence.c | 2 +-
src/backend/commands/tablecmds.c | 19 +++--
src/backend/nodes/outfuncs.c | 2 +-
src/backend/parser/parse_utilcmd.c | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 16 ++--
src/backend/utils/adt/pg_upgrade_support.c | 17 ++--
src/backend/utils/cache/relcache.c | 18 ++--
src/backend/utils/cache/relfilenodemap.c | 8 +-
src/backend/utils/cache/relmapper.c | 15 ++--
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 28 +++----
src/bin/pg_rewind/filemap.c | 8 +-
src/bin/pg_upgrade/info.c | 4 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/pg_upgrade.h | 4 +-
src/bin/pg_upgrade/relfilenode.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 6 +-
src/common/relpath.c | 22 ++---
src/fe_utils/option_utils.c | 42 ++++++++++
src/include/access/transam.h | 5 ++
src/include/access/xlog.h | 1 +
src/include/catalog/binary_upgrade.h | 6 +-
src/include/catalog/catalog.h | 4 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/pg_class.h | 10 +--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +--
src/include/commands/tablecmds.h | 2 +-
src/include/common/relpath.h | 2 +-
src/include/fe_utils/option_utils.h | 3 +
src/include/nodes/parsenodes.h | 2 +-
src/include/postgres_ext.h | 15 ++++
src/include/storage/buf_internals.h | 29 +++++--
src/include/storage/relfilenode.h | 14 +++-
src/include/utils/rel.h | 2 +-
src/include/utils/relcache.h | 2 +-
src/include/utils/relfilenodemap.h | 2 +-
src/include/utils/relmapper.h | 6 +-
src/test/regress/expected/alter_table.out | 20 ++---
src/test/regress/sql/alter_table.sql | 4 +-
72 files changed, 472 insertions(+), 261 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
index 54d02f5..5e93238 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.0--1.1.sql
@@ -6,6 +6,6 @@
-- Upgrade view to 1.1. format
CREATE OR REPLACE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache--1.2.sql b/contrib/pg_buffercache/pg_buffercache--1.2.sql
index 6ee5d84..f52ddcd 100644
--- a/contrib/pg_buffercache/pg_buffercache--1.2.sql
+++ b/contrib/pg_buffercache/pg_buffercache--1.2.sql
@@ -12,7 +12,7 @@ LANGUAGE C PARALLEL SAFE;
-- Create a view for convenient access.
CREATE VIEW pg_buffercache AS
SELECT P.* FROM pg_buffercache_pages() AS P
- (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
pinning_backends int4);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 6af96c8..94d2570 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelNode relfilenode;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -103,7 +103,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ INT8OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -153,7 +153,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.fileNode;
+ fctx->record[i].relfilenode = BufTagGetFileNode(bufHdr->tag);
fctx->record[i].reltablespace = bufHdr->tag.spcOid;
fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = Int8GetDatum(fctx->record[i].relfilenode);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index f982330..28b1757 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ RelNode filenode;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -618,7 +618,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
block_info_array[num_blocks].database = bufHdr->tag.dbOid;
block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
- block_info_array[num_blocks].filenode = bufHdr->tag.fileNode;
+ block_info_array[num_blocks].filenode = BufTagGetFileNode(bufHdr->tag);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 83987a9..c6db8ff 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1960,7 +1960,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index e68d159..631cd2f 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..aa71523 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -161,7 +161,7 @@ SyncScanShmemInit(void)
*/
item->location.relfilenode.spcNode = InvalidOid;
item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilenode.relNode = InvalidRelNode;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..17b77b9 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &node, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
node.spcNode, node.dbNode, node.relNode);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 9cab4fa..203685a 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
xlrec->node.relNode, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..57af152 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_node.spcNode,
xlrec->target_node.dbNode,
xlrec->target_node.relNode,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..8c44ebd 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->node.spcNode, xlrec->node.dbNode,
xlrec->node.relNode,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..5385ded 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->node.spcNode, xlrec->node.dbNode,
xlrec->node.relNode);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index e7452af..9066566 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenode " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelNode,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelFilenode;
+
+ memcpy(&nextRelFilenode, rec, sizeof(RelNode));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFilenode);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENODE:
+ id = "NEXT_RELFILENODE";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..5c81f6c 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,8 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
-OIDs. So cleaning up isn't really necessary.
+because relfilenode is 56 bit wide so logically there should not be any
+collisions. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..77d9a5b 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,98 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelNode
+ *
+ * Simmilar to GetNewObjectId but instead of new Oid it generates new relnode.
+ */
+RelNode
+GetNewRelNode(void)
+{
+ RelNode result;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelNode during recovery");
+
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+
+ /*
+ * Check for the wraparound for the relnode counter.
+ *
+ * XXX Actually the relnode is 56 bits wide so we don't need to worry about
+ * the wraparound case.
+ */
+ if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE)
+ {
+ ShmemVariableCache->nextRelNode = FirstNormalRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ }
+
+ /* If we run out of logged for use RelNode then we must log more */
+ if (ShmemVariableCache->relnodecount == 0)
+ {
+ XLogPutNextRelFileNode(ShmemVariableCache->nextRelNode +
+ VAR_RFN_PREFETCH);
+
+ ShmemVariableCache->relnodecount = VAR_RFN_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelNode;
+ (ShmemVariableCache->nextRelNode)++;
+ (ShmemVariableCache->relnodecount)--;
+
+ LWLockRelease(RelNodeGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelNode
+ *
+ * This may only be called during pg_upgrade; it advances the RelNode counter
+ * to the specified value.
+ */
+void
+SetNextRelNode(RelNode relnode)
+{
+ int relnodecount;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelNode during recovery");
+
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelNode is already higher than the
+ * current value then nothing to be done. This is possible because during
+ * upgrade the relfilenode for the objects can be in any order.
+ */
+ if (relnode <= ShmemVariableCache->nextRelNode)
+ {
+ LWLockRelease(RelNodeGenLock);
+ return;
+ }
+
+ /*
+ * Check if we set the new relnode then do we run out of the logged relnode
+ * if so then we need to WAL log again. Otherwise, just adjust the
+ * relnodecount counter.
+ */
+ relnodecount = relnode - ShmemVariableCache->nextRelNode;
+ if (ShmemVariableCache->relnodecount <= relnodecount)
+ {
+ XLogPutNextRelFileNode(relnode + VAR_RFN_PREFETCH);
+ ShmemVariableCache->relnodecount = VAR_RFN_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnodecount -= relnodecount;
+
+ ShmemVariableCache->nextRelNode = relnode;
+ LWLockRelease(RelNodeGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0d2bd7a..a79eac1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4547,6 +4547,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelNode = FirstNormalRelNode;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4560,7 +4561,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5026,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnodecount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6456,6 +6461,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_SHARED);
+ checkPoint.nextRelNode = ShmemVariableCache->nextRelNode;
+ if (!shutdown)
+ checkPoint.nextRelNode += ShmemVariableCache->relnodecount;
+ LWLockRelease(RelNodeGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7310,6 +7321,29 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Simmialr to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENODE log record.
+ */
+void
+XLogPutNextRelFileNode(RelNode nextrelnode)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnode), sizeof(RelNode));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENODE);
+
+ /*
+ * Flush xlog record to disk before returning. To protect agains file
+ * system changes reaching the disk before the XLOG_NEXT_RELFILENODE log.
+ *
+ * This should not impact the performance because we are WAL logging the
+ * RelNode after assigning every 64 RelNode
+ */
+ XLogFlush(recptr);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7524,6 +7558,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENODE)
+ {
+ RelNode nextRelNode;
+
+ memcpy(&nextRelNode, XLogRecGetData(record), sizeof(RelNode));
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7538,6 +7582,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelNodeGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelNode = checkPoint.nextRelNode;
+ ShmemVariableCache->relnodecount = 0;
+ LWLockRelease(RelNodeGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index f9f2126..d9e937c 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2150,13 +2150,13 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
blk);
@@ -2349,7 +2349,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rnode.spcNode, rnode.dbNode, rnode.relNode,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 54d5f20..f9f0aa8 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -593,17 +593,17 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rnode.relNode);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index dfd5fb6..9bc1809 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -472,26 +472,16 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
/*
* GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
- * database of the given tablespace.
+ * Generate a new relfilenode number.
*
- * If the relfilenode will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
+ * We are using 56 bits for the relfilenode so we expect that to be unique for
+ * the cluster so if it is already exists then report and error.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelNode
+GetNewRelFileNode(Oid reltablespace, char relpersistence)
{
RelFileNodeBackend rnode;
char *rpath;
- bool collides;
BackendId backend;
/*
@@ -525,40 +515,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
* are properly detected.
*/
rnode.backend = backend;
+ rnode.node.relNode = GetNewRelNode();
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rnode.node.relNode = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ /* Check for existing file of same name */
+ rpath = relpath(rnode, MAIN_FORKNUM);
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
+ if (access(rpath, F_OK) == 0)
+ elog(ERROR, "new relfilenode file already exists: \"%s\"\n", rpath);
return rnode.node.relNode;
}
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7e99de8..67f3225 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -91,9 +91,9 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_heap_pg_class_relfilenode = InvalidRelNode;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelNode;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -303,7 +303,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -358,8 +358,8 @@ heap_create(const char *relname,
* If relfilenode is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelNodeIsValid(relfilenode))
+ relfilenode = GetNewRelFileNode(reltablespace, relpersistence);
}
/*
@@ -912,7 +912,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1129,7 +1129,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelNode relfilenode = InvalidRelNode;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1187,8 +1187,7 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure that the Oid doesn't collide with other pg_class OIDs.
*/
if (!OidIsValid(relid))
{
@@ -1210,13 +1209,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelNodeIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("toast relfilenode value not set when in binary upgrade mode")));
relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_toast_pg_class_relfilenode = InvalidRelNode;
}
}
else
@@ -1231,20 +1230,20 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!RelNodeIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("relfilenode value not set when in binary upgrade mode")));
relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_heap_pg_class_relfilenode = InvalidRelNode;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 5e3fc2b..7a19f45 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,7 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+RelNode binary_upgrade_next_index_pg_class_relfilenode = InvalidRelNode;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,7 +662,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNode: normally, pass InvalidRelNode to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -703,7 +703,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +735,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelNodeIsValid(relFileNode);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -902,8 +902,7 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
+ * Make sure it doesn't collide with other pg_class OIDs.
*/
if (!OidIsValid(indexRelationId))
{
@@ -920,12 +919,12 @@ index_create(Relation heapRelation,
/* Overide the index relfilenode */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!RelNodeIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("index relfilenode value not set when in binary upgrade mode")));
relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ binary_upgrade_next_index_pg_class_relfilenode = InvalidRelNode;
/*
* Note that we want create_storage = true for binary upgrade.
@@ -936,8 +935,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
@@ -1408,7 +1407,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelNode, /* relFileNode */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 02a7e94..0cee4c6 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1005,9 +1005,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
+ RelNode relfilenode1,
relfilenode2;
- Oid swaptemp;
+ RelNode swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
@@ -1026,7 +1026,7 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
relfilenode1 = relform1->relfilenode;
relfilenode2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (RelNodeIsValid(relfilenode1) && RelNodeIsValid(relfilenode2))
{
/*
* Normal non-mapped relations: swap relfilenodes, reltablespaces,
@@ -1064,7 +1064,7 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (RelNodeIsValid(relfilenode1) || RelNodeIsValid(relfilenode2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1093,11 +1093,11 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ if (!RelNodeIsValid(relfilenode1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ if (!RelNodeIsValid(relfilenode2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index cd30f15..fe2cfd0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1086,7 +1086,7 @@ DefineIndex(Oid relationId,
* A valid stmt->oldNode implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelNodeIsValid(stmt->oldNode) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1316,7 +1316,7 @@ DefineIndex(Oid relationId,
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNode = InvalidRelNode;
childStmt->oldCreateSubid = InvalidSubTransactionId;
childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
@@ -2897,7 +2897,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelNodeIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ab592ce..aafca83 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -74,7 +74,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelNode filenode; /* last seen relfilenode of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3e83f37..3f17f7d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3305,7 +3305,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelNode newRelFileNode)
{
Relation pg_class;
HeapTuple tuple;
@@ -3325,7 +3325,7 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
+ if (RelNodeIsValid(newRelFileNode))
rd_rel->relfilenode = newRelFileNode;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
@@ -8573,7 +8573,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelNodeIsValid(stmt->oldNode);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8597,7 +8597,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelNodeIsValid(stmt->oldNode))
{
Relation irel = index_open(address.objectId, NoLock);
@@ -14291,7 +14291,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
+ RelNode newrelfilenode;
RelFileNode newrnode;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14321,10 +14321,13 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenode. Although relfilenodes are unique within a
+ * cluster, we are unable to use the old relfilenode since unused
+ * relfilenodes are not unlinked until commit. So if within a transaction,
+ * if we set the old tablespace again, we will get conflicting relfilenode
+ * file.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
+ newrelfilenode = GetNewRelFileNode(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad46..7372fc0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2771,7 +2771,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_UINT64_FIELD(oldNode);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 99efa26..4b6b2ca 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1577,7 +1577,7 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
@@ -2200,7 +2200,7 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNode = InvalidRelNode;
index->oldCreateSubid = InvalidSubTransactionId;
index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
index->transformed = false;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 18cf931..ffd89b6 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -156,6 +156,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENODE:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c2d9be8..8b228b8 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -5268,7 +5268,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT "tid: %u/%u cmin: %u, cmax: %u",
ent->key.relnode.dbNode,
ent->key.relnode.spcNode,
ent->key.relnode.relNode,
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 5014fe6..42f551d 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1994,7 +1994,7 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
item->tsId = bufHdr->tag.spcOid;
- item->relNode = bufHdr->tag.fileNode;
+ item->relNode = BufTagGetFileNode(bufHdr->tag);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..3c0c88d 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..1eb6d78 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelNodeGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index d71a557..a550823 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -156,7 +156,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = SizeOfRelFileNodeBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 3a2f2e1..9a8d6a5 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelNode result;
HeapTuple tuple;
Form_pg_class relform;
@@ -870,15 +870,15 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelNode;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelNodeIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,11 +898,11 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelNode relfilenode = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ if (!RelNodeIsValid(relfilenode))
PG_RETURN_NULL();
heaprel = RelidByRelfilenode(reltablespace, relfilenode);
@@ -953,13 +953,13 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rnode.relNode = InvalidRelNode;
/* some compilers generate warnings without these next two lines */
rnode.dbNode = InvalidOid;
rnode.spcNode = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!RelNodeIsValid(rnode.relNode))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..74d830f 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -98,10 +98,11 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenode = relnode;
+ SetNextRelNode(relnode + 1);
PG_RETURN_VOID();
}
@@ -120,11 +121,11 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
-
+ binary_upgrade_next_index_pg_class_relfilenode = relnode;
+ SetNextRelNode(relnode + 1);
PG_RETURN_VOID();
}
@@ -142,11 +143,11 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelNode relnode = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
-
+ binary_upgrade_next_toast_pg_class_relfilenode = relnode;
+ SetNextRelNode(relnode + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index fccffce..5e9c900 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1344,7 +1344,7 @@ RelationInitPhysicalAddr(Relation relation)
relation->rd_node.relNode =
RelationMapOidToFilenode(relation->rd_id,
relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ if (!RelNodeIsValid(relation->rd_node.relNode))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
@@ -1959,13 +1959,13 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
* because there is no other way to know what filenode they currently
- * have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * have. In bootstrap mode, generate a new relfilenode and add them to the
+ * initial relation mapper data.
*/
- relation->rd_rel->relfilenode = InvalidOid;
+ relation->rd_rel->relfilenode = InvalidRelNode;
if (IsBootstrapProcessingMode())
RelationMapUpdateMap(RelationGetRelid(relation),
- RelationGetRelid(relation),
+ GetNewRelNode(),
isshared, true);
/*
@@ -3435,7 +3435,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3606,7 +3606,7 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelNode;
/* Add it to the active mapping information */
RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
}
@@ -3675,7 +3675,7 @@ RelationBuildLocalRelation(const char *relname,
void
RelationSetNewRelfilenode(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelNode newrelfilenode;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
@@ -3684,7 +3684,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelFileNode newrnode;
/* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
+ newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace,
persistence);
/*
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
index 70c323c..4d3e068 100644
--- a/src/backend/utils/cache/relfilenodemap.c
+++ b/src/backend/utils/cache/relfilenodemap.c
@@ -37,7 +37,7 @@ static ScanKeyData relfilenode_skey[2];
typedef struct
{
Oid reltablespace;
- Oid relfilenode;
+ RelNode relfilenode;
} RelfilenodeMapKey;
typedef struct
@@ -135,7 +135,7 @@ InitializeRelfilenodeMap(void)
* Returns InvalidOid if no relation matching the criteria could be found.
*/
Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
+RelidByRelfilenode(Oid reltablespace, RelNode relfilenode)
{
RelfilenodeMapKey key;
RelfilenodeMapEntry *entry;
@@ -196,7 +196,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
+ skey[1].sk_argument = Int64GetDatum(relfilenode);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
+ "unexpected duplicate for tablespace %u, relfilenode" INT64_FORMAT,
reltablespace, relfilenode);
found = true;
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 4f6811f..1a637b0 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -79,7 +79,7 @@
typedef struct RelMapping
{
Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ RelNode mapfilenode; /* its filenode number */
} RelMapping;
typedef struct RelMapFile
@@ -132,7 +132,7 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
+static void apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
@@ -155,7 +155,7 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
+RelNode
RelationMapOidToFilenode(Oid relationId, bool shared)
{
const RelMapFile *map;
@@ -193,7 +193,7 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
}
}
- return InvalidOid;
+ return InvalidRelNode;
}
/*
@@ -209,7 +209,7 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
* relfilenode doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenodeToOid(RelNode filenode, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -258,7 +258,7 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -316,7 +316,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelNode fileNode,
+ bool add_okay)
{
int32 i;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..85ed88c 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenode",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelNode);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 7e69475..94ec594 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -520,9 +520,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index f911f98..2513fc3 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNode: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelNode);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e69dcf8..cd19b7e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4680,12 +4680,12 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
{
PQExpBuffer upgrade_query = createPQExpBuffer();
PGresult *upgrade_res;
- Oid relfilenode;
+ RelNode relfilenode;
Oid toast_oid;
- Oid toast_relfilenode;
+ RelNode toast_relfilenode;
char relkind;
Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ RelNode toast_index_relfilenode;
/*
* Preserve the OID and relfilenode of the table, table's index, table's
@@ -4711,16 +4711,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenode = atorelnode(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4736,9 +4736,9 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
* partitioned tables have a relfilenode, which should not be preserved
* when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelNodeIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenode);
/*
@@ -4752,7 +4752,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenode);
/* every toast table has an index */
@@ -4760,7 +4760,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenode);
}
@@ -4773,7 +4773,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenode);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 7211090..2674b00 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -535,11 +535,11 @@ isRelDataFile(const char *path)
*/
rnode.spcNode = InvalidOid;
rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rnode.relNode = InvalidRelNode;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rnode.relNode, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rnode.spcNode = GLOBALTABLESPACE_OID;
@@ -548,7 +548,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rnode.dbNode, &rnode.relNode, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -557,7 +557,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rnode.spcNode, &rnode.dbNode, &rnode.relNode,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 69ef231..d3c5d53 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -383,8 +383,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
i_reltablespace;
+ RelNode i_relfilenode;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -511,7 +511,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenode = atorelnode(PQgetvalue(res, relnum, i_relfilenode));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index ecb3e1f..a4c35bb 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index ca86c11..8b4f60c 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -130,7 +130,7 @@ typedef struct
char *nspname; /* namespace name */
char *relname; /* relation name */
Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
+ RelNode relfilenode; /* relation file node */
Oid indtable; /* if index, OID of its table, else 0 */
Oid toastheap; /* if toast table, OID of base table, else 0 */
char *tablespace; /* tablespace path; "" for cluster default */
@@ -154,7 +154,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelNode relfilenode;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
index d23ac88..a2605e2 100644
--- a/src/bin/pg_upgrade/relfilenode.c
+++ b/src/bin/pg_upgrade/relfilenode.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenode,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 2340dc2..4d2783e 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -515,13 +515,13 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
if (forknum != MAIN_FORKNUM)
- printf(", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
blk);
else
- printf(", blkref #%d: rel %u/%u/%u blk %u",
+ printf(", blkref #%d: rel %u/%u/" INT64_FORMAT "blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
blk);
@@ -545,7 +545,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
continue;
XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk);
- printf("\tblkref #%d: rel %u/%u/%u fork %s blk %u",
+ printf("\tblkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rnode.spcNode, rnode.dbNode, rnode.relNode,
forkNames[forknum],
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..27b8547 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -138,7 +138,7 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber)
{
char *path;
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
Assert(dbNode == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNode, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/" INT64_FORMAT, relNode);
}
else if (spcNode == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbNode, backendId, relNode);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, relNode);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcNode, TABLESPACE_VERSION_DIRECTORY,
dbNode, backendId, relNode);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 9a2816d..3cc577a 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -217,6 +217,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelNode nextRelNode; /* next relfilenode to assign */
+ uint32 relnodecount; /* Relfilenode available before must do XLOG
+ work */
/*
* These fields are protected by XidGenLock.
@@ -298,6 +301,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelNode GetNewRelNode(void);
+extern void SetNextRelNode(RelNode relnode);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 4b45ac6..cd5ab2d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -233,6 +233,7 @@ extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern XLogRecPtr CalculateMaxmumSafeLSN(void);
extern void XLogPutNextOid(Oid nextOid);
+extern void XLogPutNextRelFileNode(RelNode nextrelnode);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..401bfa2 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_heap_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_index_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelNode binary_upgrade_next_toast_pg_class_relfilenode;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..1b83c79 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -15,6 +15,7 @@
#define CATALOG_H
#include "catalog/pg_class.h"
+#include "storage/relfilenode.h"
#include "utils/relcache.h"
@@ -38,7 +39,6 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelNode GetNewRelFileNode(Oid reltablespace, char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index c4757bd..66d41af 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelNode relfilenode,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1e79ec9 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelNode relFileNode,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 304e8c1..4659ed3 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -52,13 +52,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* # of blocks (not always up-to-date) */
int32 relpages BKI_DEFAULT(0);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 1f3dc24..27d584d 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelNode nextRelNode; /* next relfile node */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENODE 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d8e8715..3114ead 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7281,11 +7281,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11053,15 +11053,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..297c20b 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelNode newRelFileNode);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a4b5dc8..d6d6215 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -66,7 +66,7 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
*/
extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbNode, Oid spcNode, RelNode relNode,
int backendId, ForkNumber forkNumber);
/*
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1617702..46f7e91 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2901,7 +2901,7 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
+ RelNode oldNode; /* relfilenode of existing storage, if any */
SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
* oldNode */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..7454933 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,21 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelNode data type identifies the specific relation file name. RelNode is
+ * unique within a cluster.
+ *
+ * XXX idealy we can use uint64 but current we only have int8 as an exposed
+ * datatype so maybe we should make a new datatype relnode which will be of
+ * type 8 bytes unsigned integer.
+ */
+typedef pg_int64 RelNode;
+
+#define atorelnode(x) ((RelNode) strtoul((x), NULL, 10))
+
+#define InvalidRelNode ((RelNode) 0)
+#define FirstNormalRelNode ((RelNode) 1)
+#define RelNodeIsValid(relNode) ((bool) ((relNode) != InvalidRelNode))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 0286d51..6e940a6 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -21,6 +21,7 @@
#include "storage/condition_variable.h"
#include "storage/latch.h"
#include "storage/lwlock.h"
+#include "storage/relfilenode.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
@@ -92,8 +93,9 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid. */
Oid dbOid; /* database oid. */
- Oid fileNode; /* relation file node. */
- ForkNumber forkNum;
+ uint32 fileNode_low; /* relation file node 32 lower bits */
+ uint32 fileNode_hi:24; /* relation file node 24 high bits */
+ uint32 forkNum:8;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
@@ -101,7 +103,8 @@ typedef struct buftag
( \
(a).spcOid = InvalidOid, \
(a).dbOid = InvalidOid, \
- (a).fileNode = InvalidOid, \
+ (a).fileNode_low = 0, \
+ (a).fileNode_hi = 0, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -110,7 +113,7 @@ typedef struct buftag
( \
(a).spcOid = (xx_rnode).spcNode, \
(a).dbOid = (xx_rnode).dbNode, \
- (a).fileNode = (xx_rnode).relNode, \
+ BufTagSetFileNode(a, (xx_rnode).relNode), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
@@ -119,23 +122,33 @@ typedef struct buftag
( \
(a).spcOid == (b).spcOid && \
(a).dbOid == (b).dbOid && \
- (a).fileNode == (b).fileNode && \
+ (a).fileNode_low == (b).fileNode_low && \
+ (a).fileNode_hi == (b).fileNode_hi && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BufTagGetFileNode(a) \
+ ((((uint64) (a).fileNode_hi << 32) | ((uint32) (a).fileNode_low)))
+
+#define BufTagSetFileNode(a, node) \
+( \
+ (a).fileNode_hi = (node) >> 32, \
+ (a).fileNode_low = (node) & 0xffffffff \
+)
+
#define BuffTagGetRelFileNode(a, node) \
do { \
(node).spcNode = (a).spcOid; \
(node).dbNode = (a).dbOid; \
- (node).relNode = (a).fileNode; \
+ (node).relNode = BufTagGetFileNode(a); \
} while(0)
#define BuffTagRelFileNodeEquals(a, node) \
( \
(a).spcOid == (node).spcNode && \
(a).dbOid == (node).dbNode && \
- (a).fileNode == (node).relNode \
+ BufTagGetFileNode(a) == (node).relNode \
)
/*
@@ -312,7 +325,7 @@ extern BufferDesc *LocalBufferDescriptors;
typedef struct CkptSortItem
{
Oid tsId;
- Oid relNode;
+ RelNode relNode;
ForkNumber forkNum;
BlockNumber blockNum;
int buf_id;
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4fdc606..cd2110c 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -34,8 +34,7 @@
* relNode identifies the specific relation. relNode corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
+ * Notice that relNode is unique within a cluster.
*
* Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
* zero. We support shared relations only in the "global" tablespace.
@@ -58,7 +57,7 @@ typedef struct RelFileNode
{
Oid spcNode; /* tablespace */
Oid dbNode; /* database */
- Oid relNode; /* relation */
+ RelNode relNode; /* relation */
} RelFileNode;
/*
@@ -75,6 +74,15 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+#define SizeOfRelFileNodeBackend \
+ (offsetof(RelFileNodeBackend, backend) + sizeof(BackendId))
+
+/*
+ * Max value of the relfilnode. Relfilenode will be of 56bits wide for more
+ * details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENODE ((((uint64) 1) << 56) - 1)
+
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 3b4ab65..ce0cffb 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -526,7 +526,7 @@ typedef struct ViewOptions
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelNode))
/*
* RelationGetSmgr
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 2281a7d..4224f42 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -103,7 +103,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelNode relfilenode,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
index 77d8046..d324981 100644
--- a/src/include/utils/relfilenodemap.h
+++ b/src/include/utils/relfilenodemap.h
@@ -13,6 +13,6 @@
#ifndef RELFILENODEMAP_H
#define RELFILENODEMAP_H
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
+extern Oid RelidByRelfilenode(Oid reltablespace, RelNode relfilenode);
#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 9fbb5a7..58234a8 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -35,11 +35,11 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelNode RelationMapOidToFilenode(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
+extern Oid RelationMapFilenodeToOid(RelNode relationId, bool shared);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+extern void RelationMapUpdateMap(Oid relationId, RelNode fileNode, bool shared,
bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 16e0475..58aeddb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,7 +2164,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,7 +2197,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | OTHER | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | OTHER | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index ac894c0..250e6cd 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,7 +1478,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1499,7 +1498,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
On Mon, Feb 21, 2022 at 2:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
While working on this I realized that even if we make the relfilenode
56 bits we can not remove the loop inside GetNewRelFileNode() for
checking the file existence. Because it is always possible that the
file reaches to the disk even before the WAL for advancing the next
relfilenode and if the system crashes in between that then we might
generate the duplicate relfilenode right?
I agree.
I think the second paragraph in XLogPutNextOid() function explain this
issue and now even after we get the wider relfilenode we will have
this issue. Correct?
I think you are correct.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Fri, Mar 4, 2022 at 12:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
In this version I have fixed both of these issues. Thanks Robert for
suggesting the solution for both of these problems in our offlist
discussion. Basically, for the first problem we can flush the xlog
immediately because we are actually logging the WAL every time after
we allocate 64 relfilenode so this should not have much impact on the
performance and I have added the same in the comments. And during
pg_upgrade, whenever we are assigning the relfilenode as part of the
upgrade we will set that relfilenode + 1 as nextRelFileNode to be
assigned so that we never generate the conflicting relfilenode.
Anyone else have an opinion on this?
The only part I do not like in the patch is that before this patch we
could directly access the buftag->rnode. But since now we are not
having directly relfilenode as part of the buffertag and instead of
that we are keeping individual fields (i.e. dbOid, tbsOid and relNode)
in the buffer tag. So if we have to directly get the relfilenode we
need to generate it. However those changes are very limited to just 1
or 2 file so maybe not that bad.
You're talking here about just needing to introduce BufTagGetFileNode
and BufTagSetFileNode, or something else? I don't find those macros to
be problematic.
BufTagSetFileNode could maybe assert that the OID isn't too big,
though. We should ereport() before we get to this point if we somehow
run out of values, but it might be nice to have a check here as a
backup.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Mar 8, 2022 at 1:27 AM Robert Haas <robertmhaas@gmail.com> wrote:
The only part I do not like in the patch is that before this patch we
could directly access the buftag->rnode. But since now we are not
having directly relfilenode as part of the buffertag and instead of
that we are keeping individual fields (i.e. dbOid, tbsOid and relNode)
in the buffer tag. So if we have to directly get the relfilenode we
need to generate it. However those changes are very limited to just 1
or 2 file so maybe not that bad.You're talking here about just needing to introduce BufTagGetFileNode
and BufTagSetFileNode, or something else? I don't find those macros to
be problematic.
Yeah, I was talking about BufTagGetFileNode macro only. The reason I
did not like it is that earlier we could directly use buftag->rnode,
but now whenever we wanted to use rnode first we need to use a
separate variable for preparing the rnode using BufTagGetFileNode
macro. But these changes are very localized and a very few places so
I don't have much problem with those.
BufTagSetFileNode could maybe assert that the OID isn't too big,
though. We should ereport() before we get to this point if we somehow
run out of values, but it might be nice to have a check here as a
backup.
Yeah, we could do that, I will do that in the next version. Thanks.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Mar 4, 2022 at 12:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
In this version I have fixed both of these issues.
Here's a bit of review for these patches:
- The whole relnode vs. relfilenode thing is really confusing. I
realize that there is some precedent for calling the number that
pertains to the file on disk "relnode" and that value when combined
with the database and tablespace OIDs "relfilenode," but it's
definitely not the most obvious thing, especially since
pg_class.relfilenode is a prominent case where we don't even adhere to
that convention. I'm kind of tempted to think that we should go the
other way and rename the RelFileNode struct to something like
RelFileLocator, and then maybe call the new data type RelFileNumber.
And then we could work toward removing references to "filenode" and
"relfilenode" in favor of either (rel)filelocator or (rel)filenumber.
Now the question (even assuming other people like this general
direction) is how far do we go with it? Renaming pg_class.relfilenode
itself wouldn't be the worst compatibility break we've ever had, but
it would definitely cause some pain. I'd be inclined to leave the
user-visible catalog column alone and just push in this direction for
internal stuff.
- What you're doing to pg_buffercache here is completely unacceptable.
You can't change the definition of an already-released version of the
extension. Please study how such issues have been handled in the past.
- It looks to me like you need to give significantly more thought to
the proper way of adjusting the relfilenode-related test cases in
alter_table.out.
- I think BufTagGetFileNode and BufTagGetSetFileNode should be
introduced in 0001 and then just update the definition in 0002 as
required. Note that as things stand you end up with both
BufTagGetFileNode and BuffTagGetRelFileNode which is an artifact of
the relnode/filenode/relfilenode confusion I mention above, and just
to make matters worse, one returns a value while the other produces an
out parameter. I think the renaming I'm talking about up above might
help somewhat here, but it seems like it might also be good to change
the one that uses an out parameter by doing Get -> Copy, just to help
the reader get a clue a little more easily.
- GetNewRelNode() needs to error out if we would wrap around, not wrap
around. Probably similar to what happens if we exhaust 2^64 bytes of
WAL.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi Dilip,
On Fri, Mar 4, 2022 at 11:07 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Feb 21, 2022 at 1:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jan 6, 2022 at 1:43 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
2) GetNewRelFileNode() will not loop for checking the file existence
and retry with other relfilenode.
Open Issues- there are currently 2 open issues in the patch 1) Issue
as discussed above about removing the loop, so currently in this patch
the loop is removed. 2) During upgrade from the previous version we
need to advance the nextrelfilenode to the current relfilenode we are
setting for the object in order to avoid the conflict.In this version I have fixed both of these issues. Thanks Robert for
suggesting the solution for both of these problems in our offlist
discussion. Basically, for the first problem we can flush the xlog
immediately because we are actually logging the WAL every time after
we allocate 64 relfilenode so this should not have much impact on the
performance and I have added the same in the comments. And during
pg_upgrade, whenever we are assigning the relfilenode as part of the
upgrade we will set that relfilenode + 1 as nextRelFileNode to be
assigned so that we never generate the conflicting relfilenode.The only part I do not like in the patch is that before this patch we
could directly access the buftag->rnode. But since now we are not
having directly relfilenode as part of the buffertag and instead of
that we are keeping individual fields (i.e. dbOid, tbsOid and relNode)
in the buffer tag. So if we have to directly get the relfilenode we
need to generate it. However those changes are very limited to just 1
or 2 file so maybe not that bad.
v5 patch needs a rebase and here are a few comments for 0002, I found
while reading that, hope that helps:
+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+
Should it be 64, as per comment in XLogPutNextRelFileNode for XLogFlush() ?
---
+ /*
+ * Check for the wraparound for the relnode counter.
+ *
+ * XXX Actually the relnode is 56 bits wide so we don't need to worry about
+ * the wraparound case.
+ */
+ if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE)
Very rare case, should use unlikely()?
---
+/*
+ * Max value of the relfilnode. Relfilenode will be of 56bits wide for more
+ * details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENODE ((((uint64) 1) << 56) - 1)
Should there be 57-bit shifts here? Instead, I think we should use
INT64CONST(0xFFFFFFFFFFFFFF) to be consistent with PG_*_MAX
declarations, thoughts?
---
+ /* If we run out of logged for use RelNode then we must log more */
+ if (ShmemVariableCache->relnodecount == 0)
Might relnodecount never go below, but just to be safer should check
<= 0 instead.
---
Few typos:
Simmialr
Simmilar
agains
idealy
Regards,
Amul
On Thu, May 12, 2022 at 4:27 PM Amul Sul <sulamul@gmail.com> wrote:
Hi Amul,
Thanks for the review, actually based on some comments from Robert we
have planned to make some design changes. So I am planning to work on
that for the July commitfest. I will try to incorporate all your
review comments in the new version.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
I think you can get rid of SYNC_UNLINK_REQUEST, sync_unlinkfiletag,
mdunlinkfiletag as these are all now unused.
Are there any special hazards here if the plan in [1]/messages/by-id/CA+TgmoYmw==TOJ6EzYb_vcjyS09NkzrVKSyBKUUyo1zBEaJASA@mail.gmail.com goes ahead? If
the relfilenode allocation is logged and replayed then it should be
fine to crash and recover multiple times in a row while creating and
dropping tables, with wal_level=minimal, I think. It would be bad if
the allocator restarted from a value from the checkpoint, though.
[1]: /messages/by-id/CA+TgmoYmw==TOJ6EzYb_vcjyS09NkzrVKSyBKUUyo1zBEaJASA@mail.gmail.com
On Mon, May 16, 2022 at 3:24 PM Thomas Munro <thomas.munro@gmail.com> wrote:
I think you can get rid of SYNC_UNLINK_REQUEST, sync_unlinkfiletag,
mdunlinkfiletag as these are all now unused.
Correct.
Are there any special hazards here if the plan in [1] goes ahead?
IMHO we should not have any problem. In fact, we need this for [1]
right? Otherwise, there is a risk of reusing the same relfilenode
within the same checkpoint cycle as discussed in [2]/messages/by-id/CA+TgmoZZDL_2E_zuahqpJ-WmkuxmUi8+g7=dLEny=18r-+c-iQ@mail.gmail.com.
[1] /messages/by-id/CA+TgmoYmw==TOJ6EzYb_vcjyS09NkzrVKSyBKUUyo1zBEaJASA@mail.gmail.com
[2]: /messages/by-id/CA+TgmoZZDL_2E_zuahqpJ-WmkuxmUi8+g7=dLEny=18r-+c-iQ@mail.gmail.com
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Mar 8, 2022 at 10:11 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Mar 4, 2022 at 12:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
In this version I have fixed both of these issues.
Here's a bit of review for these patches:
- The whole relnode vs. relfilenode thing is really confusing. I
realize that there is some precedent for calling the number that
pertains to the file on disk "relnode" and that value when combined
with the database and tablespace OIDs "relfilenode," but it's
definitely not the most obvious thing, especially since
pg_class.relfilenode is a prominent case where we don't even adhere to
that convention. I'm kind of tempted to think that we should go the
other way and rename the RelFileNode struct to something like
RelFileLocator, and then maybe call the new data type RelFileNumber.
And then we could work toward removing references to "filenode" and
"relfilenode" in favor of either (rel)filelocator or (rel)filenumber.
Now the question (even assuming other people like this general
direction) is how far do we go with it? Renaming pg_class.relfilenode
itself wouldn't be the worst compatibility break we've ever had, but
it would definitely cause some pain. I'd be inclined to leave the
user-visible catalog column alone and just push in this direction for
internal stuff.
I have worked on this renaming stuff first and once we agree with that
then I will rebase the other patches on top of this and will also work
on the other review comments for those patches.
So basically in this patch
- The "RelFileNode" structure to "RelFileLocator" and also renamed
other internal member as below
typedef struct RelFileLocator
{
Oid spcOid; /* tablespace */
Oid dbOid; /* database */
Oid relNumber; /* relation */
} RelFileLocator;
- All variables and internal functions which are using name as
relfilenode/rnode and referring to this structure are renamed to
relfilelocator/rlocator.
- relNode/relfilenode which are referring to the actual file name on
disk is renamed to relNumber/relfilenumber.
- Based on the new terminology, I have renamed the file names as well, e.g.
relfilenode.h -> relfilelocator.h
relfilenodemap.h -> relfilenumbermap.h
I haven't renamed the exposed catalog variable and exposed function
here is the high level list
- pg_class.relfilenode
- pg_catalog.pg_relation_filenode()
- All test cases variables referring to pg_class.relfilenode.
- exposed option for tool which are w.r.t pg_class relfilenode (e.g.
-f, --filenode=FILENODE)
- exposed functions
pg_catalog.binary_upgrade_set_next_heap_relfilenode() and friends
- pg_filenode.map file name, maybe we can rename this but this is used
by other tools so I left this alone.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v1-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchDownload
From 69a3c0ed23dc0d870035d45d63f42f2b53c2c063 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 21 Jun 2022 14:04:01 +0530
Subject: [PATCH v1] Rename RelFileNode to RelFileLocator and relNode to
RelNumber
Currently, the way relfilenode and relnode are used is really confusing.
Although there is some precedent for calling the number that pertains to
the file on disk "relnode" and that value when combined with the database
and tablespace OIDs "relfilenode," but it's definitely not the most obvious
thing, and this terminology is also not used uniformaly.
So as part of this patchset these variables are renamed to something more suited
with their usage. So the RelFileNode is renamed to the RelFileLocator
and all related variable declaration from relfilenode to relfilelocator.
And the relNode in the RelFileLocator is renamed to relNumber and along with that
the dbNode and spcNode are also renamed to dbOid and spcOid. Along with that
all other references to relnode/relfilenode w.r.t to the ondisk file is renamed to
relnumber/relfilenumber.
---
contrib/bloom/blinsert.c | 2 +-
contrib/oid2name/oid2name.c | 28 +--
contrib/pg_buffercache/pg_buffercache_pages.c | 12 +-
contrib/pg_prewarm/autoprewarm.c | 26 +--
contrib/pg_visibility/pg_visibility.c | 2 +-
src/backend/access/common/syncscan.c | 29 +--
src/backend/access/gin/ginbtree.c | 2 +-
src/backend/access/gin/ginfast.c | 2 +-
src/backend/access/gin/ginutil.c | 2 +-
src/backend/access/gin/ginxlog.c | 6 +-
src/backend/access/gist/gistbuild.c | 4 +-
src/backend/access/gist/gistxlog.c | 11 +-
src/backend/access/hash/hash_xlog.c | 6 +-
src/backend/access/hash/hashpage.c | 4 +-
src/backend/access/heap/heapam.c | 78 +++----
src/backend/access/heap/heapam_handler.c | 26 +--
src/backend/access/heap/rewriteheap.c | 10 +-
src/backend/access/heap/visibilitymap.c | 4 +-
src/backend/access/nbtree/nbtpage.c | 2 +-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 2 +-
src/backend/access/nbtree/nbtxlog.c | 8 +-
src/backend/access/rmgrdesc/genericdesc.c | 2 +-
src/backend/access/rmgrdesc/gindesc.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 6 +-
src/backend/access/rmgrdesc/heapdesc.c | 6 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/smgrdesc.c | 4 +-
src/backend/access/rmgrdesc/xactdesc.c | 44 ++--
src/backend/access/rmgrdesc/xlogdesc.c | 10 +-
src/backend/access/spgist/spginsert.c | 6 +-
src/backend/access/spgist/spgxlog.c | 6 +-
src/backend/access/table/tableamapi.c | 2 +-
src/backend/access/transam/README | 14 +-
src/backend/access/transam/README.parallel | 2 +-
src/backend/access/transam/twophase.c | 38 ++--
src/backend/access/transam/varsup.c | 2 +-
src/backend/access/transam/xact.c | 40 ++--
src/backend/access/transam/xloginsert.c | 38 ++--
src/backend/access/transam/xlogprefetcher.c | 96 ++++----
src/backend/access/transam/xlogreader.c | 25 ++-
src/backend/access/transam/xlogrecovery.c | 18 +-
src/backend/access/transam/xlogutils.c | 73 +++---
src/backend/bootstrap/bootparse.y | 8 +-
src/backend/catalog/catalog.c | 28 +--
src/backend/catalog/heap.c | 52 ++---
src/backend/catalog/index.c | 36 +--
src/backend/catalog/storage.c | 119 +++++-----
src/backend/commands/cluster.c | 42 ++--
src/backend/commands/copyfrom.c | 6 +-
src/backend/commands/dbcommands.c | 104 ++++-----
src/backend/commands/indexcmds.c | 12 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/sequence.c | 29 +--
src/backend/commands/tablecmds.c | 87 ++++----
src/backend/commands/tablespace.c | 18 +-
src/backend/nodes/copyfuncs.c | 4 +-
src/backend/nodes/equalfuncs.c | 4 +-
src/backend/nodes/outfuncs.c | 4 +-
src/backend/parser/gram.y | 8 +-
src/backend/parser/parse_utilcmd.c | 8 +-
src/backend/postmaster/checkpointer.c | 2 +-
src/backend/replication/logical/decode.c | 40 ++--
src/backend/replication/logical/reorderbuffer.c | 50 ++---
src/backend/replication/logical/snapbuild.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 284 ++++++++++++------------
src/backend/storage/buffer/localbuf.c | 34 +--
src/backend/storage/freespace/freespace.c | 6 +-
src/backend/storage/freespace/fsmpage.c | 6 +-
src/backend/storage/ipc/standby.c | 8 +-
src/backend/storage/lmgr/predicate.c | 24 +-
src/backend/storage/smgr/README | 2 +-
src/backend/storage/smgr/md.c | 126 +++++------
src/backend/storage/smgr/smgr.c | 44 ++--
src/backend/utils/adt/dbsize.c | 56 ++---
src/backend/utils/adt/pg_upgrade_support.c | 14 +-
src/backend/utils/cache/Makefile | 2 +-
src/backend/utils/cache/inval.c | 16 +-
src/backend/utils/cache/relcache.c | 180 +++++++--------
src/backend/utils/cache/relfilenodemap.c | 244 --------------------
src/backend/utils/cache/relfilenumbermap.c | 244 ++++++++++++++++++++
src/backend/utils/cache/relmapper.c | 72 +++---
src/bin/pg_dump/pg_dump.c | 28 +--
src/bin/pg_rewind/datapagemap.h | 2 +-
src/bin/pg_rewind/filemap.c | 34 +--
src/bin/pg_rewind/filemap.h | 4 +-
src/bin/pg_rewind/parsexlog.c | 10 +-
src/bin/pg_rewind/pg_rewind.h | 2 +-
src/bin/pg_upgrade/Makefile | 2 +-
src/bin/pg_upgrade/info.c | 10 +-
src/bin/pg_upgrade/pg_upgrade.h | 6 +-
src/bin/pg_upgrade/relfilenode.c | 259 ---------------------
src/bin/pg_upgrade/relfilenumber.c | 259 +++++++++++++++++++++
src/bin/pg_waldump/pg_waldump.c | 26 +--
src/common/relpath.c | 48 ++--
src/include/access/brin_xlog.h | 2 +-
src/include/access/ginxlog.h | 4 +-
src/include/access/gistxlog.h | 2 +-
src/include/access/heapam_xlog.h | 8 +-
src/include/access/nbtxlog.h | 4 +-
src/include/access/rewriteheap.h | 6 +-
src/include/access/tableam.h | 59 ++---
src/include/access/xact.h | 26 +--
src/include/access/xlog_internal.h | 2 +-
src/include/access/xloginsert.h | 8 +-
src/include/access/xlogreader.h | 6 +-
src/include/access/xlogrecord.h | 8 +-
src/include/access/xlogutils.h | 8 +-
src/include/catalog/binary_upgrade.h | 6 +-
src/include/catalog/catalog.h | 2 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/storage.h | 10 +-
src/include/catalog/storage_xlog.h | 8 +-
src/include/commands/sequence.h | 4 +-
src/include/commands/tablecmds.h | 2 +-
src/include/commands/tablespace.h | 2 +-
src/include/common/relpath.h | 24 +-
src/include/nodes/parsenodes.h | 8 +-
src/include/postmaster/bgwriter.h | 2 +-
src/include/replication/reorderbuffer.h | 6 +-
src/include/storage/buf_internals.h | 20 +-
src/include/storage/bufmgr.h | 16 +-
src/include/storage/freespace.h | 4 +-
src/include/storage/md.h | 6 +-
src/include/storage/relfilelocator.h | 99 +++++++++
src/include/storage/relfilenode.h | 99 ---------
src/include/storage/sinval.h | 4 +-
src/include/storage/smgr.h | 12 +-
src/include/storage/standby.h | 6 +-
src/include/storage/sync.h | 4 +-
src/include/utils/inval.h | 4 +-
src/include/utils/rel.h | 44 ++--
src/include/utils/relcache.h | 8 +-
src/include/utils/relfilenodemap.h | 18 --
src/include/utils/relfilenumbermap.h | 18 ++
src/include/utils/relmapper.h | 10 +-
src/test/recovery/t/018_wal_optimize.pl | 2 +-
src/tools/pgindent/typedefs.list | 10 +-
140 files changed, 2022 insertions(+), 2010 deletions(-)
delete mode 100644 src/backend/utils/cache/relfilenodemap.c
create mode 100644 src/backend/utils/cache/relfilenumbermap.c
delete mode 100644 src/bin/pg_upgrade/relfilenode.c
create mode 100644 src/bin/pg_upgrade/relfilenumber.c
create mode 100644 src/include/storage/relfilelocator.h
delete mode 100644 src/include/storage/relfilenode.h
delete mode 100644 src/include/utils/relfilenodemap.h
create mode 100644 src/include/utils/relfilenumbermap.h
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 82378db..e64291e 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -179,7 +179,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index a62a5ee..2e08bc7 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -30,7 +30,7 @@ struct options
{
eary *tables;
eary *oids;
- eary *filenodes;
+ eary *filenumbers;
bool quiet;
bool systables;
@@ -125,9 +125,9 @@ get_opts(int argc, char **argv, struct options *my_opts)
my_opts->dbname = pg_strdup(optarg);
break;
- /* specify one filenode to show */
+ /* specify one filenumber to show */
case 'f':
- add_one_elt(optarg, my_opts->filenodes);
+ add_one_elt(optarg, my_opts->filenumbers);
break;
/* host to connect to */
@@ -494,7 +494,7 @@ sql_exec_dumpalltables(PGconn *conn, struct options *opts)
}
/*
- * Show oid, filenode, name, schema and tablespace for each of the
+ * Show oid, filenumber, name, schema and tablespace for each of the
* given objects in the current database.
*/
void
@@ -504,19 +504,19 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
char *qualifiers,
*ptr;
char *comma_oids,
- *comma_filenodes,
+ *comma_filenumbers,
*comma_tables;
bool written = false;
char *addfields = ",c.oid AS \"Oid\", nspname AS \"Schema\", spcname as \"Tablespace\" ";
- /* get tables qualifiers, whether names, filenodes, or OIDs */
+ /* get tables qualifiers, whether names, filenumbers, or OIDs */
comma_oids = get_comma_elts(opts->oids);
comma_tables = get_comma_elts(opts->tables);
- comma_filenodes = get_comma_elts(opts->filenodes);
+ comma_filenumbers = get_comma_elts(opts->filenumbers);
/* 80 extra chars for SQL expression */
qualifiers = (char *) pg_malloc(strlen(comma_oids) + strlen(comma_tables) +
- strlen(comma_filenodes) + 80);
+ strlen(comma_filenumbers) + 80);
ptr = qualifiers;
if (opts->oids->num > 0)
@@ -524,11 +524,11 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
ptr += sprintf(ptr, "c.oid IN (%s)", comma_oids);
written = true;
}
- if (opts->filenodes->num > 0)
+ if (opts->filenumbers->num > 0)
{
if (written)
ptr += sprintf(ptr, " OR ");
- ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenodes);
+ ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenumbers);
written = true;
}
if (opts->tables->num > 0)
@@ -539,7 +539,7 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
}
free(comma_oids);
free(comma_tables);
- free(comma_filenodes);
+ free(comma_filenumbers);
/* now build the query */
todo = psprintf("SELECT pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\", relname as \"Table Name\" %s\n"
@@ -588,11 +588,11 @@ main(int argc, char **argv)
my_opts->oids = (eary *) pg_malloc(sizeof(eary));
my_opts->tables = (eary *) pg_malloc(sizeof(eary));
- my_opts->filenodes = (eary *) pg_malloc(sizeof(eary));
+ my_opts->filenumbers = (eary *) pg_malloc(sizeof(eary));
my_opts->oids->num = my_opts->oids->alloc = 0;
my_opts->tables->num = my_opts->tables->alloc = 0;
- my_opts->filenodes->num = my_opts->filenodes->alloc = 0;
+ my_opts->filenumbers->num = my_opts->filenumbers->alloc = 0;
/* parse the opts */
get_opts(argc, argv, my_opts);
@@ -618,7 +618,7 @@ main(int argc, char **argv)
/* display the given elements in the database */
if (my_opts->oids->num > 0 ||
my_opts->tables->num > 0 ||
- my_opts->filenodes->num > 0)
+ my_opts->filenumbers->num > 0)
{
if (!my_opts->quiet)
printf("From database \"%s\":\n", my_opts->dbname);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..7899b5b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ Oid relfilenumber;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -102,7 +102,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
tupledesc = CreateTemplateTupleDesc(expected_tupledesc->natts);
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
- TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
+ TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenumber",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
+ fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c0c4f5d..667f8d7 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -52,7 +52,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/resowner.h"
#define AUTOPREWARM_FILE "autoprewarm.blocks"
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ Oid filenumber;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -347,7 +347,7 @@ apw_load_buffers(void)
unsigned forknum;
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
- &blkinfo[i].tablespace, &blkinfo[i].filenode,
+ &blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
(errmsg("autoprewarm block dump file is corrupted at line %d",
@@ -494,7 +494,7 @@ autoprewarm_database_main(Datum main_arg)
* relation. Note that rel will be NULL if try_relation_open failed
* previously; in that case, there is nothing to close.
*/
- if (old_blk != NULL && old_blk->filenode != blk->filenode &&
+ if (old_blk != NULL && old_blk->filenumber != blk->filenumber &&
rel != NULL)
{
relation_close(rel, AccessShareLock);
@@ -506,13 +506,13 @@ autoprewarm_database_main(Datum main_arg)
* Try to open each new relation, but only once, when we first
* encounter it. If it's been dropped, skip the associated blocks.
*/
- if (old_blk == NULL || old_blk->filenode != blk->filenode)
+ if (old_blk == NULL || old_blk->filenumber != blk->filenumber)
{
Oid reloid;
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenode(blk->tablespace, blk->filenode);
+ reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
@@ -527,7 +527,7 @@ autoprewarm_database_main(Datum main_arg)
/* Once per fork, check for fork existence and size. */
if (old_blk == NULL ||
- old_blk->filenode != blk->filenode ||
+ old_blk->filenumber != blk->filenumber ||
old_blk->forknum != blk->forknum)
{
/*
@@ -631,9 +631,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
+ block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
@@ -671,7 +671,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
- block_info_array[i].filenode,
+ block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
if (ret < 0)
@@ -900,7 +900,7 @@ do { \
* We depend on all records for a particular database being consecutive
* in the dump file; each per-database worker will preload blocks until
* it sees a block for some other database. Sorting by tablespace,
- * filenode, forknum, and blocknum isn't critical for correctness, but
+ * filenumber, forknum, and blocknum isn't critical for correctness, but
* helps us get a sequential I/O pattern.
*/
static int
@@ -911,7 +911,7 @@ apw_compare_blockinfo(const void *p, const void *q)
cmp_member_elem(database);
cmp_member_elem(tablespace);
- cmp_member_elem(filenode);
+ cmp_member_elem(filenumber);
cmp_member_elem(forknum);
cmp_member_elem(blocknum);
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 1853c35..4e2e9ea 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -407,7 +407,7 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
xl_smgr_truncate xlrec;
xlrec.blkno = 0;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_VM;
XLogBeginInsert();
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..e3add81 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -90,7 +90,7 @@ bool trace_syncscan = false;
*/
typedef struct ss_scan_location_t
{
- RelFileNode relfilenode; /* identity of a relation */
+ RelFileLocator relfilelocator; /* identity of a relation */
BlockNumber location; /* last-reported location in the relation */
} ss_scan_location_t;
@@ -115,7 +115,7 @@ typedef struct ss_scan_locations_t
static ss_scan_locations_t *scan_locations;
/* prototypes for internal functions */
-static BlockNumber ss_search(RelFileNode relfilenode,
+static BlockNumber ss_search(RelFileLocator relfilelocator,
BlockNumber location, bool set);
@@ -159,9 +159,9 @@ SyncScanShmemInit(void)
* these invalid entries will fall off the LRU list and get
* replaced with real entries.
*/
- item->location.relfilenode.spcNode = InvalidOid;
- item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidOid;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
@@ -176,10 +176,10 @@ SyncScanShmemInit(void)
/*
* ss_search --- search the scan_locations structure for an entry with the
- * given relfilenode.
+ * given relfilelocator.
*
* If "set" is true, the location is updated to the given location. If no
- * entry for the given relfilenode is found, it will be created at the head
+ * entry for the given relfilelocator is found, it will be created at the head
* of the list with the given location, even if "set" is false.
*
* In any case, the location after possible update is returned.
@@ -188,7 +188,7 @@ SyncScanShmemInit(void)
* data structure.
*/
static BlockNumber
-ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
+ss_search(RelFileLocator relfilelocator, BlockNumber location, bool set)
{
ss_lru_item_t *item;
@@ -197,7 +197,8 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
{
bool match;
- match = RelFileNodeEquals(item->location.relfilenode, relfilenode);
+ match = RelFileLocatorEquals(item->location.relfilelocator,
+ relfilelocator);
if (match || item->next == NULL)
{
@@ -207,7 +208,7 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
*/
if (!match)
{
- item->location.relfilenode = relfilenode;
+ item->location.relfilelocator = relfilelocator;
item->location.location = location;
}
else if (set)
@@ -255,7 +256,7 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
BlockNumber startloc;
LWLockAcquire(SyncScanLock, LW_EXCLUSIVE);
- startloc = ss_search(rel->rd_node, 0, false);
+ startloc = ss_search(rel->rd_locator, 0, false);
LWLockRelease(SyncScanLock);
/*
@@ -281,8 +282,8 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
* ss_report_location --- update the current scan location
*
* Writes an entry into the shared Sync Scan state of the form
- * (relfilenode, blocknumber), overwriting any existing entry for the
- * same relfilenode.
+ * (relfilelocator, blocknumber), overwriting any existing entry for the
+ * same relfilelocator.
*/
void
ss_report_location(Relation rel, BlockNumber location)
@@ -309,7 +310,7 @@ ss_report_location(Relation rel, BlockNumber location)
{
if (LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE))
{
- (void) ss_search(rel->rd_node, location, true);
+ (void) ss_search(rel->rd_locator, location, true);
LWLockRelease(SyncScanLock);
}
#ifdef TRACE_SYNCSCAN
diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index cc6d4e6..c75bfc2 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -470,7 +470,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
savedRightLink = GinPageGetOpaque(page)->rightlink;
/* Begin setting up WAL record */
- data.node = btree->index->rd_node;
+ data.locator = btree->index->rd_locator;
data.flags = xlflags;
if (BufferIsValid(childbuf))
{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 7409fdc..6c67744 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -235,7 +235,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
needWal = RelationNeedsWAL(index);
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 20f4706..6df7f2e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -688,7 +688,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats, bool is_build)
XLogRecPtr recptr;
ginxlogUpdateMeta data;
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
memcpy(&data.metadata, metadata, sizeof(GinMetaPageData));
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..41b9211 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileLocator locator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &locator, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index f5a5caf..374e64e 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -462,7 +462,7 @@ gist_indexsortbuild(GISTBuildState *state)
smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
if (RelationNeedsWAL(state->indexrel))
- log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM, GIST_ROOT_BLKNO,
+ log_newpage(&state->indexrel->rd_locator, MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
pfree(levelstate->pages[0]);
@@ -663,7 +663,7 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
}
if (RelationNeedsWAL(state->indexrel))
- log_newpages(&state->indexrel->rd_node, MAIN_FORKNUM, state->ready_num_pages,
+ log_newpages(&state->indexrel->rd_locator, MAIN_FORKNUM, state->ready_num_pages,
state->ready_blknos, state->ready_pages, true);
for (int i = 0; i < state->ready_num_pages; i++)
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index df70f90..b4f629f 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -191,11 +191,12 @@ gistRedoDeleteRecord(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid,
+ rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -395,7 +396,7 @@ gistRedoPageReuse(XLogReaderState *record)
*/
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
@@ -607,7 +608,7 @@ gistXLogPageReuse(Relation rel, BlockNumber blkno, FullTransactionId latestRemov
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = latestRemovedXid;
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index 62dbfc3..2e68303 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -999,10 +999,10 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rlocator);
}
action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL, true, &buffer);
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 39206d1..d2edcd4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -428,7 +428,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1019,7 +1019,7 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 637de11..aab8d6f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8189,7 +8189,7 @@ log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid,
* heap_buffer, if necessary.
*/
XLogRecPtr
-log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
+log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer, Buffer vm_buffer,
TransactionId cutoff_xid, uint8 vmflags)
{
xl_heap_visible xlrec;
@@ -8454,7 +8454,7 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
Assert(tup->t_tableOid != InvalidOid);
xlrec.top_xid = GetTopTransactionId();
- xlrec.target_node = relation->rd_node;
+ xlrec.target_locator = relation->rd_locator;
xlrec.target_tid = tup->t_self;
/*
@@ -8623,18 +8623,18 @@ heap_xlog_prune(XLogReaderState *record)
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_prune *xlrec = (xl_heap_prune *) XLogRecGetData(record);
Buffer buffer;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
* no queries running for which the removed tuples are still visible.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
/*
* If we have a full-page image, restore it (using a cleanup lock) and
@@ -8694,7 +8694,7 @@ heap_xlog_prune(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8751,9 +8751,9 @@ heap_xlog_vacuum(XLogReaderState *record)
if (BufferIsValid(buffer))
{
Size freespace = PageGetHeapFreeSpace(BufferGetPage(buffer));
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
UnlockReleaseBuffer(buffer);
@@ -8766,7 +8766,7 @@ heap_xlog_vacuum(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8786,11 +8786,11 @@ heap_xlog_visible(XLogReaderState *record)
Buffer vmbuffer = InvalidBuffer;
Buffer buffer;
Page page;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -8802,7 +8802,7 @@ heap_xlog_visible(XLogReaderState *record)
* rather than killing the transaction outright.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rlocator);
/*
* Read the heap page, if it still exists. If the heap file has dropped or
@@ -8865,7 +8865,7 @@ heap_xlog_visible(XLogReaderState *record)
* FSM data is not in the page anyway.
*/
if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
- XLogRecordPageWithFreeSpace(rnode, blkno, space);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, space);
}
/*
@@ -8890,7 +8890,7 @@ heap_xlog_visible(XLogReaderState *record)
*/
LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
- reln = CreateFakeRelcacheEntry(rnode);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, blkno, &vmbuffer);
/*
@@ -8933,13 +8933,13 @@ heap_xlog_freeze_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
TransactionId latestRemovedXid = cutoff_xid;
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -9007,10 +9007,10 @@ heap_xlog_delete(XLogReaderState *record)
ItemId lp = NULL;
HeapTupleHeader htup;
BlockNumber blkno;
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9020,7 +9020,7 @@ heap_xlog_delete(XLogReaderState *record)
*/
if (xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9086,12 +9086,12 @@ heap_xlog_insert(XLogReaderState *record)
xl_heap_header xlhdr;
uint32 newlen;
Size freespace = 0;
- RelFileNode target_node;
+ RelFileLocator target_locator;
BlockNumber blkno;
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9101,7 +9101,7 @@ heap_xlog_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9184,7 +9184,7 @@ heap_xlog_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(target_node, blkno, freespace);
+ XLogRecordPageWithFreeSpace(target_locator, blkno, freespace);
}
/*
@@ -9195,7 +9195,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_multi_insert *xlrec;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
Buffer buffer;
Page page;
@@ -9217,7 +9217,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/* check that the mutually exclusive flags are not both set */
Assert(!((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) &&
@@ -9229,7 +9229,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9331,7 +9331,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
/*
@@ -9342,7 +9342,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_update *xlrec = (xl_heap_update *) XLogRecGetData(record);
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber oldblk;
BlockNumber newblk;
ItemPointerData newtid;
@@ -9371,7 +9371,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &newblk);
if (XLogRecGetBlockTagExtended(record, 1, NULL, NULL, &oldblk, NULL))
{
/* HOT updates are never done across pages */
@@ -9388,7 +9388,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, oldblk, &vmbuffer);
@@ -9472,7 +9472,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, newblk, &vmbuffer);
@@ -9606,7 +9606,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
* totally accurate anyway.
*/
if (newaction == BLK_NEEDS_REDO && !hot_update && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, newblk, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, newblk, freespace);
}
static void
@@ -9662,13 +9662,13 @@ heap_xlog_lock(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
@@ -9735,13 +9735,13 @@ heap_xlog_lock_updated(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 444f027..7f227be 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -566,11 +566,11 @@ tuple_lock_retry:
*/
static void
-heapam_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+heapam_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
SMgrRelation srel;
@@ -591,7 +591,7 @@ heapam_relation_set_new_filenode(Relation rel,
*/
*minmulti = GetOldestMultiXactId();
- srel = RelationCreateStorage(*newrnode, persistence, true);
+ srel = RelationCreateStorage(*newrlocator, persistence, true);
/*
* If required, set up an init fork for an unlogged table so that it can
@@ -608,7 +608,7 @@ heapam_relation_set_new_filenode(Relation rel,
rel->rd_rel->relkind == RELKIND_MATVIEW ||
rel->rd_rel->relkind == RELKIND_TOASTVALUE);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(newrnode, INIT_FORKNUM);
+ log_smgrcreate(newrlocator, INIT_FORKNUM);
smgrimmedsync(srel, INIT_FORKNUM);
}
@@ -622,11 +622,11 @@ heapam_relation_nontransactional_truncate(Relation rel)
}
static void
-heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+heapam_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(*newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -640,10 +640,10 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(*newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -664,7 +664,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(newrnode, forkNum);
+ log_smgrcreate(newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
@@ -2569,7 +2569,7 @@ static const TableAmRoutine heapam_methods = {
.tuple_satisfies_snapshot = heapam_tuple_satisfies_snapshot,
.index_delete_tuples = heap_index_delete_tuples,
- .relation_set_new_filenode = heapam_relation_set_new_filenode,
+ .relation_set_new_filelocator = heapam_relation_set_new_filelocator,
.relation_nontransactional_truncate = heapam_relation_nontransactional_truncate,
.relation_copy_data = heapam_relation_copy_data,
.relation_copy_for_cluster = heapam_relation_copy_for_cluster,
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826..197f06b 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -318,7 +318,7 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -679,7 +679,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
page,
@@ -742,7 +742,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
* When doing logical decoding - which relies on using cmin/cmax of catalog
* tuples, via xl_heap_new_cid records - heap rewrites have to log enough
* information to allow the decoding backend to update its internal mapping
- * of (relfilenode,ctid) => (cmin, cmax) to be correct for the rewritten heap.
+ * of (relfilelocator,ctid) => (cmin, cmax) to be correct for the rewritten heap.
*
* For that, every time we find a tuple that's been modified in a catalog
* relation within the xmin horizon of any decoding slot, we log a mapping
@@ -1080,9 +1080,9 @@ logical_rewrite_heap_tuple(RewriteState state, ItemPointerData old_tid,
return;
/* fill out mapping information */
- map.old_node = state->rs_old_rel->rd_node;
+ map.old_locator = state->rs_old_rel->rd_locator;
map.old_tid = old_tid;
- map.new_node = state->rs_new_rel->rd_node;
+ map.new_locator = state->rs_new_rel->rd_locator;
map.new_tid = new_tid;
/* ---
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index e09f25a..ed72eb7 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -283,7 +283,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
if (XLogRecPtrIsInvalid(recptr))
{
Assert(!InRecovery);
- recptr = log_heap_visible(rel->rd_node, heapBuf, vmBuf,
+ recptr = log_heap_visible(rel->rd_locator, heapBuf, vmBuf,
cutoff_xid, flags);
/*
@@ -668,7 +668,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
* to keep checking for creation or extension of the file, which happens
* infrequently.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
UnlockRelationForExtension(rel, ExclusiveLock);
}
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 20adb60..8b96708 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -836,7 +836,7 @@ _bt_log_reuse_page(Relation rel, BlockNumber blkno, FullTransactionId safexid)
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = safexid;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 9b730f3..b52eca8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -166,7 +166,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&RelationGetSmgr(index)->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&RelationGetSmgr(index)->smgr_rlocator.locator, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9f60fa9..bd1685c 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -647,7 +647,7 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the XLOG_FPI record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(&wstate->index->rd_locator, MAIN_FORKNUM, blkno, page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index f9186ca..ad489e3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -664,11 +664,11 @@ btree_xlog_delete(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
}
/*
@@ -1006,7 +1006,7 @@ btree_xlog_reuse_page(XLogReaderState *record)
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
diff --git a/src/backend/access/rmgrdesc/genericdesc.c b/src/backend/access/rmgrdesc/genericdesc.c
index 877beb5..d8509b8 100644
--- a/src/backend/access/rmgrdesc/genericdesc.c
+++ b/src/backend/access/rmgrdesc/genericdesc.c
@@ -15,7 +15,7 @@
#include "access/generic_xlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Description of generic xlog record: write page regions that this record
diff --git a/src/backend/access/rmgrdesc/gindesc.c b/src/backend/access/rmgrdesc/gindesc.c
index 57f7bce..7d147ce 100644
--- a/src/backend/access/rmgrdesc/gindesc.c
+++ b/src/backend/access/rmgrdesc/gindesc.c
@@ -17,7 +17,7 @@
#include "access/ginxlog.h"
#include "access/xlogutils.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
desc_recompress_leaf(StringInfo buf, ginxlogRecompressDataLeaf *insertData)
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index d0c8e24..7dd3c1d 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -16,7 +16,7 @@
#include "access/gistxlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
@@ -27,8 +27,8 @@ static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..923d3bc 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -170,9 +170,9 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
- xlrec->target_node.spcNode,
- xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ xlrec->target_locator.spcOid,
+ xlrec->target_locator.dbOid,
+ xlrec->target_locator.relNumber,
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..4843cd5 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -101,8 +101,8 @@ btree_desc(StringInfo buf, XLogReaderState *record)
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..b3845f9 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -26,8 +26,8 @@ seq_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SEQ_LOG)
appendStringInfo(buf, "rel %u/%u/%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber);
}
const char *
diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 7547813..e0ee8a0 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,7 +26,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SMGR_CREATE)
{
xl_smgr_create *xlrec = (xl_smgr_create *) rec;
- char *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+ char *path = relpathperm(xlrec->rlocator, xlrec->forkNum);
appendStringInfoString(buf, path);
pfree(path);
@@ -34,7 +34,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
else if (info == XLOG_SMGR_TRUNCATE)
{
xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
- char *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+ char *path = relpathperm(xlrec->rlocator, MAIN_FORKNUM);
appendStringInfo(buf, "%s to %u blocks flags %d", path,
xlrec->blkno, xlrec->flags);
diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 90b6ac2..39752cf 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -73,15 +73,15 @@ ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *pars
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocators = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocators->nrels;
+ parsed->xlocators = xl_rellocators->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocators->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -179,15 +179,15 @@ ParseAbortRecord(uint8 info, xl_xact_abort *xlrec, xl_xact_parsed_abort *parsed)
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocator = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocator->nrels;
+ parsed->xlocators = xl_rellocator->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocator->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -260,11 +260,11 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
parsed->subxacts = (TransactionId *) bufptr;
bufptr += MAXALIGN(xlrec->nsubxacts * sizeof(TransactionId));
- parsed->xnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileNode));
+ parsed->xlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileLocator));
- parsed->abortnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileNode));
+ parsed->abortlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileLocator));
parsed->stats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(xlrec->ncommitstats * sizeof(xl_xact_stats_item));
@@ -278,7 +278,7 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
static void
xact_desc_relations(StringInfo buf, char *label, int nrels,
- RelFileNode *xnodes)
+ RelFileLocator *xlocators)
{
int i;
@@ -287,7 +287,7 @@ xact_desc_relations(StringInfo buf, char *label, int nrels,
appendStringInfo(buf, "; %s:", label);
for (i = 0; i < nrels; i++)
{
- char *path = relpathperm(xnodes[i], MAIN_FORKNUM);
+ char *path = relpathperm(xlocators[i], MAIN_FORKNUM);
appendStringInfo(buf, " %s", path);
pfree(path);
@@ -340,7 +340,7 @@ xact_desc_commit(StringInfo buf, uint8 info, xl_xact_commit *xlrec, RepOriginId
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
xact_desc_stats(buf, "", parsed.nstats, parsed.stats);
@@ -376,7 +376,7 @@ xact_desc_abort(StringInfo buf, uint8 info, xl_xact_abort *xlrec, RepOriginId or
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
if (parsed.xinfo & XACT_XINFO_HAS_ORIGIN)
@@ -400,9 +400,9 @@ xact_desc_prepare(StringInfo buf, uint8 info, xl_xact_prepare *xlrec, RepOriginI
appendStringInfo(buf, "gid %s: ", parsed.twophase_gid);
appendStringInfoString(buf, timestamptz_to_str(parsed.xact_time));
- xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xlocators);
xact_desc_relations(buf, "rels(abort)", parsed.nabortrels,
- parsed.abortnodes);
+ parsed.abortlocators);
xact_desc_stats(buf, "commit ", parsed.nstats, parsed.stats);
xact_desc_stats(buf, "abort ", parsed.nabortstats, parsed.abortstats);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index fefc563..6fec485 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -219,12 +219,12 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (detailed_format)
@@ -239,7 +239,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
"blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
@@ -299,7 +299,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
}
@@ -308,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index bfb7404..c6821b5 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index b500b2c..4c9f402 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -877,11 +877,11 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
if (TransactionIdIsValid(xldata->newestRedirectXid))
{
- RelFileNode node;
+ RelFileLocator locator;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &locator, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
- node);
+ locator);
}
}
diff --git a/src/backend/access/table/tableamapi.c b/src/backend/access/table/tableamapi.c
index 76df798..873d961 100644
--- a/src/backend/access/table/tableamapi.c
+++ b/src/backend/access/table/tableamapi.c
@@ -82,7 +82,7 @@ GetTableAmRoutine(Oid amhandler)
Assert(routine->tuple_update != NULL);
Assert(routine->tuple_lock != NULL);
- Assert(routine->relation_set_new_filenode != NULL);
+ Assert(routine->relation_set_new_filelocator != NULL);
Assert(routine->relation_nontransactional_truncate != NULL);
Assert(routine->relation_copy_data != NULL);
Assert(routine->relation_copy_for_cluster != NULL);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..565f994 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -557,7 +557,7 @@ void XLogRegisterBuffer(uint8 block_id, Buffer buf, uint8 flags);
XLogRegisterBuffer adds information about a data block to the WAL record.
block_id is an arbitrary number used to identify this page reference in
the redo routine. The information needed to re-find the page at redo -
- relfilenode, fork, and block number - are included in the WAL record.
+ relfilenumber, fork, and block number - are included in the WAL record.
XLogInsert will automatically include a full copy of the page contents, if
this is the first modification of the buffer since the last checkpoint.
@@ -692,7 +692,7 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
+because we check for on-disk collisions when allocating new relfilenumber
OIDs. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
@@ -725,10 +725,10 @@ then restart recovery. This is part of the reason for not writing a WAL
entry until we've successfully done the original action.
-Skipping WAL for New RelFileNode
+Skipping WAL for New RelFileLocator
--------------------------------
-Under wal_level=minimal, if a change modifies a relfilenode that ROLLBACK
+Under wal_level=minimal, if a change modifies a relfilenumber that ROLLBACK
would unlink, in-tree access methods write no WAL for that change. Code that
writes WAL without calling RelationNeedsWAL() must check for this case. This
skipping is mandatory. If a WAL-writing change preceded a WAL-skipping change
@@ -748,9 +748,9 @@ unconditionally for permanent relations. Under these approaches, the access
method callbacks must not call functions that react to RelationNeedsWAL().
This applies only to WAL records whose replay would modify bytes stored in the
-new relfilenode. It does not apply to other records about the relfilenode,
+new relfilenumber. It does not apply to other records about the relfilenumber,
such as XLOG_SMGR_CREATE. Because it operates at the level of individual
-relfilenodes, RelationNeedsWAL() can differ for tightly-coupled relations.
+relfilenumbers, RelationNeedsWAL() can differ for tightly-coupled relations.
Consider "CREATE TABLE t (); BEGIN; ALTER TABLE t ADD c text; ..." in which
ALTER TABLE adds a TOAST relation. The TOAST relation will skip WAL, while
the table owning it will not. ALTER TABLE SET TABLESPACE will cause a table
@@ -860,7 +860,7 @@ Changes to a temp table are not WAL-logged, hence could reach disk in
advance of T1's commit, but we don't care since temp table contents don't
survive crashes anyway.
-Database writes that skip WAL for new relfilenodes are also safe. In these
+Database writes that skip WAL for new relfilenumbers are also safe. In these
cases it's entirely possible for the data to reach disk before T1's commit,
because T1 will fsync it down to disk without any sort of interlock. However,
all these paths are designed to write data that no other transaction can see
diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel
index 99c588d..e486bff 100644
--- a/src/backend/access/transam/README.parallel
+++ b/src/backend/access/transam/README.parallel
@@ -126,7 +126,7 @@ worker. This includes:
an index that is currently being rebuilt.
- Active relmapper.c mapping state. This is needed to allow consistent
- answers when fetching the current relfilenode for relation oids of
+ answers when fetching the current relfilenumber for relation oids of
mapped relations.
To prevent unprincipled deadlocks when running in parallel mode, this code
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 75551f6..41b31c5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -204,7 +204,7 @@ static void RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -215,7 +215,7 @@ static void RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid);
@@ -951,8 +951,8 @@ TwoPhaseGetDummyProc(TransactionId xid, bool lock_held)
*
* 1. TwoPhaseFileHeader
* 2. TransactionId[] (subtransactions)
- * 3. RelFileNode[] (files to be deleted at commit)
- * 4. RelFileNode[] (files to be deleted at abort)
+ * 3. RelFileLocator[] (files to be deleted at commit)
+ * 4. RelFileLocator[] (files to be deleted at abort)
* 5. SharedInvalidationMessage[] (inval messages to be sent at commit)
* 6. TwoPhaseRecordOnDisk
* 7. ...
@@ -1047,8 +1047,8 @@ StartPrepare(GlobalTransaction gxact)
TransactionId xid = gxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
xl_xact_stats_item *abortstats = NULL;
xl_xact_stats_item *commitstats = NULL;
SharedInvalidationMessage *invalmsgs;
@@ -1102,12 +1102,12 @@ StartPrepare(GlobalTransaction gxact)
}
if (hdr.ncommitrels > 0)
{
- save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileNode));
+ save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileLocator));
pfree(commitrels);
}
if (hdr.nabortrels > 0)
{
- save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileNode));
+ save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileLocator));
pfree(abortrels);
}
if (hdr.ncommitstats > 0)
@@ -1489,9 +1489,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
TwoPhaseFileHeader *hdr;
TransactionId latestXid;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
- RelFileNode *delrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
+ RelFileLocator *delrels;
int ndelrels;
xl_xact_stats_item *commitstats;
xl_xact_stats_item *abortstats;
@@ -1525,10 +1525,10 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
bufptr += MAXALIGN(hdr->gidlen);
children = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- commitrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- abortrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ commitrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ abortrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
commitstats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
abortstats = (xl_xact_stats_item *) bufptr;
@@ -2100,8 +2100,8 @@ RecoverPreparedTransactions(void)
bufptr += MAXALIGN(hdr->gidlen);
subxids = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->nabortstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->ninvalmsgs * sizeof(SharedInvalidationMessage));
@@ -2285,7 +2285,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -2383,7 +2383,7 @@ RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..849a7ce 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -521,7 +521,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
+ * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
* catalog/catalog.c.
*/
Oid
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 47d80b0..9379723 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1282,7 +1282,7 @@ RecordTransactionCommit(void)
bool markXidCommitted = TransactionIdIsValid(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int nchildren;
TransactionId *children;
int ndroppedstats = 0;
@@ -1705,7 +1705,7 @@ RecordTransactionAbort(bool isSubXact)
TransactionId xid = GetCurrentTransactionIdIfAny();
TransactionId latestXid;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int ndroppedstats = 0;
xl_xact_stats_item *droppedstats = NULL;
int nchildren;
@@ -5586,7 +5586,7 @@ xactGetCommittedChildren(TransactionId **ptr)
XLogRecPtr
XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int nmsgs, SharedInvalidationMessage *msgs,
bool relcacheInval,
@@ -5597,7 +5597,7 @@ XactLogCommitRecord(TimestampTz commit_time,
xl_xact_xinfo xl_xinfo;
xl_xact_dbinfo xl_dbinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_invals xl_invals;
xl_xact_twophase xl_twophase;
@@ -5651,8 +5651,8 @@ XactLogCommitRecord(TimestampTz commit_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5710,12 +5710,12 @@ XactLogCommitRecord(TimestampTz commit_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -5758,7 +5758,7 @@ XactLogCommitRecord(TimestampTz commit_time,
XLogRecPtr
XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int xactflags, TransactionId twophase_xid,
const char *twophase_gid)
@@ -5766,7 +5766,7 @@ XactLogAbortRecord(TimestampTz abort_time,
xl_xact_abort xlrec;
xl_xact_xinfo xl_xinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_twophase xl_twophase;
xl_xact_dbinfo xl_dbinfo;
@@ -5800,8 +5800,8 @@ XactLogAbortRecord(TimestampTz abort_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5864,12 +5864,12 @@ XactLogAbortRecord(TimestampTz abort_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -6010,7 +6010,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
XLogFlush(lsn);
/* Make sure files supposed to be dropped are dropped */
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
@@ -6121,7 +6121,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid,
*/
XLogFlush(lsn);
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 2ce9be2..ec27d36 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -70,7 +70,7 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ RelFileLocator rlocator; /* identifies the relation and block */
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -257,7 +257,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->rlocator, ®buf->forkno, ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -278,7 +278,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -293,7 +293,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
+XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
BlockNumber blknum, Page page, uint8 flags)
{
registered_buffer *regbuf;
@@ -308,7 +308,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
- regbuf->rnode = *rnode;
+ regbuf->rlocator = *rlocator;
regbuf->forkno = forknum;
regbuf->block = blknum;
regbuf->page = page;
@@ -331,7 +331,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -768,7 +768,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && RelFileLocatorEquals(regbuf->rlocator, prev_regbuf->rlocator))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -793,8 +793,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
- memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
- scratch += sizeof(RelFileNode);
+ memcpy(scratch, ®buf->rlocator, sizeof(RelFileLocator));
+ scratch += sizeof(RelFileLocator);
}
memcpy(scratch, ®buf->block, sizeof(BlockNumber));
scratch += sizeof(BlockNumber);
@@ -1031,7 +1031,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags = 0;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkno;
BlockNumber blkno;
@@ -1058,8 +1058,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &rlocator, &forkno, &blkno);
+ XLogRegisterBlock(0, &rlocator, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1080,7 +1080,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
+log_newpage(RelFileLocator *rlocator, ForkNumber forkNum, BlockNumber blkno,
Page page, bool page_std)
{
int flags;
@@ -1091,7 +1091,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, rlocator, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1112,7 +1112,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
* because we can write multiple pages in a single WAL record.
*/
void
-log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, Page *pages, bool page_std)
{
int flags;
@@ -1142,7 +1142,7 @@ log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
nbatch = 0;
while (nbatch < XLR_MAX_BLOCK_ID && i < num_pages)
{
- XLogRegisterBlock(nbatch, rnode, forkNum, blknos[i], pages[i], flags);
+ XLogRegisterBlock(nbatch, rlocator, forkNum, blknos[i], pages[i], flags);
i++;
nbatch++;
}
@@ -1177,16 +1177,16 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
BlockNumber blkno;
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &rlocator, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rlocator, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 959e409..c2bf2c5 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -138,7 +138,7 @@ struct XLogPrefetcher
dlist_head filter_queue;
/* Book-keeping to avoid repeat prefetches. */
- RelFileNode recent_rnode[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
+ RelFileLocator recent_rlocator[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
BlockNumber recent_block[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
int recent_idx;
@@ -161,7 +161,7 @@ struct XLogPrefetcher
*/
typedef struct XLogPrefetcherFilter
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
XLogRecPtr filter_until_replayed;
BlockNumber filter_from_block;
dlist_node link;
@@ -187,11 +187,11 @@ typedef struct XLogPrefetchStats
} XLogPrefetchStats;
static inline void XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno,
XLogRecPtr lsn);
static inline bool XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno);
static inline void XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher,
XLogRecPtr replaying_lsn);
@@ -365,7 +365,7 @@ XLogPrefetcherAllocate(XLogReaderState *reader)
{
XLogPrefetcher *prefetcher;
static HASHCTL hash_table_ctl = {
- .keysize = sizeof(RelFileNode),
+ .keysize = sizeof(RelFileLocator),
.entrysize = sizeof(XLogPrefetcherFilter)
};
@@ -568,22 +568,22 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
xl_dbase_create_file_copy_rec *xlrec =
(xl_dbase_create_file_copy_rec *) record->main_data;
- RelFileNode rnode = {InvalidOid, xlrec->db_id, InvalidOid};
+ RelFileLocator rlocator = {InvalidOid, xlrec->db_id, InvalidOid};
/*
* Don't try to prefetch anything in this database until
* it has been created, or we might confuse the blocks of
- * different generations, if a database OID or relfilenode
- * is reused. It's also more efficient than discovering
- * that relations don't exist on disk yet with ENOENT
- * errors.
+ * different generations, if a database OID or
+ * relfilenumber is reused. It's also more efficient than
+ * discovering that relations don't exist on disk yet with
+ * ENOENT errors.
*/
- XLogPrefetcherAddFilter(prefetcher, rnode, 0, record->lsn);
+ XLogPrefetcherAddFilter(prefetcher, rlocator, 0, record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in database %u until %X/%X is replayed due to raw file copy",
- rnode.dbNode,
+ rlocator.dbOid,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -601,19 +601,19 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't prefetch anything for this whole relation
* until it has been created. Otherwise we might
* confuse the blocks of different generations, if a
- * relfilenode is reused. This also avoids the need
+ * relfilenumber is reused. This also avoids the need
* to discover the problem via extra syscalls that
* report ENOENT.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -627,16 +627,16 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't consider prefetching anything in the truncated
* range until the truncation has been performed.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator,
xlrec->blkno,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
xlrec->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
@@ -688,7 +688,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
}
/* Should we skip prefetching this block due to a filter? */
- if (XLogPrefetcherIsFiltered(prefetcher, block->rnode, block->blkno))
+ if (XLogPrefetcherIsFiltered(prefetcher, block->rlocator, block->blkno))
{
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -698,7 +698,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
for (int i = 0; i < XLOGPREFETCHER_SEQ_WINDOW_SIZE; ++i)
{
if (block->blkno == prefetcher->recent_block[i] &&
- RelFileNodeEquals(block->rnode, prefetcher->recent_rnode[i]))
+ RelFileLocatorEquals(block->rlocator, prefetcher->recent_rlocator[i]))
{
/*
* XXX If we also remembered where it was, we could set
@@ -709,7 +709,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
return LRQ_NEXT_NO_IO;
}
}
- prefetcher->recent_rnode[prefetcher->recent_idx] = block->rnode;
+ prefetcher->recent_rlocator[prefetcher->recent_idx] = block->rlocator;
prefetcher->recent_block[prefetcher->recent_idx] = block->blkno;
prefetcher->recent_idx =
(prefetcher->recent_idx + 1) % XLOGPREFETCHER_SEQ_WINDOW_SIZE;
@@ -719,7 +719,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* same relation (with some scheme to handle invalidations
* safely), but for now we'll call smgropen() every time.
*/
- reln = smgropen(block->rnode, InvalidBackendId);
+ reln = smgropen(block->rlocator, InvalidBackendId);
/*
* If the relation file doesn't exist on disk, for example because
@@ -733,12 +733,12 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, 0,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -754,13 +754,13 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, block->blkno,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, block->blkno,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -793,9 +793,9 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
*/
elog(ERROR,
"could not prefetch relation %u/%u/%u block %u",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno);
}
}
@@ -852,17 +852,17 @@ pg_stat_get_recovery_prefetch(PG_FUNCTION_ARGS)
}
/*
- * Don't prefetch any blocks >= 'blockno' from a given 'rnode', until 'lsn'
+ * Don't prefetch any blocks >= 'blockno' from a given 'rlocator', until 'lsn'
* has been replayed.
*/
static inline void
-XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno, XLogRecPtr lsn)
{
XLogPrefetcherFilter *filter;
bool found;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_ENTER, &found);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_ENTER, &found);
if (!found)
{
/*
@@ -875,7 +875,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
else
{
/*
- * We were already filtering this rnode. Extend the filter's lifetime
+ * We were already filtering this rlocator. Extend the filter's lifetime
* to cover this WAL record, but leave the lower of the block numbers
* there because we don't want to have to track individual blocks.
*/
@@ -890,7 +890,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
* Have we replayed any records that caused us to begin filtering a block
* range? That means that relations should have been created, extended or
* dropped as required, so we can stop filtering out accesses to a given
- * relfilenode.
+ * relfilenumber.
*/
static inline void
XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_lsn)
@@ -913,7 +913,7 @@ XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_l
* Check if a given block should be skipped due to a filter.
*/
static inline bool
-XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno)
{
/*
@@ -925,13 +925,13 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
XLogPrefetcherFilter *filter;
/* See if the block range is filtered. */
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter && filter->filter_from_block <= blockno)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
#endif
@@ -939,15 +939,15 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
}
/* See if the whole database is filtered. */
- rnode.relNode = InvalidOid;
- rnode.spcNode = InvalidOid;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ rlocator.relNumber = InvalidOid;
+ rlocator.spcOid = InvalidOid;
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
return true;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cf5db23..f3dc4b7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1638,7 +1638,7 @@ DecodeXLogRecord(XLogReaderState *state,
char *out;
uint32 remaining;
uint32 datatotal;
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
uint8 block_id;
decoded->header = *record;
@@ -1823,12 +1823,12 @@ DecodeXLogRecord(XLogReaderState *state,
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
- COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
- rnode = &blk->rnode;
+ COPY_HEADER_FIELD(&blk->rlocator, sizeof(RelFileLocator));
+ rlocator = &blk->rlocator;
}
else
{
- if (rnode == NULL)
+ if (rlocator == NULL)
{
report_invalid_record(state,
"BKPBLOCK_SAME_REL set but no previous rel at %X/%X",
@@ -1836,7 +1836,7 @@ DecodeXLogRecord(XLogReaderState *state,
goto err;
}
- blk->rnode = *rnode;
+ blk->rlocator = *rlocator;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
}
@@ -1926,10 +1926,11 @@ err:
*/
void
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
+ RelFileLocator *rlocator, ForkNumber *forknum,
+ BlockNumber *blknum)
{
- if (!XLogRecGetBlockTagExtended(record, block_id, rnode, forknum, blknum,
- NULL))
+ if (!XLogRecGetBlockTagExtended(record, block_id, rlocator, forknum,
+ blknum, NULL))
{
#ifndef FRONTEND
elog(ERROR, "failed to locate backup block with ID %d in WAL record",
@@ -1945,13 +1946,13 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
* Returns information about the block that a block reference refers to,
* optionally including the buffer that the block may already be in.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
+ * If the WAL record contains a block reference with the given ID, *rlocator,
* *forknum, *blknum and *prefetch_buffer are filled in (if not NULL), and
* returns true. Otherwise returns false.
*/
bool
XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer)
{
@@ -1961,8 +1962,8 @@ XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->record->blocks[block_id];
- if (rnode)
- *rnode = bkpb->rnode;
+ if (rlocator)
+ *rlocator = bkpb->rlocator;
if (forknum)
*forknum = bkpb->forknum;
if (blknum)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6eba626..8306518 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2166,24 +2166,26 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
/* decode block references */
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (forknum != MAIN_FORKNUM)
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
forknum,
blk);
else
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
@@ -2285,7 +2287,7 @@ static void
verifyBackupPageConsistency(XLogReaderState *record)
{
RmgrData rmgr = GetRmgr(XLogRecGetRmid(record));
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
int block_id;
@@ -2302,7 +2304,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
Page page;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -2327,7 +2329,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(rlocator, forknum, blkno,
RBM_NORMAL_NO_LOG,
InvalidBuffer);
if (!BufferIsValid(buf))
@@ -2377,7 +2379,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
{
elog(FATAL,
"inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 4851669..42a0f51 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -67,7 +67,7 @@ HotStandbyState standbyState = STANDBY_DISABLED;
*/
typedef struct xl_invalid_page_key
{
- RelFileNode node; /* the relation */
+ RelFileLocator locator; /* the relation */
ForkNumber forkno; /* the fork number */
BlockNumber blkno; /* the page */
} xl_invalid_page_key;
@@ -86,10 +86,10 @@ static int read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPa
/* Report a reference to an invalid page */
static void
-report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
+report_invalid_page(int elevel, RelFileLocator locator, ForkNumber forkno,
BlockNumber blkno, bool present)
{
- char *path = relpathperm(node, forkno);
+ char *path = relpathperm(locator, forkno);
if (present)
elog(elevel, "page %u of relation %s is uninitialized",
@@ -102,7 +102,7 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
/* Log a reference to an invalid page */
static void
-log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
+log_invalid_page(RelFileLocator locator, ForkNumber forkno, BlockNumber blkno,
bool present)
{
xl_invalid_page_key key;
@@ -119,7 +119,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
*/
if (reachedConsistency)
{
- report_invalid_page(WARNING, node, forkno, blkno, present);
+ report_invalid_page(WARNING, locator, forkno, blkno, present);
elog(ignore_invalid_pages ? WARNING : PANIC,
"WAL contains references to invalid pages");
}
@@ -130,7 +130,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
* something about the XLOG record that generated the reference).
*/
if (message_level_is_interesting(DEBUG1))
- report_invalid_page(DEBUG1, node, forkno, blkno, present);
+ report_invalid_page(DEBUG1, locator, forkno, blkno, present);
if (invalid_page_tab == NULL)
{
@@ -147,7 +147,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
}
/* we currently assume xl_invalid_page_key contains no padding */
- key.node = node;
+ key.locator = locator;
key.forkno = forkno;
key.blkno = blkno;
hentry = (xl_invalid_page *)
@@ -166,7 +166,8 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
/* Forget any invalid pages >= minblkno, because they've been dropped */
static void
-forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
+forget_invalid_pages(RelFileLocator locator, ForkNumber forkno,
+ BlockNumber minblkno)
{
HASH_SEQ_STATUS status;
xl_invalid_page *hentry;
@@ -178,13 +179,13 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (RelFileNodeEquals(hentry->key.node, node) &&
+ if (RelFileLocatorEquals(hentry->key.locator, locator) &&
hentry->key.forkno == forkno &&
hentry->key.blkno >= minblkno)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, forkno);
+ char *path = relpathperm(hentry->key.locator, forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -213,11 +214,11 @@ forget_invalid_pages_db(Oid dbid)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (hentry->key.node.dbNode == dbid)
+ if (hentry->key.locator.dbOid == dbid)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, hentry->key.forkno);
+ char *path = relpathperm(hentry->key.locator, hentry->key.forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -261,7 +262,7 @@ XLogCheckInvalidPages(void)
*/
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- report_invalid_page(WARNING, hentry->key.node, hentry->key.forkno,
+ report_invalid_page(WARNING, hentry->key.locator, hentry->key.forkno,
hentry->key.blkno, hentry->present);
foundone = true;
}
@@ -356,7 +357,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Buffer *buf)
{
XLogRecPtr lsn = record->EndRecPtr;
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
Buffer prefetch_buffer;
@@ -364,7 +365,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
bool zeromode;
bool willinit;
- if (!XLogRecGetBlockTagExtended(record, block_id, &rnode, &forknum, &blkno,
+ if (!XLogRecGetBlockTagExtended(record, block_id, &rlocator, &forknum, &blkno,
&prefetch_buffer))
{
/* Caller specified a bogus block_id */
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK,
prefetch_buffer);
page = BufferGetPage(*buf);
@@ -418,7 +419,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode, prefetch_buffer);
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno, mode, prefetch_buffer);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -468,7 +469,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* they will be invisible to tools that need to know which pages are modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer)
{
@@ -481,14 +482,14 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* Do we have a clue where the buffer might be already? */
if (BufferIsValid(recent_buffer) &&
mode == RBM_NORMAL &&
- ReadRecentBuffer(rnode, forknum, blkno, recent_buffer))
+ ReadRecentBuffer(rlocator, forknum, blkno, recent_buffer))
{
buffer = recent_buffer;
goto recent_buffer_fast_path;
}
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(rlocator, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -505,7 +506,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
else
@@ -513,7 +514,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* hm, page doesn't exist in file */
if (mode == RBM_NORMAL)
{
- log_invalid_page(rnode, forknum, blkno, false);
+ log_invalid_page(rlocator, forknum, blkno, false);
return InvalidBuffer;
}
if (mode == RBM_NORMAL_NO_LOG)
@@ -530,7 +531,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum,
P_NEW, mode, NULL, true);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -540,7 +541,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
}
@@ -559,7 +560,7 @@ recent_buffer_fast_path:
if (PageIsNew(page))
{
ReleaseBuffer(buffer);
- log_invalid_page(rnode, forknum, blkno, true);
+ log_invalid_page(rlocator, forknum, blkno, true);
return InvalidBuffer;
}
}
@@ -594,7 +595,7 @@ typedef FakeRelCacheEntryData *FakeRelCacheEntry;
* Caller must free the returned entry with FreeFakeRelcacheEntry().
*/
Relation
-CreateFakeRelcacheEntry(RelFileNode rnode)
+CreateFakeRelcacheEntry(RelFileLocator rlocator)
{
FakeRelCacheEntry fakeentry;
Relation rel;
@@ -604,7 +605,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel = (Relation) fakeentry;
rel->rd_rel = &fakeentry->pgc;
- rel->rd_node = rnode;
+ rel->rd_locator = rlocator;
/*
* We will never be working with temp rels during recovery or while
@@ -615,18 +616,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
/* It must be a permanent table here */
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
- /* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ /* We don't know the name of the relation; use relfilelocator instead */
+ sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
+ * relation. Note that this is fairly bogus since relNumber may be
* different from the relation's OID. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
- rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
+ rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
rel->rd_smgr = NULL;
@@ -652,9 +653,9 @@ FreeFakeRelcacheEntry(Relation fakerel)
* any open "invalid-page" records for the relation.
*/
void
-XLogDropRelation(RelFileNode rnode, ForkNumber forknum)
+XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum)
{
- forget_invalid_pages(rnode, forknum, 0);
+ forget_invalid_pages(rlocator, forknum, 0);
}
/*
@@ -682,10 +683,10 @@ XLogDropDatabase(Oid dbid)
* We need to clean up any open "invalid-page" records for the dropped pages.
*/
void
-XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks)
{
- forget_invalid_pages(rnode, forkNum, nblocks);
+ forget_invalid_pages(rlocator, forkNum, nblocks);
}
/*
diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index e5cf1b3..a872199 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -287,9 +287,9 @@ Boot_DeclareIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = false;
stmt->primary = false;
stmt->isconstraint = false;
@@ -339,9 +339,9 @@ Boot_DeclareUniqueIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = true;
stmt->primary = false;
stmt->isconstraint = false;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index e784538..31d3d1c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,14 +481,14 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
+ * GetNewRelFileNumber
+ * Generate a new relfilenumber that is unique within the
* database of the given tablespace.
*
- * If the relfilenode will also be used as the relation's OID, pass the
+ * If the relfilenumber will also be used as the relation's OID, pass the
* opened pg_class catalog, and this routine will guarantee that the result
* is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
+ * as a relfilenumber for an existing relation, pass NULL for pg_class.
*
* As with GetNewOidWithIndex(), there is some theoretical risk of a race
* condition, but it doesn't seem worth worrying about.
@@ -497,16 +497,16 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
* created by bootstrap have preassigned OIDs, so there's no need.
*/
Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
{
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
char *rpath;
bool collides;
BackendId backend;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenode assignments during a binary-upgrade run should be
+ * relfilenumber assignments during a binary-upgrade run should be
* determined by commands in the dump script.
*/
Assert(!IsBinaryUpgrade);
@@ -526,15 +526,15 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
}
/* This logic should match RelationInitPhysicalAddr */
- rnode.node.spcNode = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rnode.node.dbNode = (rnode.node.spcNode == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
+ rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
/*
* The relpath will vary based on the backend ID, so we must initialize
* that properly here to make sure that any collisions based on filename
* are properly detected.
*/
- rnode.backend = backend;
+ rlocator.backend = backend;
do
{
@@ -542,13 +542,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
/* Generate the OID */
if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
Anum_pg_class_oid);
else
- rnode.node.relNode = GetNewObjectId();
+ rlocator.locator.relNumber = GetNewObjectId();
/* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ rpath = relpath(rlocator, MAIN_FORKNUM);
if (access(rpath, F_OK) == 0)
{
@@ -570,7 +570,7 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
pfree(rpath);
} while (collides);
- return rnode.node.relNode;
+ return rlocator.locator.relNumber;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 1803194..0d1af74 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -77,9 +77,9 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+Oid binary_upgrade_next_heap_pg_class_relfilenumber = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+Oid binary_upgrade_next_toast_pg_class_relfilenumber = InvalidOid;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -273,7 +273,7 @@ SystemAttributeByName(const char *attname)
* heap_create - Create an uncataloged heap relation
*
* Note API change: the caller must now always provide the OID
- * to use for the relation. The relfilenode may be (and in
+ * to use for the relation. The relfilenumber may be (and in
* the simplest cases is) left unspecified.
*
* create_storage indicates whether or not to create the storage.
@@ -289,7 +289,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ Oid relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -341,11 +341,11 @@ heap_create(const char *relname,
else
{
/*
- * If relfilenode is unspecified by the caller then create storage
+ * If relfilenumber is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!OidIsValid(relfilenumber))
+ relfilenumber = relid;
}
/*
@@ -368,7 +368,7 @@ heap_create(const char *relname,
tupDesc,
relid,
accessmtd,
- relfilenode,
+ relfilenumber,
reltablespace,
shared_relation,
mapped_relation,
@@ -385,11 +385,11 @@ heap_create(const char *relname,
if (create_storage)
{
if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind))
- table_relation_set_new_filenode(rel, &rel->rd_node,
- relpersistence,
- relfrozenxid, relminmxid);
+ table_relation_set_new_filelocator(rel, &rel->rd_locator,
+ relpersistence,
+ relfrozenxid, relminmxid);
else if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
- RelationCreateStorage(rel->rd_node, relpersistence, true);
+ RelationCreateStorage(rel->rd_locator, relpersistence, true);
else
Assert(false);
}
@@ -1069,7 +1069,7 @@ AddNewRelationType(const char *typeName,
* relkind: relkind for new rel
* relpersistence: rel's persistence status (permanent, temp, or unlogged)
* shared_relation: true if it's to be a shared relation
- * mapped_relation: true if the relation will use the relfilenode map
+ * mapped_relation: true if the relation will use the relfilenumber map
* oncommit: ON COMMIT marking (only relevant if it's a temp table)
* reloptions: reloptions in Datum form, or (Datum) 0 if none
* use_user_acl: true if should look for user-defined default permissions;
@@ -1115,7 +1115,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ Oid relfilenumber = InvalidOid;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1173,12 +1173,12 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(relid))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
/*
@@ -1196,13 +1196,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("toast relfilenode value not set when in binary upgrade mode")));
+ errmsg("toast relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_toast_pg_class_relfilenumber;
+ binary_upgrade_next_toast_pg_class_relfilenumber = InvalidOid;
}
}
else
@@ -1217,19 +1217,19 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("relfilenode value not set when in binary upgrade mode")));
+ errmsg("relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_heap_pg_class_relfilenumber;
+ binary_upgrade_next_heap_pg_class_relfilenumber = InvalidOid;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
+ relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
relpersistence);
}
@@ -1273,7 +1273,7 @@ heap_create_with_catalog(const char *relname,
relnamespace,
reltablespace,
relid,
- relfilenode,
+ relfilenumber,
accessmtd,
tupdesc,
relkind,
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bdd3c34..b49876a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,7 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+Oid binary_upgrade_next_index_pg_class_relfilenumber = InvalidOid;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,7 +662,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNumber: normally, pass InvalidOid to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -703,7 +703,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ Oid relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +735,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !OidIsValid(relFileNumber);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -751,7 +751,7 @@ index_create(Relation heapRelation,
/*
* The index will be in the same namespace as its parent table, and is
* shared across databases if and only if the parent is. Likewise, it
- * will use the relfilenode map if and only if the parent does; and it
+ * will use the relfilenumber map if and only if the parent does; and it
* inherits the parent's relpersistence.
*/
namespaceId = RelationGetNamespace(heapRelation);
@@ -902,12 +902,12 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(indexRelationId))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
if (!OidIsValid(binary_upgrade_next_index_pg_class_oid))
@@ -918,14 +918,14 @@ index_create(Relation heapRelation,
indexRelationId = binary_upgrade_next_index_pg_class_oid;
binary_upgrade_next_index_pg_class_oid = InvalidOid;
- /* Override the index relfilenode */
+ /* Override the index relfilenumber */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenumber)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("index relfilenode value not set when in binary upgrade mode")));
- relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ errmsg("index relfilenumber value not set when in binary upgrade mode")));
+ relFileNumber = binary_upgrade_next_index_pg_class_relfilenumber;
+ binary_upgrade_next_index_pg_class_relfilenumber = InvalidOid;
/*
* Note that we want create_storage = true for binary upgrade. The
@@ -937,7 +937,7 @@ index_create(Relation heapRelation,
else
{
indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
}
}
@@ -950,7 +950,7 @@ index_create(Relation heapRelation,
namespaceId,
tableSpaceId,
indexRelationId,
- relFileNode,
+ relFileNumber,
accessMethodObjectId,
indexTupDesc,
relkind,
@@ -1408,7 +1408,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidOid, /* relFileNumber */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
@@ -3024,7 +3024,7 @@ index_build(Relation heapRelation,
* it -- but we must first check whether one already exists. If, for
* example, an unlogged relation is truncated in the transaction that
* created it, or truncated twice in a subsequent transaction, the
- * relfilenode won't change, and nothing needs to be done here.
+ * relfilenumber won't change, and nothing needs to be done here.
*/
if (indexRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
!smgrexists(RelationGetSmgr(indexRelation), INIT_FORKNUM))
@@ -3681,7 +3681,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
* Schedule unlinking of the old index storage at transaction commit.
*/
RelationDropStorage(iRel);
- RelationAssumeNewRelfilenode(iRel);
+ RelationAssumeNewRelfilelocator(iRel);
/* Make sure the reltablespace change is visible */
CommandCounterIncrement();
@@ -3711,7 +3711,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
SetReindexProcessing(heapId, indexId);
/* Create a new physical relation for the index */
- RelationSetNewRelfilenode(iRel, persistence);
+ RelationSetNewRelfilenumber(iRel, persistence);
/* Initialize the index and rebuild */
/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index c06e414..37dd2b9 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -38,7 +38,7 @@
int wal_skip_threshold = 2048; /* in kilobytes */
/*
- * We keep a list of all relations (represented as RelFileNode values)
+ * We keep a list of all relations (represented as RelFileLocator values)
* that have been created or deleted in the current transaction. When
* a relation is created, we create the physical file immediately, but
* remember it so that we can delete the file again if the current
@@ -59,7 +59,7 @@ int wal_skip_threshold = 2048; /* in kilobytes */
typedef struct PendingRelDelete
{
- RelFileNode relnode; /* relation that may need to be deleted */
+ RelFileLocator rlocator; /* relation that may need to be deleted */
BackendId backend; /* InvalidBackendId if not a temp rel */
bool atCommit; /* T=delete at commit; F=delete at abort */
int nestLevel; /* xact nesting level of request */
@@ -68,7 +68,7 @@ typedef struct PendingRelDelete
typedef struct PendingRelSync
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
bool is_truncated; /* Has the file experienced truncation? */
} PendingRelSync;
@@ -81,7 +81,7 @@ static HTAB *pendingSyncHash = NULL;
* Queue an at-commit fsync.
*/
static void
-AddPendingSync(const RelFileNode *rnode)
+AddPendingSync(const RelFileLocator *rlocator)
{
PendingRelSync *pending;
bool found;
@@ -91,14 +91,14 @@ AddPendingSync(const RelFileNode *rnode)
{
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNode);
+ ctl.keysize = sizeof(RelFileLocator);
ctl.entrysize = sizeof(PendingRelSync);
ctl.hcxt = TopTransactionContext;
pendingSyncHash = hash_create("pending sync hash", 16, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
}
- pending = hash_search(pendingSyncHash, rnode, HASH_ENTER, &found);
+ pending = hash_search(pendingSyncHash, rlocator, HASH_ENTER, &found);
Assert(!found);
pending->is_truncated = false;
}
@@ -117,7 +117,7 @@ AddPendingSync(const RelFileNode *rnode)
* pass register_delete = false.
*/
SMgrRelation
-RelationCreateStorage(RelFileNode rnode, char relpersistence,
+RelationCreateStorage(RelFileLocator rlocator, char relpersistence,
bool register_delete)
{
SMgrRelation srel;
@@ -145,11 +145,11 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(rlocator, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
- log_smgrcreate(&srel->smgr_rnode.node, MAIN_FORKNUM);
+ log_smgrcreate(&srel->smgr_rlocator.locator, MAIN_FORKNUM);
/*
* Add the relation to the list of stuff to delete at abort, if we are
@@ -161,7 +161,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rnode;
+ pending->rlocator = rlocator;
pending->backend = backend;
pending->atCommit = false; /* delete if abort */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -172,7 +172,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
if (relpersistence == RELPERSISTENCE_PERMANENT && !XLogIsNeeded())
{
Assert(backend == InvalidBackendId);
- AddPendingSync(&rnode);
+ AddPendingSync(&rlocator);
}
return srel;
@@ -182,14 +182,14 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
* Perform XLogInsert of an XLOG_SMGR_CREATE record to WAL.
*/
void
-log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum)
+log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum)
{
xl_smgr_create xlrec;
/*
* Make an XLOG entry reporting the file creation.
*/
- xlrec.rnode = *rnode;
+ xlrec.rlocator = *rlocator;
xlrec.forkNum = forkNum;
XLogBeginInsert();
@@ -209,7 +209,7 @@ RelationDropStorage(Relation rel)
/* Add the relation to the list of stuff to delete at commit */
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rel->rd_node;
+ pending->rlocator = rel->rd_locator;
pending->backend = rel->rd_backend;
pending->atCommit = true; /* delete if commit */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -247,7 +247,7 @@ RelationDropStorage(Relation rel)
* No-op if the relation is not among those scheduled for deletion.
*/
void
-RelationPreserveStorage(RelFileNode rnode, bool atCommit)
+RelationPreserveStorage(RelFileLocator rlocator, bool atCommit)
{
PendingRelDelete *pending;
PendingRelDelete *prev;
@@ -257,7 +257,7 @@ RelationPreserveStorage(RelFileNode rnode, bool atCommit)
for (pending = pendingDeletes; pending != NULL; pending = next)
{
next = pending->next;
- if (RelFileNodeEquals(rnode, pending->relnode)
+ if (RelFileLocatorEquals(rlocator, pending->rlocator)
&& pending->atCommit == atCommit)
{
/* unlink and delete list entry */
@@ -369,7 +369,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
xl_smgr_truncate xlrec;
xlrec.blkno = nblocks;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_ALL;
XLogBeginInsert();
@@ -428,7 +428,7 @@ RelationPreTruncate(Relation rel)
return;
pending = hash_search(pendingSyncHash,
- &(RelationGetSmgr(rel)->smgr_rnode.node),
+ &(RelationGetSmgr(rel)->smgr_rlocator.locator),
HASH_FIND, NULL);
if (pending)
pending->is_truncated = true;
@@ -472,7 +472,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* We need to log the copied data in WAL iff WAL archiving/streaming is
* enabled AND it's a permanent relation. This gives the same answer as
* "RelationNeedsWAL(rel) || copying_initfork", because we know the
- * current operation created a new relfilenode.
+ * current operation created a new relfilelocator.
*/
use_wal = XLogIsNeeded() &&
(relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork);
@@ -496,8 +496,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* (errcontext callbacks shouldn't be risking any such thing, but
* people have been known to forget that rule.)
*/
- char *relpath = relpathbackend(src->smgr_rnode.node,
- src->smgr_rnode.backend,
+ char *relpath = relpathbackend(src->smgr_rlocator.locator,
+ src->smgr_rlocator.backend,
forkNum);
ereport(ERROR,
@@ -512,7 +512,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
PageSetChecksumInplace(page, blkno);
@@ -538,19 +538,19 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
}
/*
- * RelFileNodeSkippingWAL
- * Check if a BM_PERMANENT relfilenode is using WAL.
+ * RelFileLocatorSkippingWAL
+ * Check if a BM_PERMANENT relfilelocator is using WAL.
*
- * Changes of certain relfilenodes must not write WAL; see "Skipping WAL for
- * New RelFileNode" in src/backend/access/transam/README. Though it is known
- * from Relation efficiently, this function is intended for the code paths not
- * having access to Relation.
+ * Changes of certain relfilelocator must not write WAL; see "Skipping WAL for
+ * New RelFileLocator" in src/backend/access/transam/README. Though it is
+ * known from Relation efficiently, this function is intended for the code
+ * paths not having access to Relation.
*/
bool
-RelFileNodeSkippingWAL(RelFileNode rnode)
+RelFileLocatorSkippingWAL(RelFileLocator rlocator)
{
if (!pendingSyncHash ||
- hash_search(pendingSyncHash, &rnode, HASH_FIND, NULL) == NULL)
+ hash_search(pendingSyncHash, &rlocator, HASH_FIND, NULL) == NULL)
return false;
return true;
@@ -566,7 +566,7 @@ EstimatePendingSyncsSpace(void)
long entries;
entries = pendingSyncHash ? hash_get_num_entries(pendingSyncHash) : 0;
- return mul_size(1 + entries, sizeof(RelFileNode));
+ return mul_size(1 + entries, sizeof(RelFileLocator));
}
/*
@@ -581,57 +581,58 @@ SerializePendingSyncs(Size maxSize, char *startAddress)
HASH_SEQ_STATUS scan;
PendingRelSync *sync;
PendingRelDelete *delete;
- RelFileNode *src;
- RelFileNode *dest = (RelFileNode *) startAddress;
+ RelFileLocator *src;
+ RelFileLocator *dest = (RelFileLocator *) startAddress;
if (!pendingSyncHash)
goto terminate;
- /* Create temporary hash to collect active relfilenodes */
- ctl.keysize = sizeof(RelFileNode);
- ctl.entrysize = sizeof(RelFileNode);
+ /* Create temporary hash to collect active relfilelocators */
+ ctl.keysize = sizeof(RelFileLocator);
+ ctl.entrysize = sizeof(RelFileLocator);
ctl.hcxt = CurrentMemoryContext;
- tmphash = hash_create("tmp relfilenodes",
+ tmphash = hash_create("tmp relfilelocators",
hash_get_num_entries(pendingSyncHash), &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
- /* collect all rnodes from pending syncs */
+ /* collect all rlocator from pending syncs */
hash_seq_init(&scan, pendingSyncHash);
while ((sync = (PendingRelSync *) hash_seq_search(&scan)))
- (void) hash_search(tmphash, &sync->rnode, HASH_ENTER, NULL);
+ (void) hash_search(tmphash, &sync->rlocator, HASH_ENTER, NULL);
/* remove deleted rnodes */
for (delete = pendingDeletes; delete != NULL; delete = delete->next)
if (delete->atCommit)
- (void) hash_search(tmphash, (void *) &delete->relnode,
+ (void) hash_search(tmphash, (void *) &delete->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, tmphash);
- while ((src = (RelFileNode *) hash_seq_search(&scan)))
+ while ((src = (RelFileLocator *) hash_seq_search(&scan)))
*dest++ = *src;
hash_destroy(tmphash);
terminate:
- MemSet(dest, 0, sizeof(RelFileNode));
+ MemSet(dest, 0, sizeof(RelFileLocator));
}
/*
* RestorePendingSyncs
* Restore syncs within a parallel worker.
*
- * RelationNeedsWAL() and RelFileNodeSkippingWAL() must offer the correct
+ * RelationNeedsWAL() and RelFileLocatorSkippingWAL() must offer the correct
* answer to parallel workers. Only smgrDoPendingSyncs() reads the
* is_truncated field, at end of transaction. Hence, don't restore it.
*/
void
RestorePendingSyncs(char *startAddress)
{
- RelFileNode *rnode;
+ RelFileLocator *rlocator;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
- AddPendingSync(rnode);
+ for (rlocator = (RelFileLocator *) startAddress; rlocator->relNumber != 0;
+ rlocator++)
+ AddPendingSync(rlocator);
}
/*
@@ -677,7 +678,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(pending->rlocator, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -747,7 +748,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
/* Skip syncing nodes that smgrDoPendingDeletes() will delete. */
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
if (pending->atCommit)
- (void) hash_search(pendingSyncHash, (void *) &pending->relnode,
+ (void) hash_search(pendingSyncHash, (void *) &pending->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, pendingSyncHash);
@@ -758,7 +759,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
BlockNumber total_blocks = 0;
SMgrRelation srel;
- srel = smgropen(pendingsync->rnode, InvalidBackendId);
+ srel = smgropen(pendingsync->rlocator, InvalidBackendId);
/*
* We emit newpage WAL records for smaller relations.
@@ -832,7 +833,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* page including any unused space. ReadBufferExtended()
* counts some pgstat events; unfortunately, we discard them.
*/
- rel = CreateFakeRelcacheEntry(srel->smgr_rnode.node);
+ rel = CreateFakeRelcacheEntry(srel->smgr_rlocator.locator);
log_newpage_range(rel, fork, 0, n, false);
FreeFakeRelcacheEntry(rel);
}
@@ -852,7 +853,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* smgrGetPendingDeletes() -- Get a list of non-temp relations to be deleted.
*
* The return value is the number of relations scheduled for termination.
- * *ptr is set to point to a freshly-palloc'd array of RelFileNodes.
+ * *ptr is set to point to a freshly-palloc'd array of RelFileLocators.
* If there are no relations to be deleted, *ptr is set to NULL.
*
* Only non-temporary relations are included in the returned list. This is OK
@@ -866,11 +867,11 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* by upper-level transactions.
*/
int
-smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
+smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr)
{
int nestLevel = GetCurrentTransactionNestLevel();
int nrels;
- RelFileNode *rptr;
+ RelFileLocator *rptr;
PendingRelDelete *pending;
nrels = 0;
@@ -885,14 +886,14 @@ smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
*ptr = NULL;
return 0;
}
- rptr = (RelFileNode *) palloc(nrels * sizeof(RelFileNode));
+ rptr = (RelFileLocator *) palloc(nrels * sizeof(RelFileLocator));
*ptr = rptr;
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
{
if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit
&& pending->backend == InvalidBackendId)
{
- *rptr = pending->relnode;
+ *rptr = pending->rlocator;
rptr++;
}
}
@@ -967,7 +968,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -980,7 +981,7 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
@@ -1015,11 +1016,11 @@ smgr_redo(XLogReaderState *record)
nforks++;
/* Also tell xlogutils.c about it */
- XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+ XLogTruncateRelation(xlrec->rlocator, MAIN_FORKNUM, xlrec->blkno);
}
/* Prepare for truncation of FSM and VM too */
- rel = CreateFakeRelcacheEntry(xlrec->rnode);
+ rel = CreateFakeRelcacheEntry(xlrec->rlocator);
if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 &&
smgrexists(reln, FSM_FORKNUM))
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cea2c8b..a8de473 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -293,7 +293,7 @@ cluster_multiple_rels(List *rtcs, ClusterParams *params)
* cluster_rel
*
* This clusters the table by creating a new, clustered table and
- * swapping the relfilenodes of the new table and the old table, so
+ * swapping the relfilenumbers of the new table and the old table, so
* the OID of the original table is preserved. Thus we do not lose
* GRANT, inheritance nor references to this table (this was a bug
* in releases through 7.3).
@@ -1025,8 +1025,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
/*
* Swap the physical files of two given relations.
*
- * We swap the physical identity (reltablespace, relfilenode) while keeping the
- * same logical identities of the two relations. relpersistence is also
+ * We swap the physical identity (reltablespace, relfilenumber) while keeping
+ * the same logical identities of the two relations. relpersistence is also
* swapped, which is critical since it determines where buffers live for each
* relation.
*
@@ -1061,8 +1061,8 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
- relfilenode2;
+ Oid relfilenumber1,
+ relfilenumber2;
Oid swaptemp;
char swptmpchr;
@@ -1079,13 +1079,13 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
elog(ERROR, "cache lookup failed for relation %u", r2);
relform2 = (Form_pg_class) GETSTRUCT(reltup2);
- relfilenode1 = relform1->relfilenode;
- relfilenode2 = relform2->relfilenode;
+ relfilenumber1 = relform1->relfilenode;
+ relfilenumber2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (OidIsValid(relfilenumber1) && OidIsValid(relfilenumber2))
{
/*
- * Normal non-mapped relations: swap relfilenodes, reltablespaces,
+ * Normal non-mapped relations: swap relfilenumbers, reltablespaces,
* relpersistence
*/
Assert(!target_is_pg_class);
@@ -1120,7 +1120,7 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (OidIsValid(relfilenumber1) || OidIsValid(relfilenumber2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1148,12 +1148,12 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
/*
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
- relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ relfilenumber1 = RelationMapOidToFilenumber(r1, relform1->relisshared);
+ if (!OidIsValid(relfilenumber1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
- relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ relfilenumber2 = RelationMapOidToFilenumber(r2, relform2->relisshared);
+ if (!OidIsValid(relfilenumber2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
@@ -1161,15 +1161,15 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Send replacement mappings to relmapper. Note these won't actually
* take effect until CommandCounterIncrement.
*/
- RelationMapUpdateMap(r1, relfilenode2, relform1->relisshared, false);
- RelationMapUpdateMap(r2, relfilenode1, relform2->relisshared, false);
+ RelationMapUpdateMap(r1, relfilenumber2, relform1->relisshared, false);
+ RelationMapUpdateMap(r2, relfilenumber1, relform2->relisshared, false);
/* Pass OIDs of mapped r2 tables back to caller */
*mapped_tables++ = r2;
}
/*
- * Recognize that rel1's relfilenode (swapped from rel2) is new in this
+ * Recognize that rel1's relfilenumber (swapped from rel2) is new in this
* subtransaction. The rel2 storage (swapped from rel1) may or may not be
* new.
*/
@@ -1180,9 +1180,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
rel1 = relation_open(r1, NoLock);
rel2 = relation_open(r2, NoLock);
rel2->rd_createSubid = rel1->rd_createSubid;
- rel2->rd_newRelfilenodeSubid = rel1->rd_newRelfilenodeSubid;
- rel2->rd_firstRelfilenodeSubid = rel1->rd_firstRelfilenodeSubid;
- RelationAssumeNewRelfilenode(rel1);
+ rel2->rd_newRelfilelocatorSubid = rel1->rd_newRelfilelocatorSubid;
+ rel2->rd_firstRelfilelocatorSubid = rel1->rd_firstRelfilelocatorSubid;
+ RelationAssumeNewRelfilelocator(rel1);
relation_close(rel1, NoLock);
relation_close(rel2, NoLock);
}
@@ -1523,7 +1523,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
table_close(relRelation, RowExclusiveLock);
}
- /* Destroy new heap with old filenode */
+ /* Destroy new heap with old filenumber */
object.classId = RelationRelationId;
object.objectId = OIDNewHeap;
object.objectSubId = 0;
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 35a1d3a..c985fea 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -593,11 +593,11 @@ CopyFrom(CopyFromState cstate)
*/
if (RELKIND_HAS_STORAGE(cstate->rel->rd_rel->relkind) &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
- cstate->rel->rd_firstRelfilenodeSubid != InvalidSubTransactionId))
+ cstate->rel->rd_firstRelfilelocatorSubid != InvalidSubTransactionId))
ti_options |= TABLE_INSERT_SKIP_FSM;
/*
- * Optimize if new relfilenode was created in this subxact or one of its
+ * Optimize if new relfilenumber was created in this subxact or one of its
* committed children and we won't see those rows later as part of an
* earlier scan or command. The subxact test ensures that if this subxact
* aborts then the frozen rows won't be visible after xact cleanup. Note
@@ -640,7 +640,7 @@ CopyFrom(CopyFromState cstate)
errmsg("cannot perform COPY FREEZE because of prior transaction activity")));
if (cstate->rel->rd_createSubid != GetCurrentSubTransactionId() &&
- cstate->rel->rd_newRelfilenodeSubid != GetCurrentSubTransactionId())
+ cstate->rel->rd_newRelfilelocatorSubid != GetCurrentSubTransactionId())
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction")));
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index f269168..3982097 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -101,7 +101,7 @@ typedef struct
*/
typedef struct CreateDBRelInfo
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
Oid reloid; /* relation oid */
bool permanent; /* relation is permanent or unlogged */
} CreateDBRelInfo;
@@ -127,7 +127,7 @@ static void CreateDatabaseUsingWalLog(Oid src_dboid, Oid dboid, Oid src_tsid,
static List *ScanSourceDatabasePgClass(Oid srctbid, Oid srcdbid, char *srcpath);
static List *ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid,
Oid dbid, char *srcpath,
- List *rnodelist, Snapshot snapshot);
+ List *rlocatorlist, Snapshot snapshot);
static CreateDBRelInfo *ScanSourceDatabasePgClassTuple(HeapTupleData *tuple,
Oid tbid, Oid dbid,
char *srcpath);
@@ -147,12 +147,12 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
{
char *srcpath;
char *dstpath;
- List *rnodelist = NULL;
+ List *rlocatorlist = NULL;
ListCell *cell;
LockRelId srcrelid;
LockRelId dstrelid;
- RelFileNode srcrnode;
- RelFileNode dstrnode;
+ RelFileLocator srcrlocator;
+ RelFileLocator dstrlocator;
CreateDBRelInfo *relinfo;
/* Get source and destination database paths. */
@@ -165,9 +165,9 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
/* Copy relmap file from source database to the destination database. */
RelationMapCopy(dst_dboid, dst_tsid, srcpath, dstpath);
- /* Get list of relfilenodes to copy from the source database. */
- rnodelist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
- Assert(rnodelist != NIL);
+ /* Get list of relfilelocators to copy from the source database. */
+ rlocatorlist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
+ Assert(rlocatorlist != NIL);
/*
* Database IDs will be the same for all relations so set them before
@@ -176,11 +176,11 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
srcrelid.dbId = src_dboid;
dstrelid.dbId = dst_dboid;
- /* Loop over our list of relfilenodes and copy each one. */
- foreach(cell, rnodelist)
+ /* Loop over our list of relfilelocators and copy each one. */
+ foreach(cell, rlocatorlist)
{
relinfo = lfirst(cell);
- srcrnode = relinfo->rnode;
+ srcrlocator = relinfo->rlocator;
/*
* If the relation is from the source db's default tablespace then we
@@ -188,13 +188,13 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
* Otherwise, we need to create in the same tablespace as it is in the
* source database.
*/
- if (srcrnode.spcNode == src_tsid)
- dstrnode.spcNode = dst_tsid;
+ if (srcrlocator.spcOid == src_tsid)
+ dstrlocator.spcOid = dst_tsid;
else
- dstrnode.spcNode = srcrnode.spcNode;
+ dstrlocator.spcOid = srcrlocator.spcOid;
- dstrnode.dbNode = dst_dboid;
- dstrnode.relNode = srcrnode.relNode;
+ dstrlocator.dbOid = dst_dboid;
+ dstrlocator.relNumber = srcrlocator.relNumber;
/*
* Acquire locks on source and target relations before copying.
@@ -210,7 +210,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
LockRelationId(&dstrelid, AccessShareLock);
/* Copy relation storage from source to the destination. */
- CreateAndCopyRelationData(srcrnode, dstrnode, relinfo->permanent);
+ CreateAndCopyRelationData(srcrlocator, dstrlocator, relinfo->permanent);
/* Release the relation locks. */
UnlockRelationId(&srcrelid, AccessShareLock);
@@ -219,7 +219,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
pfree(srcpath);
pfree(dstpath);
- list_free_deep(rnodelist);
+ list_free_deep(rlocatorlist);
}
/*
@@ -246,31 +246,31 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
static List *
ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenode;
+ Oid relfilenumber;
Page page;
- List *rnodelist = NIL;
+ List *rlocatorlist = NIL;
LockRelId relid;
Relation rel;
Snapshot snapshot;
BufferAccessStrategy bstrategy;
- /* Get pg_class relfilenode. */
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- RelationRelationId);
+ /* Get pg_class relfilenumber. */
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ RelationRelationId);
/* Don't read data into shared_buffers without holding a relation lock. */
relid.dbId = dbid;
relid.relId = RelationRelationId;
LockRelationId(&relid, AccessShareLock);
- /* Prepare a RelFileNode for the pg_class relation. */
- rnode.spcNode = tbid;
- rnode.dbNode = dbid;
- rnode.relNode = relfilenode;
+ /* Prepare a RelFileLocator for the pg_class relation. */
+ rlocator.spcOid = tbid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = relfilenumber;
/*
* We can't use a real relcache entry for a relation in some other
@@ -279,7 +279,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- rel = CreateFakeRelcacheEntry(rnode);
+ rel = CreateFakeRelcacheEntry(rlocator);
nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
FreeFakeRelcacheEntry(rel);
@@ -299,7 +299,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
CHECK_FOR_INTERRUPTS();
- buf = ReadBufferWithoutRelcache(rnode, MAIN_FORKNUM, blkno,
+ buf = ReadBufferWithoutRelcache(rlocator, MAIN_FORKNUM, blkno,
RBM_NORMAL, bstrategy, false);
LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -310,9 +310,9 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
continue;
}
- /* Append relevant pg_class tuples for current page to rnodelist. */
- rnodelist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
- srcpath, rnodelist,
+ /* Append relevant pg_class tuples for current page to rlocatorlist. */
+ rlocatorlist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
+ srcpath, rlocatorlist,
snapshot);
UnlockReleaseBuffer(buf);
@@ -321,16 +321,16 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
/* Release relation lock. */
UnlockRelationId(&relid, AccessShareLock);
- return rnodelist;
+ return rlocatorlist;
}
/*
* Scan one page of the source database's pg_class relation and add relevant
- * entries to rnodelist. The return value is the updated list.
+ * entries to rlocatorlist. The return value is the updated list.
*/
static List *
ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
- char *srcpath, List *rnodelist,
+ char *srcpath, List *rlocatorlist,
Snapshot snapshot)
{
BlockNumber blkno = BufferGetBlockNumber(buf);
@@ -376,11 +376,11 @@ ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
relinfo = ScanSourceDatabasePgClassTuple(&tuple, tbid, dbid,
srcpath);
if (relinfo != NULL)
- rnodelist = lappend(rnodelist, relinfo);
+ rlocatorlist = lappend(rlocatorlist, relinfo);
}
}
- return rnodelist;
+ return rlocatorlist;
}
/*
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenode = InvalidOid;
+ Oid relfilenumber = InvalidOid;
classForm = (Form_pg_class) GETSTRUCT(tuple);
@@ -418,29 +418,29 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
return NULL;
/*
- * If relfilenode is valid then directly use it. Otherwise, consult the
+ * If relfilenumber is valid then directly use it. Otherwise, consult the
* relmap.
*/
if (OidIsValid(classForm->relfilenode))
- relfilenode = classForm->relfilenode;
+ relfilenumber = classForm->relfilenode;
else
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- classForm->oid);
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ classForm->oid);
- /* We must have a valid relfilenode oid. */
- if (!OidIsValid(relfilenode))
- elog(ERROR, "relation with OID %u does not have a valid relfilenode",
+ /* We must have a valid relfilenumber oid. */
+ if (!OidIsValid(relfilenumber))
+ elog(ERROR, "relation with OID %u does not have a valid relfilenumber",
classForm->oid);
/* Prepare a rel info element and add it to the list. */
relinfo = (CreateDBRelInfo *) palloc(sizeof(CreateDBRelInfo));
if (OidIsValid(classForm->reltablespace))
- relinfo->rnode.spcNode = classForm->reltablespace;
+ relinfo->rlocator.spcOid = classForm->reltablespace;
else
- relinfo->rnode.spcNode = tbid;
+ relinfo->rlocator.spcOid = tbid;
- relinfo->rnode.dbNode = dbid;
- relinfo->rnode.relNode = relfilenode;
+ relinfo->rlocator.dbOid = dbid;
+ relinfo->rlocator.relNumber = relfilenumber;
relinfo->reloid = classForm->oid;
/* Temporary relations were rejected above. */
@@ -2867,8 +2867,8 @@ remove_dbtablespaces(Oid db_id)
* try to remove that already-existing subdirectory during the cleanup in
* remove_dbtablespaces. Nuking existing files seems like a bad idea, so
* instead we make this extra check before settling on the OID of the new
- * database. This exactly parallels what GetNewRelFileNode() does for table
- * relfilenode values.
+ * database. This exactly parallels what GetNewRelFileNumber() does for table
+ * relfilenumber values.
*/
static bool
check_db_file_conflict(Oid db_id)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index eac13ac..9754585 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1093,10 +1093,10 @@ DefineIndex(Oid relationId,
}
/*
- * A valid stmt->oldNode implies that we already have a built form of the
+ * A valid stmt->oldNumber implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!OidIsValid(stmt->oldNumber) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1138,7 +1138,7 @@ DefineIndex(Oid relationId,
indexRelationId =
index_create(rel, indexRelationName, indexRelationId, parentIndexId,
parentConstraintId,
- stmt->oldNode, indexInfo, indexColNames,
+ stmt->oldNumber, indexInfo, indexColNames,
accessMethodId, tablespaceId,
collationObjectId, classObjectId,
coloptions, reloptions,
@@ -1348,15 +1348,15 @@ DefineIndex(Oid relationId,
* We can't use the same index name for the child index,
* so clear idxname to let the recursive invocation choose
* a new name. Likewise, the existing target relation
- * field is wrong, and if indexOid or oldNode are set,
+ * field is wrong, and if indexOid or oldNumber are set,
* they mustn't be applied to the child either.
*/
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNumber = InvalidOid;
childStmt->oldCreateSubid = InvalidSubTransactionId;
- childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ childStmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
/*
* Adjust any Vars (both in expressions and in the index's
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106..9ac0383 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -118,7 +118,7 @@ SetMatViewPopulatedState(Relation relation, bool newstate)
* ExecRefreshMatView -- execute a REFRESH MATERIALIZED VIEW command
*
* This refreshes the materialized view by creating a new table and swapping
- * the relfilenodes of the new table and the old materialized view, so the OID
+ * the relfilenumbers of the new table and the old materialized view, so the OID
* of the original materialized view is preserved. Thus we do not lose GRANT
* nor references to this materialized view.
*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ddf219b..faee605 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -75,7 +75,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ Oid filenumber; /* last seen relfilenumber of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
@@ -255,7 +255,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
*
* The change is made transactionally, so that on failure of the current
* transaction, the sequence will be restored to its previous state.
- * We do that by creating a whole new relfilenode for the sequence; so this
+ * We do that by creating a whole new relfilenumber for the sequence; so this
* works much like the rewriting forms of ALTER TABLE.
*
* Caller is assumed to have acquired AccessExclusiveLock on the sequence,
@@ -310,7 +310,7 @@ ResetSequence(Oid seq_relid)
/*
* Create a new storage file for the sequence.
*/
- RelationSetNewRelfilenode(seq_rel, seq_rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seq_rel, seq_rel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -347,9 +347,9 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
{
SMgrRelation srel;
- srel = smgropen(rel->rd_node, InvalidBackendId);
+ srel = smgropen(rel->rd_locator, InvalidBackendId);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
+ log_smgrcreate(&rel->rd_locator, INIT_FORKNUM);
fill_seq_fork_with_data(rel, tuple, INIT_FORKNUM);
FlushRelationBuffers(rel);
smgrclose(srel);
@@ -418,7 +418,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = rel->rd_node;
+ xlrec.locator = rel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) tuple->t_data, tuple->t_len);
@@ -509,7 +509,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
* Create a new storage file for the sequence, making the state
* changes transactional.
*/
- RelationSetNewRelfilenode(seqrel, seqrel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seqrel, seqrel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -557,7 +557,7 @@ SequenceChangePersistence(Oid relid, char newrelpersistence)
GetTopTransactionId();
(void) read_seq_tuple(seqrel, &buf, &seqdatatuple);
- RelationSetNewRelfilenode(seqrel, newrelpersistence);
+ RelationSetNewRelfilenumber(seqrel, newrelpersistence);
fill_seq_with_data(seqrel, &seqdatatuple);
UnlockReleaseBuffer(buf);
@@ -836,7 +836,7 @@ nextval_internal(Oid relid, bool check_permissions)
seq->is_called = true;
seq->log_cnt = 0;
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1023,7 +1023,7 @@ do_setval(Oid relid, int64 next, bool iscalled)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1147,7 +1147,7 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
if (!found)
{
/* relid already filled in */
- elm->filenode = InvalidOid;
+ elm->filenumber = InvalidOid;
elm->lxid = InvalidLocalTransactionId;
elm->last_valid = false;
elm->last = elm->cached = 0;
@@ -1169,9 +1169,9 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
* discard any cached-but-unissued values. We do not touch the currval()
* state, however.
*/
- if (seqrel->rd_rel->relfilenode != elm->filenode)
+ if (seqrel->rd_rel->relfilenode != elm->filenumber)
{
- elm->filenode = seqrel->rd_rel->relfilenode;
+ elm->filenumber = seqrel->rd_rel->relfilenode;
elm->cached = elm->last;
}
@@ -1254,7 +1254,8 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
* changed. This allows ALTER SEQUENCE to behave transactionally. Currently,
* the only option that doesn't cause that is OWNED BY. It's *necessary* for
* ALTER SEQUENCE OWNED BY to not rewrite the sequence, because that would
- * break pg_upgrade by causing unwanted changes in the sequence's relfilenode.
+ * break pg_upgrade by causing unwanted changes in the sequence's
+ * relfilenumber.
*/
static void
init_params(ParseState *pstate, List *options, bool for_identity,
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 2de0eba..72a5b88 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -596,7 +596,7 @@ static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
static ObjectAddress ATExecSetCompression(AlteredTableInfo *tab, Relation rel,
const char *column, Node *newValue, LOCKMODE lockmode);
-static void index_copy_data(Relation rel, RelFileNode newrnode);
+static void index_copy_data(Relation rel, RelFileLocator newrlocator);
static const char *storage_name(char c);
static void RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid,
@@ -1986,12 +1986,12 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* Normally, we need a transaction-safe truncation here. However, if
* the table was either created in the current (sub)transaction or has
- * a new relfilenode in the current (sub)transaction, then we can just
+ * a new relfilenumber in the current (sub)transaction, then we can just
* truncate it in-place, because a rollback would cause the whole
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilelocatorSubid == mySubid)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -2014,10 +2014,10 @@ ExecuteTruncateGuts(List *explicit_rels,
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
- * as the relfilenode value. The old storage file is scheduled for
+ * as the relfilenumber value. The old storage file is scheduled for
* deletion at commit.
*/
- RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(rel, rel->rd_rel->relpersistence);
heap_relid = RelationGetRelid(rel);
@@ -2030,7 +2030,7 @@ ExecuteTruncateGuts(List *explicit_rels,
Relation toastrel = relation_open(toast_relid,
AccessExclusiveLock);
- RelationSetNewRelfilenode(toastrel,
+ RelationSetNewRelfilenumber(toastrel,
toastrel->rd_rel->relpersistence);
table_close(toastrel, NoLock);
}
@@ -3315,10 +3315,10 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
/*
* SetRelationTableSpace
- * Set new reltablespace and relfilenode in pg_class entry.
+ * Set new reltablespace and relfilenumber in pg_class entry.
*
* newTableSpaceId is the new tablespace for the relation, and
- * newRelFileNode its new filenode. If newRelFileNode is InvalidOid,
+ * newRelFilenumber its new filenumber. If newRelFilenumber is InvalidOid,
* this field is not updated.
*
* NOTE: The caller must hold AccessExclusiveLock on the relation.
@@ -3331,7 +3331,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ Oid newRelFilenumber)
{
Relation pg_class;
HeapTuple tuple;
@@ -3351,8 +3351,8 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
- rd_rel->relfilenode = newRelFileNode;
+ if (OidIsValid(newRelFilenumber))
+ rd_rel->relfilenode = newRelFilenumber;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
/*
@@ -5420,7 +5420,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* persistence: on one hand, we need to ensure that the buffers
* belonging to each of the two relations are marked with or without
* BM_PERMANENT properly. On the other hand, since rewriting creates
- * and assigns a new relfilenode, we automatically create or drop an
+ * and assigns a new relfilenumber, we automatically create or drop an
* init fork for the relation as appropriate.
*/
if (tab->rewrite > 0 && tab->relkind != RELKIND_SEQUENCE)
@@ -5506,12 +5506,13 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* Create transient table that will receive the modified data.
*
* Ensure it is marked correctly as logged or unlogged. We have
- * to do this here so that buffers for the new relfilenode will
+ * to do this here so that buffers for the new relfilenumber will
* have the right persistence set, and at the same time ensure
- * that the original filenode's buffers will get read in with the
- * correct setting (i.e. the original one). Otherwise a rollback
- * after the rewrite would possibly result with buffers for the
- * original filenode having the wrong persistence setting.
+ * that the original filenumbers's buffers will get read in with
+ * the correct setting (i.e. the original one). Otherwise a
+ * rollback after the rewrite would possibly result with buffers
+ * for the original filenumbers having the wrong persistence
+ * setting.
*
* NB: This relies on swap_relation_files() also swapping the
* persistence. That wouldn't work for pg_class, but that can't be
@@ -8597,7 +8598,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNumber);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8613,7 +8614,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
quiet);
/*
- * If TryReuseIndex() stashed a relfilenode for us, we used it for the new
+ * If TryReuseIndex() stashed a relfilenumber for us, we used it for the new
* index instead of building from scratch. Restore associated fields.
* This may store InvalidSubTransactionId in both fields, in which case
* relcache.c will assume it can rebuild the relcache entry. Hence, do
@@ -8621,13 +8622,13 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (OidIsValid(stmt->oldNumber))
{
Relation irel = index_open(address.objectId, NoLock);
irel->rd_createSubid = stmt->oldCreateSubid;
- irel->rd_firstRelfilenodeSubid = stmt->oldFirstRelfilenodeSubid;
- RelationPreserveStorage(irel->rd_node, true);
+ irel->rd_firstRelfilelocatorSubid = stmt->oldFirstRelfilenumberSubid;
+ RelationPreserveStorage(irel->rd_locator, true);
index_close(irel, NoLock);
}
@@ -13491,9 +13492,9 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNumber = irel->rd_locator.relNumber;
stmt->oldCreateSubid = irel->rd_createSubid;
- stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
+ stmt->oldFirstRelfilenumberSubid = irel->rd_firstRelfilelocatorSubid;
}
index_close(irel, NoLock);
}
@@ -14340,8 +14341,8 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
- RelFileNode newrnode;
+ Oid newrelnumber;
+ RelFileLocator newrlocator;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14370,26 +14371,28 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
+ * Relfilenumbers are not unique in databases across tablespaces, so we need
* to allocate a new one in the new tablespace.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelnumber = GetNewRelFileNumber(newTableSpace, NULL,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
- newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
- newrnode.spcNode = newTableSpace;
+ newrlocator = rel->rd_locator;
+ newrlocator.relNumber = newrelnumber;
+ newrlocator.spcOid = newTableSpace;
- /* hand off to AM to actually create the new filenode and copy the data */
+ /*
+ * hand off to AM to actually create the new filelocator and copy the data
+ */
if (rel->rd_rel->relkind == RELKIND_INDEX)
{
- index_copy_data(rel, newrnode);
+ index_copy_data(rel, newrlocator);
}
else
{
Assert(RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind));
- table_relation_copy_data(rel, &newrnode);
+ table_relation_copy_data(rel, &newrlocator);
}
/*
@@ -14400,11 +14403,11 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* the updated pg_class entry), but that's forbidden with
* CheckRelationTableSpaceMove().
*/
- SetRelationTableSpace(rel, newTableSpace, newrelfilenode);
+ SetRelationTableSpace(rel, newTableSpace, newrelnumber);
InvokeObjectPostAlterHook(RelationRelationId, RelationGetRelid(rel), 0);
- RelationAssumeNewRelfilenode(rel);
+ RelationAssumeNewRelfilelocator(rel);
relation_close(rel, NoLock);
@@ -14630,11 +14633,11 @@ AlterTableMoveAll(AlterTableMoveAllStmt *stmt)
}
static void
-index_copy_data(Relation rel, RelFileNode newrnode)
+index_copy_data(Relation rel, RelFileLocator newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -14648,10 +14651,10 @@ index_copy_data(Relation rel, RelFileNode newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -14672,7 +14675,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(&newrnode, forkNum);
+ log_smgrcreate(&newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 00ca397..c8bdd99 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -12,12 +12,12 @@
* remove the possibility of having file name conflicts, we isolate
* files within a tablespace into database-specific subdirectories.
*
- * To support file access via the information given in RelFileNode, we
+ * To support file access via the information given in RelFileLocator, we
* maintain a symbolic-link map in $PGDATA/pg_tblspc. The symlinks are
* named by tablespace OIDs and point to the actual tablespace directories.
* There is also a per-cluster version directory in each tablespace.
* Thus the full path to an arbitrary file is
- * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenode
+ * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenumber
* e.g.
* $PGDATA/pg_tblspc/20981/PG_9.0_201002161/719849/83292814
*
@@ -25,8 +25,8 @@
* tables) and pg_default (for everything else). For backwards compatibility
* and to remain functional on platforms without symlinks, these tablespaces
* are accessed specially: they are respectively
- * $PGDATA/global/relfilenode
- * $PGDATA/base/dboid/relfilenode
+ * $PGDATA/global/relfilenumber
+ * $PGDATA/base/dboid/relfilenumber
*
* To allow CREATE DATABASE to give a new database a default tablespace
* that's different from the template database's default, we make the
@@ -115,7 +115,7 @@ static bool destroy_tablespace_directories(Oid tablespaceoid, bool redo);
* re-create a database subdirectory (of $PGDATA/base) during WAL replay.
*/
void
-TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
+TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
{
struct stat st;
char *dir;
@@ -124,13 +124,13 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
* The global tablespace doesn't have per-database subdirectories, so
* nothing to do for it.
*/
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
return;
- Assert(OidIsValid(spcNode));
- Assert(OidIsValid(dbNode));
+ Assert(OidIsValid(spcOid));
+ Assert(OidIsValid(dbOid));
- dir = GetDatabasePath(dbNode, spcNode);
+ dir = GetDatabasePath(dbOid, spcOid);
if (stat(dir, &st) < 0)
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630f..7d50b50 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4193,9 +4193,9 @@ _copyIndexStmt(const IndexStmt *from)
COPY_NODE_FIELD(excludeOpNames);
COPY_STRING_FIELD(idxcomment);
COPY_SCALAR_FIELD(indexOid);
- COPY_SCALAR_FIELD(oldNode);
+ COPY_SCALAR_FIELD(oldNumber);
COPY_SCALAR_FIELD(oldCreateSubid);
- COPY_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COPY_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COPY_SCALAR_FIELD(unique);
COPY_SCALAR_FIELD(nulls_not_distinct);
COPY_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index e747e16..d63d326 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1752,9 +1752,9 @@ _equalIndexStmt(const IndexStmt *a, const IndexStmt *b)
COMPARE_NODE_FIELD(excludeOpNames);
COMPARE_STRING_FIELD(idxcomment);
COMPARE_SCALAR_FIELD(indexOid);
- COMPARE_SCALAR_FIELD(oldNode);
+ COMPARE_SCALAR_FIELD(oldNumber);
COMPARE_SCALAR_FIELD(oldCreateSubid);
- COMPARE_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COMPARE_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COMPARE_SCALAR_FIELD(unique);
COMPARE_SCALAR_FIELD(nulls_not_distinct);
COMPARE_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ce12915..3724d48 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2928,9 +2928,9 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_OID_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
- WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
+ WRITE_UINT_FIELD(oldFirstRelfilenumberSubid);
WRITE_BOOL_FIELD(unique);
WRITE_BOOL_FIELD(nulls_not_distinct);
WRITE_BOOL_FIELD(primary);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 969c9c1..394404d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,9 +7990,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
@@ -8022,9 +8022,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 1a64a52..e943365 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1578,9 +1578,9 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidOid;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
index->nulls_not_distinct = idxrec->indnullsnotdistinct;
index->primary = idxrec->indisprimary;
@@ -2201,9 +2201,9 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidOid;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->transformed = false;
index->concurrent = false;
index->if_not_exists = false;
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index c937c39..5fc076f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -1207,7 +1207,7 @@ CompactCheckpointerRequestQueue(void)
* We use the request struct directly as a hashtable key. This
* assumes that any padding bytes in the structs are consistently the
* same, which should be okay because we zeroed them in
- * CheckpointerShmemInit. Note also that RelFileNode had better
+ * CheckpointerShmemInit. Note also that RelFileLocator had better
* contain no pad bytes.
*/
request = &CheckpointerShmem->requests[n];
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index aa2427b..c5c6a2b 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -845,7 +845,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_insert *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_insert *) XLogRecGetData(r);
@@ -857,8 +857,8 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -872,7 +872,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
tupledata = XLogRecGetBlockData(r, 0, &datalen);
tuplelen = datalen - SizeOfHeapHeader;
@@ -902,13 +902,13 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xl_heap_update *xlrec;
ReorderBufferChange *change;
char *data;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -918,7 +918,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change = ReorderBufferGetChange(ctx->reorder);
change->action = REORDER_BUFFER_CHANGE_UPDATE;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
if (xlrec->flags & XLH_UPDATE_CONTAINS_NEW_TUPLE)
{
@@ -968,13 +968,13 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_delete *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -990,7 +990,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
/* old primary key stored */
if (xlrec->flags & XLH_DELETE_CONTAINS_OLD)
@@ -1063,7 +1063,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
char *data;
char *tupledata;
Size tuplelen;
- RelFileNode rnode;
+ RelFileLocator rlocator;
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
@@ -1075,8 +1075,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
- if (rnode.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &rlocator, NULL, NULL);
+ if (rlocator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1103,7 +1103,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &rnode, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &rlocator, sizeof(RelFileLocator));
xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
@@ -1165,11 +1165,11 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
{
XLogReaderState *r = buf->record;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1180,7 +1180,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
change->data.tp.clear_toast_afterwards = true;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8da5f90..f8fb228 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -106,7 +106,7 @@
#include "utils/memdebug.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
/* entry for a hash table we use to map from xid to our transaction state */
@@ -116,10 +116,10 @@ typedef struct ReorderBufferTXNByIdEnt
ReorderBufferTXN *txn;
} ReorderBufferTXNByIdEnt;
-/* data structures for (relfilenode, ctid) => (cmin, cmax) mapping */
+/* data structures for (relfilelocator, ctid) => (cmin, cmax) mapping */
typedef struct ReorderBufferTupleCidKey
{
- RelFileNode relnode;
+ RelFileLocator rlocator;
ItemPointerData tid;
} ReorderBufferTupleCidKey;
@@ -1643,7 +1643,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Destroy the (relfilenode, ctid) hashtable, so that we don't leak any
+ * Destroy the (relfilelocator, ctid) hashtable, so that we don't leak any
* memory. We could also keep the hash table and update it with new ctid
* values, but this seems simpler and good enough for now.
*/
@@ -1673,7 +1673,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Build a hash with a (relfilenode, ctid) -> (cmin, cmax) mapping for use by
+ * Build a hash with a (relfilelocator, ctid) -> (cmin, cmax) mapping for use by
* HeapTupleSatisfiesHistoricMVCC.
*/
static void
@@ -1711,7 +1711,7 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
/* be careful about padding */
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
- key.relnode = change->data.tuplecid.node;
+ key.rlocator = change->data.tuplecid.locator;
ItemPointerCopy(&change->data.tuplecid.tid,
&key.tid);
@@ -2140,36 +2140,36 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
+ change->data.tp.rlocator.relNumber);
/*
* Mapped catalog tuple without data, emitted while
* catalog table was in the process of being rewritten. We
- * can fail to look up the relfilenode, because the
+ * can fail to look up the relfilenumber, because the
* relmapper has no "historic" view, in contrast to the
* normal catalog during decoding. Thus repeated rewrites
* can cause a lookup failure. That's OK because we do not
* decode catalog changes anyway. Normally such tuples
* would be skipped over below, but we can't identify
* whether the table should be logically logged without
- * mapping the relfilenode to the oid.
+ * mapping the relfilenumber to the oid.
*/
if (reloid == InvalidOid &&
change->data.tp.newtuple == NULL &&
change->data.tp.oldtuple == NULL)
goto change_done;
else if (reloid == InvalidOid)
- elog(ERROR, "could not map filenode \"%s\" to relation OID",
- relpathperm(change->data.tp.relnode,
+ elog(ERROR, "could not map filenumber \"%s\" to relation OID",
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
relation = RelationIdGetRelation(reloid);
if (!RelationIsValid(relation))
- elog(ERROR, "could not open relation with OID %u (for filenode \"%s\")",
+ elog(ERROR, "could not open relation with OID %u (for filenumber \"%s\")",
reloid,
- relpathperm(change->data.tp.relnode,
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
if (!RelationIsLogicallyLogged(relation))
@@ -3157,7 +3157,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
}
/*
- * Add new (relfilenode, tid) -> (cmin, cmax) mappings.
+ * Add new (relfilelocator, tid) -> (cmin, cmax) mappings.
*
* We do not include this change type in memory accounting, because we
* keep CIDs in a separate list and do not evict them when reaching
@@ -3165,7 +3165,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
*/
void
ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
- XLogRecPtr lsn, RelFileNode node,
+ XLogRecPtr lsn, RelFileLocator locator,
ItemPointerData tid, CommandId cmin,
CommandId cmax, CommandId combocid)
{
@@ -3174,7 +3174,7 @@ ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
- change->data.tuplecid.node = node;
+ change->data.tuplecid.locator = locator;
change->data.tuplecid.tid = tid;
change->data.tuplecid.cmin = cmin;
change->data.tuplecid.cmax = cmax;
@@ -4839,7 +4839,7 @@ ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn)
* need anymore.
*
* To resolve those problems we have a per-transaction hash of (cmin,
- * cmax) tuples keyed by (relfilenode, ctid) which contains the actual
+ * cmax) tuples keyed by (relfilelocator, ctid) which contains the actual
* (cmin, cmax) values. That also takes care of combo CIDs by simply
* not caring about them at all. As we have the real cmin/cmax values
* combo CIDs aren't interesting.
@@ -4870,9 +4870,9 @@ DisplayMapping(HTAB *tuplecid_data)
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
- ent->key.relnode.dbNode,
- ent->key.relnode.spcNode,
- ent->key.relnode.relNode,
+ ent->key.rlocator.dbOid,
+ ent->key.rlocator.spcOid,
+ ent->key.rlocator.relNumber,
ItemPointerGetBlockNumber(&ent->key.tid),
ItemPointerGetOffsetNumber(&ent->key.tid),
ent->cmin,
@@ -4932,7 +4932,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
path, readBytes,
(int32) sizeof(LogicalRewriteMappingData))));
- key.relnode = map.old_node;
+ key.rlocator = map.old_locator;
ItemPointerCopy(&map.old_tid,
&key.tid);
@@ -4947,7 +4947,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
if (!ent)
continue;
- key.relnode = map.new_node;
+ key.rlocator = map.new_locator;
ItemPointerCopy(&map.new_tid,
&key.tid);
@@ -5120,10 +5120,10 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
Assert(!BufferIsLocal(buffer));
/*
- * get relfilenode from the buffer, no convenient way to access it other
+ * get relfilelocator from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &key.rlocator, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1119a12..73c0f15 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -781,7 +781,7 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
- xlrec->target_node, xlrec->target_tid,
+ xlrec->target_locator, xlrec->target_tid,
xlrec->cmin, xlrec->cmax,
xlrec->combocid);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index ae13011..7071ff6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -121,12 +121,12 @@ typedef struct CkptTsStatus
* Type for array used to sort SMgrRelations
*
* FlushRelationsAllBuffers shares the same comparator function with
- * DropRelFileNodesAllBuffers. Pointer to this struct and RelFileNode must be
+ * DropRelFileLocatorsAllBuffers. Pointer to this struct and RelFileLocator must be
* compatible.
*/
typedef struct SMgrSortArray
{
- RelFileNode rnode; /* This must be the first member */
+ RelFileLocator rlocator; /* This must be the first member */
SMgrRelation srel;
} SMgrSortArray;
@@ -483,7 +483,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
BufferAccessStrategy strategy,
bool *foundPtr);
static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
-static void FindAndDropRelFileNodeBuffers(RelFileNode rnode,
+static void FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator,
ForkNumber forkNum,
BlockNumber nForkBlock,
BlockNumber firstDelBlock);
@@ -492,7 +492,7 @@ static void RelationCopyStorageUsingBuffer(Relation src, Relation dst,
bool isunlogged);
static void AtProcExit_Buffers(int code, Datum arg);
static void CheckForBufferLeaks(void);
-static int rnode_comparator(const void *p1, const void *p2);
+static int rlocator_comparator(const void *p1, const void *p2);
static inline int buffertag_comparator(const BufferTag *a, const BufferTag *b);
static inline int ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b);
static int ts_ckpt_progress_comparator(Datum a, Datum b, void *arg);
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -620,7 +620,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
* tag. In that case, the buffer is pinned and the usage count is bumped.
*/
bool
-ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
+ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockNum,
Buffer recent_buffer)
{
BufferDesc *bufHdr;
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rnode, forkNum, blockNum);
+ INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -786,13 +786,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* BackendId).
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool permanent)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(rlocator, InvalidBackendId);
return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
@@ -824,10 +824,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
isExtend = (blockNum == P_NEW);
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
@@ -839,7 +839,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend relation %s beyond %u blocks",
- relpath(smgr->smgr_rnode, forkNum),
+ relpath(smgr->smgr_rlocator, forkNum),
P_NEW)));
}
@@ -886,10 +886,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -926,7 +926,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (!PageIsNew((Page) bufBlock))
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
- blockNum, relpath(smgr->smgr_rnode, forkNum)),
+ blockNum, relpath(smgr->smgr_rlocator, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
@@ -1028,7 +1028,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
@@ -1036,7 +1036,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
}
}
}
@@ -1076,10 +1076,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageMiss;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1255,9 +1255,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
/* OK, do the I/O */
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1266,9 +1266,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
&buf->tag);
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
}
else
{
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rlocator.spcOid;
+ item->relNumber = bufHdr->tag.rlocator.relNumber;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2708,7 +2708,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2769,11 +2769,11 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
+ * Returns the relfilelocator, fork number and block number associated with
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2787,7 +2787,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ *rlocator = bufHdr->tag.rlocator;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,13 +2838,13 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rlocator, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
buf_state = LockBufHdr(buf);
@@ -2922,9 +2922,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3026,7 +3026,7 @@ BufferGetLSNAtomic(Buffer buffer)
}
/* ---------------------------------------------------------------------
- * DropRelFileNodeBuffers
+ * DropRelFileLocatorBuffers
*
* This function removes from the buffer pool all the pages of the
* specified relation forks that have block numbers >= firstDelBlock.
@@ -3047,24 +3047,24 @@ BufferGetLSNAtomic(Buffer buffer)
* --------------------------------------------------------------------
*/
void
-DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
+DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock)
{
int i;
int j;
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
BlockNumber nForkBlock[MAX_FORKNUM];
uint64 nBlocksToInvalidate = 0;
- rnode = smgr_reln->smgr_rnode;
+ rlocator = smgr_reln->smgr_rlocator;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileLocatorBackendIsTemp(rlocator))
{
- if (rnode.backend == MyBackendId)
+ if (rlocator.backend == MyBackendId)
{
for (j = 0; j < nforks; j++)
- DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
+ DropRelFileLocatorLocalBuffers(rlocator.locator, forkNum[j],
firstDelBlock[j]);
}
return;
@@ -3115,7 +3115,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)
{
for (j = 0; j < nforks; j++)
- FindAndDropRelFileNodeBuffers(rnode.node, forkNum[j],
+ FindAndDropRelFileLocatorBuffers(rlocator.locator, forkNum[j],
nForkBlock[j], firstDelBlock[j]);
return;
}
@@ -3138,17 +3138,17 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* false positives are safe because we'll recheck after getting the
* buffer lock.
*
- * We could check forkNum and blockNum as well as the rnode, but the
+ * We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3162,16 +3162,16 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
}
/* ---------------------------------------------------------------------
- * DropRelFileNodesAllBuffers
+ * DropRelFileLocatorsAllBuffers
*
* This function removes from the buffer pool all the pages of all
* forks of the specified relations. It's equivalent to calling
- * DropRelFileNodeBuffers once per fork per relation with
+ * DropRelFileLocatorBuffers once per fork per relation with
* firstDelBlock = 0.
* --------------------------------------------------------------------
*/
void
-DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
+DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
{
int i;
int j;
@@ -3179,22 +3179,22 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
SMgrRelation *rels;
BlockNumber (*block)[MAX_FORKNUM + 1];
uint64 nBlocksToInvalidate = 0;
- RelFileNode *nodes;
+ RelFileLocator *locators;
bool cached = true;
bool use_bsearch;
- if (nnodes == 0)
+ if (nlocators == 0)
return;
- rels = palloc(sizeof(SMgrRelation) * nnodes); /* non-local relations */
+ rels = palloc(sizeof(SMgrRelation) * nlocators); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
- for (i = 0; i < nnodes; i++)
+ for (i = 0; i < nlocators; i++)
{
- if (RelFileNodeBackendIsTemp(smgr_reln[i]->smgr_rnode))
+ if (RelFileLocatorBackendIsTemp(smgr_reln[i]->smgr_rlocator))
{
- if (smgr_reln[i]->smgr_rnode.backend == MyBackendId)
- DropRelFileNodeAllLocalBuffers(smgr_reln[i]->smgr_rnode.node);
+ if (smgr_reln[i]->smgr_rlocator.backend == MyBackendId)
+ DropRelFileLocatorAllLocalBuffers(smgr_reln[i]->smgr_rlocator.locator);
}
else
rels[n++] = smgr_reln[i];
@@ -3219,7 +3219,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
/*
* We can avoid scanning the entire buffer pool if we know the exact size
- * of each of the given relation forks. See DropRelFileNodeBuffers.
+ * of each of the given relation forks. See DropRelFileLocatorBuffers.
*/
for (i = 0; i < n && cached; i++)
{
@@ -3257,7 +3257,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
/* drop all the buffers for a particular relation fork */
- FindAndDropRelFileNodeBuffers(rels[i]->smgr_rnode.node,
+ FindAndDropRelFileLocatorBuffers(rels[i]->smgr_rlocator.locator,
j, block[i][j], 0);
}
}
@@ -3268,9 +3268,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
pfree(block);
- nodes = palloc(sizeof(RelFileNode) * n); /* non-local relations */
+ locators = palloc(sizeof(RelFileLocator) * n); /* non-local relations */
for (i = 0; i < n; i++)
- nodes[i] = rels[i]->smgr_rnode.node;
+ locators[i] = rels[i]->smgr_rlocator.locator;
/*
* For low number of relations to drop just use a simple walk through, to
@@ -3280,18 +3280,18 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
*/
use_bsearch = n > RELS_BSEARCH_THRESHOLD;
- /* sort the list of rnodes if necessary */
+ /* sort the list of rlocators if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(locators, n, sizeof(RelFileLocator), rlocator_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3301,37 +3301,37 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
{
- rnode = &nodes[j];
+ rlocator = &locators[j];
break;
}
}
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
- rnode_comparator);
+ rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ locators, n, sizeof(RelFileLocator),
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
- if (rnode == NULL)
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
+ if (rlocator == NULL)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
}
- pfree(nodes);
+ pfree(locators);
pfree(rels);
}
/* ---------------------------------------------------------------------
- * FindAndDropRelFileNodeBuffers
+ * FindAndDropRelFileLocatorBuffers
*
* This function performs look up in BufMapping table and removes from the
* buffer pool all the pages of the specified relation fork that has block
@@ -3340,9 +3340,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
* --------------------------------------------------------------------
*/
static void
-FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber nForkBlock,
- BlockNumber firstDelBlock)
+FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber nForkBlock,
+ BlockNumber firstDelBlock)
{
BlockNumber curBlock;
@@ -3356,7 +3356,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rnode, forkNum, curBlock);
+ INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
@@ -3380,7 +3380,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3397,7 +3397,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
* bothering to write them out first. This is used when we destroy a
* database, to avoid trying to flush data to disk when the directory
* tree no longer exists. Implementation is pretty similar to
- * DropRelFileNodeBuffers() which is for destroying just one relation.
+ * DropRelFileLocatorBuffers() which is for destroying just one relation.
* --------------------------------------------------------------------
*/
void
@@ -3416,14 +3416,14 @@ DropDatabaseBuffers(Oid dbid)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rlocator.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3453,7 +3453,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3478,7 +3478,7 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(buf->tag.rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3517,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3561,16 +3561,16 @@ FlushRelationBuffers(Relation rel)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3608,21 +3608,21 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (i = 0; i < nrels; i++)
{
- Assert(!RelFileNodeBackendIsTemp(smgrs[i]->smgr_rnode));
+ Assert(!RelFileLocatorBackendIsTemp(smgrs[i]->smgr_rlocator));
- srels[i].rnode = smgrs[i]->smgr_rnode.node;
+ srels[i].rlocator = smgrs[i]->smgr_rlocator.locator;
srels[i].srel = smgrs[i];
}
/*
* Save the bsearch overhead for low number of relations to sync. See
- * DropRelFileNodesAllBuffers for details.
+ * DropRelFileLocatorsAllBuffers for details.
*/
use_bsearch = nrels > RELS_BSEARCH_THRESHOLD;
/* sort the list of SMgrRelations if necessary */
if (use_bsearch)
- pg_qsort(srels, nrels, sizeof(SMgrSortArray), rnode_comparator);
+ pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
@@ -3634,7 +3634,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3644,7 +3644,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,19 +3653,19 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
srels, nrels, sizeof(SMgrSortArray),
- rnode_comparator);
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
if (srelent == NULL)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3729,7 +3729,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
CHECK_FOR_INTERRUPTS();
/* Read block from source relation. */
- srcBuf = ReadBufferWithoutRelcache(src->rd_node, forkNum, blkno,
+ srcBuf = ReadBufferWithoutRelcache(src->rd_locator, forkNum, blkno,
RBM_NORMAL, bstrategy_src,
permanent);
srcPage = BufferGetPage(srcBuf);
@@ -3740,7 +3740,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
}
/* Use P_NEW to extend the destination relation. */
- dstBuf = ReadBufferWithoutRelcache(dst->rd_node, forkNum, P_NEW,
+ dstBuf = ReadBufferWithoutRelcache(dst->rd_locator, forkNum, P_NEW,
RBM_NORMAL, bstrategy_dst,
permanent);
LockBuffer(dstBuf, BUFFER_LOCK_EXCLUSIVE);
@@ -3775,8 +3775,8 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
* --------------------------------------------------------------------
*/
void
-CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
- bool permanent)
+CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator, bool permanent)
{
Relation src_rel;
Relation dst_rel;
@@ -3793,8 +3793,8 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- src_rel = CreateFakeRelcacheEntry(src_rnode);
- dst_rel = CreateFakeRelcacheEntry(dst_rnode);
+ src_rel = CreateFakeRelcacheEntry(src_rlocator);
+ dst_rel = CreateFakeRelcacheEntry(dst_rlocator);
/*
* Create and copy all forks of the relation. During create database we
@@ -3802,7 +3802,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* directory. Therefore, each individual relation doesn't need to be
* registered for cleanup.
*/
- RelationCreateStorage(dst_rnode, relpersistence, false);
+ RelationCreateStorage(dst_rlocator, relpersistence, false);
/* copy main fork. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, MAIN_FORKNUM, permanent);
@@ -3820,7 +3820,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* init fork of an unlogged relation.
*/
if (permanent || forkNum == INIT_FORKNUM)
- log_smgrcreate(&dst_rnode, forkNum);
+ log_smgrcreate(&dst_rlocator, forkNum);
/* Copy a fork's data, block by block. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, forkNum,
@@ -3864,16 +3864,16 @@ FlushDatabaseBuffers(Oid dbid)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rlocator.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4034,7 +4034,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
/*
- * If we must not write WAL, due to a relfilenode-specific
+ * If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
* set the hint, just not dirty the page as a result so the hint
* is lost when we evict the page or shutdown.
@@ -4042,7 +4042,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
return;
/*
@@ -4651,7 +4651,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4675,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,7 +4693,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4703,27 +4703,27 @@ local_buffer_write_error_callback(void *arg)
}
/*
- * RelFileNode qsort/bsearch comparator; see RelFileNodeEquals.
+ * RelFileLocator qsort/bsearch comparator; see RelFileLocatorEquals.
*/
static int
-rnode_comparator(const void *p1, const void *p2)
+rlocator_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileLocator n1 = *(const RelFileLocator *) p1;
+ RelFileLocator n2 = *(const RelFileLocator *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.relNumber < n2.relNumber)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.relNumber > n2.relNumber)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.dbOid < n2.dbOid)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.dbOid > n2.dbOid)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.spcOid < n2.spcOid)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.spcOid > n2.spcOid)
return 1;
else
return 0;
@@ -4789,7 +4789,7 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
if (ret != 0)
return ret;
@@ -4822,9 +4822,9 @@ ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b)
else if (a->tsId > b->tsId)
return 1;
/* compare relation */
- if (a->relNode < b->relNode)
+ if (a->relNumber < b->relNumber)
return -1;
- else if (a->relNode > b->relNode)
+ else if (a->relNumber > b->relNumber)
return 1;
/* compare fork */
else if (a->forkNum < b->forkNum)
@@ -4960,7 +4960,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +4979,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..3dc9cc7 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -134,7 +134,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum, -b - 1);
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
#endif
buf_state = pg_atomic_read_u32(&bufHdr->state);
@@ -162,7 +162,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum,
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum,
-nextFreeLocalBuf - 1);
#endif
@@ -215,7 +215,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -312,7 +312,7 @@ MarkLocalBufferDirty(Buffer buffer)
}
/*
- * DropRelFileNodeLocalBuffers
+ * DropRelFileLocatorLocalBuffers
* This function removes from the buffer pool all the pages of the
* specified relation that have block numbers >= firstDelBlock.
* (In particular, with firstDelBlock = 0, all pages are removed.)
@@ -320,11 +320,11 @@ MarkLocalBufferDirty(Buffer buffer)
* out first. Therefore, this is NOT rollback-able, and so should be
* used only with extreme caution!
*
- * See DropRelFileNodeBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber firstDelBlock)
+DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber firstDelBlock)
{
int i;
@@ -337,14 +337,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -363,14 +363,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
}
/*
- * DropRelFileNodeAllLocalBuffers
+ * DropRelFileLocatorAllLocalBuffers
* This function removes from the buffer pool all pages of all forks
* of the specified relation.
*
- * See DropRelFileNodesAllBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorsAllBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
+DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
{
int i;
@@ -383,12 +383,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -589,7 +589,7 @@ AtProcExit_LocalBuffers(void)
{
/*
* We shouldn't be holding any remaining pins; if we are, and assertions
- * aren't enabled, we'll fail later in DropRelFileNodeBuffers while trying
+ * aren't enabled, we'll fail later in DropRelFileLocatorBuffers while trying
* to drop the temp rels.
*/
CheckForLocalBufferLeaks();
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index d41ae37..005def5 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -196,7 +196,7 @@ RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk, Size spaceAvail)
* WAL replay
*/
void
-XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail)
{
int new_cat = fsm_space_avail_to_cat(spaceAvail);
@@ -211,8 +211,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR,
- InvalidBuffer);
+ buf = XLogReadBufferExtended(rlocator, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR, InvalidBuffer);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..af4dab7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &rlocator, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 671b00a..9dab931 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -442,7 +442,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
}
void
-ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
+ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileLocator locator)
{
VirtualTransactionId *backends;
@@ -461,7 +461,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
return;
backends = GetConflictingVirtualXIDs(latestRemovedXid,
- node.dbNode);
+ locator.dbOid);
ResolveRecoveryConflictWithVirtualXIDs(backends,
PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
@@ -475,7 +475,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
*/
void
ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node)
+ RelFileLocator locator)
{
/*
* ResolveRecoveryConflictWithSnapshot operates on 32-bit TransactionIds,
@@ -493,7 +493,7 @@ ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXi
TransactionId latestRemovedXid;
latestRemovedXid = XidFromFullTransactionId(latestRemovedFullXid);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, node);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, locator);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 25e7e4e..5136da6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1997,7 +1997,7 @@ PageIsPredicateLocked(Relation relation, BlockNumber blkno)
PREDICATELOCKTARGET *target;
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
@@ -2576,7 +2576,7 @@ PredicateLockRelation(Relation relation, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
PredicateLockAcquire(&tag);
}
@@ -2599,7 +2599,7 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_PAGE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
PredicateLockAcquire(&tag);
@@ -2638,13 +2638,13 @@ PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
* level lock.
*/
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
if (PredicateLockExists(&tag))
return;
SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -2974,7 +2974,7 @@ DropAllPredicateLocksFromTable(Relation relation, bool transfer)
if (!PredicateLockingNeededForRelation(relation))
return;
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
relId = relation->rd_id;
if (relation->rd_index == NULL)
{
@@ -3194,11 +3194,11 @@ PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
Assert(BlockNumberIsValid(newblkno));
SET_PREDICATELOCKTARGETTAG_PAGE(oldtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
oldblkno);
SET_PREDICATELOCKTARGETTAG_PAGE(newtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
newblkno);
@@ -4478,7 +4478,7 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (tid != NULL)
{
SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -4488,14 +4488,14 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (blkno != InvalidBlockNumber)
{
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
CheckTargetForConflictsIn(&targettag);
}
SET_PREDICATELOCKTARGETTAG_RELATION(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
CheckTargetForConflictsIn(&targettag);
}
@@ -4556,7 +4556,7 @@ CheckTableForSerializableConflictIn(Relation relation)
Assert(relation->rd_index == NULL); /* not an index relation */
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
heapId = relation->rd_id;
LWLockAcquire(SerializablePredicateListLock, LW_EXCLUSIVE);
diff --git a/src/backend/storage/smgr/README b/src/backend/storage/smgr/README
index e1cfc6c..1dfc16f 100644
--- a/src/backend/storage/smgr/README
+++ b/src/backend/storage/smgr/README
@@ -46,7 +46,7 @@ physical relation in system catalogs.
It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
exists. Fork numbers are assigned in src/include/common/relpath.h.
Functions in smgr.c and md.c take an extra fork number argument, in addition
-to relfilenode and block number, to identify which relation fork you want to
+to relfilenumber and block number, to identify which relation fork you want to
access. Since most code wants to access the main fork, a shortcut version of
ReadBuffer that accesses MAIN_FORKNUM is provided in the buffer manager for
convenience.
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 43edaf5..3998296 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -35,7 +35,7 @@
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/md.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
@@ -89,11 +89,11 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* Populate a file tag describing an md.c segment file. */
-#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
+#define INIT_MD_FILETAG(a,xx_rlocator,xx_forknum,xx_segno) \
( \
memset(&(a), 0, sizeof(FileTag)), \
(a).handler = SYNC_HANDLER_MD, \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forknum = (xx_forknum), \
(a).segno = (xx_segno) \
)
@@ -121,14 +121,14 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
-static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
+static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
bool isRedo);
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
-static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
ForkNumber forknum,
@@ -199,11 +199,11 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* should be here and not in commands/tablespace.c? But that would imply
* importing a lot of stuff that smgr.c oughtn't know, either.
*/
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
+ TablespaceCreateDbspace(reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
isRedo);
- path = relpath(reln->smgr_rnode, forkNum);
+ path = relpath(reln->smgr_rlocator, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
@@ -234,7 +234,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
/*
* mdunlink() -- Unlink a relation.
*
- * Note that we're passed a RelFileNodeBackend --- by the time this is called,
+ * Note that we're passed a RelFileLocatorBackend --- by the time this is called,
* there won't be an SMgrRelation hashtable entry anymore.
*
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
@@ -243,10 +243,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* For regular relations, we don't unlink the first segment file of the rel,
* but just truncate it to zero length, and record a request to unlink it after
* the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenode
- * number from being reused. The scenario this protects us from is:
+ * however. Leaving the empty file in place prevents that relfilenumber
+ * from being reused. The scenario this protects us from is:
* 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenode as
+ * 2. We create a new relation, which by chance gets the same relfilenumber as
* the just-deleted one (OIDs must've wrapped around for that to happen).
* 3. We crash before another checkpoint occurs.
* During replay, we would delete the file and then recreate it, which is fine
@@ -254,18 +254,18 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* But if we didn't WAL-log insertions, but instead relied on fsyncing the
* file after populating it (as we do at wal_level=minimal), the contents of
* the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenode number until
- * it's safe, because relfilenode assignment skips over any existing file.
+ * next checkpoint, we prevent reassignment of the relfilenumber until it's
+ * safe, because relfilenumber assignment skips over any existing file.
*
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenode number.
+ * to the health of a regular rel that has taken over its relfilenumber.
* The fact that temp rels and regular rels have different file naming
* patterns provides additional safety.
*
* All the above applies only to the relation's main fork; other forks can
* just be removed immediately, since they are not needed to prevent the
- * relfilenode number from being recycled. Also, we do not carefully
+ * relfilenumber from being recycled. Also, we do not carefully
* track whether other forks have been created or not, but just attempt to
* unlink them unconditionally; so we should never complain about ENOENT.
*
@@ -278,16 +278,16 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* we are usually not in a transaction anymore when this is called.
*/
void
-mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlink(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
/* Now do the per-fork work */
if (forkNum == InvalidForkNumber)
{
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
else
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
/*
@@ -315,25 +315,25 @@ do_truncate(const char *path)
}
static void
-mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
int ret;
- path = relpath(rnode, forkNum);
+ path = relpath(rlocator, forkNum);
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
/* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
ret = 0;
@@ -354,7 +354,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
ret = do_truncate(path);
/* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
}
/*
@@ -373,7 +373,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
sprintf(segpath, "%s.%u", path, segno);
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
* Prevent other backends' fds from holding on to the disk
@@ -386,7 +386,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
* Forget any pending sync requests for this segment before we
* try to unlink.
*/
- register_forget_request(rnode, forkNum, segno);
+ register_forget_request(rlocator, forkNum, segno);
}
if (unlink(segpath) < 0)
@@ -437,7 +437,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend file \"%s\" beyond %u blocks",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
InvalidBlockNumber)));
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync, EXTENSION_CREATE);
@@ -490,7 +490,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (reln->md_num_open_segs[forknum] > 0)
return &reln->md_seg_fds[forknum][0];
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);
@@ -645,10 +645,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
MdfdVec *v;
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -660,10 +660,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_READ);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -715,10 +715,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
#endif
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -730,10 +730,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_WRITE);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -842,7 +842,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
return;
ereport(ERROR,
(errmsg("could not truncate file \"%s\" to %u blocks: it's only %u blocks now",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
nblocks, curnblk)));
}
if (nblocks == curnblk)
@@ -983,7 +983,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
{
FileTag tag;
- INIT_MD_FILETAG(tag, reln->smgr_rnode.node, forknum, seg->mdfd_segno);
+ INIT_MD_FILETAG(tag, reln->smgr_rlocator.locator, forknum, seg->mdfd_segno);
/* Temp relations should never be fsync'd */
Assert(!SmgrIsTemp(reln));
@@ -1005,15 +1005,15 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
* register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
*/
static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
/* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
+ Assert(!RelFileLocatorBackendIsTemp(rlocator));
RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
}
@@ -1022,12 +1022,12 @@ register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
-register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true /* retryOnError */ );
}
@@ -1039,13 +1039,13 @@ void
ForgetDatabaseSyncRequests(Oid dbid)
{
FileTag tag;
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.dbNode = dbid;
- rnode.spcNode = 0;
- rnode.relNode = 0;
+ rlocator.dbOid = dbid;
+ rlocator.spcOid = 0;
+ rlocator.relNumber = 0;
- INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
+ INIT_MD_FILETAG(tag, rlocator, InvalidForkNumber, InvalidBlockNumber);
RegisterSyncRequest(&tag, SYNC_FILTER_REQUEST, true /* retryOnError */ );
}
@@ -1054,7 +1054,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
* DropRelationFiles -- drop files of all given relations
*/
void
-DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
+DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo)
{
SMgrRelation *srels;
int i;
@@ -1129,7 +1129,7 @@ _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
char *path,
*fullpath;
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
if (segno > 0)
{
@@ -1345,7 +1345,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(ftag->rlocator, InvalidBackendId);
File file;
bool need_to_close;
int result,
@@ -1395,7 +1395,7 @@ mdunlinkfiletag(const FileTag *ftag, char *path)
char *p;
/* Compute the path. */
- p = relpathperm(ftag->rnode, MAIN_FORKNUM);
+ p = relpathperm(ftag->rlocator, MAIN_FORKNUM);
strlcpy(path, p, MAXPGPATH);
pfree(p);
@@ -1417,5 +1417,5 @@ mdfiletagmatches(const FileTag *ftag, const FileTag *candidate)
* We'll return true for all candidates that have the same database OID as
* the ftag from the SYNC_FILTER_REQUEST request, so they're forgotten.
*/
- return ftag->rnode.dbNode == candidate->rnode.dbNode;
+ return ftag->rlocator.dbOid == candidate->rlocator.dbOid;
}
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index a477f70..b21d8c3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -46,7 +46,7 @@ typedef struct f_smgr
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
bool (*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
- void (*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
+ void (*smgr_unlink) (RelFileLocatorBackend rlocator, ForkNumber forknum,
bool isRedo);
void (*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
@@ -143,9 +143,9 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(RelFileLocator rlocator, BackendId backend)
{
- RelFileNodeBackend brnode;
+ RelFileLocatorBackend brlocator;
SMgrRelation reln;
bool found;
@@ -154,7 +154,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = sizeof(RelFileLocatorBackend);
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
@@ -162,10 +162,10 @@ smgropen(RelFileNode rnode, BackendId backend)
}
/* Look up or create an entry */
- brnode.node = rnode;
- brnode.backend = backend;
+ brlocator.locator = rlocator;
+ brlocator.backend = backend;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &brnode,
+ (void *) &brlocator,
HASH_ENTER, &found);
/* Initialize it if not present before */
@@ -267,7 +267,7 @@ smgrclose(SMgrRelation reln)
dlist_delete(&reln->node);
if (hash_search(SMgrRelationHash,
- (void *) &(reln->smgr_rnode),
+ (void *) &(reln->smgr_rlocator),
HASH_REMOVE, NULL) == NULL)
elog(ERROR, "SMgrRelation hashtable corrupted");
@@ -335,15 +335,15 @@ smgrcloseall(void)
}
/*
- * smgrclosenode() -- Close SMgrRelation object for given RelFileNode,
+ * smgrcloserellocator() -- Close SMgrRelation object for given RelFileLocator,
* if one exists.
*
- * This has the same effects as smgrclose(smgropen(rnode)), but it avoids
+ * This has the same effects as smgrclose(smgropen(rlocator)), but it avoids
* uselessly creating a hashtable entry only to drop it again when no
* such entry exists already.
*/
void
-smgrclosenode(RelFileNodeBackend rnode)
+smgrcloserellocator(RelFileLocatorBackend rlocator)
{
SMgrRelation reln;
@@ -352,7 +352,7 @@ smgrclosenode(RelFileNodeBackend rnode)
return;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &rnode,
+ (void *) &rlocator,
HASH_FIND, NULL);
if (reln != NULL)
smgrclose(reln);
@@ -420,7 +420,7 @@ void
smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
{
int i = 0;
- RelFileNodeBackend *rnodes;
+ RelFileLocatorBackend *rlocators;
ForkNumber forknum;
if (nrels == 0)
@@ -430,19 +430,19 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* Get rid of any remaining buffers for the relations. bufmgr will just
* drop them without bothering to write the contents.
*/
- DropRelFileNodesAllBuffers(rels, nrels);
+ DropRelFileLocatorsAllBuffers(rels, nrels);
/*
* create an array which contains all relations to be dropped, and close
* each relation's forks at the smgr level while at it
*/
- rnodes = palloc(sizeof(RelFileNodeBackend) * nrels);
+ rlocators = palloc(sizeof(RelFileLocatorBackend) * nrels);
for (i = 0; i < nrels; i++)
{
- RelFileNodeBackend rnode = rels[i]->smgr_rnode;
+ RelFileLocatorBackend rlocator = rels[i]->smgr_rlocator;
int which = rels[i]->smgr_which;
- rnodes[i] = rnode;
+ rlocators[i] = rlocator;
/* Close the forks at smgr level */
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
@@ -458,7 +458,7 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* closed our own smgr rel.
*/
for (i = 0; i < nrels; i++)
- CacheInvalidateSmgr(rnodes[i]);
+ CacheInvalidateSmgr(rlocators[i]);
/*
* Delete the physical file(s).
@@ -473,10 +473,10 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
int which = rels[i]->smgr_which;
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- smgrsw[which].smgr_unlink(rnodes[i], forknum, isRedo);
+ smgrsw[which].smgr_unlink(rlocators[i], forknum, isRedo);
}
- pfree(rnodes);
+ pfree(rlocators);
}
@@ -631,7 +631,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* Get rid of any buffers for the about-to-be-deleted blocks. bufmgr will
* just drop them without bothering to write the contents.
*/
- DropRelFileNodeBuffers(reln, forknum, nforks, nblocks);
+ DropRelFileLocatorBuffers(reln, forknum, nforks, nblocks);
/*
* Send a shared-inval message to force other backends to close any smgr
@@ -643,7 +643,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* is a performance-critical path.) As in the unlink code, we want to be
* sure the message is sent before we start changing things on-disk.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
/* Do the truncation */
for (i = 0; i < nforks; i++)
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index b4a2c8d..1430239 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -27,7 +27,7 @@
#include "utils/builtins.h"
#include "utils/numeric.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/relmapper.h"
#include "utils/syscache.h"
@@ -292,7 +292,7 @@ pg_tablespace_size_name(PG_FUNCTION_ARGS)
* is no check here or at the call sites for that.
*/
static int64
-calculate_relation_size(RelFileNode *rfn, BackendId backend, ForkNumber forknum)
+calculate_relation_size(RelFileLocator *rfn, BackendId backend, ForkNumber forknum)
{
int64 totalsize = 0;
char *relationpath;
@@ -349,7 +349,7 @@ pg_relation_size(PG_FUNCTION_ARGS)
if (rel == NULL)
PG_RETURN_NULL();
- size = calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size = calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkname_to_number(text_to_cstring(forkName)));
relation_close(rel, AccessShareLock);
@@ -374,7 +374,7 @@ calculate_toast_table_size(Oid toastrelid)
/* toast heap size, including FSM and VM size */
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastRel->rd_node),
+ size += calculate_relation_size(&(toastRel->rd_locator),
toastRel->rd_backend, forkNum);
/* toast index size, including FSM and VM size */
@@ -388,7 +388,7 @@ calculate_toast_table_size(Oid toastrelid)
toastIdxRel = relation_open(lfirst_oid(lc),
AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastIdxRel->rd_node),
+ size += calculate_relation_size(&(toastIdxRel->rd_locator),
toastIdxRel->rd_backend, forkNum);
relation_close(toastIdxRel, AccessShareLock);
@@ -417,7 +417,7 @@ calculate_table_size(Relation rel)
* heap size, including FSM and VM
*/
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size += calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkNum);
/*
@@ -456,7 +456,7 @@ calculate_indexes_size(Relation rel)
idxRel = relation_open(idxOid, AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(idxRel->rd_node),
+ size += calculate_relation_size(&(idxRel->rd_locator),
idxRel->rd_backend,
forkNum);
@@ -864,7 +864,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (relform->relfilenode)
result = relform->relfilenode;
else /* Consult the relation mapper */
- result = RelationMapOidToFilenode(relid,
+ result = RelationMapOidToFilenumber(relid,
relform->relisshared);
}
else
@@ -882,11 +882,11 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
}
/*
- * Get the relation via (reltablespace, relfilenode)
+ * Get the relation via (reltablespace, relfilenumber)
*
* This is expected to be used when somebody wants to match an individual file
* on the filesystem back to its table. That's not trivially possible via
- * pg_class, because that doesn't contain the relfilenodes of shared and nailed
+ * pg_class, because that doesn't contain the relfilenumbers of shared and nailed
* tables.
*
* We don't fail but return NULL if we cannot find a mapping.
@@ -898,14 +898,14 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ Oid relfilenumber = PG_GETARG_OID(1);
Oid heaprel;
- /* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ /* test needed so RelidByRelfilenumber doesn't misbehave */
+ if (!OidIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenode(reltablespace, relfilenode);
+ heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
@@ -924,7 +924,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
Oid relid = PG_GETARG_OID(0);
HeapTuple tuple;
Form_pg_class relform;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BackendId backend;
char *path;
@@ -937,29 +937,29 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
{
/* This logic should match RelationInitPhysicalAddr */
if (relform->reltablespace)
- rnode.spcNode = relform->reltablespace;
+ rlocator.spcOid = relform->reltablespace;
else
- rnode.spcNode = MyDatabaseTableSpace;
- if (rnode.spcNode == GLOBALTABLESPACE_OID)
- rnode.dbNode = InvalidOid;
+ rlocator.spcOid = MyDatabaseTableSpace;
+ if (rlocator.spcOid == GLOBALTABLESPACE_OID)
+ rlocator.dbOid = InvalidOid;
else
- rnode.dbNode = MyDatabaseId;
+ rlocator.dbOid = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ rlocator.relNumber = relform->relfilenode;
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ rlocator.relNumber = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rlocator.relNumber = InvalidOid;
/* some compilers generate warnings without these next two lines */
- rnode.dbNode = InvalidOid;
- rnode.spcNode = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.spcOid = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!OidIsValid(rlocator.relNumber))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
@@ -990,7 +990,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
ReleaseSysCache(tuple);
- path = relpathbackend(rnode, backend, MAIN_FORKNUM);
+ path = relpathbackend(rlocator, backend, MAIN_FORKNUM);
PG_RETURN_TEXT_P(cstring_to_text(path));
}
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..5ccdcf6 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -2,7 +2,7 @@
* pg_upgrade_support.c
*
* server-side functions to set backend global variables
- * to control oid and relfilenode assignment, and do other special
+ * to control oid and relfilenumber assignment, and do other special
* hacks needed for pg_upgrade.
*
* Copyright (c) 2010-2022, PostgreSQL Global Development Group
@@ -98,10 +98,10 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ Oid relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -120,10 +120,10 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ Oid relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ Oid relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile
index 38e46d2..5105018 100644
--- a/src/backend/utils/cache/Makefile
+++ b/src/backend/utils/cache/Makefile
@@ -21,7 +21,7 @@ OBJS = \
partcache.o \
plancache.o \
relcache.o \
- relfilenodemap.o \
+ relfilenumbermap.o \
relmapper.o \
spccache.o \
syscache.o \
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index af000d4..eb5782f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -661,11 +661,11 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
* We could have smgr entries for relations of other databases, so no
* short-circuit test is possible here.
*/
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
- rnode.node = msg->sm.rnode;
- rnode.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
- smgrclosenode(rnode);
+ rlocator.locator = msg->sm.rlocator;
+ rlocator.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
+ smgrcloserellocator(rlocator);
}
else if (msg->id == SHAREDINVALRELMAP_ID)
{
@@ -1459,14 +1459,14 @@ CacheInvalidateRelcacheByRelid(Oid relid)
* Thus, the maximum possible backend ID is 2^23-1.
*/
void
-CacheInvalidateSmgr(RelFileNodeBackend rnode)
+CacheInvalidateSmgr(RelFileLocatorBackend rlocator)
{
SharedInvalidationMessage msg;
msg.sm.id = SHAREDINVALSMGR_ID;
- msg.sm.backend_hi = rnode.backend >> 16;
- msg.sm.backend_lo = rnode.backend & 0xffff;
- msg.sm.rnode = rnode.node;
+ msg.sm.backend_hi = rlocator.backend >> 16;
+ msg.sm.backend_lo = rlocator.backend & 0xffff;
+ msg.sm.rlocator = rlocator.locator;
/* check AddCatcacheInvalidationMessage() for an explanation */
VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 0e8fda9..fbd1571 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -369,7 +369,7 @@ ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic)
/*
* The caller might need a tuple that's newer than the one the historic
* snapshot; currently the only case requiring to do so is looking up the
- * relfilenode of non mapped system relations during decoding. That
+ * relfilenumber of non mapped system relations during decoding. That
* snapshot can't change in the midst of a relcache build, so there's no
* need to register the snapshot.
*/
@@ -1133,8 +1133,8 @@ retry:
relation->rd_refcnt = 0;
relation->rd_isnailed = false;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
@@ -1300,7 +1300,7 @@ retry:
}
/*
- * Initialize the physical addressing info (RelFileNode) for a relcache entry
+ * Initialize the physical addressing info (RelFileLocator) for a relcache entry
*
* Note: at the physical level, relations in the pg_global tablespace must
* be treated as shared, even if relisshared isn't set. Hence we do not
@@ -1309,20 +1309,20 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ Oid oldnumber = relation->rd_locator.relNumber;
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
if (relation->rd_rel->reltablespace)
- relation->rd_node.spcNode = relation->rd_rel->reltablespace;
+ relation->rd_locator.spcOid = relation->rd_rel->reltablespace;
else
- relation->rd_node.spcNode = MyDatabaseTableSpace;
- if (relation->rd_node.spcNode == GLOBALTABLESPACE_OID)
- relation->rd_node.dbNode = InvalidOid;
+ relation->rd_locator.spcOid = MyDatabaseTableSpace;
+ if (relation->rd_locator.spcOid == GLOBALTABLESPACE_OID)
+ relation->rd_locator.dbOid = InvalidOid;
else
- relation->rd_node.dbNode = MyDatabaseId;
+ relation->rd_locator.dbOid = MyDatabaseId;
if (relation->rd_rel->relfilenode)
{
@@ -1356,30 +1356,30 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ relation->rd_locator.relNumber = relation->rd_rel->relfilenode;
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ relation->rd_locator.relNumber =
+ RelationMapOidToFilenumber(relation->rd_id,
+ relation->rd_rel->relisshared);
+ if (!OidIsValid(relation->rd_locator.relNumber))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
/*
* For RelationNeedsWAL() to answer correctly on parallel workers, restore
- * rd_firstRelfilenodeSubid. No subtransactions start or end while in
+ * rd_firstRelfilelocatorSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnumber != relation->rd_locator.relNumber)
{
- if (RelFileNodeSkippingWAL(relation->rd_node))
- relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ if (RelFileLocatorSkippingWAL(relation->rd_locator))
+ relation->rd_firstRelfilelocatorSubid = TopSubTransactionId;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
}
@@ -1889,8 +1889,8 @@ formrdesc(const char *relationName, Oid relationReltype,
*/
relation->rd_isnailed = true;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
relation->rd_backend = InvalidBackendId;
relation->rd_islocaltemp = false;
@@ -1978,9 +1978,9 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
- * because there is no other way to know what filenode they currently
+ * because there is no other way to know what filenumber they currently
* have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * specifying that the initial filenumber is the same as the OID.
*/
relation->rd_rel->relfilenode = InvalidOid;
if (IsBootstrapProcessingMode())
@@ -2180,7 +2180,7 @@ RelationClose(Relation relation)
#ifdef RELCACHE_FORCE_RELEASE
if (RelationHasReferenceCountZero(relation) &&
relation->rd_createSubid == InvalidSubTransactionId &&
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
RelationClearRelation(relation, false);
#endif
}
@@ -2352,7 +2352,7 @@ RelationReloadNailed(Relation relation)
{
/*
* If it's a nailed-but-not-mapped index, then we need to re-read the
- * pg_class row to see if its relfilenode changed.
+ * pg_class row to see if its relfilenumber changed.
*/
RelationReloadIndexInfo(relation);
}
@@ -2700,8 +2700,8 @@ RelationClearRelation(Relation relation, bool rebuild)
Assert(newrel->rd_isnailed == relation->rd_isnailed);
/* creation sub-XIDs must be preserved */
SWAPFIELD(SubTransactionId, rd_createSubid);
- SWAPFIELD(SubTransactionId, rd_newRelfilenodeSubid);
- SWAPFIELD(SubTransactionId, rd_firstRelfilenodeSubid);
+ SWAPFIELD(SubTransactionId, rd_newRelfilelocatorSubid);
+ SWAPFIELD(SubTransactionId, rd_firstRelfilelocatorSubid);
SWAPFIELD(SubTransactionId, rd_droppedSubid);
/* un-swap rd_rel pointers, swap contents instead */
SWAPFIELD(Form_pg_class, rd_rel);
@@ -2791,12 +2791,12 @@ static void
RelationFlushRelation(Relation relation)
{
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* New relcache entries are always rebuilt, not flushed; else we'd
* forget the "new" status of the relation. Ditto for the
- * new-relfilenode status.
+ * new-relfilenumber status.
*
* The rel could have zero refcnt here, so temporarily increment the
* refcnt to ensure it's safe to rebuild it. We can assume that the
@@ -2835,7 +2835,7 @@ RelationForgetRelation(Oid rid)
Assert(relation->rd_droppedSubid == InvalidSubTransactionId);
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* In the event of subtransaction rollback, we must not forget
@@ -2894,7 +2894,7 @@ RelationCacheInvalidateEntry(Oid relationId)
*
* Apart from debug_discard_caches, this is currently used only to recover
* from SI message buffer overflow, so we do not touch relations having
- * new-in-transaction relfilenodes; they cannot be targets of cross-backend
+ * new-in-transaction relfilenumbers; they cannot be targets of cross-backend
* SI updates (and our own updates now go through a separate linked list
* that isn't limited by the SI message buffer size).
*
@@ -2909,7 +2909,7 @@ RelationCacheInvalidateEntry(Oid relationId)
* so hash_seq_search will complete safely; (b) during the second pass we
* only hold onto pointers to nondeletable entries.
*
- * The two-phase approach also makes it easy to update relfilenodes for
+ * The two-phase approach also makes it easy to update relfilenumbers for
* mapped relations before we do anything else, and to ensure that the
* second pass processes nailed-in-cache items before other nondeletable
* items. This should ensure that system catalogs are up to date before
@@ -2948,12 +2948,12 @@ RelationCacheInvalidate(bool debug_discard)
/*
* Ignore new relations; no other backend will manipulate them before
- * we commit. Likewise, before replacing a relation's relfilenode, we
- * shall have acquired AccessExclusiveLock and drained any applicable
- * pending invalidations.
+ * we commit. Likewise, before replacing a relation's relfilenumber,
+ * we shall have acquired AccessExclusiveLock and drained any
+ * applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;
relcacheInvalsReceived++;
@@ -2967,8 +2967,8 @@ RelationCacheInvalidate(bool debug_discard)
else
{
/*
- * If it's a mapped relation, immediately update its rd_node in
- * case its relfilenode changed. We must do this during phase 1
+ * If it's a mapped relation, immediately update its rd_locator in
+ * case its relfilenumber changed. We must do this during phase 1
* in case the relation is consulted during rebuild of other
* relcache entries in phase 2. It's safe since consulting the
* map doesn't involve any access to relcache entries.
@@ -3078,14 +3078,14 @@ AssertPendingSyncConsistency(Relation relation)
RelationIsPermanent(relation) &&
((relation->rd_createSubid != InvalidSubTransactionId &&
RELKIND_HAS_STORAGE(relation->rd_rel->relkind)) ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId);
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId);
- Assert(relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node));
+ Assert(relcache_verdict == RelFileLocatorSkippingWAL(relation->rd_locator));
if (relation->rd_droppedSubid != InvalidSubTransactionId)
Assert(!relation->rd_isvalid &&
(relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId));
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId));
}
/*
@@ -3282,8 +3282,8 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
* also lets RelationClearRelation() drop the relcache entry.
*/
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
if (clear_relcache)
@@ -3397,8 +3397,8 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
{
/* allow the entry to be removed */
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
RelationClearRelation(relation, false);
return;
@@ -3419,23 +3419,23 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
}
/*
- * Likewise, update or drop any new-relfilenode-in-subtransaction record
+ * Likewise, update or drop any new-relfilenumber-in-subtransaction record
* or drop record.
*/
- if (relation->rd_newRelfilenodeSubid == mySubid)
+ if (relation->rd_newRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_newRelfilenodeSubid = parentSubid;
+ relation->rd_newRelfilelocatorSubid = parentSubid;
else
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
}
- if (relation->rd_firstRelfilenodeSubid == mySubid)
+ if (relation->rd_firstRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_firstRelfilenodeSubid = parentSubid;
+ relation->rd_firstRelfilelocatorSubid = parentSubid;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
if (relation->rd_droppedSubid == mySubid)
@@ -3459,7 +3459,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ Oid relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3533,8 +3533,8 @@ RelationBuildLocalRelation(const char *relname,
/* it's being created in this transaction */
rel->rd_createSubid = GetCurrentSubTransactionId();
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
/*
@@ -3616,7 +3616,7 @@ RelationBuildLocalRelation(const char *relname,
/*
* Insert relation physical and logical identifiers (OIDs) into the right
- * places. For a mapped relation, we set relfilenode to zero and rely on
+ * places. For a mapped relation, we set relfilenumber to zero and rely on
* RelationInitPhysicalAddr to consult the map.
*/
rel->rd_rel->relisshared = shared_relation;
@@ -3632,10 +3632,10 @@ RelationBuildLocalRelation(const char *relname,
{
rel->rd_rel->relfilenode = InvalidOid;
/* Add it to the active mapping information */
- RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
+ RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
else
- rel->rd_rel->relfilenode = relfilenode;
+ rel->rd_rel->relfilenode = relfilenumber;
RelationInitLockInfo(rel); /* see lmgr.c */
@@ -3683,13 +3683,13 @@ RelationBuildLocalRelation(const char *relname,
/*
- * RelationSetNewRelfilenode
+ * RelationSetNewRelfilenumber
*
- * Assign a new relfilenode (physical file name), and possibly a new
+ * Assign a new relfilenumber (physical file name), and possibly a new
* persistence setting, to the relation.
*
* This allows a full rewrite of the relation to be done with transactional
- * safety (since the filenode assignment can be rolled back). Note however
+ * safety (since the filenumber assignment can be rolled back). Note however
* that there is no simple way to access the relation's old data for the
* remainder of the current transaction. This limits the usefulness to cases
* such as TRUNCATE or rebuilding an index from scratch.
@@ -3697,19 +3697,19 @@ RelationBuildLocalRelation(const char *relname,
* Caller must already hold exclusive lock on the relation.
*/
void
-RelationSetNewRelfilenode(Relation relation, char persistence)
+RelationSetNewRelfilenumber(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ Oid newrelfilenumber;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
MultiXactId minmulti = InvalidMultiXactId;
TransactionId freezeXid = InvalidTransactionId;
- RelFileNode newrnode;
+ RelFileLocator newrlocator;
- /* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
- persistence);
+ /* Allocate a new relfilenumber */
+ newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
+ NULL, persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
@@ -3729,28 +3729,28 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelationDropStorage(relation);
/*
- * Create storage for the main fork of the new relfilenode. If it's a
+ * Create storage for the main fork of the new relfilenumber. If it's a
* table-like object, call into the table AM to do so, which'll also
* create the table's init fork if needed.
*
- * NOTE: If relevant for the AM, any conflict in relfilenode value will be
- * caught here, if GetNewRelFileNode messes up for any reason.
+ * NOTE: If relevant for the AM, any conflict in relfilenumber value will be
+ * caught here, if GetNewRelFileNumber messes up for any reason.
*/
- newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ newrlocator = relation->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
- table_relation_set_new_filenode(relation, &newrnode,
- persistence,
- &freezeXid, &minmulti);
+ table_relation_set_new_filelocator(relation, &newrlocator,
+ persistence,
+ &freezeXid, &minmulti);
}
else if (RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
{
/* handle these directly, at least for now */
SMgrRelation srel;
- srel = RelationCreateStorage(newrnode, persistence, true);
+ srel = RelationCreateStorage(newrlocator, persistence, true);
smgrclose(srel);
}
else
@@ -3789,7 +3789,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
/* Do the deed */
RelationMapUpdateMap(RelationGetRelid(relation),
- newrelfilenode,
+ newrelfilenumber,
relation->rd_rel->relisshared,
false);
@@ -3799,7 +3799,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
else
{
/* Normal case, update the pg_class entry */
- classform->relfilenode = newrelfilenode;
+ classform->relfilenode = newrelfilenumber;
/* relpages etc. never change for sequences */
if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
@@ -3825,27 +3825,27 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
*/
CommandCounterIncrement();
- RelationAssumeNewRelfilenode(relation);
+ RelationAssumeNewRelfilelocator(relation);
}
/*
- * RelationAssumeNewRelfilenode
+ * RelationAssumeNewRelfilelocator
*
* Code that modifies pg_class.reltablespace or pg_class.relfilenode must call
* this. The call shall precede any code that might insert WAL records whose
- * replay would modify bytes in the new RelFileNode, and the call shall follow
- * any WAL modifying bytes in the prior RelFileNode. See struct RelationData.
+ * replay would modify bytes in the new RelFileLocator, and the call shall follow
+ * any WAL modifying bytes in the prior RelFileLocator. See struct RelationData.
* Ideally, call this as near as possible to the CommandCounterIncrement()
* that makes the pg_class change visible (before it or after it); that
* minimizes the chance of future development adding a forbidden WAL insertion
- * between RelationAssumeNewRelfilenode() and CommandCounterIncrement().
+ * between RelationAssumeNewRelfilelocator() and CommandCounterIncrement().
*/
void
-RelationAssumeNewRelfilenode(Relation relation)
+RelationAssumeNewRelfilelocator(Relation relation)
{
- relation->rd_newRelfilenodeSubid = GetCurrentSubTransactionId();
- if (relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
- relation->rd_firstRelfilenodeSubid = relation->rd_newRelfilenodeSubid;
+ relation->rd_newRelfilelocatorSubid = GetCurrentSubTransactionId();
+ if (relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid = relation->rd_newRelfilelocatorSubid;
/* Flag relation as needing eoxact cleanup (to clear these fields) */
EOXactListAdd(relation);
@@ -6254,8 +6254,8 @@ load_relcache_init_file(bool shared)
rel->rd_fkeyvalid = false;
rel->rd_fkeylist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
MemSet(&rel->pgstat_info, 0, sizeof(rel->pgstat_info));
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
deleted file mode 100644
index 70c323c..0000000
--- a/src/backend/utils/cache/relfilenodemap.c
+++ /dev/null
@@ -1,244 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.c
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- * src/backend/utils/cache/relfilenodemap.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "access/genam.h"
-#include "access/htup_details.h"
-#include "access/table.h"
-#include "catalog/pg_class.h"
-#include "catalog/pg_tablespace.h"
-#include "miscadmin.h"
-#include "utils/builtins.h"
-#include "utils/catcache.h"
-#include "utils/fmgroids.h"
-#include "utils/hsearch.h"
-#include "utils/inval.h"
-#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
-#include "utils/relmapper.h"
-
-/* Hash table for information about each relfilenode <-> oid pair */
-static HTAB *RelfilenodeMapHash = NULL;
-
-/* built first time through in InitializeRelfilenodeMap */
-static ScanKeyData relfilenode_skey[2];
-
-typedef struct
-{
- Oid reltablespace;
- Oid relfilenode;
-} RelfilenodeMapKey;
-
-typedef struct
-{
- RelfilenodeMapKey key; /* lookup key - must be first */
- Oid relid; /* pg_class.oid */
-} RelfilenodeMapEntry;
-
-/*
- * RelfilenodeMapInvalidateCallback
- * Flush mapping entries when pg_class is updated in a relevant fashion.
- */
-static void
-RelfilenodeMapInvalidateCallback(Datum arg, Oid relid)
-{
- HASH_SEQ_STATUS status;
- RelfilenodeMapEntry *entry;
-
- /* callback only gets registered after creating the hash */
- Assert(RelfilenodeMapHash != NULL);
-
- hash_seq_init(&status, RelfilenodeMapHash);
- while ((entry = (RelfilenodeMapEntry *) hash_seq_search(&status)) != NULL)
- {
- /*
- * If relid is InvalidOid, signaling a complete reset, we must remove
- * all entries, otherwise just remove the specific relation's entry.
- * Always remove negative cache entries.
- */
- if (relid == InvalidOid || /* complete reset */
- entry->relid == InvalidOid || /* negative cache entry */
- entry->relid == relid) /* individual flushed relation */
- {
- if (hash_search(RelfilenodeMapHash,
- (void *) &entry->key,
- HASH_REMOVE,
- NULL) == NULL)
- elog(ERROR, "hash table corrupted");
- }
- }
-}
-
-/*
- * InitializeRelfilenodeMap
- * Initialize cache, either on first use or after a reset.
- */
-static void
-InitializeRelfilenodeMap(void)
-{
- HASHCTL ctl;
- int i;
-
- /* Make sure we've initialized CacheMemoryContext. */
- if (CacheMemoryContext == NULL)
- CreateCacheMemoryContext();
-
- /* build skey */
- MemSet(&relfilenode_skey, 0, sizeof(relfilenode_skey));
-
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenode_skey[i].sk_func,
- CacheMemoryContext);
- relfilenode_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenode_skey[i].sk_subtype = InvalidOid;
- relfilenode_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenode_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenode_skey[1].sk_attno = Anum_pg_class_relfilenode;
-
- /*
- * Only create the RelfilenodeMapHash now, so we don't end up partially
- * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
- * error.
- */
- ctl.keysize = sizeof(RelfilenodeMapKey);
- ctl.entrysize = sizeof(RelfilenodeMapEntry);
- ctl.hcxt = CacheMemoryContext;
-
- RelfilenodeMapHash =
- hash_create("RelfilenodeMap cache", 64, &ctl,
- HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
-
- /* Watch for invalidation events. */
- CacheRegisterRelcacheCallback(RelfilenodeMapInvalidateCallback,
- (Datum) 0);
-}
-
-/*
- * Map a relation's (tablespace, filenode) to a relation's oid and cache the
- * result.
- *
- * Returns InvalidOid if no relation matching the criteria could be found.
- */
-Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
-{
- RelfilenodeMapKey key;
- RelfilenodeMapEntry *entry;
- bool found;
- SysScanDesc scandesc;
- Relation relation;
- HeapTuple ntp;
- ScanKeyData skey[2];
- Oid relid;
-
- if (RelfilenodeMapHash == NULL)
- InitializeRelfilenodeMap();
-
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
-
- /*
- * Check cache and return entry if one is found. Even if no target
- * relation can be found later on we store the negative match and return a
- * InvalidOid from cache. That's not really necessary for performance
- * since querying invalid values isn't supposed to be a frequent thing,
- * but it's basically free.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_FIND, &found);
-
- if (found)
- return entry->relid;
-
- /* ok, no previous cache entry, do it the hard way */
-
- /* initialize empty/negative cache entry before doing the actual lookups */
- relid = InvalidOid;
-
- if (reltablespace == GLOBALTABLESPACE_OID)
- {
- /*
- * Ok, shared table, check relmapper.
- */
- relid = RelationMapFilenodeToOid(relfilenode, true);
- }
- else
- {
- /*
- * Not a shared table, could either be a plain relation or a
- * non-shared, nailed one, like e.g. pg_class.
- */
-
- /* check for plain relations by looking in pg_class */
- relation = table_open(RelationRelationId, AccessShareLock);
-
- /* copy scankey to local copy, it will be modified during the scan */
- memcpy(skey, relfilenode_skey, sizeof(skey));
-
- /* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
-
- scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
- true,
- NULL,
- 2,
- skey);
-
- found = false;
-
- while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
- {
- Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
-
- if (found)
- elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
- reltablespace, relfilenode);
- found = true;
-
- Assert(classform->reltablespace == reltablespace);
- Assert(classform->relfilenode == relfilenode);
- relid = classform->oid;
- }
-
- systable_endscan(scandesc);
- table_close(relation, AccessShareLock);
-
- /* check for tables that are mapped but not shared */
- if (!found)
- relid = RelationMapFilenodeToOid(relfilenode, false);
- }
-
- /*
- * Only enter entry into cache now, our opening of pg_class could have
- * caused cache invalidations to be executed which would have deleted a
- * new entry if we had entered it above.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_ENTER, &found);
- if (found)
- elog(ERROR, "corrupted hashtable");
- entry->relid = relid;
-
- return relid;
-}
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
new file mode 100644
index 0000000..c3b9a52
--- /dev/null
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -0,0 +1,244 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.c
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/utils/cache/relfilenumbermap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/htup_details.h"
+#include "access/table.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/catcache.h"
+#include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/inval.h"
+#include "utils/rel.h"
+#include "utils/relfilenumbermap.h"
+#include "utils/relmapper.h"
+
+/* Hash table for information about each relfilenumber <-> oid pair */
+static HTAB *RelfilenumberMapHash = NULL;
+
+/* built first time through in InitializeRelfilenumberMap */
+static ScanKeyData relfilenumber_skey[2];
+
+typedef struct
+{
+ Oid reltablespace;
+ Oid relfilenumber;
+} RelfilenumberMapKey;
+
+typedef struct
+{
+ RelfilenumberMapKey key; /* lookup key - must be first */
+ Oid relid; /* pg_class.oid */
+} RelfilenumberMapEntry;
+
+/*
+ * RelfilenumberMapInvalidateCallback
+ * Flush mapping entries when pg_class is updated in a relevant fashion.
+ */
+static void
+RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
+{
+ HASH_SEQ_STATUS status;
+ RelfilenumberMapEntry *entry;
+
+ /* callback only gets registered after creating the hash */
+ Assert(RelfilenumberMapHash != NULL);
+
+ hash_seq_init(&status, RelfilenumberMapHash);
+ while ((entry = (RelfilenumberMapEntry *) hash_seq_search(&status)) != NULL)
+ {
+ /*
+ * If relid is InvalidOid, signaling a complete reset, we must remove
+ * all entries, otherwise just remove the specific relation's entry.
+ * Always remove negative cache entries.
+ */
+ if (relid == InvalidOid || /* complete reset */
+ entry->relid == InvalidOid || /* negative cache entry */
+ entry->relid == relid) /* individual flushed relation */
+ {
+ if (hash_search(RelfilenumberMapHash,
+ (void *) &entry->key,
+ HASH_REMOVE,
+ NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+ }
+ }
+}
+
+/*
+ * InitializeRelfilenumberMap
+ * Initialize cache, either on first use or after a reset.
+ */
+static void
+InitializeRelfilenumberMap(void)
+{
+ HASHCTL ctl;
+ int i;
+
+ /* Make sure we've initialized CacheMemoryContext. */
+ if (CacheMemoryContext == NULL)
+ CreateCacheMemoryContext();
+
+ /* build skey */
+ MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
+
+ for (i = 0; i < 2; i++)
+ {
+ fmgr_info_cxt(F_OIDEQ,
+ &relfilenumber_skey[i].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[i].sk_subtype = InvalidOid;
+ relfilenumber_skey[i].sk_collation = InvalidOid;
+ }
+
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
+ relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+
+ /*
+ * Only create the RelfilenumberMapHash now, so we don't end up partially
+ * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
+ * error.
+ */
+ ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.entrysize = sizeof(RelfilenumberMapEntry);
+ ctl.hcxt = CacheMemoryContext;
+
+ RelfilenumberMapHash =
+ hash_create("RelfilenumberMap cache", 64, &ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+ /* Watch for invalidation events. */
+ CacheRegisterRelcacheCallback(RelfilenumberMapInvalidateCallback,
+ (Datum) 0);
+}
+
+/*
+ * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * the result.
+ *
+ * Returns InvalidOid if no relation matching the criteria could be found.
+ */
+Oid
+RelidByRelfilenumber(Oid reltablespace, Oid relfilenumber)
+{
+ RelfilenumberMapKey key;
+ RelfilenumberMapEntry *entry;
+ bool found;
+ SysScanDesc scandesc;
+ Relation relation;
+ HeapTuple ntp;
+ ScanKeyData skey[2];
+ Oid relid;
+
+ if (RelfilenumberMapHash == NULL)
+ InitializeRelfilenumberMap();
+
+ /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
+ if (reltablespace == MyDatabaseTableSpace)
+ reltablespace = 0;
+
+ MemSet(&key, 0, sizeof(key));
+ key.reltablespace = reltablespace;
+ key.relfilenumber = relfilenumber;
+
+ /*
+ * Check cache and return entry if one is found. Even if no target
+ * relation can be found later on we store the negative match and return a
+ * InvalidOid from cache. That's not really necessary for performance
+ * since querying invalid values isn't supposed to be a frequent thing,
+ * but it's basically free.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+
+ if (found)
+ return entry->relid;
+
+ /* ok, no previous cache entry, do it the hard way */
+
+ /* initialize empty/negative cache entry before doing the actual lookups */
+ relid = InvalidOid;
+
+ if (reltablespace == GLOBALTABLESPACE_OID)
+ {
+ /*
+ * Ok, shared table, check relmapper.
+ */
+ relid = RelationMapFilenumberToOid(relfilenumber, true);
+ }
+ else
+ {
+ /*
+ * Not a shared table, could either be a plain relation or a
+ * non-shared, nailed one, like e.g. pg_class.
+ */
+
+ /* check for plain relations by looking in pg_class */
+ relation = table_open(RelationRelationId, AccessShareLock);
+
+ /* copy scankey to local copy, it will be modified during the scan */
+ memcpy(skey, relfilenumber_skey, sizeof(skey));
+
+ /* set scan arguments */
+ skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
+ skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+
+ scandesc = systable_beginscan(relation,
+ ClassTblspcRelfilenodeIndexId,
+ true,
+ NULL,
+ 2,
+ skey);
+
+ found = false;
+
+ while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
+ {
+ Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
+
+ if (found)
+ elog(ERROR,
+ "unexpected duplicate for tablespace %u, relfilenumber %u",
+ reltablespace, relfilenumber);
+ found = true;
+
+ Assert(classform->reltablespace == reltablespace);
+ Assert(classform->relfilenode == relfilenumber);
+ relid = classform->oid;
+ }
+
+ systable_endscan(scandesc);
+ table_close(relation, AccessShareLock);
+
+ /* check for tables that are mapped but not shared */
+ if (!found)
+ relid = RelationMapFilenumberToOid(relfilenumber, false);
+ }
+
+ /*
+ * Only enter entry into cache now, our opening of pg_class could have
+ * caused cache invalidations to be executed which would have deleted a
+ * new entry if we had entered it above.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ if (found)
+ elog(ERROR, "corrupted hashtable");
+ entry->relid = relid;
+
+ return relid;
+}
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 2a330cf..905a57b 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.c
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
* For most tables, the physical file underlying the table is specified by
* pg_class.relfilenode. However, that obviously won't work for pg_class
@@ -11,7 +11,7 @@
* update other databases' pg_class entries when relocating a shared catalog.
* Therefore, for these special catalogs (henceforth referred to as "mapped
* catalogs") we rely on a separately maintained file that shows the mapping
- * from catalog OIDs to filenode numbers. Each database has a map file for
+ * from catalog OIDs to filenumber. Each database has a map file for
* its local mapped catalogs, and there is a separate map file for shared
* catalogs. Mapped catalogs have zero in their pg_class.relfilenode entries.
*
@@ -79,7 +79,7 @@
typedef struct RelMapping
{
Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ Oid mapfilenumber; /* its rel file number */
} RelMapping;
typedef struct RelMapFile
@@ -116,7 +116,7 @@ static RelMapFile local_map;
* subtransactions, so one set of transaction-level changes is sufficient.
*
* The active_xxx variables contain updates that are valid in our transaction
- * and should be honored by RelationMapOidToFilenode. The pending_xxx
+ * and should be honored by RelationMapOidToFilenumber. The pending_xxx
* variables contain updates we have been told about that aren't active yet;
* they will become active at the next CommandCounterIncrement. This setup
* lets map updates act similarly to updates of pg_class rows, ie, they
@@ -132,7 +132,7 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
+static void apply_map_update(RelMapFile *map, Oid relationId, Oid filenumber,
bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
@@ -146,9 +146,9 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
/*
- * RelationMapOidToFilenode
+ * RelationMapOidToFilenumber
*
- * The raison d' etre ... given a relation OID, look up its filenode.
+ * The raison d' etre ... given a relation OID, look up its filenumber.
*
* Although shared and local relation OIDs should never overlap, the caller
* always knows which we need --- so pass that information to avoid useless
@@ -158,7 +158,7 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* but the caller is in a better position to report a meaningful error).
*/
Oid
-RelationMapOidToFilenode(Oid relationId, bool shared)
+RelationMapOidToFilenumber(Oid relationId, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -170,13 +170,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
else
@@ -185,13 +185,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
@@ -199,19 +199,19 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
}
/*
- * RelationMapFilenodeToOid
+ * RelationMapFilenumberToOid
*
* Do the reverse of the normal direction of mapping done in
- * RelationMapOidToFilenode.
+ * RelationMapOidToFilenumber.
*
* This is not supposed to be used during normal running but rather for
* information purposes when looking at the filesystem or xlog.
*
* Returns InvalidOid if the OID is not known; this can easily happen if the
- * relfilenode doesn't pertain to a mapped relation.
+ * relfilenumber doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenumberToOid(Oid filenumber, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -222,13 +222,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -237,13 +237,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -252,13 +252,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
}
/*
- * RelationMapOidToFilenodeForDatabase
+ * RelationMapOidToFilenumberForDatabase
*
- * Like RelationMapOidToFilenode, but reads the mapping from the indicated
+ * Like RelationMapOidToFilenumber, but reads the mapping from the indicated
* path instead of using the one for the current database.
*/
Oid
-RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
+RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId)
{
RelMapFile map;
int i;
@@ -270,7 +270,7 @@ RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
for (i = 0; i < map.num_mappings; i++)
{
if (relationId == map.mappings[i].mapoid)
- return map.mappings[i].mapfilenode;
+ return map.mappings[i].mapfilenumber;
}
return InvalidOid;
@@ -311,13 +311,13 @@ RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath, char *dstdbpath)
/*
* RelationMapUpdateMap
*
- * Install a new relfilenode mapping for the specified relation.
+ * Install a new relfilenumber mapping for the specified relation.
*
* If immediate is true (or we're bootstrapping), the mapping is activated
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, Oid fileNumber, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -362,7 +362,7 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
map = &pending_local_updates;
}
}
- apply_map_update(map, relationId, fileNode, true);
+ apply_map_update(map, relationId, fileNumber, true);
}
/*
@@ -375,7 +375,7 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, Oid fileNumber, bool add_okay)
{
int32 i;
@@ -384,7 +384,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ map->mappings[i].mapfilenumber = fileNumber;
return;
}
}
@@ -396,7 +396,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ map->mappings[map->num_mappings].mapfilenumber = fileNumber;
map->num_mappings++;
}
@@ -415,7 +415,7 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ updates->mappings[i].mapfilenumber,
add_okay);
}
}
@@ -983,12 +983,12 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
for (i = 0; i < newmap->num_mappings; i++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.spcNode = tsid;
- rnode.dbNode = dbid;
- rnode.relNode = newmap->mappings[i].mapfilenode;
- RelationPreserveStorage(rnode, false);
+ rlocator.spcOid = tsid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = newmap->mappings[i].mapfilenumber;
+ RelationPreserveStorage(rlocator, false);
}
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7cc9c72..89b67a3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4806,15 +4806,15 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
{
PQExpBuffer upgrade_query = createPQExpBuffer();
PGresult *upgrade_res;
- Oid relfilenode;
+ Oid relfilenumber;
Oid toast_oid;
- Oid toast_relfilenode;
+ Oid toast_relfilenumber;
char relkind;
Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ Oid toast_index_relfilenumber;
/*
- * Preserve the OID and relfilenode of the table, table's index, table's
+ * Preserve the OID and relfilenumber of the table, table's index, table's
* toast table and toast table's index if any.
*
* One complexity is that the current table definition might not require
@@ -4837,15 +4837,15 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
@@ -4859,13 +4859,13 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
/*
* Not every relation has storage. Also, in a pre-v12 database,
- * partitioned tables have a relfilenode, which should not be
+ * partitioned tables have a relfilenumber, which should not be
* preserved when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (OidIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
/*
* In a pre-v12 database, partitioned tables might be marked as having
@@ -4879,7 +4879,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
- toast_relfilenode);
+ toast_relfilenumber);
/* every toast table has an index */
appendPQExpBuffer(upgrade_buffer,
@@ -4887,20 +4887,20 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- toast_index_relfilenode);
+ toast_index_relfilenumber);
}
PQclear(upgrade_res);
}
else
{
- /* Preserve the OID and relfilenode of the index */
+ /* Preserve the OID and relfilenumber of the index */
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
}
appendPQExpBufferChar(upgrade_buffer, '\n');
diff --git a/src/bin/pg_rewind/datapagemap.h b/src/bin/pg_rewind/datapagemap.h
index ae4965f..235b676 100644
--- a/src/bin/pg_rewind/datapagemap.h
+++ b/src/bin/pg_rewind/datapagemap.h
@@ -10,7 +10,7 @@
#define DATAPAGEMAP_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
struct datapagemap
{
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 6252931..f832a94 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -56,7 +56,7 @@ static uint32 hash_string_pointer(const char *s);
static filehash_hash *filehash;
static bool isRelDataFile(const char *path);
-static char *datasegpath(RelFileNode rnode, ForkNumber forknum,
+static char *datasegpath(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber segno);
static file_entry_t *insert_filehash_entry(const char *path);
@@ -288,7 +288,7 @@ process_target_file(const char *path, file_type_t type, size_t size,
* hash table!
*/
void
-process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
+process_target_wal_block_change(ForkNumber forknum, RelFileLocator rlocator,
BlockNumber blkno)
{
char *path;
@@ -299,7 +299,7 @@ process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
segno = blkno / RELSEG_SIZE;
blkno_inseg = blkno % RELSEG_SIZE;
- path = datasegpath(rnode, forknum, segno);
+ path = datasegpath(rlocator, forknum, segno);
entry = lookup_filehash_entry(path);
pfree(path);
@@ -508,7 +508,7 @@ print_filemap(filemap_t *filemap)
static bool
isRelDataFile(const char *path)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
unsigned int segNo;
int nmatch;
bool matched;
@@ -532,32 +532,32 @@ isRelDataFile(const char *path)
*
*----
*/
- rnode.spcNode = InvalidOid;
- rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rlocator.spcOid = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.relNumber = InvalidOid;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
- rnode.spcNode = GLOBALTABLESPACE_OID;
- rnode.dbNode = 0;
+ rlocator.spcOid = GLOBALTABLESPACE_OID;
+ rlocator.dbOid = 0;
matched = true;
}
else
{
nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ &rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
- rnode.spcNode = DEFAULTTABLESPACE_OID;
+ rlocator.spcOid = DEFAULTTABLESPACE_OID;
matched = true;
}
else
{
nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ &rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
matched = true;
@@ -567,12 +567,12 @@ isRelDataFile(const char *path)
/*
* The sscanf tests above can match files that have extra characters at
* the end. To eliminate such cases, cross-check that GetRelationPath
- * creates the exact same filename, when passed the RelFileNode
+ * creates the exact same filename, when passed the RelFileLocator
* information we extracted from the filename.
*/
if (matched)
{
- char *check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
+ char *check_path = datasegpath(rlocator, MAIN_FORKNUM, segNo);
if (strcmp(check_path, path) != 0)
matched = false;
@@ -589,12 +589,12 @@ isRelDataFile(const char *path)
* The returned path is palloc'd
*/
static char *
-datasegpath(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
+datasegpath(RelFileLocator rlocator, ForkNumber forknum, BlockNumber segno)
{
char *path;
char *segpath;
- path = relpathperm(rnode, forknum);
+ path = relpathperm(rlocator, forknum);
if (segno > 0)
{
segpath = psprintf("%s.%u", path, segno);
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 096f57a..0e011fb 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -10,7 +10,7 @@
#include "datapagemap.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* these enum values are sorted in the order we want actions to be processed */
typedef enum
@@ -103,7 +103,7 @@ extern void process_source_file(const char *path, file_type_t type,
extern void process_target_file(const char *path, file_type_t type,
size_t size, const char *link_target);
extern void process_target_wal_block_change(ForkNumber forknum,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blkno);
extern filemap_t *decide_file_actions(void);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index c6792da..d97240e 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -445,18 +445,18 @@ extractPageInfo(XLogReaderState *record)
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
- ForkNumber forknum;
- BlockNumber blkno;
+ RelFileLocator rlocator;
+ ForkNumber forknum;
+ BlockNumber blkno;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
continue;
/* We only care about the main fork; others are copied in toto */
if (forknum != MAIN_FORKNUM)
continue;
- process_target_wal_block_change(forknum, rnode, blkno);
+ process_target_wal_block_change(forknum, rlocator, blkno);
}
}
diff --git a/src/bin/pg_rewind/pg_rewind.h b/src/bin/pg_rewind/pg_rewind.h
index 393182f..8b4b50a 100644
--- a/src/bin/pg_rewind/pg_rewind.h
+++ b/src/bin/pg_rewind/pg_rewind.h
@@ -16,7 +16,7 @@
#include "datapagemap.h"
#include "libpq-fe.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* Configuration options */
extern char *datadir_target;
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 587793e..7f8042f 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -19,7 +19,7 @@ OBJS = \
option.o \
parallel.o \
pg_upgrade.o \
- relfilenode.o \
+ relfilenumber.o \
server.o \
tablespace.o \
util.o \
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5c3968e..b45a32c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -190,9 +190,9 @@ create_rel_filename_map(const char *old_data, const char *new_data,
map->new_tablespace_suffix = new_cluster.tablespace_suffix;
}
- /* DB oid and relfilenodes are preserved between old and new cluster */
+ /* DB oid and relfilenumbers are preserved between old and new cluster */
map->db_oid = old_db->db_oid;
- map->relfilenode = old_rel->relfilenode;
+ map->relfilenumber = old_rel->relfilenumber;
/* used only for logging and error reporting, old/new are identical */
map->nspname = old_rel->nspname;
@@ -399,7 +399,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
+ i_relfilenumber,
i_reltablespace;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
@@ -495,7 +495,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_toastheap = PQfnumber(res, "toastheap");
i_nspname = PQfnumber(res, "nspname");
i_relname = PQfnumber(res, "relname");
- i_relfilenode = PQfnumber(res, "relfilenode");
+ i_relfilenumber = PQfnumber(res, "relfilenode");
i_reltablespace = PQfnumber(res, "reltablespace");
i_spclocation = PQfnumber(res, "spclocation");
@@ -527,7 +527,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 55de244..85a9eeb 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -135,7 +135,7 @@ typedef struct
char *nspname; /* namespace name */
char *relname; /* relation name */
Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
+ Oid relfilenumber; /* relation file number */
Oid indtable; /* if index, OID of its table, else 0 */
Oid toastheap; /* if toast table, OID of base table, else 0 */
char *tablespace; /* tablespace path; "" for cluster default */
@@ -159,7 +159,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ Oid relfilenumber;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
@@ -400,7 +400,7 @@ void parseCommandLine(int argc, char *argv[]);
void adjust_data_dir(ClusterInfo *cluster);
void get_sock_dir(ClusterInfo *cluster, bool live_check);
-/* relfilenode.c */
+/* relfilenumber.c */
void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
deleted file mode 100644
index d23ac88..0000000
--- a/src/bin/pg_upgrade/relfilenode.c
+++ /dev/null
@@ -1,259 +0,0 @@
-/*
- * relfilenode.c
- *
- * relfilenode functions
- *
- * Copyright (c) 2010-2022, PostgreSQL Global Development Group
- * src/bin/pg_upgrade/relfilenode.c
- */
-
-#include "postgres_fe.h"
-
-#include <sys/stat.h>
-
-#include "access/transam.h"
-#include "catalog/pg_class_d.h"
-#include "pg_upgrade.h"
-
-static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
-static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
-
-
-/*
- * transfer_all_new_tablespaces()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata)
-{
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- prep_status_progress("Cloning user relation files");
- break;
- case TRANSFER_MODE_COPY:
- prep_status_progress("Copying user relation files");
- break;
- case TRANSFER_MODE_LINK:
- prep_status_progress("Linking user relation files");
- break;
- }
-
- /*
- * Transferring files by tablespace is tricky because a single database
- * can use multiple tablespaces. For non-parallel mode, we just pass a
- * NULL tablespace path, which matches all tablespaces. In parallel mode,
- * we pass the default tablespace and all user-created tablespaces and let
- * those operations happen in parallel.
- */
- if (user_opts.jobs <= 1)
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, NULL);
- else
- {
- int tblnum;
-
- /* transfer default tablespace */
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, old_pgdata);
-
- for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
- parallel_transfer_all_new_dbs(old_db_arr,
- new_db_arr,
- old_pgdata,
- new_pgdata,
- os_info.old_tablespaces[tblnum]);
- /* reap all children */
- while (reap_child(true) == true)
- ;
- }
-
- end_progress_output();
- check_ok();
-}
-
-
-/*
- * transfer_all_new_dbs()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata, char *old_tablespace)
-{
- int old_dbnum,
- new_dbnum;
-
- /* Scan the old cluster databases and transfer their files */
- for (old_dbnum = new_dbnum = 0;
- old_dbnum < old_db_arr->ndbs;
- old_dbnum++, new_dbnum++)
- {
- DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
- *new_db = NULL;
- FileNameMap *mappings;
- int n_maps;
-
- /*
- * Advance past any databases that exist in the new cluster but not in
- * the old, e.g. "postgres". (The user might have removed the
- * 'postgres' database from the old cluster.)
- */
- for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
- {
- new_db = &new_db_arr->dbs[new_dbnum];
- if (strcmp(old_db->db_name, new_db->db_name) == 0)
- break;
- }
-
- if (new_dbnum >= new_db_arr->ndbs)
- pg_fatal("old database \"%s\" not found in the new cluster\n",
- old_db->db_name);
-
- mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
- new_pgdata);
- if (n_maps)
- {
- transfer_single_new_db(mappings, n_maps, old_tablespace);
- }
- /* We allocate something even for n_maps == 0 */
- pg_free(mappings);
- }
-}
-
-/*
- * transfer_single_new_db()
- *
- * create links for mappings stored in "maps" array.
- */
-static void
-transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
-{
- int mapnum;
- bool vm_must_add_frozenbit = false;
-
- /*
- * Do we need to rewrite visibilitymap?
- */
- if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
- new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
- vm_must_add_frozenbit = true;
-
- for (mapnum = 0; mapnum < size; mapnum++)
- {
- if (old_tablespace == NULL ||
- strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
- {
- /* transfer primary file */
- transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
-
- /*
- * Copy/link any fsm and vm files, if they exist
- */
- transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
- transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
- }
- }
-}
-
-
-/*
- * transfer_relfile()
- *
- * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
- * is true, visibility map forks are converted and rewritten, even in link
- * mode.
- */
-static void
-transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
-{
- char old_file[MAXPGPATH];
- char new_file[MAXPGPATH];
- int segno;
- char extent_suffix[65];
- struct stat statbuf;
-
- /*
- * Now copy/link any related segments as well. Remember, PG breaks large
- * files into 1GB segments, the first segment has no extension, subsequent
- * segments are named relfilenode.1, relfilenode.2, relfilenode.3.
- */
- for (segno = 0;; segno++)
- {
- if (segno == 0)
- extent_suffix[0] = '\0';
- else
- snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
-
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
- map->old_tablespace,
- map->old_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
- map->new_tablespace,
- map->new_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
-
- /* Is it an extent, fsm, or vm file? */
- if (type_suffix[0] != '\0' || segno != 0)
- {
- /* Did file open fail? */
- if (stat(old_file, &statbuf) != 0)
- {
- /* File does not exist? That's OK, just return */
- if (errno == ENOENT)
- return;
- else
- pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
- map->nspname, map->relname, old_file, new_file,
- strerror(errno));
- }
-
- /* If file is empty, just return */
- if (statbuf.st_size == 0)
- return;
- }
-
- unlink(new_file);
-
- /* Copying files might take some time, so give feedback. */
- pg_log(PG_STATUS, "%s", old_file);
-
- if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
- {
- /* Need to rewrite visibility map format */
- pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
- old_file, new_file);
- rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
- }
- else
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
- old_file, new_file);
- cloneFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_COPY:
- pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
- old_file, new_file);
- copyFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_LINK:
- pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
- old_file, new_file);
- linkFile(old_file, new_file, map->nspname, map->relname);
- }
- }
-}
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
new file mode 100644
index 0000000..b3ad820
--- /dev/null
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -0,0 +1,259 @@
+/*
+ * relfilenumber.c
+ *
+ * relfilenumber functions
+ *
+ * Copyright (c) 2010-2022, PostgreSQL Global Development Group
+ * src/bin/pg_upgrade/relfilenumber.c
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+
+#include "access/transam.h"
+#include "catalog/pg_class_d.h"
+#include "pg_upgrade.h"
+
+static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
+static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
+
+
+/*
+ * transfer_all_new_tablespaces()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata)
+{
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ prep_status_progress("Cloning user relation files");
+ break;
+ case TRANSFER_MODE_COPY:
+ prep_status_progress("Copying user relation files");
+ break;
+ case TRANSFER_MODE_LINK:
+ prep_status_progress("Linking user relation files");
+ break;
+ }
+
+ /*
+ * Transferring files by tablespace is tricky because a single database
+ * can use multiple tablespaces. For non-parallel mode, we just pass a
+ * NULL tablespace path, which matches all tablespaces. In parallel mode,
+ * we pass the default tablespace and all user-created tablespaces and let
+ * those operations happen in parallel.
+ */
+ if (user_opts.jobs <= 1)
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, NULL);
+ else
+ {
+ int tblnum;
+
+ /* transfer default tablespace */
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, old_pgdata);
+
+ for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+ parallel_transfer_all_new_dbs(old_db_arr,
+ new_db_arr,
+ old_pgdata,
+ new_pgdata,
+ os_info.old_tablespaces[tblnum]);
+ /* reap all children */
+ while (reap_child(true) == true)
+ ;
+ }
+
+ end_progress_output();
+ check_ok();
+}
+
+
+/*
+ * transfer_all_new_dbs()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata, char *old_tablespace)
+{
+ int old_dbnum,
+ new_dbnum;
+
+ /* Scan the old cluster databases and transfer their files */
+ for (old_dbnum = new_dbnum = 0;
+ old_dbnum < old_db_arr->ndbs;
+ old_dbnum++, new_dbnum++)
+ {
+ DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
+ *new_db = NULL;
+ FileNameMap *mappings;
+ int n_maps;
+
+ /*
+ * Advance past any databases that exist in the new cluster but not in
+ * the old, e.g. "postgres". (The user might have removed the
+ * 'postgres' database from the old cluster.)
+ */
+ for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
+ {
+ new_db = &new_db_arr->dbs[new_dbnum];
+ if (strcmp(old_db->db_name, new_db->db_name) == 0)
+ break;
+ }
+
+ if (new_dbnum >= new_db_arr->ndbs)
+ pg_fatal("old database \"%s\" not found in the new cluster\n",
+ old_db->db_name);
+
+ mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
+ new_pgdata);
+ if (n_maps)
+ {
+ transfer_single_new_db(mappings, n_maps, old_tablespace);
+ }
+ /* We allocate something even for n_maps == 0 */
+ pg_free(mappings);
+ }
+}
+
+/*
+ * transfer_single_new_db()
+ *
+ * create links for mappings stored in "maps" array.
+ */
+static void
+transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
+{
+ int mapnum;
+ bool vm_must_add_frozenbit = false;
+
+ /*
+ * Do we need to rewrite visibilitymap?
+ */
+ if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
+ new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
+ vm_must_add_frozenbit = true;
+
+ for (mapnum = 0; mapnum < size; mapnum++)
+ {
+ if (old_tablespace == NULL ||
+ strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
+ {
+ /* transfer primary file */
+ transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
+
+ /*
+ * Copy/link any fsm and vm files, if they exist
+ */
+ transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
+ transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
+ }
+ }
+}
+
+
+/*
+ * transfer_relfile()
+ *
+ * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
+ * is true, visibility map forks are converted and rewritten, even in link
+ * mode.
+ */
+static void
+transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
+{
+ char old_file[MAXPGPATH];
+ char new_file[MAXPGPATH];
+ int segno;
+ char extent_suffix[65];
+ struct stat statbuf;
+
+ /*
+ * Now copy/link any related segments as well. Remember, PG breaks large
+ * files into 1GB segments, the first segment has no extension, subsequent
+ * segments are named relfilenumber.1, relfilenumber.2, relfilenumber.3.
+ */
+ for (segno = 0;; segno++)
+ {
+ if (segno == 0)
+ extent_suffix[0] = '\0';
+ else
+ snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
+
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ map->old_tablespace,
+ map->old_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ map->new_tablespace,
+ map->new_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+
+ /* Is it an extent, fsm, or vm file? */
+ if (type_suffix[0] != '\0' || segno != 0)
+ {
+ /* Did file open fail? */
+ if (stat(old_file, &statbuf) != 0)
+ {
+ /* File does not exist? That's OK, just return */
+ if (errno == ENOENT)
+ return;
+ else
+ pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+ map->nspname, map->relname, old_file, new_file,
+ strerror(errno));
+ }
+
+ /* If file is empty, just return */
+ if (statbuf.st_size == 0)
+ return;
+ }
+
+ unlink(new_file);
+
+ /* Copying files might take some time, so give feedback. */
+ pg_log(PG_STATUS, "%s", old_file);
+
+ if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
+ {
+ /* Need to rewrite visibility map format */
+ pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
+ }
+ else
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ cloneFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_COPY:
+ pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ copyFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_LINK:
+ pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ linkFile(old_file, new_file, map->nspname, map->relname);
+ }
+ }
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5dc6010..0fdde9d 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -37,7 +37,7 @@ static const char *progname;
static int WalSegSz;
static volatile sig_atomic_t time_to_stop = false;
-static const RelFileNode emptyRelFileNode = {0, 0, 0};
+static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
typedef struct XLogDumpPrivate
{
@@ -63,7 +63,7 @@ typedef struct XLogDumpConfig
bool filter_by_rmgr_enabled;
TransactionId filter_by_xid;
bool filter_by_xid_enabled;
- RelFileNode filter_by_relation;
+ RelFileLocator filter_by_relation;
bool filter_by_extended;
bool filter_by_relation_enabled;
BlockNumber filter_by_relation_block;
@@ -393,7 +393,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
*/
static bool
XLogRecordMatchesRelationBlock(XLogReaderState *record,
- RelFileNode matchRnode,
+ RelFileLocator matchRlocator,
BlockNumber matchBlock,
ForkNumber matchFork)
{
@@ -401,17 +401,17 @@ XLogRecordMatchesRelationBlock(XLogReaderState *record,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if ((matchFork == InvalidForkNumber || matchFork == forknum) &&
- (RelFileNodeEquals(matchRnode, emptyRelFileNode) ||
- RelFileNodeEquals(matchRnode, rnode)) &&
+ (RelFileLocatorEquals(matchRlocator, emptyRelFileLocator) ||
+ RelFileLocatorEquals(matchRlocator, rlocator)) &&
(matchBlock == InvalidBlockNumber || matchBlock == blk))
return true;
}
@@ -885,11 +885,11 @@ main(int argc, char **argv)
break;
case 'R':
if (sscanf(optarg, "%u/%u/%u",
- &config.filter_by_relation.spcNode,
- &config.filter_by_relation.dbNode,
- &config.filter_by_relation.relNode) != 3 ||
- !OidIsValid(config.filter_by_relation.spcNode) ||
- !OidIsValid(config.filter_by_relation.relNode))
+ &config.filter_by_relation.spcOid,
+ &config.filter_by_relation.dbOid,
+ &config.filter_by_relation.relNumber) != 3 ||
+ !OidIsValid(config.filter_by_relation.spcOid) ||
+ !OidIsValid(config.filter_by_relation.relNumber))
{
pg_log_error("invalid relation specification: \"%s\"", optarg);
pg_log_error_detail("Expecting \"tablespace OID/database OID/relation filenode\".");
@@ -1132,7 +1132,7 @@ main(int argc, char **argv)
!XLogRecordMatchesRelationBlock(xlogreader_state,
config.filter_by_relation_enabled ?
config.filter_by_relation :
- emptyRelFileNode,
+ emptyRelFileLocator,
config.filter_by_relation_block_enabled ?
config.filter_by_relation_block :
InvalidBlockNumber,
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..78450e4 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -107,24 +107,24 @@ forkname_chars(const char *str, ForkNumber *fork)
* XXX this must agree with GetRelationPath()!
*/
char *
-GetDatabasePath(Oid dbNode, Oid spcNode)
+GetDatabasePath(Oid dbOid, Oid spcOid)
{
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
return pstrdup("global");
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
- return psprintf("base/%u", dbNode);
+ return psprintf("base/%u", dbOid);
}
else
{
/* All other tablespaces are accessed via symlinks */
return psprintf("pg_tblspc/%u/%s/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY, dbNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY, dbOid);
}
}
@@ -138,44 +138,44 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbOid, Oid spcOid, Oid relNumber,
int backendId, ForkNumber forkNumber)
{
char *path;
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
path = psprintf("global/%u_%s",
- relNode, forkNames[forkNumber]);
+ relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/%u", relNumber);
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/%u_%s",
- dbNode, relNode,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/%u",
- dbNode, relNode);
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
- dbNode, backendId, relNode,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/t%d_%u",
- dbNode, backendId, relNode);
+ dbOid, backendId, relNumber);
}
}
else
@@ -185,25 +185,25 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber);
}
}
return path;
diff --git a/src/include/access/brin_xlog.h b/src/include/access/brin_xlog.h
index 95bfc7e..012a9af 100644
--- a/src/include/access/brin_xlog.h
+++ b/src/include/access/brin_xlog.h
@@ -18,7 +18,7 @@
#include "lib/stringinfo.h"
#include "storage/bufpage.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
diff --git a/src/include/access/ginxlog.h b/src/include/access/ginxlog.h
index 21de389..7f98503 100644
--- a/src/include/access/ginxlog.h
+++ b/src/include/access/ginxlog.h
@@ -110,7 +110,7 @@ typedef struct
typedef struct ginxlogSplit
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber rrlink; /* right link, or root's blocknumber if root
* split */
BlockNumber leftChildBlkno; /* valid on a non-leaf split */
@@ -167,7 +167,7 @@ typedef struct ginxlogDeletePage
*/
typedef struct ginxlogUpdateMeta
{
- RelFileNode node;
+ RelFileLocator locator;
GinMetaPageData metadata;
BlockNumber prevTail;
BlockNumber newRightlink;
diff --git a/src/include/access/gistxlog.h b/src/include/access/gistxlog.h
index 4537e67..9bbe4c2 100644
--- a/src/include/access/gistxlog.h
+++ b/src/include/access/gistxlog.h
@@ -97,7 +97,7 @@ typedef struct gistxlogPageDelete
*/
typedef struct gistxlogPageReuse
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} gistxlogPageReuse;
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 2d8a7f6..1705e73 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
@@ -370,9 +370,9 @@ typedef struct xl_heap_new_cid
CommandId combocid; /* just for debugging */
/*
- * Store the relfilenode/ctid pair to facilitate lookups.
+ * Store the relfilelocator/ctid pair to facilitate lookups.
*/
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
} xl_heap_new_cid;
@@ -415,7 +415,7 @@ extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
MultiXactId *relminmxid_out);
extern void heap_execute_freeze_tuple(HeapTupleHeader tuple,
xl_heap_freeze_tuple *xlrec_tp);
-extern XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer,
+extern XLogRecPtr log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer,
Buffer vm_buffer, TransactionId cutoff_xid, uint8 flags);
#endif /* HEAPAM_XLOG_H */
diff --git a/src/include/access/nbtxlog.h b/src/include/access/nbtxlog.h
index de362d3..d79489e 100644
--- a/src/include/access/nbtxlog.h
+++ b/src/include/access/nbtxlog.h
@@ -180,12 +180,12 @@ typedef struct xl_btree_dedup
* This is what we need to know about page reuse within btree. This record
* only exists to generate a conflict point for Hot Standby.
*
- * Note that we must include a RelFileNode in the record because we don't
+ * Note that we must include a RelFileLocator in the record because we don't
* actually register the buffer with the record.
*/
typedef struct xl_btree_reuse_page
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} xl_btree_reuse_page;
diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h
index 3e27790..353cbb2 100644
--- a/src/include/access/rewriteheap.h
+++ b/src/include/access/rewriteheap.h
@@ -15,7 +15,7 @@
#include "access/htup.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* struct definition is private to rewriteheap.c */
@@ -34,8 +34,8 @@ extern bool rewrite_heap_dead_tuple(RewriteState state, HeapTuple oldTuple);
*/
typedef struct LogicalRewriteMappingData
{
- RelFileNode old_node;
- RelFileNode new_node;
+ RelFileLocator old_locator;
+ RelFileLocator new_locator;
ItemPointerData old_tid;
ItemPointerData new_tid;
} LogicalRewriteMappingData;
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index fe869c6..83a8e7e 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -560,32 +560,32 @@ typedef struct TableAmRoutine
*/
/*
- * This callback needs to create a new relation filenode for `rel`, with
+ * This callback needs to create a new relation filelocator for `rel`, with
* appropriate durability behaviour for `persistence`.
*
* Note that only the subset of the relcache filled by
* RelationBuildLocalRelation() can be relied upon and that the relation's
* catalog entries will either not yet exist (new relation), or will still
- * reference the old relfilenode.
+ * reference the old relfilelocator.
*
* As output *freezeXid, *minmulti must be set to the values appropriate
* for pg_class.{relfrozenxid, relminmxid}. For AMs that don't need those
* fields to be filled they can be set to InvalidTransactionId and
* InvalidMultiXactId, respectively.
*
- * See also table_relation_set_new_filenode().
+ * See also table_relation_set_new_filelocator().
*/
- void (*relation_set_new_filenode) (Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti);
+ void (*relation_set_new_filelocator) (Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti);
/*
* This callback needs to remove all contents from `rel`'s current
- * relfilenode. No provisions for transactional behaviour need to be made.
- * Often this can be implemented by truncating the underlying storage to
- * its minimal size.
+ * relfilelocator. No provisions for transactional behaviour need to be
+ * made. Often this can be implemented by truncating the underlying
+ * storage to its minimal size.
*
* See also table_relation_nontransactional_truncate().
*/
@@ -598,7 +598,7 @@ typedef struct TableAmRoutine
* storage, unless it contains references to the tablespace internally.
*/
void (*relation_copy_data) (Relation rel,
- const RelFileNode *newrnode);
+ const RelFileLocator *newrlocator);
/* See table_relation_copy_for_cluster() */
void (*relation_copy_for_cluster) (Relation NewTable,
@@ -1348,7 +1348,7 @@ table_index_delete_tuples(Relation rel, TM_IndexDeleteOp *delstate)
* RelationGetBufferForTuple. See that method for more information.
*
* TABLE_INSERT_FROZEN should only be specified for inserts into
- * relfilenodes created during the current subtransaction and when
+ * relfilenumbers created during the current subtransaction and when
* there are no prior snapshots or pre-existing portals open.
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
@@ -1577,33 +1577,34 @@ table_finish_bulk_insert(Relation rel, int options)
*/
/*
- * Create storage for `rel` in `newrnode`, with persistence set to
+ * Create storage for `rel` in `newrlocator`, with persistence set to
* `persistence`.
*
* This is used both during relation creation and various DDL operations to
- * create a new relfilenode that can be filled from scratch. When creating
- * new storage for an existing relfilenode, this should be called before the
+ * create a new relfilelocator that can be filled from scratch. When creating
+ * new storage for an existing relfilelocator, this should be called before the
* relcache entry has been updated.
*
* *freezeXid, *minmulti are set to the xid / multixact horizon for the table
* that pg_class.{relfrozenxid, relminmxid} have to be set to.
*/
static inline void
-table_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+table_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
- rel->rd_tableam->relation_set_new_filenode(rel, newrnode, persistence,
- freezeXid, minmulti);
+ rel->rd_tableam->relation_set_new_filelocator(rel, newrlocator,
+ persistence, freezeXid,
+ minmulti);
}
/*
* Remove all table contents from `rel`, in a non-transactional manner.
* Non-transactional meaning that there's no need to support rollbacks. This
- * commonly only is used to perform truncations for relfilenodes created in the
- * current transaction.
+ * commonly only is used to perform truncations for relfilelocators created in
+ * the current transaction.
*/
static inline void
table_relation_nontransactional_truncate(Relation rel)
@@ -1612,15 +1613,15 @@ table_relation_nontransactional_truncate(Relation rel)
}
/*
- * Copy data from `rel` into the new relfilenode `newrnode`. The new
- * relfilenode may not have storage associated before this function is
+ * Copy data from `rel` into the new relfilelocator `newrlocator`. The new
+ * relfilelocator may not have storage associated before this function is
* called. This is only supposed to be used for low level operations like
* changing a relation's tablespace.
*/
static inline void
-table_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+table_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
- rel->rd_tableam->relation_copy_data(rel, newrnode);
+ rel->rd_tableam->relation_copy_data(rel, newrlocator);
}
/*
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 4794941..7d2b352 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -19,7 +19,7 @@
#include "datatype/timestamp.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/sinval.h"
/*
@@ -174,7 +174,7 @@ typedef struct SavedTransactionCharacteristics
*/
#define XACT_XINFO_HAS_DBINFO (1U << 0)
#define XACT_XINFO_HAS_SUBXACTS (1U << 1)
-#define XACT_XINFO_HAS_RELFILENODES (1U << 2)
+#define XACT_XINFO_HAS_RELFILELOCATORS (1U << 2)
#define XACT_XINFO_HAS_INVALS (1U << 3)
#define XACT_XINFO_HAS_TWOPHASE (1U << 4)
#define XACT_XINFO_HAS_ORIGIN (1U << 5)
@@ -252,12 +252,12 @@ typedef struct xl_xact_subxacts
} xl_xact_subxacts;
#define MinSizeOfXactSubxacts offsetof(xl_xact_subxacts, subxacts)
-typedef struct xl_xact_relfilenodes
+typedef struct xl_xact_relfilelocators
{
int nrels; /* number of relations */
- RelFileNode xnodes[FLEXIBLE_ARRAY_MEMBER];
-} xl_xact_relfilenodes;
-#define MinSizeOfXactRelfilenodes offsetof(xl_xact_relfilenodes, xnodes)
+ RelFileLocator xlocators[FLEXIBLE_ARRAY_MEMBER];
+} xl_xact_relfilelocators;
+#define MinSizeOfXactRelfileLocators offsetof(xl_xact_relfilelocators, xlocators)
/*
* A transactionally dropped statistics entry.
@@ -305,7 +305,7 @@ typedef struct xl_xact_commit
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* xl_xact_invals follows if XINFO_HAS_INVALS */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -321,7 +321,7 @@ typedef struct xl_xact_abort
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* No invalidation messages needed. */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -367,7 +367,7 @@ typedef struct xl_xact_parsed_commit
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -378,7 +378,7 @@ typedef struct xl_xact_parsed_commit
TransactionId twophase_xid; /* only for 2PC */
char twophase_gid[GIDSIZE]; /* only for 2PC */
int nabortrels; /* only for 2PC */
- RelFileNode *abortnodes; /* only for 2PC */
+ RelFileLocator *abortlocators; /* only for 2PC */
int nabortstats; /* only for 2PC */
xl_xact_stats_item *abortstats; /* only for 2PC */
@@ -400,7 +400,7 @@ typedef struct xl_xact_parsed_abort
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -483,7 +483,7 @@ extern int xactGetCommittedChildren(TransactionId **ptr);
extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int nmsgs, SharedInvalidationMessage *msgs,
@@ -494,7 +494,7 @@ extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
extern XLogRecPtr XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int xactflags, TransactionId twophase_xid,
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index fae0bef..3524c39 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,7 +25,7 @@
#include "lib/stringinfo.h"
#include "pgtime.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index 5fc340c..c04f77b 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "storage/block.h"
#include "storage/buf.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/*
@@ -45,16 +45,16 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int max_block_id, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
+extern void XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator,
ForkNumber forknum, BlockNumber blknum, char *page,
uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
+extern XLogRecPtr log_newpage(RelFileLocator *rlocator, ForkNumber forkNum,
BlockNumber blk, char *page, bool page_std);
-extern void log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+extern void log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, char **pages, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index e73ea4a..5395f15 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -122,7 +122,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
@@ -430,10 +430,10 @@ extern FullTransactionId XLogRecGetFullXid(XLogReaderState *record);
extern bool RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern void XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum);
extern bool XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer);
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 052ac68..7e467ef 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "port/pg_crc32c.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* The overall layout of an XLOG record is:
@@ -97,7 +97,7 @@ typedef struct XLogRecordBlockHeader
* image) */
/* If BKPBLOCK_HAS_IMAGE, an XLogRecordBlockImageHeader struct follows */
- /* If BKPBLOCK_SAME_REL is not set, a RelFileNode follows */
+ /* If BKPBLOCK_SAME_REL is not set, a RelFileLocator follows */
/* BlockNumber follows */
} XLogRecordBlockHeader;
@@ -175,7 +175,7 @@ typedef struct XLogRecordBlockCompressHeader
(SizeOfXLogRecordBlockHeader + \
SizeOfXLogRecordBlockImageHeader + \
SizeOfXLogRecordBlockCompressHeader + \
- sizeof(RelFileNode) + \
+ sizeof(RelFileLocator) + \
sizeof(BlockNumber))
/*
@@ -187,7 +187,7 @@ typedef struct XLogRecordBlockCompressHeader
#define BKPBLOCK_HAS_IMAGE 0x10 /* block data is an XLogRecordBlockImage */
#define BKPBLOCK_HAS_DATA 0x20
#define BKPBLOCK_WILL_INIT 0x40 /* redo will re-init the page */
-#define BKPBLOCK_SAME_REL 0x80 /* RelFileNode omitted, same as previous */
+#define BKPBLOCK_SAME_REL 0x80 /* RelFileLocator omitted, same as previous */
/*
* XLogRecordDataHeaderShort/Long are used for the "main data" portion of
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index c9d0b75..ef18297 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -60,9 +60,9 @@ extern PGDLLIMPORT HotStandbyState standbyState;
extern bool XLogHaveInvalidPages(void);
extern void XLogCheckInvalidPages(void);
-extern void XLogDropRelation(RelFileNode rnode, ForkNumber forknum);
+extern void XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum);
extern void XLogDropDatabase(Oid dbid);
-extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+extern void XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks);
/* Result codes for XLogReadBufferForRedo[Extended] */
@@ -89,11 +89,11 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer);
-extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
+extern Relation CreateFakeRelcacheEntry(RelFileLocator rlocator);
extern void FreeFakeRelcacheEntry(Relation fakerel);
extern int read_local_xlog_page(XLogReaderState *state,
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..07bbf3a 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..5373cd0 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,7 +38,7 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
+extern Oid GetNewRelFileNumber(Oid reltablespace, Relation pg_class,
char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 07c5b88..787baa0 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ Oid relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..e245ce0 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ Oid relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/storage.h b/src/include/catalog/storage.h
index 59f3404..9964c31 100644
--- a/src/include/catalog/storage.h
+++ b/src/include/catalog/storage.h
@@ -15,23 +15,23 @@
#define STORAGE_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
/* GUC variables */
extern PGDLLIMPORT int wal_skip_threshold;
-extern SMgrRelation RelationCreateStorage(RelFileNode rnode,
+extern SMgrRelation RelationCreateStorage(RelFileLocator rlocator,
char relpersistence,
bool register_delete);
extern void RelationDropStorage(Relation rel);
-extern void RelationPreserveStorage(RelFileNode rnode, bool atCommit);
+extern void RelationPreserveStorage(RelFileLocator rlocator, bool atCommit);
extern void RelationPreTruncate(Relation rel);
extern void RelationTruncate(Relation rel, BlockNumber nblocks);
extern void RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
-extern bool RelFileNodeSkippingWAL(RelFileNode rnode);
+extern bool RelFileLocatorSkippingWAL(RelFileLocator rlocator);
extern Size EstimatePendingSyncsSpace(void);
extern void SerializePendingSyncs(Size maxSize, char *startAddress);
extern void RestorePendingSyncs(char *startAddress);
@@ -42,7 +42,7 @@ extern void RestorePendingSyncs(char *startAddress);
*/
extern void smgrDoPendingDeletes(bool isCommit);
extern void smgrDoPendingSyncs(bool isCommit, bool isParallelWorker);
-extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr);
+extern int smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr);
extern void AtSubCommit_smgr(void);
extern void AtSubAbort_smgr(void);
extern void PostPrepare_smgr(void);
diff --git a/src/include/catalog/storage_xlog.h b/src/include/catalog/storage_xlog.h
index 622de22..44a5e20 100644
--- a/src/include/catalog/storage_xlog.h
+++ b/src/include/catalog/storage_xlog.h
@@ -17,7 +17,7 @@
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Declarations for smgr-related XLOG records
@@ -32,7 +32,7 @@
typedef struct xl_smgr_create
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
} xl_smgr_create;
@@ -46,11 +46,11 @@ typedef struct xl_smgr_create
typedef struct xl_smgr_truncate
{
BlockNumber blkno;
- RelFileNode rnode;
+ RelFileLocator rlocator;
int flags;
} xl_smgr_truncate;
-extern void log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum);
+extern void log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum);
extern void smgr_redo(XLogReaderState *record);
extern void smgr_desc(StringInfo buf, XLogReaderState *record);
diff --git a/src/include/commands/sequence.h b/src/include/commands/sequence.h
index 9da2300..d38c0e2 100644
--- a/src/include/commands/sequence.h
+++ b/src/include/commands/sequence.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "parser/parse_node.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
typedef struct FormData_pg_sequence_data
@@ -47,7 +47,7 @@ typedef FormData_pg_sequence_data *Form_pg_sequence_data;
typedef struct xl_seq_rec
{
- RelFileNode node;
+ RelFileLocator locator;
/* SEQUENCE TUPLE DATA FOLLOWS AT THE END */
} xl_seq_rec;
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..4cc2b17 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ Oid newRelFileNumber);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index 24b6473..1f80907 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -50,7 +50,7 @@ extern void DropTableSpace(DropTableSpaceStmt *stmt);
extern ObjectAddress RenameTableSpace(const char *oldname, const char *newname);
extern Oid AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
-extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
+extern void TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo);
extern Oid GetDefaultTablespace(char relpersistence, bool partitioned);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 13849a3..aefeeff 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -64,27 +64,27 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
/*
* Stuff for computing filesystem pathnames for relations.
*/
-extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
+extern char *GetDatabasePath(Oid dbOid, Oid spcOid);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbOid, Oid spcOid, Oid relNumber,
int backendId, ForkNumber forkNumber);
/*
* Wrapper macros for GetRelationPath. Beware of multiple
- * evaluation of the RelFileNode or RelFileNodeBackend argument!
+ * evaluation of the RelFileLocator or RelFileLocatorBackend argument!
*/
-/* First argument is a RelFileNode */
-#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
+/* First argument is a RelFileLocator */
+#define relpathbackend(rlocator, backend, forknum) \
+ GetRelationPath((rlocator).dbOid, (rlocator).spcOid, (rlocator).relNumber, \
backend, forknum)
-/* First argument is a RelFileNode */
-#define relpathperm(rnode, forknum) \
- relpathbackend(rnode, InvalidBackendId, forknum)
+/* First argument is a RelFileLocator */
+#define relpathperm(rlocator, forknum) \
+ relpathbackend(rlocator, InvalidBackendId, forknum)
-/* First argument is a RelFileNodeBackend */
-#define relpath(rnode, forknum) \
- relpathbackend((rnode).node, (rnode).backend, forknum)
+/* First argument is a RelFileLocatorBackend */
+#define relpath(rlocator, forknum) \
+ relpathbackend((rlocator).locator, (rlocator).backend, forknum)
#endif /* RELPATH_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 73f635b..f540d9b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3247,10 +3247,10 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
- SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
- SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
- * oldNode */
+ Oid oldNumber; /* relfilenumber of existing storage, if any */
+ SubTransactionId oldCreateSubid; /* rd_createSubid of oldNumber */
+ SubTransactionId oldFirstRelfilenumberSubid; /* rd_firstRelfilelocatorSubid
+ * of oldNumber */
bool unique; /* is index unique? */
bool nulls_not_distinct; /* null treatment for UNIQUE constraints */
bool primary; /* is index a primary key? */
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 2511ef4..b67fb1e 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -16,7 +16,7 @@
#define _BGWRITER_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 4a01f87..d109d0b 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -99,7 +99,7 @@ typedef struct ReorderBufferChange
struct
{
/* relation that has been changed */
- RelFileNode relnode;
+ RelFileLocator rlocator;
/* no previously reassembled toast chunks are necessary anymore */
bool clear_toast_afterwards;
@@ -145,7 +145,7 @@ typedef struct ReorderBufferChange
*/
struct
{
- RelFileNode node;
+ RelFileLocator locator;
ItemPointerData tid;
CommandId cmin;
CommandId cmax;
@@ -657,7 +657,7 @@ extern void ReorderBufferAddSnapshot(ReorderBuffer *, TransactionId, XLogRecPtr
extern void ReorderBufferAddNewCommandId(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
CommandId cid);
extern void ReorderBufferAddNewTupleCids(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
- RelFileNode node, ItemPointerData pt,
+ RelFileLocator locator, ItemPointerData pt,
CommandId cmin, CommandId cmax, CommandId combocid);
extern void ReorderBufferAddInvalidations(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
Size nmsgs, SharedInvalidationMessage *msgs);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index a17e7b2..c318c46 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,30 +90,30 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rlocator.spcOid = InvalidOid, \
+ (a).rlocator.dbOid = InvalidOid, \
+ (a).rlocator.relNumber = InvalidOid, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -292,7 +292,7 @@ extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
typedef struct CkptSortItem
{
Oid tsId;
- Oid relNode;
+ Oid relNumber;
ForkNumber forkNum;
BlockNumber blockNum;
int buf_id;
@@ -337,9 +337,9 @@ extern PrefetchBufferResult PrefetchLocalBuffer(SMgrRelation smgr,
extern BufferDesc *LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum,
BlockNumber blockNum, bool *foundPtr);
extern void MarkLocalBufferDirty(Buffer buffer);
-extern void DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
+extern void DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber firstDelBlock);
-extern void DropRelFileNodeAllLocalBuffers(RelFileNode rnode);
+extern void DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator);
extern void AtEOXact_LocalBuffers(bool isCommit);
#endif /* BUFMGR_INTERNALS_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 5839140..96e473e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,7 +17,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -176,13 +176,13 @@ extern PrefetchBufferResult PrefetchSharedBuffer(struct SMgrRelationData *smgr_r
BlockNumber blockNum);
extern PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum,
BlockNumber blockNum);
-extern bool ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum,
+extern bool ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, Buffer recent_buffer);
extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
+extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool permanent);
@@ -204,13 +204,13 @@ extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
extern void FlushOneBuffer(Buffer buffer);
extern void FlushRelationBuffers(Relation rel);
extern void FlushRelationsAllBuffers(struct SMgrRelationData **smgrs, int nrels);
-extern void CreateAndCopyRelationData(RelFileNode src_rnode,
- RelFileNode dst_rnode,
+extern void CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator,
bool permanent);
extern void FlushDatabaseBuffers(Oid dbid);
-extern void DropRelFileNodeBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
+extern void DropRelFileLocatorBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock);
-extern void DropRelFileNodesAllBuffers(struct SMgrRelationData **smgr_reln, int nnodes);
+extern void DropRelFileLocatorsAllBuffers(struct SMgrRelationData **smgr_reln, int nlocators);
extern void DropDatabaseBuffers(Oid dbid);
#define RelationGetNumberOfBlocks(reln) \
@@ -223,7 +223,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileLocator *rlocator,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/freespace.h b/src/include/storage/freespace.h
index dcc40eb..fcb0802 100644
--- a/src/include/storage/freespace.h
+++ b/src/include/storage/freespace.h
@@ -15,7 +15,7 @@
#define FREESPACE_H_
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* prototypes for public functions in freespace.c */
@@ -27,7 +27,7 @@ extern BlockNumber RecordAndGetPageWithFreeSpace(Relation rel,
Size spaceNeeded);
extern void RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk,
Size spaceAvail);
-extern void XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+extern void XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail);
extern BlockNumber FreeSpaceMapPrepareTruncateRel(Relation rel,
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ffffa40..10aa1b0 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -15,7 +15,7 @@
#define MD_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
@@ -25,7 +25,7 @@ extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
-extern void mdunlink(RelFileNodeBackend rnode, ForkNumber forknum, bool isRedo);
+extern void mdunlink(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo);
extern void mdextend(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern bool mdprefetch(SMgrRelation reln, ForkNumber forknum,
@@ -42,7 +42,7 @@ extern void mdtruncate(SMgrRelation reln, ForkNumber forknum,
extern void mdimmedsync(SMgrRelation reln, ForkNumber forknum);
extern void ForgetDatabaseSyncRequests(Oid dbid);
-extern void DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo);
+extern void DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo);
/* md sync callbacks */
extern int mdsyncfiletag(const FileTag *ftag, char *path);
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
new file mode 100644
index 0000000..4496cfa
--- /dev/null
+++ b/src/include/storage/relfilelocator.h
@@ -0,0 +1,99 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilelocator.h
+ * Physical access information for relations.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/relfilelocator.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILELOCATOR_H
+#define RELFILELOCATOR_H
+
+#include "common/relpath.h"
+#include "storage/backendid.h"
+
+/*
+ * RelFileLocator must provide all that we need to know to physically access
+ * a relation, with the exception of the backend ID, which can be provided
+ * separately. Note, however, that a "physical" relation is comprised of
+ * multiple files on the filesystem, as each fork is stored as a separate
+ * file, and each fork can be divided into multiple segments. See md.c.
+ *
+ * spcOid identifies the tablespace of the relation. It corresponds to
+ * pg_tablespace.oid.
+ *
+ * dbOid identifies the database of the relation. It is zero for
+ * "shared" relations (those common to all databases of a cluster).
+ * Nonzero dbOid values correspond to pg_database.oid.
+ *
+ * relNumber identifies the specific relation. relNumber corresponds to
+ * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
+ * to assign new physical files to relations in some situations).
+ * Notice that relNumber is only unique within a database in a particular
+ * tablespace.
+ *
+ * Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
+ * zero. We support shared relations only in the "global" tablespace.
+ *
+ * Note: in pg_class we allow reltablespace == 0 to denote that the
+ * relation is stored in its database's "default" tablespace (as
+ * identified by pg_database.dattablespace). However this shorthand
+ * is NOT allowed in RelFileLocator structs --- the real tablespace ID
+ * must be supplied when setting spcOid.
+ *
+ * Note: in pg_class, relfilenode can be zero to denote that the relation
+ * is a "mapped" relation, whose current true filenode number is available
+ * from relmapper.c. Again, this case is NOT allowed in RelFileLocators.
+ *
+ * Note: various places use RelFileLocator in hashtable keys. Therefore,
+ * there *must not* be any unused padding bytes in this struct. That
+ * should be safe as long as all the fields are of type Oid.
+ */
+typedef struct RelFileLocator
+{
+ Oid spcOid; /* tablespace */
+ Oid dbOid; /* database */
+ Oid relNumber; /* relation */
+} RelFileLocator;
+
+/*
+ * Augmenting a relfilelocator with the backend ID provides all the information
+ * we need to locate the physical storage. The backend ID is InvalidBackendId
+ * for regular relations (those accessible to more than one backend), or the
+ * owning backend's ID for backend-local relations. Backend-local relations
+ * are always transient and removed in case of a database crash; they are
+ * never WAL-logged or fsync'd.
+ */
+typedef struct RelFileLocatorBackend
+{
+ RelFileLocator locator;
+ BackendId backend;
+} RelFileLocatorBackend;
+
+#define RelFileLocatorBackendIsTemp(rlocator) \
+ ((rlocator).backend != InvalidBackendId)
+
+/*
+ * Note: RelFileLocatorEquals and RelFileLocatorBackendEquals compare relNumber first
+ * since that is most likely to be different in two unequal RelFileLocators. It
+ * is probably redundant to compare spcOid if the other fields are found equal,
+ * but do it anyway to be sure. Likewise for checking the backend ID in
+ * RelFileLocatorBackendEquals.
+ */
+#define RelFileLocatorEquals(locator1, locator2) \
+ ((locator1).relNumber == (locator2).relNumber && \
+ (locator1).dbOid == (locator2).dbOid && \
+ (locator1).spcOid == (locator2).spcOid)
+
+#define RelFileLocatorBackendEquals(locator1, locator2) \
+ ((locator1).locator.relNumber == (locator2).locator.relNumber && \
+ (locator1).locator.dbOid == (locator2).locator.dbOid && \
+ (locator1).backend == (locator2).backend && \
+ (locator1).locator.spcOid == (locator2).locator.spcOid)
+
+#endif /* RELFILELOCATOR_H */
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
deleted file mode 100644
index 4fdc606..0000000
--- a/src/include/storage/relfilenode.h
+++ /dev/null
@@ -1,99 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenode.h
- * Physical access information for relations.
- *
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/storage/relfilenode.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODE_H
-#define RELFILENODE_H
-
-#include "common/relpath.h"
-#include "storage/backendid.h"
-
-/*
- * RelFileNode must provide all that we need to know to physically access
- * a relation, with the exception of the backend ID, which can be provided
- * separately. Note, however, that a "physical" relation is comprised of
- * multiple files on the filesystem, as each fork is stored as a separate
- * file, and each fork can be divided into multiple segments. See md.c.
- *
- * spcNode identifies the tablespace of the relation. It corresponds to
- * pg_tablespace.oid.
- *
- * dbNode identifies the database of the relation. It is zero for
- * "shared" relations (those common to all databases of a cluster).
- * Nonzero dbNode values correspond to pg_database.oid.
- *
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
- *
- * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
- * zero. We support shared relations only in the "global" tablespace.
- *
- * Note: in pg_class we allow reltablespace == 0 to denote that the
- * relation is stored in its database's "default" tablespace (as
- * identified by pg_database.dattablespace). However this shorthand
- * is NOT allowed in RelFileNode structs --- the real tablespace ID
- * must be supplied when setting spcNode.
- *
- * Note: in pg_class, relfilenode can be zero to denote that the relation
- * is a "mapped" relation, whose current true filenode number is available
- * from relmapper.c. Again, this case is NOT allowed in RelFileNodes.
- *
- * Note: various places use RelFileNode in hashtable keys. Therefore,
- * there *must not* be any unused padding bytes in this struct. That
- * should be safe as long as all the fields are of type Oid.
- */
-typedef struct RelFileNode
-{
- Oid spcNode; /* tablespace */
- Oid dbNode; /* database */
- Oid relNode; /* relation */
-} RelFileNode;
-
-/*
- * Augmenting a relfilenode with the backend ID provides all the information
- * we need to locate the physical storage. The backend ID is InvalidBackendId
- * for regular relations (those accessible to more than one backend), or the
- * owning backend's ID for backend-local relations. Backend-local relations
- * are always transient and removed in case of a database crash; they are
- * never WAL-logged or fsync'd.
- */
-typedef struct RelFileNodeBackend
-{
- RelFileNode node;
- BackendId backend;
-} RelFileNodeBackend;
-
-#define RelFileNodeBackendIsTemp(rnode) \
- ((rnode).backend != InvalidBackendId)
-
-/*
- * Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
- * since that is most likely to be different in two unequal RelFileNodes. It
- * is probably redundant to compare spcNode if the other fields are found equal,
- * but do it anyway to be sure. Likewise for checking the backend ID in
- * RelFileNodeBackendEquals.
- */
-#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
- (node1).dbNode == (node2).dbNode && \
- (node1).spcNode == (node2).spcNode)
-
-#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
- (node1).node.dbNode == (node2).node.dbNode && \
- (node1).backend == (node2).backend && \
- (node1).node.spcNode == (node2).node.spcNode)
-
-#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index e7cd456..56c6fc9 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -16,7 +16,7 @@
#include <signal.h>
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* We support several types of shared-invalidation messages:
@@ -90,7 +90,7 @@ typedef struct
int8 id; /* type field --- must be first */
int8 backend_hi; /* high bits of backend ID, if temprel */
uint16 backend_lo; /* low bits of backend ID, if temprel */
- RelFileNode rnode; /* spcNode, dbNode, relNode */
+ RelFileLocator rlocator; /* spcOid, dbOid, relNumber */
} SharedInvalSmgrMsg;
#define SHAREDINVALRELMAP_ID (-4)
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 6b63c60..a077153 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -16,7 +16,7 @@
#include "lib/ilist.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* smgr.c maintains a table of SMgrRelation objects, which are essentially
@@ -38,8 +38,8 @@
*/
typedef struct SMgrRelationData
{
- /* rnode is the hashtable lookup key, so it must be first! */
- RelFileNodeBackend smgr_rnode; /* relation physical identifier */
+ /* rlocator is the hashtable lookup key, so it must be first! */
+ RelFileLocatorBackend smgr_rlocator; /* relation physical identifier */
/* pointer to owning pointer, or NULL if none */
struct SMgrRelationData **smgr_owner;
@@ -75,16 +75,16 @@ typedef struct SMgrRelationData
typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
- RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+ RelFileLocatorBackendIsTemp((smgr)->smgr_rlocator)
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(RelFileLocator rlocator, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclose(SMgrRelation reln);
extern void smgrcloseall(void);
-extern void smgrclosenode(RelFileNodeBackend rnode);
+extern void smgrcloserellocator(RelFileLocatorBackend rlocator);
extern void smgrrelease(SMgrRelation reln);
extern void smgrreleaseall(void);
extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 6a77632..dacef92 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -17,7 +17,7 @@
#include "datatype/timestamp.h"
#include "storage/lock.h"
#include "storage/procsignal.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/standbydefs.h"
/* User-settable GUC parameters */
@@ -30,9 +30,9 @@ extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithTablespace(Oid tsid);
extern void ResolveRecoveryConflictWithDatabase(Oid dbid);
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..049af87 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -13,7 +13,7 @@
#ifndef SYNC_H
#define SYNC_H
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Type of sync request. These are used to manage the set of pending
@@ -51,7 +51,7 @@ typedef struct FileTag
{
int16 handler; /* SyncRequestHandler value, saving space */
int16 forknum; /* ForkNumber, saving space */
- RelFileNode rnode;
+ RelFileLocator rlocator;
uint32 segno;
} FileTag;
diff --git a/src/include/utils/inval.h b/src/include/utils/inval.h
index 0e0323b..23748b7 100644
--- a/src/include/utils/inval.h
+++ b/src/include/utils/inval.h
@@ -15,7 +15,7 @@
#define INVAL_H
#include "access/htup.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
extern PGDLLIMPORT int debug_discard_caches;
@@ -48,7 +48,7 @@ extern void CacheInvalidateRelcacheByTuple(HeapTuple classTuple);
extern void CacheInvalidateRelcacheByRelid(Oid relid);
-extern void CacheInvalidateSmgr(RelFileNodeBackend rnode);
+extern void CacheInvalidateSmgr(RelFileLocatorBackend rlocator);
extern void CacheInvalidateRelmap(Oid databaseId);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1896a9a..f7c4ce8 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -23,7 +23,7 @@
#include "partitioning/partdefs.h"
#include "rewrite/prs2lock.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/reltrigger.h"
@@ -53,7 +53,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
- RelFileNode rd_node; /* relation physical identifier */
+ RelFileLocator rd_locator; /* relation physical identifier */
SMgrRelation rd_smgr; /* cached file handle, or NULL */
int rd_refcnt; /* reference count */
BackendId rd_backend; /* owning backend id, if temporary relation */
@@ -66,44 +66,44 @@ typedef struct RelationData
/*----------
* rd_createSubid is the ID of the highest subtransaction the rel has
- * survived into or zero if the rel or its rd_node was created before the
- * current top transaction. (IndexStmt.oldNode leads to the case of a new
- * rel with an old rd_node.) rd_firstRelfilenodeSubid is the ID of the
- * highest subtransaction an rd_node change has survived into or zero if
- * rd_node matches the value it had at the start of the current top
+ * survived into or zero if the rel or its rd_locator was created before the
+ * current top transaction. (IndexStmt.oldNumber leads to the case of a new
+ * rel with an old rd_locator.) rd_firstRelfilelocatorSubid is the ID of the
+ * highest subtransaction an rd_locator change has survived into or zero if
+ * rd_locator matches the value it had at the start of the current top
* transaction. (Rolling back the subtransaction that
- * rd_firstRelfilenodeSubid denotes would restore rd_node to the value it
+ * rd_firstRelfilelocatorSubid denotes would restore rd_locator to the value it
* had at the start of the current top transaction. Rolling back any
* lower subtransaction would not.) Their accuracy is critical to
* RelationNeedsWAL().
*
- * rd_newRelfilenodeSubid is the ID of the highest subtransaction the
- * most-recent relfilenode change has survived into or zero if not changed
+ * rd_newRelfilelocatorSubid is the ID of the highest subtransaction the
+ * most-recent relfilenumber change has survived into or zero if not changed
* in the current transaction (or we have forgotten changing it). This
* field is accurate when non-zero, but it can be zero when a relation has
- * multiple new relfilenodes within a single transaction, with one of them
+ * multiple new relfilenumbers within a single transaction, with one of them
* occurring in a subsequently aborted subtransaction, e.g.
* BEGIN;
* TRUNCATE t;
* SAVEPOINT save;
* TRUNCATE t;
* ROLLBACK TO save;
- * -- rd_newRelfilenodeSubid is now forgotten
+ * -- rd_newRelfilelocatorSubid is now forgotten
*
* If every rd_*Subid field is zero, they are read-only outside
- * relcache.c. Files that trigger rd_node changes by updating
+ * relcache.c. Files that trigger rd_locator changes by updating
* pg_class.reltablespace and/or pg_class.relfilenode call
- * RelationAssumeNewRelfilenode() to update rd_*Subid.
+ * RelationAssumeNewRelfilelocator() to update rd_*Subid.
*
* rd_droppedSubid is the ID of the highest subtransaction that a drop of
* the rel has survived into. In entries visible outside relcache.c, this
* is always zero.
*/
SubTransactionId rd_createSubid; /* rel was created in current xact */
- SubTransactionId rd_newRelfilenodeSubid; /* highest subxact changing
- * rd_node to current value */
- SubTransactionId rd_firstRelfilenodeSubid; /* highest subxact changing
- * rd_node to any value */
+ SubTransactionId rd_newRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to current value */
+ SubTransactionId rd_firstRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to any value */
SubTransactionId rd_droppedSubid; /* dropped with another Subid set */
Form_pg_class rd_rel; /* RELATION tuple */
@@ -531,7 +531,7 @@ typedef struct ViewOptions
/*
* RelationIsMapped
- * True if the relation uses the relfilenode map. Note multiple eval
+ * True if the relation uses the relfilenumber map. Note multiple eval
* of argument!
*/
#define RelationIsMapped(relation) \
@@ -555,7 +555,7 @@ static inline SMgrRelation
RelationGetSmgr(Relation rel)
{
if (unlikely(rel->rd_smgr == NULL))
- smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_node, rel->rd_backend));
+ smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_locator, rel->rd_backend));
return rel->rd_smgr;
}
@@ -607,12 +607,12 @@ RelationGetSmgr(Relation rel)
*
* Returns false if wal_level = minimal and this relation is created or
* truncated in the current transaction. See "Skipping WAL for New
- * RelFileNode" in src/backend/access/transam/README.
+ * RelFileLocator" in src/backend/access/transam/README.
*/
#define RelationNeedsWAL(relation) \
(RelationIsPermanent(relation) && (XLogIsNeeded() || \
(relation->rd_createSubid == InvalidSubTransactionId && \
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)))
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)))
/*
* RelationUsesLocalBuffers
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index c93d865..08b6cc2 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -103,7 +103,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ Oid relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -111,10 +111,10 @@ extern Relation RelationBuildLocalRelation(const char *relname,
char relkind);
/*
- * Routines to manage assignment of new relfilenode to a relation
+ * Routines to manage assignment of new relfilenumber to a relation
*/
-extern void RelationSetNewRelfilenode(Relation relation, char persistence);
-extern void RelationAssumeNewRelfilenode(Relation relation);
+extern void RelationSetNewRelfilenumber(Relation relation, char persistence);
+extern void RelationAssumeNewRelfilelocator(Relation relation);
/*
* Routines for flushing/rebuilding relcache entries in various scenarios
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
deleted file mode 100644
index 77d8046..0000000
--- a/src/include/utils/relfilenodemap.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.h
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/relfilenodemap.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODEMAP_H
-#define RELFILENODEMAP_H
-
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
-
-#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
new file mode 100644
index 0000000..eac1db5
--- /dev/null
+++ b/src/include/utils/relfilenumbermap.h
@@ -0,0 +1,18 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.h
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/relfilenumbermap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILENUMBERMAP_H
+#define RELFILENUMBERMAP_H
+
+extern Oid RelidByRelfilenumber(Oid reltablespace, Oid relfilenumber);
+
+#endif /* RELFILENUMBERMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 557f77e..7aeb031 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.h
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
*
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
@@ -35,13 +35,13 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern Oid RelationMapOidToFilenumber(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
-extern Oid RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId);
+extern Oid RelationMapFilenumberToOid(Oid relationId, bool shared);
+extern Oid RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId);
extern void RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath,
char *dstdbpath);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+extern void RelationMapUpdateMap(Oid relationId, Oid fileNumber, bool shared,
bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
index 4700d49..869d9d5 100644
--- a/src/test/recovery/t/018_wal_optimize.pl
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -5,7 +5,7 @@
#
# These tests exercise code that once violated the mandate described in
# src/backend/access/transam/README section "Skipping WAL for New
-# RelFileNode". The tests work by committing some transactions, initiating an
+# RelFileLocator". The tests work by committing some transactions, initiating an
# immediate shutdown, and confirming that the expected data survives recovery.
# For many years, individual commands made the decision to skip WAL, hence the
# frequent appearance of COPY in these tests.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4fb7469..11b68b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2255,8 +2255,8 @@ ReindexObjectType
ReindexParams
ReindexStmt
ReindexType
-RelFileNode
-RelFileNodeBackend
+RelFileLocator
+RelFileLocatorBackend
RelIdCacheEnt
RelInfo
RelInfoArr
@@ -2274,8 +2274,8 @@ RelationPtr
RelationSyncEntry
RelcacheCallbackFunction
ReleaseMatchCB
-RelfilenodeMapEntry
-RelfilenodeMapKey
+RelfilenumberMapEntry
+RelfilenumberMapKey
Relids
RelocationBufferInfo
RelptrFreePageBtree
@@ -3877,7 +3877,7 @@ xl_xact_parsed_abort
xl_xact_parsed_commit
xl_xact_parsed_prepare
xl_xact_prepare
-xl_xact_relfilenodes
+xl_xact_relfilelocators
xl_xact_stats_item
xl_xact_stats_items
xl_xact_subxacts
--
1.8.3.1
[ changing subject line so nobody misses what's under discussion ]
For a quick summary of the overall idea being discussed here and some
discussion of the problems it solves, see
/messages/by-id/CA+TgmobM5FN5x0u3tSpoNvk_TZPFCdbcHxsXCoY1ytn1dXROvg@mail.gmail.com
For discussion of the proposed renaming of non-user-visible references
to relfilenode to either RelFileLocator or RelFileNumber as
preparatory refactoring work for that change, see
/messages/by-id/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
On Thu, Jun 23, 2022 at 3:55 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have worked on this renaming stuff first and once we agree with that
then I will rebase the other patches on top of this and will also work
on the other review comments for those patches.
So basically in this patch
- The "RelFileNode" structure to "RelFileLocator" and also renamed
other internal member as below
typedef struct RelFileLocator
{
Oid spcOid; /* tablespace */
Oid dbOid; /* database */
Oid relNumber; /* relation */
} RelFileLocator;
I like those structure member names fine, but I'd like to see this
preliminary patch also introduce the RelFileNumber typedef as an alias
for Oid. Then the main patch can change it to be uint64.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Fri, Jun 24, 2022 at 1:36 AM Robert Haas <robertmhaas@gmail.com> wrote:
[ changing subject line so nobody misses what's under discussion ]
For a quick summary of the overall idea being discussed here and some
discussion of the problems it solves, see
/messages/by-id/CA+TgmobM5FN5x0u3tSpoNvk_TZPFCdbcHxsXCoY1ytn1dXROvg@mail.gmail.comFor discussion of the proposed renaming of non-user-visible references
to relfilenode to either RelFileLocator or RelFileNumber as
preparatory refactoring work for that change, see
/messages/by-id/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.comOn Thu, Jun 23, 2022 at 3:55 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have worked on this renaming stuff first and once we agree with that
then I will rebase the other patches on top of this and will also work
on the other review comments for those patches.
So basically in this patch
- The "RelFileNode" structure to "RelFileLocator" and also renamed
other internal member as below
typedef struct RelFileLocator
{
Oid spcOid; /* tablespace */
Oid dbOid; /* database */
Oid relNumber; /* relation */
} RelFileLocator;I like those structure member names fine, but I'd like to see this
preliminary patch also introduce the RelFileNumber typedef as an alias
for Oid. Then the main patch can change it to be uint64.
I have changed that. PFA, the updated patch.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v2-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchDownload
From 3f9b73619d8da33742e1718d047fd634f47458b8 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 21 Jun 2022 14:04:01 +0530
Subject: [PATCH v2] Rename RelFileNode to RelFileLocator and relNode to
RelNumber
Currently, the way relfilenode and relnode are used is really confusing.
Although there is some precedent for calling the number that pertains to
the file on disk "relnode" and that value when combined with the database
and tablespace OIDs "relfilenode," but it's definitely not the most obvious
thing, and this terminology is also not used uniformaly.
So as part of this patchset these variables are renamed to something more suited
with their usage. So the RelFileNode is renamed to the RelFileLocator
and all related variable declaration from relfilenode to relfilelocator.
And the relNode in the RelFileLocator is renamed to relNumber and along with that
the dbNode and spcNode are also renamed to dbOid and spcOid. Along with that
all other references to relnode/relfilenode w.r.t to the ondisk file is renamed to
relnumber/relfilenumber.
---
contrib/bloom/blinsert.c | 2 +-
contrib/oid2name/oid2name.c | 28 +--
contrib/pg_buffercache/pg_buffercache_pages.c | 10 +-
contrib/pg_prewarm/autoprewarm.c | 26 +--
contrib/pg_visibility/pg_visibility.c | 2 +-
src/backend/access/common/syncscan.c | 29 +--
src/backend/access/gin/ginbtree.c | 2 +-
src/backend/access/gin/ginfast.c | 2 +-
src/backend/access/gin/ginutil.c | 2 +-
src/backend/access/gin/ginxlog.c | 6 +-
src/backend/access/gist/gistbuild.c | 4 +-
src/backend/access/gist/gistxlog.c | 11 +-
src/backend/access/hash/hash_xlog.c | 6 +-
src/backend/access/hash/hashpage.c | 4 +-
src/backend/access/heap/heapam.c | 78 +++----
src/backend/access/heap/heapam_handler.c | 26 +--
src/backend/access/heap/rewriteheap.c | 10 +-
src/backend/access/heap/visibilitymap.c | 4 +-
src/backend/access/nbtree/nbtpage.c | 2 +-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 2 +-
src/backend/access/nbtree/nbtxlog.c | 8 +-
src/backend/access/rmgrdesc/genericdesc.c | 2 +-
src/backend/access/rmgrdesc/gindesc.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 6 +-
src/backend/access/rmgrdesc/heapdesc.c | 6 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/smgrdesc.c | 4 +-
src/backend/access/rmgrdesc/xactdesc.c | 44 ++--
src/backend/access/rmgrdesc/xlogdesc.c | 10 +-
src/backend/access/spgist/spginsert.c | 6 +-
src/backend/access/spgist/spgxlog.c | 6 +-
src/backend/access/table/tableamapi.c | 2 +-
src/backend/access/transam/README | 14 +-
src/backend/access/transam/README.parallel | 2 +-
src/backend/access/transam/twophase.c | 38 ++--
src/backend/access/transam/varsup.c | 2 +-
src/backend/access/transam/xact.c | 40 ++--
src/backend/access/transam/xloginsert.c | 38 ++--
src/backend/access/transam/xlogprefetcher.c | 96 ++++----
src/backend/access/transam/xlogreader.c | 25 ++-
src/backend/access/transam/xlogrecovery.c | 18 +-
src/backend/access/transam/xlogutils.c | 73 +++---
src/backend/bootstrap/bootparse.y | 8 +-
src/backend/catalog/catalog.c | 30 +--
src/backend/catalog/heap.c | 56 ++---
src/backend/catalog/index.c | 37 +--
src/backend/catalog/storage.c | 119 +++++-----
src/backend/commands/cluster.c | 46 ++--
src/backend/commands/copyfrom.c | 6 +-
src/backend/commands/dbcommands.c | 104 ++++-----
src/backend/commands/indexcmds.c | 14 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/sequence.c | 29 +--
src/backend/commands/tablecmds.c | 87 ++++----
src/backend/commands/tablespace.c | 18 +-
src/backend/nodes/copyfuncs.c | 4 +-
src/backend/nodes/equalfuncs.c | 4 +-
src/backend/nodes/outfuncs.c | 4 +-
src/backend/parser/gram.y | 8 +-
src/backend/parser/parse_utilcmd.c | 8 +-
src/backend/postmaster/checkpointer.c | 2 +-
src/backend/replication/logical/decode.c | 40 ++--
src/backend/replication/logical/reorderbuffer.c | 50 ++---
src/backend/replication/logical/snapbuild.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 284 ++++++++++++------------
src/backend/storage/buffer/localbuf.c | 34 +--
src/backend/storage/freespace/freespace.c | 6 +-
src/backend/storage/freespace/fsmpage.c | 6 +-
src/backend/storage/ipc/standby.c | 8 +-
src/backend/storage/lmgr/predicate.c | 24 +-
src/backend/storage/smgr/README | 2 +-
src/backend/storage/smgr/md.c | 126 +++++------
src/backend/storage/smgr/smgr.c | 44 ++--
src/backend/utils/adt/dbsize.c | 64 +++---
src/backend/utils/adt/pg_upgrade_support.c | 14 +-
src/backend/utils/cache/Makefile | 2 +-
src/backend/utils/cache/inval.c | 16 +-
src/backend/utils/cache/relcache.c | 180 +++++++--------
src/backend/utils/cache/relfilenodemap.c | 244 --------------------
src/backend/utils/cache/relfilenumbermap.c | 244 ++++++++++++++++++++
src/backend/utils/cache/relmapper.c | 81 +++----
src/bin/pg_dump/pg_dump.c | 36 +--
src/bin/pg_rewind/datapagemap.h | 2 +-
src/bin/pg_rewind/filemap.c | 34 +--
src/bin/pg_rewind/filemap.h | 4 +-
src/bin/pg_rewind/parsexlog.c | 10 +-
src/bin/pg_rewind/pg_rewind.h | 2 +-
src/bin/pg_upgrade/Makefile | 2 +-
src/bin/pg_upgrade/info.c | 10 +-
src/bin/pg_upgrade/pg_upgrade.h | 22 +-
src/bin/pg_upgrade/relfilenode.c | 259 ---------------------
src/bin/pg_upgrade/relfilenumber.c | 259 +++++++++++++++++++++
src/bin/pg_waldump/pg_waldump.c | 26 +--
src/common/relpath.c | 48 ++--
src/include/access/brin_xlog.h | 2 +-
src/include/access/ginxlog.h | 4 +-
src/include/access/gistxlog.h | 2 +-
src/include/access/heapam_xlog.h | 8 +-
src/include/access/nbtxlog.h | 4 +-
src/include/access/rewriteheap.h | 6 +-
src/include/access/tableam.h | 59 ++---
src/include/access/xact.h | 26 +--
src/include/access/xlog_internal.h | 2 +-
src/include/access/xloginsert.h | 8 +-
src/include/access/xlogreader.h | 6 +-
src/include/access/xlogrecord.h | 8 +-
src/include/access/xlogutils.h | 8 +-
src/include/catalog/binary_upgrade.h | 6 +-
src/include/catalog/catalog.h | 5 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/storage.h | 10 +-
src/include/catalog/storage_xlog.h | 8 +-
src/include/commands/sequence.h | 4 +-
src/include/commands/tablecmds.h | 2 +-
src/include/commands/tablespace.h | 2 +-
src/include/common/relpath.h | 24 +-
src/include/nodes/parsenodes.h | 8 +-
src/include/postgres_ext.h | 7 +
src/include/postmaster/bgwriter.h | 2 +-
src/include/replication/reorderbuffer.h | 6 +-
src/include/storage/buf_internals.h | 28 +--
src/include/storage/bufmgr.h | 16 +-
src/include/storage/freespace.h | 4 +-
src/include/storage/md.h | 6 +-
src/include/storage/relfilelocator.h | 99 +++++++++
src/include/storage/relfilenode.h | 99 ---------
src/include/storage/sinval.h | 4 +-
src/include/storage/smgr.h | 12 +-
src/include/storage/standby.h | 6 +-
src/include/storage/sync.h | 4 +-
src/include/utils/inval.h | 4 +-
src/include/utils/rel.h | 46 ++--
src/include/utils/relcache.h | 8 +-
src/include/utils/relfilenodemap.h | 18 --
src/include/utils/relfilenumbermap.h | 19 ++
src/include/utils/relmapper.h | 12 +-
src/test/recovery/t/018_wal_optimize.pl | 2 +-
src/tools/pgindent/typedefs.list | 10 +-
141 files changed, 2067 insertions(+), 2040 deletions(-)
delete mode 100644 src/backend/utils/cache/relfilenodemap.c
create mode 100644 src/backend/utils/cache/relfilenumbermap.c
delete mode 100644 src/bin/pg_upgrade/relfilenode.c
create mode 100644 src/bin/pg_upgrade/relfilenumber.c
create mode 100644 src/include/storage/relfilelocator.h
delete mode 100644 src/include/storage/relfilenode.h
delete mode 100644 src/include/utils/relfilenodemap.h
create mode 100644 src/include/utils/relfilenumbermap.h
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 82378db..e64291e 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -179,7 +179,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index a62a5ee..2e08bc7 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -30,7 +30,7 @@ struct options
{
eary *tables;
eary *oids;
- eary *filenodes;
+ eary *filenumbers;
bool quiet;
bool systables;
@@ -125,9 +125,9 @@ get_opts(int argc, char **argv, struct options *my_opts)
my_opts->dbname = pg_strdup(optarg);
break;
- /* specify one filenode to show */
+ /* specify one filenumber to show */
case 'f':
- add_one_elt(optarg, my_opts->filenodes);
+ add_one_elt(optarg, my_opts->filenumbers);
break;
/* host to connect to */
@@ -494,7 +494,7 @@ sql_exec_dumpalltables(PGconn *conn, struct options *opts)
}
/*
- * Show oid, filenode, name, schema and tablespace for each of the
+ * Show oid, filenumber, name, schema and tablespace for each of the
* given objects in the current database.
*/
void
@@ -504,19 +504,19 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
char *qualifiers,
*ptr;
char *comma_oids,
- *comma_filenodes,
+ *comma_filenumbers,
*comma_tables;
bool written = false;
char *addfields = ",c.oid AS \"Oid\", nspname AS \"Schema\", spcname as \"Tablespace\" ";
- /* get tables qualifiers, whether names, filenodes, or OIDs */
+ /* get tables qualifiers, whether names, filenumbers, or OIDs */
comma_oids = get_comma_elts(opts->oids);
comma_tables = get_comma_elts(opts->tables);
- comma_filenodes = get_comma_elts(opts->filenodes);
+ comma_filenumbers = get_comma_elts(opts->filenumbers);
/* 80 extra chars for SQL expression */
qualifiers = (char *) pg_malloc(strlen(comma_oids) + strlen(comma_tables) +
- strlen(comma_filenodes) + 80);
+ strlen(comma_filenumbers) + 80);
ptr = qualifiers;
if (opts->oids->num > 0)
@@ -524,11 +524,11 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
ptr += sprintf(ptr, "c.oid IN (%s)", comma_oids);
written = true;
}
- if (opts->filenodes->num > 0)
+ if (opts->filenumbers->num > 0)
{
if (written)
ptr += sprintf(ptr, " OR ");
- ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenodes);
+ ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenumbers);
written = true;
}
if (opts->tables->num > 0)
@@ -539,7 +539,7 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
}
free(comma_oids);
free(comma_tables);
- free(comma_filenodes);
+ free(comma_filenumbers);
/* now build the query */
todo = psprintf("SELECT pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\", relname as \"Table Name\" %s\n"
@@ -588,11 +588,11 @@ main(int argc, char **argv)
my_opts->oids = (eary *) pg_malloc(sizeof(eary));
my_opts->tables = (eary *) pg_malloc(sizeof(eary));
- my_opts->filenodes = (eary *) pg_malloc(sizeof(eary));
+ my_opts->filenumbers = (eary *) pg_malloc(sizeof(eary));
my_opts->oids->num = my_opts->oids->alloc = 0;
my_opts->tables->num = my_opts->tables->alloc = 0;
- my_opts->filenodes->num = my_opts->filenodes->alloc = 0;
+ my_opts->filenumbers->num = my_opts->filenumbers->alloc = 0;
/* parse the opts */
get_opts(argc, argv, my_opts);
@@ -618,7 +618,7 @@ main(int argc, char **argv)
/* display the given elements in the database */
if (my_opts->oids->num > 0 ||
my_opts->tables->num > 0 ||
- my_opts->filenodes->num > 0)
+ my_opts->filenumbers->num > 0)
{
if (!my_opts->quiet)
printf("From database \"%s\":\n", my_opts->dbname);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..713f52a 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
+ fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c0c4f5d..7f1d55c 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -52,7 +52,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/resowner.h"
#define AUTOPREWARM_FILE "autoprewarm.blocks"
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ RelFileNumber filenumber;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -347,7 +347,7 @@ apw_load_buffers(void)
unsigned forknum;
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
- &blkinfo[i].tablespace, &blkinfo[i].filenode,
+ &blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
(errmsg("autoprewarm block dump file is corrupted at line %d",
@@ -494,7 +494,7 @@ autoprewarm_database_main(Datum main_arg)
* relation. Note that rel will be NULL if try_relation_open failed
* previously; in that case, there is nothing to close.
*/
- if (old_blk != NULL && old_blk->filenode != blk->filenode &&
+ if (old_blk != NULL && old_blk->filenumber != blk->filenumber &&
rel != NULL)
{
relation_close(rel, AccessShareLock);
@@ -506,13 +506,13 @@ autoprewarm_database_main(Datum main_arg)
* Try to open each new relation, but only once, when we first
* encounter it. If it's been dropped, skip the associated blocks.
*/
- if (old_blk == NULL || old_blk->filenode != blk->filenode)
+ if (old_blk == NULL || old_blk->filenumber != blk->filenumber)
{
Oid reloid;
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenode(blk->tablespace, blk->filenode);
+ reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
@@ -527,7 +527,7 @@ autoprewarm_database_main(Datum main_arg)
/* Once per fork, check for fork existence and size. */
if (old_blk == NULL ||
- old_blk->filenode != blk->filenode ||
+ old_blk->filenumber != blk->filenumber ||
old_blk->forknum != blk->forknum)
{
/*
@@ -631,9 +631,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
+ block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
@@ -671,7 +671,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
- block_info_array[i].filenode,
+ block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
if (ret < 0)
@@ -900,7 +900,7 @@ do { \
* We depend on all records for a particular database being consecutive
* in the dump file; each per-database worker will preload blocks until
* it sees a block for some other database. Sorting by tablespace,
- * filenode, forknum, and blocknum isn't critical for correctness, but
+ * filenumber, forknum, and blocknum isn't critical for correctness, but
* helps us get a sequential I/O pattern.
*/
static int
@@ -911,7 +911,7 @@ apw_compare_blockinfo(const void *p, const void *q)
cmp_member_elem(database);
cmp_member_elem(tablespace);
- cmp_member_elem(filenode);
+ cmp_member_elem(filenumber);
cmp_member_elem(forknum);
cmp_member_elem(blocknum);
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 1853c35..4e2e9ea 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -407,7 +407,7 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
xl_smgr_truncate xlrec;
xlrec.blkno = 0;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_VM;
XLogBeginInsert();
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..e3add81 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -90,7 +90,7 @@ bool trace_syncscan = false;
*/
typedef struct ss_scan_location_t
{
- RelFileNode relfilenode; /* identity of a relation */
+ RelFileLocator relfilelocator; /* identity of a relation */
BlockNumber location; /* last-reported location in the relation */
} ss_scan_location_t;
@@ -115,7 +115,7 @@ typedef struct ss_scan_locations_t
static ss_scan_locations_t *scan_locations;
/* prototypes for internal functions */
-static BlockNumber ss_search(RelFileNode relfilenode,
+static BlockNumber ss_search(RelFileLocator relfilelocator,
BlockNumber location, bool set);
@@ -159,9 +159,9 @@ SyncScanShmemInit(void)
* these invalid entries will fall off the LRU list and get
* replaced with real entries.
*/
- item->location.relfilenode.spcNode = InvalidOid;
- item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidOid;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
@@ -176,10 +176,10 @@ SyncScanShmemInit(void)
/*
* ss_search --- search the scan_locations structure for an entry with the
- * given relfilenode.
+ * given relfilelocator.
*
* If "set" is true, the location is updated to the given location. If no
- * entry for the given relfilenode is found, it will be created at the head
+ * entry for the given relfilelocator is found, it will be created at the head
* of the list with the given location, even if "set" is false.
*
* In any case, the location after possible update is returned.
@@ -188,7 +188,7 @@ SyncScanShmemInit(void)
* data structure.
*/
static BlockNumber
-ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
+ss_search(RelFileLocator relfilelocator, BlockNumber location, bool set)
{
ss_lru_item_t *item;
@@ -197,7 +197,8 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
{
bool match;
- match = RelFileNodeEquals(item->location.relfilenode, relfilenode);
+ match = RelFileLocatorEquals(item->location.relfilelocator,
+ relfilelocator);
if (match || item->next == NULL)
{
@@ -207,7 +208,7 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
*/
if (!match)
{
- item->location.relfilenode = relfilenode;
+ item->location.relfilelocator = relfilelocator;
item->location.location = location;
}
else if (set)
@@ -255,7 +256,7 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
BlockNumber startloc;
LWLockAcquire(SyncScanLock, LW_EXCLUSIVE);
- startloc = ss_search(rel->rd_node, 0, false);
+ startloc = ss_search(rel->rd_locator, 0, false);
LWLockRelease(SyncScanLock);
/*
@@ -281,8 +282,8 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
* ss_report_location --- update the current scan location
*
* Writes an entry into the shared Sync Scan state of the form
- * (relfilenode, blocknumber), overwriting any existing entry for the
- * same relfilenode.
+ * (relfilelocator, blocknumber), overwriting any existing entry for the
+ * same relfilelocator.
*/
void
ss_report_location(Relation rel, BlockNumber location)
@@ -309,7 +310,7 @@ ss_report_location(Relation rel, BlockNumber location)
{
if (LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE))
{
- (void) ss_search(rel->rd_node, location, true);
+ (void) ss_search(rel->rd_locator, location, true);
LWLockRelease(SyncScanLock);
}
#ifdef TRACE_SYNCSCAN
diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index cc6d4e6..c75bfc2 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -470,7 +470,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
savedRightLink = GinPageGetOpaque(page)->rightlink;
/* Begin setting up WAL record */
- data.node = btree->index->rd_node;
+ data.locator = btree->index->rd_locator;
data.flags = xlflags;
if (BufferIsValid(childbuf))
{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 7409fdc..6c67744 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -235,7 +235,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
needWal = RelationNeedsWAL(index);
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 20f4706..6df7f2e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -688,7 +688,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats, bool is_build)
XLogRecPtr recptr;
ginxlogUpdateMeta data;
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
memcpy(&data.metadata, metadata, sizeof(GinMetaPageData));
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..41b9211 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileLocator locator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &locator, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index f5a5caf..374e64e 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -462,7 +462,7 @@ gist_indexsortbuild(GISTBuildState *state)
smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
if (RelationNeedsWAL(state->indexrel))
- log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM, GIST_ROOT_BLKNO,
+ log_newpage(&state->indexrel->rd_locator, MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
pfree(levelstate->pages[0]);
@@ -663,7 +663,7 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
}
if (RelationNeedsWAL(state->indexrel))
- log_newpages(&state->indexrel->rd_node, MAIN_FORKNUM, state->ready_num_pages,
+ log_newpages(&state->indexrel->rd_locator, MAIN_FORKNUM, state->ready_num_pages,
state->ready_blknos, state->ready_pages, true);
for (int i = 0; i < state->ready_num_pages; i++)
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index df70f90..b4f629f 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -191,11 +191,12 @@ gistRedoDeleteRecord(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid,
+ rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -395,7 +396,7 @@ gistRedoPageReuse(XLogReaderState *record)
*/
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
@@ -607,7 +608,7 @@ gistXLogPageReuse(Relation rel, BlockNumber blkno, FullTransactionId latestRemov
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = latestRemovedXid;
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index 62dbfc3..2e68303 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -999,10 +999,10 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rlocator);
}
action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL, true, &buffer);
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 39206d1..d2edcd4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -428,7 +428,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1019,7 +1019,7 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 637de11..aab8d6f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8189,7 +8189,7 @@ log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid,
* heap_buffer, if necessary.
*/
XLogRecPtr
-log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
+log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer, Buffer vm_buffer,
TransactionId cutoff_xid, uint8 vmflags)
{
xl_heap_visible xlrec;
@@ -8454,7 +8454,7 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
Assert(tup->t_tableOid != InvalidOid);
xlrec.top_xid = GetTopTransactionId();
- xlrec.target_node = relation->rd_node;
+ xlrec.target_locator = relation->rd_locator;
xlrec.target_tid = tup->t_self;
/*
@@ -8623,18 +8623,18 @@ heap_xlog_prune(XLogReaderState *record)
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_prune *xlrec = (xl_heap_prune *) XLogRecGetData(record);
Buffer buffer;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
* no queries running for which the removed tuples are still visible.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
/*
* If we have a full-page image, restore it (using a cleanup lock) and
@@ -8694,7 +8694,7 @@ heap_xlog_prune(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8751,9 +8751,9 @@ heap_xlog_vacuum(XLogReaderState *record)
if (BufferIsValid(buffer))
{
Size freespace = PageGetHeapFreeSpace(BufferGetPage(buffer));
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
UnlockReleaseBuffer(buffer);
@@ -8766,7 +8766,7 @@ heap_xlog_vacuum(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8786,11 +8786,11 @@ heap_xlog_visible(XLogReaderState *record)
Buffer vmbuffer = InvalidBuffer;
Buffer buffer;
Page page;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -8802,7 +8802,7 @@ heap_xlog_visible(XLogReaderState *record)
* rather than killing the transaction outright.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rlocator);
/*
* Read the heap page, if it still exists. If the heap file has dropped or
@@ -8865,7 +8865,7 @@ heap_xlog_visible(XLogReaderState *record)
* FSM data is not in the page anyway.
*/
if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
- XLogRecordPageWithFreeSpace(rnode, blkno, space);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, space);
}
/*
@@ -8890,7 +8890,7 @@ heap_xlog_visible(XLogReaderState *record)
*/
LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
- reln = CreateFakeRelcacheEntry(rnode);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, blkno, &vmbuffer);
/*
@@ -8933,13 +8933,13 @@ heap_xlog_freeze_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
TransactionId latestRemovedXid = cutoff_xid;
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -9007,10 +9007,10 @@ heap_xlog_delete(XLogReaderState *record)
ItemId lp = NULL;
HeapTupleHeader htup;
BlockNumber blkno;
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9020,7 +9020,7 @@ heap_xlog_delete(XLogReaderState *record)
*/
if (xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9086,12 +9086,12 @@ heap_xlog_insert(XLogReaderState *record)
xl_heap_header xlhdr;
uint32 newlen;
Size freespace = 0;
- RelFileNode target_node;
+ RelFileLocator target_locator;
BlockNumber blkno;
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9101,7 +9101,7 @@ heap_xlog_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9184,7 +9184,7 @@ heap_xlog_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(target_node, blkno, freespace);
+ XLogRecordPageWithFreeSpace(target_locator, blkno, freespace);
}
/*
@@ -9195,7 +9195,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_multi_insert *xlrec;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
Buffer buffer;
Page page;
@@ -9217,7 +9217,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/* check that the mutually exclusive flags are not both set */
Assert(!((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) &&
@@ -9229,7 +9229,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9331,7 +9331,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
/*
@@ -9342,7 +9342,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_update *xlrec = (xl_heap_update *) XLogRecGetData(record);
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber oldblk;
BlockNumber newblk;
ItemPointerData newtid;
@@ -9371,7 +9371,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &newblk);
if (XLogRecGetBlockTagExtended(record, 1, NULL, NULL, &oldblk, NULL))
{
/* HOT updates are never done across pages */
@@ -9388,7 +9388,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, oldblk, &vmbuffer);
@@ -9472,7 +9472,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, newblk, &vmbuffer);
@@ -9606,7 +9606,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
* totally accurate anyway.
*/
if (newaction == BLK_NEEDS_REDO && !hot_update && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, newblk, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, newblk, freespace);
}
static void
@@ -9662,13 +9662,13 @@ heap_xlog_lock(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
@@ -9735,13 +9735,13 @@ heap_xlog_lock_updated(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 444f027..7f227be 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -566,11 +566,11 @@ tuple_lock_retry:
*/
static void
-heapam_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+heapam_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
SMgrRelation srel;
@@ -591,7 +591,7 @@ heapam_relation_set_new_filenode(Relation rel,
*/
*minmulti = GetOldestMultiXactId();
- srel = RelationCreateStorage(*newrnode, persistence, true);
+ srel = RelationCreateStorage(*newrlocator, persistence, true);
/*
* If required, set up an init fork for an unlogged table so that it can
@@ -608,7 +608,7 @@ heapam_relation_set_new_filenode(Relation rel,
rel->rd_rel->relkind == RELKIND_MATVIEW ||
rel->rd_rel->relkind == RELKIND_TOASTVALUE);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(newrnode, INIT_FORKNUM);
+ log_smgrcreate(newrlocator, INIT_FORKNUM);
smgrimmedsync(srel, INIT_FORKNUM);
}
@@ -622,11 +622,11 @@ heapam_relation_nontransactional_truncate(Relation rel)
}
static void
-heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+heapam_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(*newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -640,10 +640,10 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(*newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -664,7 +664,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(newrnode, forkNum);
+ log_smgrcreate(newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
@@ -2569,7 +2569,7 @@ static const TableAmRoutine heapam_methods = {
.tuple_satisfies_snapshot = heapam_tuple_satisfies_snapshot,
.index_delete_tuples = heap_index_delete_tuples,
- .relation_set_new_filenode = heapam_relation_set_new_filenode,
+ .relation_set_new_filelocator = heapam_relation_set_new_filelocator,
.relation_nontransactional_truncate = heapam_relation_nontransactional_truncate,
.relation_copy_data = heapam_relation_copy_data,
.relation_copy_for_cluster = heapam_relation_copy_for_cluster,
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826..197f06b 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -318,7 +318,7 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -679,7 +679,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
page,
@@ -742,7 +742,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
* When doing logical decoding - which relies on using cmin/cmax of catalog
* tuples, via xl_heap_new_cid records - heap rewrites have to log enough
* information to allow the decoding backend to update its internal mapping
- * of (relfilenode,ctid) => (cmin, cmax) to be correct for the rewritten heap.
+ * of (relfilelocator,ctid) => (cmin, cmax) to be correct for the rewritten heap.
*
* For that, every time we find a tuple that's been modified in a catalog
* relation within the xmin horizon of any decoding slot, we log a mapping
@@ -1080,9 +1080,9 @@ logical_rewrite_heap_tuple(RewriteState state, ItemPointerData old_tid,
return;
/* fill out mapping information */
- map.old_node = state->rs_old_rel->rd_node;
+ map.old_locator = state->rs_old_rel->rd_locator;
map.old_tid = old_tid;
- map.new_node = state->rs_new_rel->rd_node;
+ map.new_locator = state->rs_new_rel->rd_locator;
map.new_tid = new_tid;
/* ---
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index e09f25a..ed72eb7 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -283,7 +283,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
if (XLogRecPtrIsInvalid(recptr))
{
Assert(!InRecovery);
- recptr = log_heap_visible(rel->rd_node, heapBuf, vmBuf,
+ recptr = log_heap_visible(rel->rd_locator, heapBuf, vmBuf,
cutoff_xid, flags);
/*
@@ -668,7 +668,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
* to keep checking for creation or extension of the file, which happens
* infrequently.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
UnlockRelationForExtension(rel, ExclusiveLock);
}
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 20adb60..8b96708 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -836,7 +836,7 @@ _bt_log_reuse_page(Relation rel, BlockNumber blkno, FullTransactionId safexid)
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = safexid;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 9b730f3..b52eca8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -166,7 +166,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&RelationGetSmgr(index)->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&RelationGetSmgr(index)->smgr_rlocator.locator, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9f60fa9..bd1685c 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -647,7 +647,7 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the XLOG_FPI record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(&wstate->index->rd_locator, MAIN_FORKNUM, blkno, page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index f9186ca..ad489e3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -664,11 +664,11 @@ btree_xlog_delete(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
}
/*
@@ -1006,7 +1006,7 @@ btree_xlog_reuse_page(XLogReaderState *record)
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
diff --git a/src/backend/access/rmgrdesc/genericdesc.c b/src/backend/access/rmgrdesc/genericdesc.c
index 877beb5..d8509b8 100644
--- a/src/backend/access/rmgrdesc/genericdesc.c
+++ b/src/backend/access/rmgrdesc/genericdesc.c
@@ -15,7 +15,7 @@
#include "access/generic_xlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Description of generic xlog record: write page regions that this record
diff --git a/src/backend/access/rmgrdesc/gindesc.c b/src/backend/access/rmgrdesc/gindesc.c
index 57f7bce..7d147ce 100644
--- a/src/backend/access/rmgrdesc/gindesc.c
+++ b/src/backend/access/rmgrdesc/gindesc.c
@@ -17,7 +17,7 @@
#include "access/ginxlog.h"
#include "access/xlogutils.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
desc_recompress_leaf(StringInfo buf, ginxlogRecompressDataLeaf *insertData)
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index d0c8e24..7dd3c1d 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -16,7 +16,7 @@
#include "access/gistxlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
@@ -27,8 +27,8 @@ static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..923d3bc 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -170,9 +170,9 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
- xlrec->target_node.spcNode,
- xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ xlrec->target_locator.spcOid,
+ xlrec->target_locator.dbOid,
+ xlrec->target_locator.relNumber,
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..4843cd5 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -101,8 +101,8 @@ btree_desc(StringInfo buf, XLogReaderState *record)
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..b3845f9 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -26,8 +26,8 @@ seq_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SEQ_LOG)
appendStringInfo(buf, "rel %u/%u/%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber);
}
const char *
diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 7547813..e0ee8a0 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,7 +26,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SMGR_CREATE)
{
xl_smgr_create *xlrec = (xl_smgr_create *) rec;
- char *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+ char *path = relpathperm(xlrec->rlocator, xlrec->forkNum);
appendStringInfoString(buf, path);
pfree(path);
@@ -34,7 +34,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
else if (info == XLOG_SMGR_TRUNCATE)
{
xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
- char *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+ char *path = relpathperm(xlrec->rlocator, MAIN_FORKNUM);
appendStringInfo(buf, "%s to %u blocks flags %d", path,
xlrec->blkno, xlrec->flags);
diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 90b6ac2..39752cf 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -73,15 +73,15 @@ ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *pars
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocators = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocators->nrels;
+ parsed->xlocators = xl_rellocators->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocators->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -179,15 +179,15 @@ ParseAbortRecord(uint8 info, xl_xact_abort *xlrec, xl_xact_parsed_abort *parsed)
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocator = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocator->nrels;
+ parsed->xlocators = xl_rellocator->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocator->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -260,11 +260,11 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
parsed->subxacts = (TransactionId *) bufptr;
bufptr += MAXALIGN(xlrec->nsubxacts * sizeof(TransactionId));
- parsed->xnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileNode));
+ parsed->xlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileLocator));
- parsed->abortnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileNode));
+ parsed->abortlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileLocator));
parsed->stats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(xlrec->ncommitstats * sizeof(xl_xact_stats_item));
@@ -278,7 +278,7 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
static void
xact_desc_relations(StringInfo buf, char *label, int nrels,
- RelFileNode *xnodes)
+ RelFileLocator *xlocators)
{
int i;
@@ -287,7 +287,7 @@ xact_desc_relations(StringInfo buf, char *label, int nrels,
appendStringInfo(buf, "; %s:", label);
for (i = 0; i < nrels; i++)
{
- char *path = relpathperm(xnodes[i], MAIN_FORKNUM);
+ char *path = relpathperm(xlocators[i], MAIN_FORKNUM);
appendStringInfo(buf, " %s", path);
pfree(path);
@@ -340,7 +340,7 @@ xact_desc_commit(StringInfo buf, uint8 info, xl_xact_commit *xlrec, RepOriginId
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
xact_desc_stats(buf, "", parsed.nstats, parsed.stats);
@@ -376,7 +376,7 @@ xact_desc_abort(StringInfo buf, uint8 info, xl_xact_abort *xlrec, RepOriginId or
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
if (parsed.xinfo & XACT_XINFO_HAS_ORIGIN)
@@ -400,9 +400,9 @@ xact_desc_prepare(StringInfo buf, uint8 info, xl_xact_prepare *xlrec, RepOriginI
appendStringInfo(buf, "gid %s: ", parsed.twophase_gid);
appendStringInfoString(buf, timestamptz_to_str(parsed.xact_time));
- xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xlocators);
xact_desc_relations(buf, "rels(abort)", parsed.nabortrels,
- parsed.abortnodes);
+ parsed.abortlocators);
xact_desc_stats(buf, "commit ", parsed.nstats, parsed.stats);
xact_desc_stats(buf, "abort ", parsed.nabortstats, parsed.abortstats);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index fefc563..6fec485 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -219,12 +219,12 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (detailed_format)
@@ -239,7 +239,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
"blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
@@ -299,7 +299,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
}
@@ -308,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index bfb7404..c6821b5 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index b500b2c..4c9f402 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -877,11 +877,11 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
if (TransactionIdIsValid(xldata->newestRedirectXid))
{
- RelFileNode node;
+ RelFileLocator locator;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &locator, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
- node);
+ locator);
}
}
diff --git a/src/backend/access/table/tableamapi.c b/src/backend/access/table/tableamapi.c
index 76df798..873d961 100644
--- a/src/backend/access/table/tableamapi.c
+++ b/src/backend/access/table/tableamapi.c
@@ -82,7 +82,7 @@ GetTableAmRoutine(Oid amhandler)
Assert(routine->tuple_update != NULL);
Assert(routine->tuple_lock != NULL);
- Assert(routine->relation_set_new_filenode != NULL);
+ Assert(routine->relation_set_new_filelocator != NULL);
Assert(routine->relation_nontransactional_truncate != NULL);
Assert(routine->relation_copy_data != NULL);
Assert(routine->relation_copy_for_cluster != NULL);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..565f994 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -557,7 +557,7 @@ void XLogRegisterBuffer(uint8 block_id, Buffer buf, uint8 flags);
XLogRegisterBuffer adds information about a data block to the WAL record.
block_id is an arbitrary number used to identify this page reference in
the redo routine. The information needed to re-find the page at redo -
- relfilenode, fork, and block number - are included in the WAL record.
+ relfilenumber, fork, and block number - are included in the WAL record.
XLogInsert will automatically include a full copy of the page contents, if
this is the first modification of the buffer since the last checkpoint.
@@ -692,7 +692,7 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
+because we check for on-disk collisions when allocating new relfilenumber
OIDs. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
@@ -725,10 +725,10 @@ then restart recovery. This is part of the reason for not writing a WAL
entry until we've successfully done the original action.
-Skipping WAL for New RelFileNode
+Skipping WAL for New RelFileLocator
--------------------------------
-Under wal_level=minimal, if a change modifies a relfilenode that ROLLBACK
+Under wal_level=minimal, if a change modifies a relfilenumber that ROLLBACK
would unlink, in-tree access methods write no WAL for that change. Code that
writes WAL without calling RelationNeedsWAL() must check for this case. This
skipping is mandatory. If a WAL-writing change preceded a WAL-skipping change
@@ -748,9 +748,9 @@ unconditionally for permanent relations. Under these approaches, the access
method callbacks must not call functions that react to RelationNeedsWAL().
This applies only to WAL records whose replay would modify bytes stored in the
-new relfilenode. It does not apply to other records about the relfilenode,
+new relfilenumber. It does not apply to other records about the relfilenumber,
such as XLOG_SMGR_CREATE. Because it operates at the level of individual
-relfilenodes, RelationNeedsWAL() can differ for tightly-coupled relations.
+relfilenumbers, RelationNeedsWAL() can differ for tightly-coupled relations.
Consider "CREATE TABLE t (); BEGIN; ALTER TABLE t ADD c text; ..." in which
ALTER TABLE adds a TOAST relation. The TOAST relation will skip WAL, while
the table owning it will not. ALTER TABLE SET TABLESPACE will cause a table
@@ -860,7 +860,7 @@ Changes to a temp table are not WAL-logged, hence could reach disk in
advance of T1's commit, but we don't care since temp table contents don't
survive crashes anyway.
-Database writes that skip WAL for new relfilenodes are also safe. In these
+Database writes that skip WAL for new relfilenumbers are also safe. In these
cases it's entirely possible for the data to reach disk before T1's commit,
because T1 will fsync it down to disk without any sort of interlock. However,
all these paths are designed to write data that no other transaction can see
diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel
index 99c588d..e486bff 100644
--- a/src/backend/access/transam/README.parallel
+++ b/src/backend/access/transam/README.parallel
@@ -126,7 +126,7 @@ worker. This includes:
an index that is currently being rebuilt.
- Active relmapper.c mapping state. This is needed to allow consistent
- answers when fetching the current relfilenode for relation oids of
+ answers when fetching the current relfilenumber for relation oids of
mapped relations.
To prevent unprincipled deadlocks when running in parallel mode, this code
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 75551f6..41b31c5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -204,7 +204,7 @@ static void RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -215,7 +215,7 @@ static void RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid);
@@ -951,8 +951,8 @@ TwoPhaseGetDummyProc(TransactionId xid, bool lock_held)
*
* 1. TwoPhaseFileHeader
* 2. TransactionId[] (subtransactions)
- * 3. RelFileNode[] (files to be deleted at commit)
- * 4. RelFileNode[] (files to be deleted at abort)
+ * 3. RelFileLocator[] (files to be deleted at commit)
+ * 4. RelFileLocator[] (files to be deleted at abort)
* 5. SharedInvalidationMessage[] (inval messages to be sent at commit)
* 6. TwoPhaseRecordOnDisk
* 7. ...
@@ -1047,8 +1047,8 @@ StartPrepare(GlobalTransaction gxact)
TransactionId xid = gxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
xl_xact_stats_item *abortstats = NULL;
xl_xact_stats_item *commitstats = NULL;
SharedInvalidationMessage *invalmsgs;
@@ -1102,12 +1102,12 @@ StartPrepare(GlobalTransaction gxact)
}
if (hdr.ncommitrels > 0)
{
- save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileNode));
+ save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileLocator));
pfree(commitrels);
}
if (hdr.nabortrels > 0)
{
- save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileNode));
+ save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileLocator));
pfree(abortrels);
}
if (hdr.ncommitstats > 0)
@@ -1489,9 +1489,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
TwoPhaseFileHeader *hdr;
TransactionId latestXid;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
- RelFileNode *delrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
+ RelFileLocator *delrels;
int ndelrels;
xl_xact_stats_item *commitstats;
xl_xact_stats_item *abortstats;
@@ -1525,10 +1525,10 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
bufptr += MAXALIGN(hdr->gidlen);
children = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- commitrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- abortrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ commitrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ abortrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
commitstats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
abortstats = (xl_xact_stats_item *) bufptr;
@@ -2100,8 +2100,8 @@ RecoverPreparedTransactions(void)
bufptr += MAXALIGN(hdr->gidlen);
subxids = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->nabortstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->ninvalmsgs * sizeof(SharedInvalidationMessage));
@@ -2285,7 +2285,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -2383,7 +2383,7 @@ RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..849a7ce 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -521,7 +521,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
+ * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
* catalog/catalog.c.
*/
Oid
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 47d80b0..9379723 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1282,7 +1282,7 @@ RecordTransactionCommit(void)
bool markXidCommitted = TransactionIdIsValid(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int nchildren;
TransactionId *children;
int ndroppedstats = 0;
@@ -1705,7 +1705,7 @@ RecordTransactionAbort(bool isSubXact)
TransactionId xid = GetCurrentTransactionIdIfAny();
TransactionId latestXid;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int ndroppedstats = 0;
xl_xact_stats_item *droppedstats = NULL;
int nchildren;
@@ -5586,7 +5586,7 @@ xactGetCommittedChildren(TransactionId **ptr)
XLogRecPtr
XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int nmsgs, SharedInvalidationMessage *msgs,
bool relcacheInval,
@@ -5597,7 +5597,7 @@ XactLogCommitRecord(TimestampTz commit_time,
xl_xact_xinfo xl_xinfo;
xl_xact_dbinfo xl_dbinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_invals xl_invals;
xl_xact_twophase xl_twophase;
@@ -5651,8 +5651,8 @@ XactLogCommitRecord(TimestampTz commit_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5710,12 +5710,12 @@ XactLogCommitRecord(TimestampTz commit_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -5758,7 +5758,7 @@ XactLogCommitRecord(TimestampTz commit_time,
XLogRecPtr
XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int xactflags, TransactionId twophase_xid,
const char *twophase_gid)
@@ -5766,7 +5766,7 @@ XactLogAbortRecord(TimestampTz abort_time,
xl_xact_abort xlrec;
xl_xact_xinfo xl_xinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_twophase xl_twophase;
xl_xact_dbinfo xl_dbinfo;
@@ -5800,8 +5800,8 @@ XactLogAbortRecord(TimestampTz abort_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5864,12 +5864,12 @@ XactLogAbortRecord(TimestampTz abort_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -6010,7 +6010,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
XLogFlush(lsn);
/* Make sure files supposed to be dropped are dropped */
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
@@ -6121,7 +6121,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid,
*/
XLogFlush(lsn);
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 2ce9be2..ec27d36 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -70,7 +70,7 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ RelFileLocator rlocator; /* identifies the relation and block */
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -257,7 +257,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->rlocator, ®buf->forkno, ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -278,7 +278,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -293,7 +293,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
+XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
BlockNumber blknum, Page page, uint8 flags)
{
registered_buffer *regbuf;
@@ -308,7 +308,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
- regbuf->rnode = *rnode;
+ regbuf->rlocator = *rlocator;
regbuf->forkno = forknum;
regbuf->block = blknum;
regbuf->page = page;
@@ -331,7 +331,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -768,7 +768,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && RelFileLocatorEquals(regbuf->rlocator, prev_regbuf->rlocator))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -793,8 +793,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
- memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
- scratch += sizeof(RelFileNode);
+ memcpy(scratch, ®buf->rlocator, sizeof(RelFileLocator));
+ scratch += sizeof(RelFileLocator);
}
memcpy(scratch, ®buf->block, sizeof(BlockNumber));
scratch += sizeof(BlockNumber);
@@ -1031,7 +1031,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags = 0;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkno;
BlockNumber blkno;
@@ -1058,8 +1058,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &rlocator, &forkno, &blkno);
+ XLogRegisterBlock(0, &rlocator, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1080,7 +1080,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
+log_newpage(RelFileLocator *rlocator, ForkNumber forkNum, BlockNumber blkno,
Page page, bool page_std)
{
int flags;
@@ -1091,7 +1091,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, rlocator, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1112,7 +1112,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
* because we can write multiple pages in a single WAL record.
*/
void
-log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, Page *pages, bool page_std)
{
int flags;
@@ -1142,7 +1142,7 @@ log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
nbatch = 0;
while (nbatch < XLR_MAX_BLOCK_ID && i < num_pages)
{
- XLogRegisterBlock(nbatch, rnode, forkNum, blknos[i], pages[i], flags);
+ XLogRegisterBlock(nbatch, rlocator, forkNum, blknos[i], pages[i], flags);
i++;
nbatch++;
}
@@ -1177,16 +1177,16 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
BlockNumber blkno;
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &rlocator, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rlocator, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 959e409..d1662f3 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -138,7 +138,7 @@ struct XLogPrefetcher
dlist_head filter_queue;
/* Book-keeping to avoid repeat prefetches. */
- RelFileNode recent_rnode[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
+ RelFileLocator recent_rlocator[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
BlockNumber recent_block[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
int recent_idx;
@@ -161,7 +161,7 @@ struct XLogPrefetcher
*/
typedef struct XLogPrefetcherFilter
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
XLogRecPtr filter_until_replayed;
BlockNumber filter_from_block;
dlist_node link;
@@ -187,11 +187,11 @@ typedef struct XLogPrefetchStats
} XLogPrefetchStats;
static inline void XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno,
XLogRecPtr lsn);
static inline bool XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno);
static inline void XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher,
XLogRecPtr replaying_lsn);
@@ -365,7 +365,7 @@ XLogPrefetcherAllocate(XLogReaderState *reader)
{
XLogPrefetcher *prefetcher;
static HASHCTL hash_table_ctl = {
- .keysize = sizeof(RelFileNode),
+ .keysize = sizeof(RelFileLocator),
.entrysize = sizeof(XLogPrefetcherFilter)
};
@@ -568,22 +568,22 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
xl_dbase_create_file_copy_rec *xlrec =
(xl_dbase_create_file_copy_rec *) record->main_data;
- RelFileNode rnode = {InvalidOid, xlrec->db_id, InvalidOid};
+ RelFileLocator rlocator = {InvalidOid, xlrec->db_id, InvalidOid};
/*
* Don't try to prefetch anything in this database until
* it has been created, or we might confuse the blocks of
- * different generations, if a database OID or relfilenode
- * is reused. It's also more efficient than discovering
- * that relations don't exist on disk yet with ENOENT
- * errors.
+ * different generations, if a database OID or
+ * relfilenumber is reused. It's also more efficient than
+ * discovering that relations don't exist on disk yet with
+ * ENOENT errors.
*/
- XLogPrefetcherAddFilter(prefetcher, rnode, 0, record->lsn);
+ XLogPrefetcherAddFilter(prefetcher, rlocator, 0, record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in database %u until %X/%X is replayed due to raw file copy",
- rnode.dbNode,
+ rlocator.dbOid,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -601,19 +601,19 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't prefetch anything for this whole relation
* until it has been created. Otherwise we might
* confuse the blocks of different generations, if a
- * relfilenode is reused. This also avoids the need
+ * relfilenumber is reused. This also avoids the need
* to discover the problem via extra syscalls that
* report ENOENT.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -627,16 +627,16 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't consider prefetching anything in the truncated
* range until the truncation has been performed.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator,
xlrec->blkno,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
xlrec->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
@@ -688,7 +688,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
}
/* Should we skip prefetching this block due to a filter? */
- if (XLogPrefetcherIsFiltered(prefetcher, block->rnode, block->blkno))
+ if (XLogPrefetcherIsFiltered(prefetcher, block->rlocator, block->blkno))
{
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -698,7 +698,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
for (int i = 0; i < XLOGPREFETCHER_SEQ_WINDOW_SIZE; ++i)
{
if (block->blkno == prefetcher->recent_block[i] &&
- RelFileNodeEquals(block->rnode, prefetcher->recent_rnode[i]))
+ RelFileLocatorEquals(block->rlocator, prefetcher->recent_rlocator[i]))
{
/*
* XXX If we also remembered where it was, we could set
@@ -709,7 +709,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
return LRQ_NEXT_NO_IO;
}
}
- prefetcher->recent_rnode[prefetcher->recent_idx] = block->rnode;
+ prefetcher->recent_rlocator[prefetcher->recent_idx] = block->rlocator;
prefetcher->recent_block[prefetcher->recent_idx] = block->blkno;
prefetcher->recent_idx =
(prefetcher->recent_idx + 1) % XLOGPREFETCHER_SEQ_WINDOW_SIZE;
@@ -719,7 +719,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* same relation (with some scheme to handle invalidations
* safely), but for now we'll call smgropen() every time.
*/
- reln = smgropen(block->rnode, InvalidBackendId);
+ reln = smgropen(block->rlocator, InvalidBackendId);
/*
* If the relation file doesn't exist on disk, for example because
@@ -733,12 +733,12 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, 0,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -754,13 +754,13 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, block->blkno,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, block->blkno,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -793,9 +793,9 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
*/
elog(ERROR,
"could not prefetch relation %u/%u/%u block %u",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno);
}
}
@@ -852,17 +852,17 @@ pg_stat_get_recovery_prefetch(PG_FUNCTION_ARGS)
}
/*
- * Don't prefetch any blocks >= 'blockno' from a given 'rnode', until 'lsn'
+ * Don't prefetch any blocks >= 'blockno' from a given 'rlocator', until 'lsn'
* has been replayed.
*/
static inline void
-XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno, XLogRecPtr lsn)
{
XLogPrefetcherFilter *filter;
bool found;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_ENTER, &found);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_ENTER, &found);
if (!found)
{
/*
@@ -875,7 +875,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
else
{
/*
- * We were already filtering this rnode. Extend the filter's lifetime
+ * We were already filtering this rlocator. Extend the filter's lifetime
* to cover this WAL record, but leave the lower of the block numbers
* there because we don't want to have to track individual blocks.
*/
@@ -890,7 +890,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
* Have we replayed any records that caused us to begin filtering a block
* range? That means that relations should have been created, extended or
* dropped as required, so we can stop filtering out accesses to a given
- * relfilenode.
+ * relfilenumber.
*/
static inline void
XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_lsn)
@@ -913,7 +913,7 @@ XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_l
* Check if a given block should be skipped due to a filter.
*/
static inline bool
-XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno)
{
/*
@@ -925,13 +925,13 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
XLogPrefetcherFilter *filter;
/* See if the block range is filtered. */
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter && filter->filter_from_block <= blockno)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
#endif
@@ -939,15 +939,15 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
}
/* See if the whole database is filtered. */
- rnode.relNode = InvalidOid;
- rnode.spcNode = InvalidOid;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ rlocator.relNumber = InvalidRelFileNumber;
+ rlocator.spcOid = InvalidOid;
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
return true;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cf5db23..f3dc4b7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1638,7 +1638,7 @@ DecodeXLogRecord(XLogReaderState *state,
char *out;
uint32 remaining;
uint32 datatotal;
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
uint8 block_id;
decoded->header = *record;
@@ -1823,12 +1823,12 @@ DecodeXLogRecord(XLogReaderState *state,
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
- COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
- rnode = &blk->rnode;
+ COPY_HEADER_FIELD(&blk->rlocator, sizeof(RelFileLocator));
+ rlocator = &blk->rlocator;
}
else
{
- if (rnode == NULL)
+ if (rlocator == NULL)
{
report_invalid_record(state,
"BKPBLOCK_SAME_REL set but no previous rel at %X/%X",
@@ -1836,7 +1836,7 @@ DecodeXLogRecord(XLogReaderState *state,
goto err;
}
- blk->rnode = *rnode;
+ blk->rlocator = *rlocator;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
}
@@ -1926,10 +1926,11 @@ err:
*/
void
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
+ RelFileLocator *rlocator, ForkNumber *forknum,
+ BlockNumber *blknum)
{
- if (!XLogRecGetBlockTagExtended(record, block_id, rnode, forknum, blknum,
- NULL))
+ if (!XLogRecGetBlockTagExtended(record, block_id, rlocator, forknum,
+ blknum, NULL))
{
#ifndef FRONTEND
elog(ERROR, "failed to locate backup block with ID %d in WAL record",
@@ -1945,13 +1946,13 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
* Returns information about the block that a block reference refers to,
* optionally including the buffer that the block may already be in.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
+ * If the WAL record contains a block reference with the given ID, *rlocator,
* *forknum, *blknum and *prefetch_buffer are filled in (if not NULL), and
* returns true. Otherwise returns false.
*/
bool
XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer)
{
@@ -1961,8 +1962,8 @@ XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->record->blocks[block_id];
- if (rnode)
- *rnode = bkpb->rnode;
+ if (rlocator)
+ *rlocator = bkpb->rlocator;
if (forknum)
*forknum = bkpb->forknum;
if (blknum)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6eba626..8306518 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2166,24 +2166,26 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
/* decode block references */
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (forknum != MAIN_FORKNUM)
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
forknum,
blk);
else
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
@@ -2285,7 +2287,7 @@ static void
verifyBackupPageConsistency(XLogReaderState *record)
{
RmgrData rmgr = GetRmgr(XLogRecGetRmid(record));
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
int block_id;
@@ -2302,7 +2304,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
Page page;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -2327,7 +2329,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(rlocator, forknum, blkno,
RBM_NORMAL_NO_LOG,
InvalidBuffer);
if (!BufferIsValid(buf))
@@ -2377,7 +2379,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
{
elog(FATAL,
"inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 4851669..42a0f51 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -67,7 +67,7 @@ HotStandbyState standbyState = STANDBY_DISABLED;
*/
typedef struct xl_invalid_page_key
{
- RelFileNode node; /* the relation */
+ RelFileLocator locator; /* the relation */
ForkNumber forkno; /* the fork number */
BlockNumber blkno; /* the page */
} xl_invalid_page_key;
@@ -86,10 +86,10 @@ static int read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPa
/* Report a reference to an invalid page */
static void
-report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
+report_invalid_page(int elevel, RelFileLocator locator, ForkNumber forkno,
BlockNumber blkno, bool present)
{
- char *path = relpathperm(node, forkno);
+ char *path = relpathperm(locator, forkno);
if (present)
elog(elevel, "page %u of relation %s is uninitialized",
@@ -102,7 +102,7 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
/* Log a reference to an invalid page */
static void
-log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
+log_invalid_page(RelFileLocator locator, ForkNumber forkno, BlockNumber blkno,
bool present)
{
xl_invalid_page_key key;
@@ -119,7 +119,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
*/
if (reachedConsistency)
{
- report_invalid_page(WARNING, node, forkno, blkno, present);
+ report_invalid_page(WARNING, locator, forkno, blkno, present);
elog(ignore_invalid_pages ? WARNING : PANIC,
"WAL contains references to invalid pages");
}
@@ -130,7 +130,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
* something about the XLOG record that generated the reference).
*/
if (message_level_is_interesting(DEBUG1))
- report_invalid_page(DEBUG1, node, forkno, blkno, present);
+ report_invalid_page(DEBUG1, locator, forkno, blkno, present);
if (invalid_page_tab == NULL)
{
@@ -147,7 +147,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
}
/* we currently assume xl_invalid_page_key contains no padding */
- key.node = node;
+ key.locator = locator;
key.forkno = forkno;
key.blkno = blkno;
hentry = (xl_invalid_page *)
@@ -166,7 +166,8 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
/* Forget any invalid pages >= minblkno, because they've been dropped */
static void
-forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
+forget_invalid_pages(RelFileLocator locator, ForkNumber forkno,
+ BlockNumber minblkno)
{
HASH_SEQ_STATUS status;
xl_invalid_page *hentry;
@@ -178,13 +179,13 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (RelFileNodeEquals(hentry->key.node, node) &&
+ if (RelFileLocatorEquals(hentry->key.locator, locator) &&
hentry->key.forkno == forkno &&
hentry->key.blkno >= minblkno)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, forkno);
+ char *path = relpathperm(hentry->key.locator, forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -213,11 +214,11 @@ forget_invalid_pages_db(Oid dbid)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (hentry->key.node.dbNode == dbid)
+ if (hentry->key.locator.dbOid == dbid)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, hentry->key.forkno);
+ char *path = relpathperm(hentry->key.locator, hentry->key.forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -261,7 +262,7 @@ XLogCheckInvalidPages(void)
*/
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- report_invalid_page(WARNING, hentry->key.node, hentry->key.forkno,
+ report_invalid_page(WARNING, hentry->key.locator, hentry->key.forkno,
hentry->key.blkno, hentry->present);
foundone = true;
}
@@ -356,7 +357,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Buffer *buf)
{
XLogRecPtr lsn = record->EndRecPtr;
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
Buffer prefetch_buffer;
@@ -364,7 +365,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
bool zeromode;
bool willinit;
- if (!XLogRecGetBlockTagExtended(record, block_id, &rnode, &forknum, &blkno,
+ if (!XLogRecGetBlockTagExtended(record, block_id, &rlocator, &forknum, &blkno,
&prefetch_buffer))
{
/* Caller specified a bogus block_id */
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK,
prefetch_buffer);
page = BufferGetPage(*buf);
@@ -418,7 +419,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode, prefetch_buffer);
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno, mode, prefetch_buffer);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -468,7 +469,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* they will be invisible to tools that need to know which pages are modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer)
{
@@ -481,14 +482,14 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* Do we have a clue where the buffer might be already? */
if (BufferIsValid(recent_buffer) &&
mode == RBM_NORMAL &&
- ReadRecentBuffer(rnode, forknum, blkno, recent_buffer))
+ ReadRecentBuffer(rlocator, forknum, blkno, recent_buffer))
{
buffer = recent_buffer;
goto recent_buffer_fast_path;
}
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(rlocator, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -505,7 +506,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
else
@@ -513,7 +514,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* hm, page doesn't exist in file */
if (mode == RBM_NORMAL)
{
- log_invalid_page(rnode, forknum, blkno, false);
+ log_invalid_page(rlocator, forknum, blkno, false);
return InvalidBuffer;
}
if (mode == RBM_NORMAL_NO_LOG)
@@ -530,7 +531,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum,
P_NEW, mode, NULL, true);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -540,7 +541,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
}
@@ -559,7 +560,7 @@ recent_buffer_fast_path:
if (PageIsNew(page))
{
ReleaseBuffer(buffer);
- log_invalid_page(rnode, forknum, blkno, true);
+ log_invalid_page(rlocator, forknum, blkno, true);
return InvalidBuffer;
}
}
@@ -594,7 +595,7 @@ typedef FakeRelCacheEntryData *FakeRelCacheEntry;
* Caller must free the returned entry with FreeFakeRelcacheEntry().
*/
Relation
-CreateFakeRelcacheEntry(RelFileNode rnode)
+CreateFakeRelcacheEntry(RelFileLocator rlocator)
{
FakeRelCacheEntry fakeentry;
Relation rel;
@@ -604,7 +605,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel = (Relation) fakeentry;
rel->rd_rel = &fakeentry->pgc;
- rel->rd_node = rnode;
+ rel->rd_locator = rlocator;
/*
* We will never be working with temp rels during recovery or while
@@ -615,18 +616,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
/* It must be a permanent table here */
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
- /* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ /* We don't know the name of the relation; use relfilelocator instead */
+ sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
+ * relation. Note that this is fairly bogus since relNumber may be
* different from the relation's OID. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
- rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
+ rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
rel->rd_smgr = NULL;
@@ -652,9 +653,9 @@ FreeFakeRelcacheEntry(Relation fakerel)
* any open "invalid-page" records for the relation.
*/
void
-XLogDropRelation(RelFileNode rnode, ForkNumber forknum)
+XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum)
{
- forget_invalid_pages(rnode, forknum, 0);
+ forget_invalid_pages(rlocator, forknum, 0);
}
/*
@@ -682,10 +683,10 @@ XLogDropDatabase(Oid dbid)
* We need to clean up any open "invalid-page" records for the dropped pages.
*/
void
-XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks)
{
- forget_invalid_pages(rnode, forkNum, nblocks);
+ forget_invalid_pages(rlocator, forkNum, nblocks);
}
/*
diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index e5cf1b3..a872199 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -287,9 +287,9 @@ Boot_DeclareIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = false;
stmt->primary = false;
stmt->isconstraint = false;
@@ -339,9 +339,9 @@ Boot_DeclareUniqueIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = true;
stmt->primary = false;
stmt->isconstraint = false;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index e784538..2a33273 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,14 +481,14 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
+ * GetNewRelFileNumber
+ * Generate a new relfilenumber that is unique within the
* database of the given tablespace.
*
- * If the relfilenode will also be used as the relation's OID, pass the
+ * If the relfilenumber will also be used as the relation's OID, pass the
* opened pg_class catalog, and this routine will guarantee that the result
* is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
+ * as a relfilenumber for an existing relation, pass NULL for pg_class.
*
* As with GetNewOidWithIndex(), there is some theoretical risk of a race
* condition, but it doesn't seem worth worrying about.
@@ -496,17 +496,17 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
* Note: we don't support using this in bootstrap mode. All relations
* created by bootstrap have preassigned OIDs, so there's no need.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
{
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
char *rpath;
bool collides;
BackendId backend;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenode assignments during a binary-upgrade run should be
+ * relfilenumber assignments during a binary-upgrade run should be
* determined by commands in the dump script.
*/
Assert(!IsBinaryUpgrade);
@@ -526,15 +526,15 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
}
/* This logic should match RelationInitPhysicalAddr */
- rnode.node.spcNode = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rnode.node.dbNode = (rnode.node.spcNode == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
+ rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
/*
* The relpath will vary based on the backend ID, so we must initialize
* that properly here to make sure that any collisions based on filename
* are properly detected.
*/
- rnode.backend = backend;
+ rlocator.backend = backend;
do
{
@@ -542,13 +542,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
/* Generate the OID */
if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
Anum_pg_class_oid);
else
- rnode.node.relNode = GetNewObjectId();
+ rlocator.locator.relNumber = GetNewObjectId();
/* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ rpath = relpath(rlocator, MAIN_FORKNUM);
if (access(rpath, F_OK) == 0)
{
@@ -570,7 +570,7 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
pfree(rpath);
} while (collides);
- return rnode.node.relNode;
+ return rlocator.locator.relNumber;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 1803194..c69c923 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -77,9 +77,11 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber =
+ InvalidRelFileNumber;
+RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber =
+ InvalidRelFileNumber;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -273,7 +275,7 @@ SystemAttributeByName(const char *attname)
* heap_create - Create an uncataloged heap relation
*
* Note API change: the caller must now always provide the OID
- * to use for the relation. The relfilenode may be (and in
+ * to use for the relation. The relfilenumber may be (and in
* the simplest cases is) left unspecified.
*
* create_storage indicates whether or not to create the storage.
@@ -289,7 +291,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -341,11 +343,11 @@ heap_create(const char *relname,
else
{
/*
- * If relfilenode is unspecified by the caller then create storage
+ * If relfilenumber is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelFileNumberIsValid(relfilenumber))
+ relfilenumber = relid;
}
/*
@@ -368,7 +370,7 @@ heap_create(const char *relname,
tupDesc,
relid,
accessmtd,
- relfilenode,
+ relfilenumber,
reltablespace,
shared_relation,
mapped_relation,
@@ -385,11 +387,11 @@ heap_create(const char *relname,
if (create_storage)
{
if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind))
- table_relation_set_new_filenode(rel, &rel->rd_node,
- relpersistence,
- relfrozenxid, relminmxid);
+ table_relation_set_new_filelocator(rel, &rel->rd_locator,
+ relpersistence,
+ relfrozenxid, relminmxid);
else if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
- RelationCreateStorage(rel->rd_node, relpersistence, true);
+ RelationCreateStorage(rel->rd_locator, relpersistence, true);
else
Assert(false);
}
@@ -1069,7 +1071,7 @@ AddNewRelationType(const char *typeName,
* relkind: relkind for new rel
* relpersistence: rel's persistence status (permanent, temp, or unlogged)
* shared_relation: true if it's to be a shared relation
- * mapped_relation: true if the relation will use the relfilenode map
+ * mapped_relation: true if the relation will use the relfilenumber map
* oncommit: ON COMMIT marking (only relevant if it's a temp table)
* reloptions: reloptions in Datum form, or (Datum) 0 if none
* use_user_acl: true if should look for user-defined default permissions;
@@ -1115,7 +1117,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1173,12 +1175,12 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(relid))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
/*
@@ -1196,13 +1198,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_toast_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("toast relfilenode value not set when in binary upgrade mode")));
+ errmsg("toast relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_toast_pg_class_relfilenumber;
+ binary_upgrade_next_toast_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
else
@@ -1217,20 +1219,20 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_heap_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("relfilenode value not set when in binary upgrade mode")));
+ errmsg("relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_heap_pg_class_relfilenumber;
+ binary_upgrade_next_heap_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
+ relpersistence);
}
/*
@@ -1273,7 +1275,7 @@ heap_create_with_catalog(const char *relname,
relnamespace,
reltablespace,
relid,
- relfilenode,
+ relfilenumber,
accessmtd,
tupdesc,
relkind,
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bdd3c34..f245df8 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,8 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber =
+ InvalidRelFileNumber;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,7 +663,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNumber: normally, pass InvalidOid to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -703,7 +704,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +736,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelFileNumberIsValid(relFileNumber);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -751,7 +752,7 @@ index_create(Relation heapRelation,
/*
* The index will be in the same namespace as its parent table, and is
* shared across databases if and only if the parent is. Likewise, it
- * will use the relfilenode map if and only if the parent does; and it
+ * will use the relfilenumber map if and only if the parent does; and it
* inherits the parent's relpersistence.
*/
namespaceId = RelationGetNamespace(heapRelation);
@@ -902,12 +903,12 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(indexRelationId))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
if (!OidIsValid(binary_upgrade_next_index_pg_class_oid))
@@ -918,14 +919,14 @@ index_create(Relation heapRelation,
indexRelationId = binary_upgrade_next_index_pg_class_oid;
binary_upgrade_next_index_pg_class_oid = InvalidOid;
- /* Override the index relfilenode */
+ /* Override the index relfilenumber */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!RelFileNumberIsValid(binary_upgrade_next_index_pg_class_relfilenumber)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("index relfilenode value not set when in binary upgrade mode")));
- relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ errmsg("index relfilenumber value not set when in binary upgrade mode")));
+ relFileNumber = binary_upgrade_next_index_pg_class_relfilenumber;
+ binary_upgrade_next_index_pg_class_relfilenumber = InvalidRelFileNumber;
/*
* Note that we want create_storage = true for binary upgrade. The
@@ -937,7 +938,7 @@ index_create(Relation heapRelation,
else
{
indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
}
}
@@ -950,7 +951,7 @@ index_create(Relation heapRelation,
namespaceId,
tableSpaceId,
indexRelationId,
- relFileNode,
+ relFileNumber,
accessMethodObjectId,
indexTupDesc,
relkind,
@@ -1408,7 +1409,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelFileNumber, /* relFileNumber */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
@@ -3024,7 +3025,7 @@ index_build(Relation heapRelation,
* it -- but we must first check whether one already exists. If, for
* example, an unlogged relation is truncated in the transaction that
* created it, or truncated twice in a subsequent transaction, the
- * relfilenode won't change, and nothing needs to be done here.
+ * relfilenumber won't change, and nothing needs to be done here.
*/
if (indexRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
!smgrexists(RelationGetSmgr(indexRelation), INIT_FORKNUM))
@@ -3681,7 +3682,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
* Schedule unlinking of the old index storage at transaction commit.
*/
RelationDropStorage(iRel);
- RelationAssumeNewRelfilenode(iRel);
+ RelationAssumeNewRelfilelocator(iRel);
/* Make sure the reltablespace change is visible */
CommandCounterIncrement();
@@ -3711,7 +3712,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
SetReindexProcessing(heapId, indexId);
/* Create a new physical relation for the index */
- RelationSetNewRelfilenode(iRel, persistence);
+ RelationSetNewRelfilenumber(iRel, persistence);
/* Initialize the index and rebuild */
/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index c06e414..37dd2b9 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -38,7 +38,7 @@
int wal_skip_threshold = 2048; /* in kilobytes */
/*
- * We keep a list of all relations (represented as RelFileNode values)
+ * We keep a list of all relations (represented as RelFileLocator values)
* that have been created or deleted in the current transaction. When
* a relation is created, we create the physical file immediately, but
* remember it so that we can delete the file again if the current
@@ -59,7 +59,7 @@ int wal_skip_threshold = 2048; /* in kilobytes */
typedef struct PendingRelDelete
{
- RelFileNode relnode; /* relation that may need to be deleted */
+ RelFileLocator rlocator; /* relation that may need to be deleted */
BackendId backend; /* InvalidBackendId if not a temp rel */
bool atCommit; /* T=delete at commit; F=delete at abort */
int nestLevel; /* xact nesting level of request */
@@ -68,7 +68,7 @@ typedef struct PendingRelDelete
typedef struct PendingRelSync
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
bool is_truncated; /* Has the file experienced truncation? */
} PendingRelSync;
@@ -81,7 +81,7 @@ static HTAB *pendingSyncHash = NULL;
* Queue an at-commit fsync.
*/
static void
-AddPendingSync(const RelFileNode *rnode)
+AddPendingSync(const RelFileLocator *rlocator)
{
PendingRelSync *pending;
bool found;
@@ -91,14 +91,14 @@ AddPendingSync(const RelFileNode *rnode)
{
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNode);
+ ctl.keysize = sizeof(RelFileLocator);
ctl.entrysize = sizeof(PendingRelSync);
ctl.hcxt = TopTransactionContext;
pendingSyncHash = hash_create("pending sync hash", 16, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
}
- pending = hash_search(pendingSyncHash, rnode, HASH_ENTER, &found);
+ pending = hash_search(pendingSyncHash, rlocator, HASH_ENTER, &found);
Assert(!found);
pending->is_truncated = false;
}
@@ -117,7 +117,7 @@ AddPendingSync(const RelFileNode *rnode)
* pass register_delete = false.
*/
SMgrRelation
-RelationCreateStorage(RelFileNode rnode, char relpersistence,
+RelationCreateStorage(RelFileLocator rlocator, char relpersistence,
bool register_delete)
{
SMgrRelation srel;
@@ -145,11 +145,11 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(rlocator, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
- log_smgrcreate(&srel->smgr_rnode.node, MAIN_FORKNUM);
+ log_smgrcreate(&srel->smgr_rlocator.locator, MAIN_FORKNUM);
/*
* Add the relation to the list of stuff to delete at abort, if we are
@@ -161,7 +161,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rnode;
+ pending->rlocator = rlocator;
pending->backend = backend;
pending->atCommit = false; /* delete if abort */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -172,7 +172,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
if (relpersistence == RELPERSISTENCE_PERMANENT && !XLogIsNeeded())
{
Assert(backend == InvalidBackendId);
- AddPendingSync(&rnode);
+ AddPendingSync(&rlocator);
}
return srel;
@@ -182,14 +182,14 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
* Perform XLogInsert of an XLOG_SMGR_CREATE record to WAL.
*/
void
-log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum)
+log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum)
{
xl_smgr_create xlrec;
/*
* Make an XLOG entry reporting the file creation.
*/
- xlrec.rnode = *rnode;
+ xlrec.rlocator = *rlocator;
xlrec.forkNum = forkNum;
XLogBeginInsert();
@@ -209,7 +209,7 @@ RelationDropStorage(Relation rel)
/* Add the relation to the list of stuff to delete at commit */
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rel->rd_node;
+ pending->rlocator = rel->rd_locator;
pending->backend = rel->rd_backend;
pending->atCommit = true; /* delete if commit */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -247,7 +247,7 @@ RelationDropStorage(Relation rel)
* No-op if the relation is not among those scheduled for deletion.
*/
void
-RelationPreserveStorage(RelFileNode rnode, bool atCommit)
+RelationPreserveStorage(RelFileLocator rlocator, bool atCommit)
{
PendingRelDelete *pending;
PendingRelDelete *prev;
@@ -257,7 +257,7 @@ RelationPreserveStorage(RelFileNode rnode, bool atCommit)
for (pending = pendingDeletes; pending != NULL; pending = next)
{
next = pending->next;
- if (RelFileNodeEquals(rnode, pending->relnode)
+ if (RelFileLocatorEquals(rlocator, pending->rlocator)
&& pending->atCommit == atCommit)
{
/* unlink and delete list entry */
@@ -369,7 +369,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
xl_smgr_truncate xlrec;
xlrec.blkno = nblocks;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_ALL;
XLogBeginInsert();
@@ -428,7 +428,7 @@ RelationPreTruncate(Relation rel)
return;
pending = hash_search(pendingSyncHash,
- &(RelationGetSmgr(rel)->smgr_rnode.node),
+ &(RelationGetSmgr(rel)->smgr_rlocator.locator),
HASH_FIND, NULL);
if (pending)
pending->is_truncated = true;
@@ -472,7 +472,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* We need to log the copied data in WAL iff WAL archiving/streaming is
* enabled AND it's a permanent relation. This gives the same answer as
* "RelationNeedsWAL(rel) || copying_initfork", because we know the
- * current operation created a new relfilenode.
+ * current operation created a new relfilelocator.
*/
use_wal = XLogIsNeeded() &&
(relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork);
@@ -496,8 +496,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* (errcontext callbacks shouldn't be risking any such thing, but
* people have been known to forget that rule.)
*/
- char *relpath = relpathbackend(src->smgr_rnode.node,
- src->smgr_rnode.backend,
+ char *relpath = relpathbackend(src->smgr_rlocator.locator,
+ src->smgr_rlocator.backend,
forkNum);
ereport(ERROR,
@@ -512,7 +512,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
PageSetChecksumInplace(page, blkno);
@@ -538,19 +538,19 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
}
/*
- * RelFileNodeSkippingWAL
- * Check if a BM_PERMANENT relfilenode is using WAL.
+ * RelFileLocatorSkippingWAL
+ * Check if a BM_PERMANENT relfilelocator is using WAL.
*
- * Changes of certain relfilenodes must not write WAL; see "Skipping WAL for
- * New RelFileNode" in src/backend/access/transam/README. Though it is known
- * from Relation efficiently, this function is intended for the code paths not
- * having access to Relation.
+ * Changes of certain relfilelocator must not write WAL; see "Skipping WAL for
+ * New RelFileLocator" in src/backend/access/transam/README. Though it is
+ * known from Relation efficiently, this function is intended for the code
+ * paths not having access to Relation.
*/
bool
-RelFileNodeSkippingWAL(RelFileNode rnode)
+RelFileLocatorSkippingWAL(RelFileLocator rlocator)
{
if (!pendingSyncHash ||
- hash_search(pendingSyncHash, &rnode, HASH_FIND, NULL) == NULL)
+ hash_search(pendingSyncHash, &rlocator, HASH_FIND, NULL) == NULL)
return false;
return true;
@@ -566,7 +566,7 @@ EstimatePendingSyncsSpace(void)
long entries;
entries = pendingSyncHash ? hash_get_num_entries(pendingSyncHash) : 0;
- return mul_size(1 + entries, sizeof(RelFileNode));
+ return mul_size(1 + entries, sizeof(RelFileLocator));
}
/*
@@ -581,57 +581,58 @@ SerializePendingSyncs(Size maxSize, char *startAddress)
HASH_SEQ_STATUS scan;
PendingRelSync *sync;
PendingRelDelete *delete;
- RelFileNode *src;
- RelFileNode *dest = (RelFileNode *) startAddress;
+ RelFileLocator *src;
+ RelFileLocator *dest = (RelFileLocator *) startAddress;
if (!pendingSyncHash)
goto terminate;
- /* Create temporary hash to collect active relfilenodes */
- ctl.keysize = sizeof(RelFileNode);
- ctl.entrysize = sizeof(RelFileNode);
+ /* Create temporary hash to collect active relfilelocators */
+ ctl.keysize = sizeof(RelFileLocator);
+ ctl.entrysize = sizeof(RelFileLocator);
ctl.hcxt = CurrentMemoryContext;
- tmphash = hash_create("tmp relfilenodes",
+ tmphash = hash_create("tmp relfilelocators",
hash_get_num_entries(pendingSyncHash), &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
- /* collect all rnodes from pending syncs */
+ /* collect all rlocator from pending syncs */
hash_seq_init(&scan, pendingSyncHash);
while ((sync = (PendingRelSync *) hash_seq_search(&scan)))
- (void) hash_search(tmphash, &sync->rnode, HASH_ENTER, NULL);
+ (void) hash_search(tmphash, &sync->rlocator, HASH_ENTER, NULL);
/* remove deleted rnodes */
for (delete = pendingDeletes; delete != NULL; delete = delete->next)
if (delete->atCommit)
- (void) hash_search(tmphash, (void *) &delete->relnode,
+ (void) hash_search(tmphash, (void *) &delete->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, tmphash);
- while ((src = (RelFileNode *) hash_seq_search(&scan)))
+ while ((src = (RelFileLocator *) hash_seq_search(&scan)))
*dest++ = *src;
hash_destroy(tmphash);
terminate:
- MemSet(dest, 0, sizeof(RelFileNode));
+ MemSet(dest, 0, sizeof(RelFileLocator));
}
/*
* RestorePendingSyncs
* Restore syncs within a parallel worker.
*
- * RelationNeedsWAL() and RelFileNodeSkippingWAL() must offer the correct
+ * RelationNeedsWAL() and RelFileLocatorSkippingWAL() must offer the correct
* answer to parallel workers. Only smgrDoPendingSyncs() reads the
* is_truncated field, at end of transaction. Hence, don't restore it.
*/
void
RestorePendingSyncs(char *startAddress)
{
- RelFileNode *rnode;
+ RelFileLocator *rlocator;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
- AddPendingSync(rnode);
+ for (rlocator = (RelFileLocator *) startAddress; rlocator->relNumber != 0;
+ rlocator++)
+ AddPendingSync(rlocator);
}
/*
@@ -677,7 +678,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(pending->rlocator, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -747,7 +748,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
/* Skip syncing nodes that smgrDoPendingDeletes() will delete. */
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
if (pending->atCommit)
- (void) hash_search(pendingSyncHash, (void *) &pending->relnode,
+ (void) hash_search(pendingSyncHash, (void *) &pending->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, pendingSyncHash);
@@ -758,7 +759,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
BlockNumber total_blocks = 0;
SMgrRelation srel;
- srel = smgropen(pendingsync->rnode, InvalidBackendId);
+ srel = smgropen(pendingsync->rlocator, InvalidBackendId);
/*
* We emit newpage WAL records for smaller relations.
@@ -832,7 +833,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* page including any unused space. ReadBufferExtended()
* counts some pgstat events; unfortunately, we discard them.
*/
- rel = CreateFakeRelcacheEntry(srel->smgr_rnode.node);
+ rel = CreateFakeRelcacheEntry(srel->smgr_rlocator.locator);
log_newpage_range(rel, fork, 0, n, false);
FreeFakeRelcacheEntry(rel);
}
@@ -852,7 +853,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* smgrGetPendingDeletes() -- Get a list of non-temp relations to be deleted.
*
* The return value is the number of relations scheduled for termination.
- * *ptr is set to point to a freshly-palloc'd array of RelFileNodes.
+ * *ptr is set to point to a freshly-palloc'd array of RelFileLocators.
* If there are no relations to be deleted, *ptr is set to NULL.
*
* Only non-temporary relations are included in the returned list. This is OK
@@ -866,11 +867,11 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* by upper-level transactions.
*/
int
-smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
+smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr)
{
int nestLevel = GetCurrentTransactionNestLevel();
int nrels;
- RelFileNode *rptr;
+ RelFileLocator *rptr;
PendingRelDelete *pending;
nrels = 0;
@@ -885,14 +886,14 @@ smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
*ptr = NULL;
return 0;
}
- rptr = (RelFileNode *) palloc(nrels * sizeof(RelFileNode));
+ rptr = (RelFileLocator *) palloc(nrels * sizeof(RelFileLocator));
*ptr = rptr;
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
{
if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit
&& pending->backend == InvalidBackendId)
{
- *rptr = pending->relnode;
+ *rptr = pending->rlocator;
rptr++;
}
}
@@ -967,7 +968,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -980,7 +981,7 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
@@ -1015,11 +1016,11 @@ smgr_redo(XLogReaderState *record)
nforks++;
/* Also tell xlogutils.c about it */
- XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+ XLogTruncateRelation(xlrec->rlocator, MAIN_FORKNUM, xlrec->blkno);
}
/* Prepare for truncation of FSM and VM too */
- rel = CreateFakeRelcacheEntry(xlrec->rnode);
+ rel = CreateFakeRelcacheEntry(xlrec->rlocator);
if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 &&
smgrexists(reln, FSM_FORKNUM))
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cea2c8b..da137eb 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -293,7 +293,7 @@ cluster_multiple_rels(List *rtcs, ClusterParams *params)
* cluster_rel
*
* This clusters the table by creating a new, clustered table and
- * swapping the relfilenodes of the new table and the old table, so
+ * swapping the relfilenumbers of the new table and the old table, so
* the OID of the original table is preserved. Thus we do not lose
* GRANT, inheritance nor references to this table (this was a bug
* in releases through 7.3).
@@ -1025,8 +1025,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
/*
* Swap the physical files of two given relations.
*
- * We swap the physical identity (reltablespace, relfilenode) while keeping the
- * same logical identities of the two relations. relpersistence is also
+ * We swap the physical identity (reltablespace, relfilenumber) while keeping
+ * the same logical identities of the two relations. relpersistence is also
* swapped, which is critical since it determines where buffers live for each
* relation.
*
@@ -1061,9 +1061,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
- relfilenode2;
- Oid swaptemp;
+ RelFileNumber relfilenumber1,
+ relfilenumber2;
+ RelFileNumber swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
@@ -1079,13 +1079,14 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
elog(ERROR, "cache lookup failed for relation %u", r2);
relform2 = (Form_pg_class) GETSTRUCT(reltup2);
- relfilenode1 = relform1->relfilenode;
- relfilenode2 = relform2->relfilenode;
+ relfilenumber1 = relform1->relfilenode;
+ relfilenumber2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) &&
+ RelFileNumberIsValid(relfilenumber2))
{
/*
- * Normal non-mapped relations: swap relfilenodes, reltablespaces,
+ * Normal non-mapped relations: swap relfilenumbers, reltablespaces,
* relpersistence
*/
Assert(!target_is_pg_class);
@@ -1120,7 +1121,8 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) ||
+ RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1148,12 +1150,12 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
/*
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
- relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ relfilenumber1 = RelationMapOidToFilenumber(r1, relform1->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
- relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ relfilenumber2 = RelationMapOidToFilenumber(r2, relform2->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
@@ -1161,15 +1163,15 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Send replacement mappings to relmapper. Note these won't actually
* take effect until CommandCounterIncrement.
*/
- RelationMapUpdateMap(r1, relfilenode2, relform1->relisshared, false);
- RelationMapUpdateMap(r2, relfilenode1, relform2->relisshared, false);
+ RelationMapUpdateMap(r1, relfilenumber2, relform1->relisshared, false);
+ RelationMapUpdateMap(r2, relfilenumber1, relform2->relisshared, false);
/* Pass OIDs of mapped r2 tables back to caller */
*mapped_tables++ = r2;
}
/*
- * Recognize that rel1's relfilenode (swapped from rel2) is new in this
+ * Recognize that rel1's relfilenumber (swapped from rel2) is new in this
* subtransaction. The rel2 storage (swapped from rel1) may or may not be
* new.
*/
@@ -1180,9 +1182,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
rel1 = relation_open(r1, NoLock);
rel2 = relation_open(r2, NoLock);
rel2->rd_createSubid = rel1->rd_createSubid;
- rel2->rd_newRelfilenodeSubid = rel1->rd_newRelfilenodeSubid;
- rel2->rd_firstRelfilenodeSubid = rel1->rd_firstRelfilenodeSubid;
- RelationAssumeNewRelfilenode(rel1);
+ rel2->rd_newRelfilelocatorSubid = rel1->rd_newRelfilelocatorSubid;
+ rel2->rd_firstRelfilelocatorSubid = rel1->rd_firstRelfilelocatorSubid;
+ RelationAssumeNewRelfilelocator(rel1);
relation_close(rel1, NoLock);
relation_close(rel2, NoLock);
}
@@ -1523,7 +1525,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
table_close(relRelation, RowExclusiveLock);
}
- /* Destroy new heap with old filenode */
+ /* Destroy new heap with old filenumber */
object.classId = RelationRelationId;
object.objectId = OIDNewHeap;
object.objectSubId = 0;
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 35a1d3a..c985fea 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -593,11 +593,11 @@ CopyFrom(CopyFromState cstate)
*/
if (RELKIND_HAS_STORAGE(cstate->rel->rd_rel->relkind) &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
- cstate->rel->rd_firstRelfilenodeSubid != InvalidSubTransactionId))
+ cstate->rel->rd_firstRelfilelocatorSubid != InvalidSubTransactionId))
ti_options |= TABLE_INSERT_SKIP_FSM;
/*
- * Optimize if new relfilenode was created in this subxact or one of its
+ * Optimize if new relfilenumber was created in this subxact or one of its
* committed children and we won't see those rows later as part of an
* earlier scan or command. The subxact test ensures that if this subxact
* aborts then the frozen rows won't be visible after xact cleanup. Note
@@ -640,7 +640,7 @@ CopyFrom(CopyFromState cstate)
errmsg("cannot perform COPY FREEZE because of prior transaction activity")));
if (cstate->rel->rd_createSubid != GetCurrentSubTransactionId() &&
- cstate->rel->rd_newRelfilenodeSubid != GetCurrentSubTransactionId())
+ cstate->rel->rd_newRelfilelocatorSubid != GetCurrentSubTransactionId())
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction")));
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index f269168..ca2f884 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -101,7 +101,7 @@ typedef struct
*/
typedef struct CreateDBRelInfo
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
Oid reloid; /* relation oid */
bool permanent; /* relation is permanent or unlogged */
} CreateDBRelInfo;
@@ -127,7 +127,7 @@ static void CreateDatabaseUsingWalLog(Oid src_dboid, Oid dboid, Oid src_tsid,
static List *ScanSourceDatabasePgClass(Oid srctbid, Oid srcdbid, char *srcpath);
static List *ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid,
Oid dbid, char *srcpath,
- List *rnodelist, Snapshot snapshot);
+ List *rlocatorlist, Snapshot snapshot);
static CreateDBRelInfo *ScanSourceDatabasePgClassTuple(HeapTupleData *tuple,
Oid tbid, Oid dbid,
char *srcpath);
@@ -147,12 +147,12 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
{
char *srcpath;
char *dstpath;
- List *rnodelist = NULL;
+ List *rlocatorlist = NULL;
ListCell *cell;
LockRelId srcrelid;
LockRelId dstrelid;
- RelFileNode srcrnode;
- RelFileNode dstrnode;
+ RelFileLocator srcrlocator;
+ RelFileLocator dstrlocator;
CreateDBRelInfo *relinfo;
/* Get source and destination database paths. */
@@ -165,9 +165,9 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
/* Copy relmap file from source database to the destination database. */
RelationMapCopy(dst_dboid, dst_tsid, srcpath, dstpath);
- /* Get list of relfilenodes to copy from the source database. */
- rnodelist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
- Assert(rnodelist != NIL);
+ /* Get list of relfilelocators to copy from the source database. */
+ rlocatorlist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
+ Assert(rlocatorlist != NIL);
/*
* Database IDs will be the same for all relations so set them before
@@ -176,11 +176,11 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
srcrelid.dbId = src_dboid;
dstrelid.dbId = dst_dboid;
- /* Loop over our list of relfilenodes and copy each one. */
- foreach(cell, rnodelist)
+ /* Loop over our list of relfilelocators and copy each one. */
+ foreach(cell, rlocatorlist)
{
relinfo = lfirst(cell);
- srcrnode = relinfo->rnode;
+ srcrlocator = relinfo->rlocator;
/*
* If the relation is from the source db's default tablespace then we
@@ -188,13 +188,13 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
* Otherwise, we need to create in the same tablespace as it is in the
* source database.
*/
- if (srcrnode.spcNode == src_tsid)
- dstrnode.spcNode = dst_tsid;
+ if (srcrlocator.spcOid == src_tsid)
+ dstrlocator.spcOid = dst_tsid;
else
- dstrnode.spcNode = srcrnode.spcNode;
+ dstrlocator.spcOid = srcrlocator.spcOid;
- dstrnode.dbNode = dst_dboid;
- dstrnode.relNode = srcrnode.relNode;
+ dstrlocator.dbOid = dst_dboid;
+ dstrlocator.relNumber = srcrlocator.relNumber;
/*
* Acquire locks on source and target relations before copying.
@@ -210,7 +210,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
LockRelationId(&dstrelid, AccessShareLock);
/* Copy relation storage from source to the destination. */
- CreateAndCopyRelationData(srcrnode, dstrnode, relinfo->permanent);
+ CreateAndCopyRelationData(srcrlocator, dstrlocator, relinfo->permanent);
/* Release the relation locks. */
UnlockRelationId(&srcrelid, AccessShareLock);
@@ -219,7 +219,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
pfree(srcpath);
pfree(dstpath);
- list_free_deep(rnodelist);
+ list_free_deep(rlocatorlist);
}
/*
@@ -246,31 +246,31 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
static List *
ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenode;
+ Oid relfilenumber;
Page page;
- List *rnodelist = NIL;
+ List *rlocatorlist = NIL;
LockRelId relid;
Relation rel;
Snapshot snapshot;
BufferAccessStrategy bstrategy;
- /* Get pg_class relfilenode. */
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- RelationRelationId);
+ /* Get pg_class relfilenumber. */
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ RelationRelationId);
/* Don't read data into shared_buffers without holding a relation lock. */
relid.dbId = dbid;
relid.relId = RelationRelationId;
LockRelationId(&relid, AccessShareLock);
- /* Prepare a RelFileNode for the pg_class relation. */
- rnode.spcNode = tbid;
- rnode.dbNode = dbid;
- rnode.relNode = relfilenode;
+ /* Prepare a RelFileLocator for the pg_class relation. */
+ rlocator.spcOid = tbid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = relfilenumber;
/*
* We can't use a real relcache entry for a relation in some other
@@ -279,7 +279,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- rel = CreateFakeRelcacheEntry(rnode);
+ rel = CreateFakeRelcacheEntry(rlocator);
nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
FreeFakeRelcacheEntry(rel);
@@ -299,7 +299,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
CHECK_FOR_INTERRUPTS();
- buf = ReadBufferWithoutRelcache(rnode, MAIN_FORKNUM, blkno,
+ buf = ReadBufferWithoutRelcache(rlocator, MAIN_FORKNUM, blkno,
RBM_NORMAL, bstrategy, false);
LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -310,9 +310,9 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
continue;
}
- /* Append relevant pg_class tuples for current page to rnodelist. */
- rnodelist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
- srcpath, rnodelist,
+ /* Append relevant pg_class tuples for current page to rlocatorlist. */
+ rlocatorlist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
+ srcpath, rlocatorlist,
snapshot);
UnlockReleaseBuffer(buf);
@@ -321,16 +321,16 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
/* Release relation lock. */
UnlockRelationId(&relid, AccessShareLock);
- return rnodelist;
+ return rlocatorlist;
}
/*
* Scan one page of the source database's pg_class relation and add relevant
- * entries to rnodelist. The return value is the updated list.
+ * entries to rlocatorlist. The return value is the updated list.
*/
static List *
ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
- char *srcpath, List *rnodelist,
+ char *srcpath, List *rlocatorlist,
Snapshot snapshot)
{
BlockNumber blkno = BufferGetBlockNumber(buf);
@@ -376,11 +376,11 @@ ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
relinfo = ScanSourceDatabasePgClassTuple(&tuple, tbid, dbid,
srcpath);
if (relinfo != NULL)
- rnodelist = lappend(rnodelist, relinfo);
+ rlocatorlist = lappend(rlocatorlist, relinfo);
}
}
- return rnodelist;
+ return rlocatorlist;
}
/*
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenode = InvalidOid;
+ Oid relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
@@ -418,29 +418,29 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
return NULL;
/*
- * If relfilenode is valid then directly use it. Otherwise, consult the
+ * If relfilenumber is valid then directly use it. Otherwise, consult the
* relmap.
*/
if (OidIsValid(classForm->relfilenode))
- relfilenode = classForm->relfilenode;
+ relfilenumber = classForm->relfilenode;
else
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- classForm->oid);
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ classForm->oid);
- /* We must have a valid relfilenode oid. */
- if (!OidIsValid(relfilenode))
- elog(ERROR, "relation with OID %u does not have a valid relfilenode",
+ /* We must have a valid relfilenumber oid. */
+ if (!RelFileNumberIsValid(relfilenumber))
+ elog(ERROR, "relation with OID %u does not have a valid relfilenumber",
classForm->oid);
/* Prepare a rel info element and add it to the list. */
relinfo = (CreateDBRelInfo *) palloc(sizeof(CreateDBRelInfo));
if (OidIsValid(classForm->reltablespace))
- relinfo->rnode.spcNode = classForm->reltablespace;
+ relinfo->rlocator.spcOid = classForm->reltablespace;
else
- relinfo->rnode.spcNode = tbid;
+ relinfo->rlocator.spcOid = tbid;
- relinfo->rnode.dbNode = dbid;
- relinfo->rnode.relNode = relfilenode;
+ relinfo->rlocator.dbOid = dbid;
+ relinfo->rlocator.relNumber = relfilenumber;
relinfo->reloid = classForm->oid;
/* Temporary relations were rejected above. */
@@ -2867,8 +2867,8 @@ remove_dbtablespaces(Oid db_id)
* try to remove that already-existing subdirectory during the cleanup in
* remove_dbtablespaces. Nuking existing files seems like a bad idea, so
* instead we make this extra check before settling on the OID of the new
- * database. This exactly parallels what GetNewRelFileNode() does for table
- * relfilenode values.
+ * database. This exactly parallels what GetNewRelFileNumber() does for table
+ * relfilenumber values.
*/
static bool
check_db_file_conflict(Oid db_id)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index eac13ac..10c1d5c 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1093,10 +1093,10 @@ DefineIndex(Oid relationId,
}
/*
- * A valid stmt->oldNode implies that we already have a built form of the
+ * A valid stmt->oldNumber implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelFileNumberIsValid(stmt->oldNumber) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1138,7 +1138,7 @@ DefineIndex(Oid relationId,
indexRelationId =
index_create(rel, indexRelationName, indexRelationId, parentIndexId,
parentConstraintId,
- stmt->oldNode, indexInfo, indexColNames,
+ stmt->oldNumber, indexInfo, indexColNames,
accessMethodId, tablespaceId,
collationObjectId, classObjectId,
coloptions, reloptions,
@@ -1348,15 +1348,15 @@ DefineIndex(Oid relationId,
* We can't use the same index name for the child index,
* so clear idxname to let the recursive invocation choose
* a new name. Likewise, the existing target relation
- * field is wrong, and if indexOid or oldNode are set,
+ * field is wrong, and if indexOid or oldNumber are set,
* they mustn't be applied to the child either.
*/
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNumber = InvalidRelFileNumber;
childStmt->oldCreateSubid = InvalidSubTransactionId;
- childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ childStmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
/*
* Adjust any Vars (both in expressions and in the index's
@@ -2949,7 +2949,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelFileNumberIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106..9ac0383 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -118,7 +118,7 @@ SetMatViewPopulatedState(Relation relation, bool newstate)
* ExecRefreshMatView -- execute a REFRESH MATERIALIZED VIEW command
*
* This refreshes the materialized view by creating a new table and swapping
- * the relfilenodes of the new table and the old materialized view, so the OID
+ * the relfilenumbers of the new table and the old materialized view, so the OID
* of the original materialized view is preserved. Thus we do not lose GRANT
* nor references to this materialized view.
*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ddf219b..48d9d43 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -75,7 +75,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelFileNumber filenumber; /* last seen relfilenumber of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
@@ -255,7 +255,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
*
* The change is made transactionally, so that on failure of the current
* transaction, the sequence will be restored to its previous state.
- * We do that by creating a whole new relfilenode for the sequence; so this
+ * We do that by creating a whole new relfilenumber for the sequence; so this
* works much like the rewriting forms of ALTER TABLE.
*
* Caller is assumed to have acquired AccessExclusiveLock on the sequence,
@@ -310,7 +310,7 @@ ResetSequence(Oid seq_relid)
/*
* Create a new storage file for the sequence.
*/
- RelationSetNewRelfilenode(seq_rel, seq_rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seq_rel, seq_rel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -347,9 +347,9 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
{
SMgrRelation srel;
- srel = smgropen(rel->rd_node, InvalidBackendId);
+ srel = smgropen(rel->rd_locator, InvalidBackendId);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
+ log_smgrcreate(&rel->rd_locator, INIT_FORKNUM);
fill_seq_fork_with_data(rel, tuple, INIT_FORKNUM);
FlushRelationBuffers(rel);
smgrclose(srel);
@@ -418,7 +418,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = rel->rd_node;
+ xlrec.locator = rel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) tuple->t_data, tuple->t_len);
@@ -509,7 +509,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
* Create a new storage file for the sequence, making the state
* changes transactional.
*/
- RelationSetNewRelfilenode(seqrel, seqrel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seqrel, seqrel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -557,7 +557,7 @@ SequenceChangePersistence(Oid relid, char newrelpersistence)
GetTopTransactionId();
(void) read_seq_tuple(seqrel, &buf, &seqdatatuple);
- RelationSetNewRelfilenode(seqrel, newrelpersistence);
+ RelationSetNewRelfilenumber(seqrel, newrelpersistence);
fill_seq_with_data(seqrel, &seqdatatuple);
UnlockReleaseBuffer(buf);
@@ -836,7 +836,7 @@ nextval_internal(Oid relid, bool check_permissions)
seq->is_called = true;
seq->log_cnt = 0;
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1023,7 +1023,7 @@ do_setval(Oid relid, int64 next, bool iscalled)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1147,7 +1147,7 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
if (!found)
{
/* relid already filled in */
- elm->filenode = InvalidOid;
+ elm->filenumber = InvalidRelFileNumber;
elm->lxid = InvalidLocalTransactionId;
elm->last_valid = false;
elm->last = elm->cached = 0;
@@ -1169,9 +1169,9 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
* discard any cached-but-unissued values. We do not touch the currval()
* state, however.
*/
- if (seqrel->rd_rel->relfilenode != elm->filenode)
+ if (seqrel->rd_rel->relfilenode != elm->filenumber)
{
- elm->filenode = seqrel->rd_rel->relfilenode;
+ elm->filenumber = seqrel->rd_rel->relfilenode;
elm->cached = elm->last;
}
@@ -1254,7 +1254,8 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
* changed. This allows ALTER SEQUENCE to behave transactionally. Currently,
* the only option that doesn't cause that is OWNED BY. It's *necessary* for
* ALTER SEQUENCE OWNED BY to not rewrite the sequence, because that would
- * break pg_upgrade by causing unwanted changes in the sequence's relfilenode.
+ * break pg_upgrade by causing unwanted changes in the sequence's
+ * relfilenumber.
*/
static void
init_params(ParseState *pstate, List *options, bool for_identity,
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 2de0eba..bf645b8 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -596,7 +596,7 @@ static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
static ObjectAddress ATExecSetCompression(AlteredTableInfo *tab, Relation rel,
const char *column, Node *newValue, LOCKMODE lockmode);
-static void index_copy_data(Relation rel, RelFileNode newrnode);
+static void index_copy_data(Relation rel, RelFileLocator newrlocator);
static const char *storage_name(char c);
static void RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid,
@@ -1986,12 +1986,12 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* Normally, we need a transaction-safe truncation here. However, if
* the table was either created in the current (sub)transaction or has
- * a new relfilenode in the current (sub)transaction, then we can just
+ * a new relfilenumber in the current (sub)transaction, then we can just
* truncate it in-place, because a rollback would cause the whole
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilelocatorSubid == mySubid)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -2014,10 +2014,10 @@ ExecuteTruncateGuts(List *explicit_rels,
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
- * as the relfilenode value. The old storage file is scheduled for
+ * as the relfilenumber value. The old storage file is scheduled for
* deletion at commit.
*/
- RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(rel, rel->rd_rel->relpersistence);
heap_relid = RelationGetRelid(rel);
@@ -2030,7 +2030,7 @@ ExecuteTruncateGuts(List *explicit_rels,
Relation toastrel = relation_open(toast_relid,
AccessExclusiveLock);
- RelationSetNewRelfilenode(toastrel,
+ RelationSetNewRelfilenumber(toastrel,
toastrel->rd_rel->relpersistence);
table_close(toastrel, NoLock);
}
@@ -3315,10 +3315,10 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
/*
* SetRelationTableSpace
- * Set new reltablespace and relfilenode in pg_class entry.
+ * Set new reltablespace and relfilenumber in pg_class entry.
*
* newTableSpaceId is the new tablespace for the relation, and
- * newRelFileNode its new filenode. If newRelFileNode is InvalidOid,
+ * newRelFilenumber its new filenumber. If newRelFilenumber is InvalidOid,
* this field is not updated.
*
* NOTE: The caller must hold AccessExclusiveLock on the relation.
@@ -3331,7 +3331,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelFileNumber newRelFilenumber)
{
Relation pg_class;
HeapTuple tuple;
@@ -3351,8 +3351,8 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
- rd_rel->relfilenode = newRelFileNode;
+ if (RelFileNumberIsValid(newRelFilenumber))
+ rd_rel->relfilenode = newRelFilenumber;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
/*
@@ -5420,7 +5420,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* persistence: on one hand, we need to ensure that the buffers
* belonging to each of the two relations are marked with or without
* BM_PERMANENT properly. On the other hand, since rewriting creates
- * and assigns a new relfilenode, we automatically create or drop an
+ * and assigns a new relfilenumber, we automatically create or drop an
* init fork for the relation as appropriate.
*/
if (tab->rewrite > 0 && tab->relkind != RELKIND_SEQUENCE)
@@ -5506,12 +5506,13 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* Create transient table that will receive the modified data.
*
* Ensure it is marked correctly as logged or unlogged. We have
- * to do this here so that buffers for the new relfilenode will
+ * to do this here so that buffers for the new relfilenumber will
* have the right persistence set, and at the same time ensure
- * that the original filenode's buffers will get read in with the
- * correct setting (i.e. the original one). Otherwise a rollback
- * after the rewrite would possibly result with buffers for the
- * original filenode having the wrong persistence setting.
+ * that the original filenumbers's buffers will get read in with
+ * the correct setting (i.e. the original one). Otherwise a
+ * rollback after the rewrite would possibly result with buffers
+ * for the original filenumbers having the wrong persistence
+ * setting.
*
* NB: This relies on swap_relation_files() also swapping the
* persistence. That wouldn't work for pg_class, but that can't be
@@ -8597,7 +8598,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelFileNumberIsValid(stmt->oldNumber);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8613,7 +8614,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
quiet);
/*
- * If TryReuseIndex() stashed a relfilenode for us, we used it for the new
+ * If TryReuseIndex() stashed a relfilenumber for us, we used it for the new
* index instead of building from scratch. Restore associated fields.
* This may store InvalidSubTransactionId in both fields, in which case
* relcache.c will assume it can rebuild the relcache entry. Hence, do
@@ -8621,13 +8622,13 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelFileNumberIsValid(stmt->oldNumber))
{
Relation irel = index_open(address.objectId, NoLock);
irel->rd_createSubid = stmt->oldCreateSubid;
- irel->rd_firstRelfilenodeSubid = stmt->oldFirstRelfilenodeSubid;
- RelationPreserveStorage(irel->rd_node, true);
+ irel->rd_firstRelfilelocatorSubid = stmt->oldFirstRelfilenumberSubid;
+ RelationPreserveStorage(irel->rd_locator, true);
index_close(irel, NoLock);
}
@@ -13491,9 +13492,9 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNumber = irel->rd_locator.relNumber;
stmt->oldCreateSubid = irel->rd_createSubid;
- stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
+ stmt->oldFirstRelfilenumberSubid = irel->rd_firstRelfilelocatorSubid;
}
index_close(irel, NoLock);
}
@@ -14340,8 +14341,8 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
- RelFileNode newrnode;
+ RelFileNumber newrelfilenumber;
+ RelFileLocator newrlocator;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14370,26 +14371,28 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
+ * Relfilenumbers are not unique in databases across tablespaces, so we need
* to allocate a new one in the new tablespace.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
- newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
- newrnode.spcNode = newTableSpace;
+ newrlocator = rel->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
+ newrlocator.spcOid = newTableSpace;
- /* hand off to AM to actually create the new filenode and copy the data */
+ /*
+ * hand off to AM to actually create the new filelocator and copy the data
+ */
if (rel->rd_rel->relkind == RELKIND_INDEX)
{
- index_copy_data(rel, newrnode);
+ index_copy_data(rel, newrlocator);
}
else
{
Assert(RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind));
- table_relation_copy_data(rel, &newrnode);
+ table_relation_copy_data(rel, &newrlocator);
}
/*
@@ -14400,11 +14403,11 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* the updated pg_class entry), but that's forbidden with
* CheckRelationTableSpaceMove().
*/
- SetRelationTableSpace(rel, newTableSpace, newrelfilenode);
+ SetRelationTableSpace(rel, newTableSpace, newrelfilenumber);
InvokeObjectPostAlterHook(RelationRelationId, RelationGetRelid(rel), 0);
- RelationAssumeNewRelfilenode(rel);
+ RelationAssumeNewRelfilelocator(rel);
relation_close(rel, NoLock);
@@ -14630,11 +14633,11 @@ AlterTableMoveAll(AlterTableMoveAllStmt *stmt)
}
static void
-index_copy_data(Relation rel, RelFileNode newrnode)
+index_copy_data(Relation rel, RelFileLocator newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -14648,10 +14651,10 @@ index_copy_data(Relation rel, RelFileNode newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -14672,7 +14675,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(&newrnode, forkNum);
+ log_smgrcreate(&newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 00ca397..c8bdd99 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -12,12 +12,12 @@
* remove the possibility of having file name conflicts, we isolate
* files within a tablespace into database-specific subdirectories.
*
- * To support file access via the information given in RelFileNode, we
+ * To support file access via the information given in RelFileLocator, we
* maintain a symbolic-link map in $PGDATA/pg_tblspc. The symlinks are
* named by tablespace OIDs and point to the actual tablespace directories.
* There is also a per-cluster version directory in each tablespace.
* Thus the full path to an arbitrary file is
- * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenode
+ * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenumber
* e.g.
* $PGDATA/pg_tblspc/20981/PG_9.0_201002161/719849/83292814
*
@@ -25,8 +25,8 @@
* tables) and pg_default (for everything else). For backwards compatibility
* and to remain functional on platforms without symlinks, these tablespaces
* are accessed specially: they are respectively
- * $PGDATA/global/relfilenode
- * $PGDATA/base/dboid/relfilenode
+ * $PGDATA/global/relfilenumber
+ * $PGDATA/base/dboid/relfilenumber
*
* To allow CREATE DATABASE to give a new database a default tablespace
* that's different from the template database's default, we make the
@@ -115,7 +115,7 @@ static bool destroy_tablespace_directories(Oid tablespaceoid, bool redo);
* re-create a database subdirectory (of $PGDATA/base) during WAL replay.
*/
void
-TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
+TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
{
struct stat st;
char *dir;
@@ -124,13 +124,13 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
* The global tablespace doesn't have per-database subdirectories, so
* nothing to do for it.
*/
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
return;
- Assert(OidIsValid(spcNode));
- Assert(OidIsValid(dbNode));
+ Assert(OidIsValid(spcOid));
+ Assert(OidIsValid(dbOid));
- dir = GetDatabasePath(dbNode, spcNode);
+ dir = GetDatabasePath(dbOid, spcOid);
if (stat(dir, &st) < 0)
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630f..7d50b50 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4193,9 +4193,9 @@ _copyIndexStmt(const IndexStmt *from)
COPY_NODE_FIELD(excludeOpNames);
COPY_STRING_FIELD(idxcomment);
COPY_SCALAR_FIELD(indexOid);
- COPY_SCALAR_FIELD(oldNode);
+ COPY_SCALAR_FIELD(oldNumber);
COPY_SCALAR_FIELD(oldCreateSubid);
- COPY_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COPY_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COPY_SCALAR_FIELD(unique);
COPY_SCALAR_FIELD(nulls_not_distinct);
COPY_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index e747e16..d63d326 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1752,9 +1752,9 @@ _equalIndexStmt(const IndexStmt *a, const IndexStmt *b)
COMPARE_NODE_FIELD(excludeOpNames);
COMPARE_STRING_FIELD(idxcomment);
COMPARE_SCALAR_FIELD(indexOid);
- COMPARE_SCALAR_FIELD(oldNode);
+ COMPARE_SCALAR_FIELD(oldNumber);
COMPARE_SCALAR_FIELD(oldCreateSubid);
- COMPARE_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COMPARE_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COMPARE_SCALAR_FIELD(unique);
COMPARE_SCALAR_FIELD(nulls_not_distinct);
COMPARE_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ce12915..3724d48 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2928,9 +2928,9 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_OID_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
- WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
+ WRITE_UINT_FIELD(oldFirstRelfilenumberSubid);
WRITE_BOOL_FIELD(unique);
WRITE_BOOL_FIELD(nulls_not_distinct);
WRITE_BOOL_FIELD(primary);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 969c9c1..394404d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,9 +7990,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
@@ -8022,9 +8022,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 1a64a52..390b454 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1578,9 +1578,9 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
index->nulls_not_distinct = idxrec->indnullsnotdistinct;
index->primary = idxrec->indisprimary;
@@ -2201,9 +2201,9 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->transformed = false;
index->concurrent = false;
index->if_not_exists = false;
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index c937c39..5fc076f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -1207,7 +1207,7 @@ CompactCheckpointerRequestQueue(void)
* We use the request struct directly as a hashtable key. This
* assumes that any padding bytes in the structs are consistently the
* same, which should be okay because we zeroed them in
- * CheckpointerShmemInit. Note also that RelFileNode had better
+ * CheckpointerShmemInit. Note also that RelFileLocator had better
* contain no pad bytes.
*/
request = &CheckpointerShmem->requests[n];
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index aa2427b..c5c6a2b 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -845,7 +845,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_insert *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_insert *) XLogRecGetData(r);
@@ -857,8 +857,8 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -872,7 +872,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
tupledata = XLogRecGetBlockData(r, 0, &datalen);
tuplelen = datalen - SizeOfHeapHeader;
@@ -902,13 +902,13 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xl_heap_update *xlrec;
ReorderBufferChange *change;
char *data;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -918,7 +918,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change = ReorderBufferGetChange(ctx->reorder);
change->action = REORDER_BUFFER_CHANGE_UPDATE;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
if (xlrec->flags & XLH_UPDATE_CONTAINS_NEW_TUPLE)
{
@@ -968,13 +968,13 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_delete *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -990,7 +990,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
/* old primary key stored */
if (xlrec->flags & XLH_DELETE_CONTAINS_OLD)
@@ -1063,7 +1063,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
char *data;
char *tupledata;
Size tuplelen;
- RelFileNode rnode;
+ RelFileLocator rlocator;
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
@@ -1075,8 +1075,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
- if (rnode.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &rlocator, NULL, NULL);
+ if (rlocator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1103,7 +1103,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &rnode, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &rlocator, sizeof(RelFileLocator));
xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
@@ -1165,11 +1165,11 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
{
XLogReaderState *r = buf->record;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1180,7 +1180,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
change->data.tp.clear_toast_afterwards = true;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8da5f90..f8fb228 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -106,7 +106,7 @@
#include "utils/memdebug.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
/* entry for a hash table we use to map from xid to our transaction state */
@@ -116,10 +116,10 @@ typedef struct ReorderBufferTXNByIdEnt
ReorderBufferTXN *txn;
} ReorderBufferTXNByIdEnt;
-/* data structures for (relfilenode, ctid) => (cmin, cmax) mapping */
+/* data structures for (relfilelocator, ctid) => (cmin, cmax) mapping */
typedef struct ReorderBufferTupleCidKey
{
- RelFileNode relnode;
+ RelFileLocator rlocator;
ItemPointerData tid;
} ReorderBufferTupleCidKey;
@@ -1643,7 +1643,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Destroy the (relfilenode, ctid) hashtable, so that we don't leak any
+ * Destroy the (relfilelocator, ctid) hashtable, so that we don't leak any
* memory. We could also keep the hash table and update it with new ctid
* values, but this seems simpler and good enough for now.
*/
@@ -1673,7 +1673,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Build a hash with a (relfilenode, ctid) -> (cmin, cmax) mapping for use by
+ * Build a hash with a (relfilelocator, ctid) -> (cmin, cmax) mapping for use by
* HeapTupleSatisfiesHistoricMVCC.
*/
static void
@@ -1711,7 +1711,7 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
/* be careful about padding */
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
- key.relnode = change->data.tuplecid.node;
+ key.rlocator = change->data.tuplecid.locator;
ItemPointerCopy(&change->data.tuplecid.tid,
&key.tid);
@@ -2140,36 +2140,36 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
+ change->data.tp.rlocator.relNumber);
/*
* Mapped catalog tuple without data, emitted while
* catalog table was in the process of being rewritten. We
- * can fail to look up the relfilenode, because the
+ * can fail to look up the relfilenumber, because the
* relmapper has no "historic" view, in contrast to the
* normal catalog during decoding. Thus repeated rewrites
* can cause a lookup failure. That's OK because we do not
* decode catalog changes anyway. Normally such tuples
* would be skipped over below, but we can't identify
* whether the table should be logically logged without
- * mapping the relfilenode to the oid.
+ * mapping the relfilenumber to the oid.
*/
if (reloid == InvalidOid &&
change->data.tp.newtuple == NULL &&
change->data.tp.oldtuple == NULL)
goto change_done;
else if (reloid == InvalidOid)
- elog(ERROR, "could not map filenode \"%s\" to relation OID",
- relpathperm(change->data.tp.relnode,
+ elog(ERROR, "could not map filenumber \"%s\" to relation OID",
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
relation = RelationIdGetRelation(reloid);
if (!RelationIsValid(relation))
- elog(ERROR, "could not open relation with OID %u (for filenode \"%s\")",
+ elog(ERROR, "could not open relation with OID %u (for filenumber \"%s\")",
reloid,
- relpathperm(change->data.tp.relnode,
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
if (!RelationIsLogicallyLogged(relation))
@@ -3157,7 +3157,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
}
/*
- * Add new (relfilenode, tid) -> (cmin, cmax) mappings.
+ * Add new (relfilelocator, tid) -> (cmin, cmax) mappings.
*
* We do not include this change type in memory accounting, because we
* keep CIDs in a separate list and do not evict them when reaching
@@ -3165,7 +3165,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
*/
void
ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
- XLogRecPtr lsn, RelFileNode node,
+ XLogRecPtr lsn, RelFileLocator locator,
ItemPointerData tid, CommandId cmin,
CommandId cmax, CommandId combocid)
{
@@ -3174,7 +3174,7 @@ ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
- change->data.tuplecid.node = node;
+ change->data.tuplecid.locator = locator;
change->data.tuplecid.tid = tid;
change->data.tuplecid.cmin = cmin;
change->data.tuplecid.cmax = cmax;
@@ -4839,7 +4839,7 @@ ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn)
* need anymore.
*
* To resolve those problems we have a per-transaction hash of (cmin,
- * cmax) tuples keyed by (relfilenode, ctid) which contains the actual
+ * cmax) tuples keyed by (relfilelocator, ctid) which contains the actual
* (cmin, cmax) values. That also takes care of combo CIDs by simply
* not caring about them at all. As we have the real cmin/cmax values
* combo CIDs aren't interesting.
@@ -4870,9 +4870,9 @@ DisplayMapping(HTAB *tuplecid_data)
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
- ent->key.relnode.dbNode,
- ent->key.relnode.spcNode,
- ent->key.relnode.relNode,
+ ent->key.rlocator.dbOid,
+ ent->key.rlocator.spcOid,
+ ent->key.rlocator.relNumber,
ItemPointerGetBlockNumber(&ent->key.tid),
ItemPointerGetOffsetNumber(&ent->key.tid),
ent->cmin,
@@ -4932,7 +4932,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
path, readBytes,
(int32) sizeof(LogicalRewriteMappingData))));
- key.relnode = map.old_node;
+ key.rlocator = map.old_locator;
ItemPointerCopy(&map.old_tid,
&key.tid);
@@ -4947,7 +4947,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
if (!ent)
continue;
- key.relnode = map.new_node;
+ key.rlocator = map.new_locator;
ItemPointerCopy(&map.new_tid,
&key.tid);
@@ -5120,10 +5120,10 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
Assert(!BufferIsLocal(buffer));
/*
- * get relfilenode from the buffer, no convenient way to access it other
+ * get relfilelocator from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &key.rlocator, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1119a12..73c0f15 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -781,7 +781,7 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
- xlrec->target_node, xlrec->target_tid,
+ xlrec->target_locator, xlrec->target_tid,
xlrec->cmin, xlrec->cmax,
xlrec->combocid);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index ae13011..7071ff6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -121,12 +121,12 @@ typedef struct CkptTsStatus
* Type for array used to sort SMgrRelations
*
* FlushRelationsAllBuffers shares the same comparator function with
- * DropRelFileNodesAllBuffers. Pointer to this struct and RelFileNode must be
+ * DropRelFileLocatorsAllBuffers. Pointer to this struct and RelFileLocator must be
* compatible.
*/
typedef struct SMgrSortArray
{
- RelFileNode rnode; /* This must be the first member */
+ RelFileLocator rlocator; /* This must be the first member */
SMgrRelation srel;
} SMgrSortArray;
@@ -483,7 +483,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
BufferAccessStrategy strategy,
bool *foundPtr);
static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
-static void FindAndDropRelFileNodeBuffers(RelFileNode rnode,
+static void FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator,
ForkNumber forkNum,
BlockNumber nForkBlock,
BlockNumber firstDelBlock);
@@ -492,7 +492,7 @@ static void RelationCopyStorageUsingBuffer(Relation src, Relation dst,
bool isunlogged);
static void AtProcExit_Buffers(int code, Datum arg);
static void CheckForBufferLeaks(void);
-static int rnode_comparator(const void *p1, const void *p2);
+static int rlocator_comparator(const void *p1, const void *p2);
static inline int buffertag_comparator(const BufferTag *a, const BufferTag *b);
static inline int ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b);
static int ts_ckpt_progress_comparator(Datum a, Datum b, void *arg);
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -620,7 +620,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
* tag. In that case, the buffer is pinned and the usage count is bumped.
*/
bool
-ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
+ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockNum,
Buffer recent_buffer)
{
BufferDesc *bufHdr;
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rnode, forkNum, blockNum);
+ INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -786,13 +786,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* BackendId).
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool permanent)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(rlocator, InvalidBackendId);
return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
@@ -824,10 +824,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
isExtend = (blockNum == P_NEW);
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
@@ -839,7 +839,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend relation %s beyond %u blocks",
- relpath(smgr->smgr_rnode, forkNum),
+ relpath(smgr->smgr_rlocator, forkNum),
P_NEW)));
}
@@ -886,10 +886,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -926,7 +926,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (!PageIsNew((Page) bufBlock))
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
- blockNum, relpath(smgr->smgr_rnode, forkNum)),
+ blockNum, relpath(smgr->smgr_rlocator, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
@@ -1028,7 +1028,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
@@ -1036,7 +1036,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
}
}
}
@@ -1076,10 +1076,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageMiss;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1255,9 +1255,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
/* OK, do the I/O */
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1266,9 +1266,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
&buf->tag);
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
}
else
{
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rlocator.spcOid;
+ item->relNumber = bufHdr->tag.rlocator.relNumber;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2708,7 +2708,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2769,11 +2769,11 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
+ * Returns the relfilelocator, fork number and block number associated with
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2787,7 +2787,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ *rlocator = bufHdr->tag.rlocator;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,13 +2838,13 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rlocator, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
buf_state = LockBufHdr(buf);
@@ -2922,9 +2922,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3026,7 +3026,7 @@ BufferGetLSNAtomic(Buffer buffer)
}
/* ---------------------------------------------------------------------
- * DropRelFileNodeBuffers
+ * DropRelFileLocatorBuffers
*
* This function removes from the buffer pool all the pages of the
* specified relation forks that have block numbers >= firstDelBlock.
@@ -3047,24 +3047,24 @@ BufferGetLSNAtomic(Buffer buffer)
* --------------------------------------------------------------------
*/
void
-DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
+DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock)
{
int i;
int j;
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
BlockNumber nForkBlock[MAX_FORKNUM];
uint64 nBlocksToInvalidate = 0;
- rnode = smgr_reln->smgr_rnode;
+ rlocator = smgr_reln->smgr_rlocator;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileLocatorBackendIsTemp(rlocator))
{
- if (rnode.backend == MyBackendId)
+ if (rlocator.backend == MyBackendId)
{
for (j = 0; j < nforks; j++)
- DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
+ DropRelFileLocatorLocalBuffers(rlocator.locator, forkNum[j],
firstDelBlock[j]);
}
return;
@@ -3115,7 +3115,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)
{
for (j = 0; j < nforks; j++)
- FindAndDropRelFileNodeBuffers(rnode.node, forkNum[j],
+ FindAndDropRelFileLocatorBuffers(rlocator.locator, forkNum[j],
nForkBlock[j], firstDelBlock[j]);
return;
}
@@ -3138,17 +3138,17 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* false positives are safe because we'll recheck after getting the
* buffer lock.
*
- * We could check forkNum and blockNum as well as the rnode, but the
+ * We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3162,16 +3162,16 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
}
/* ---------------------------------------------------------------------
- * DropRelFileNodesAllBuffers
+ * DropRelFileLocatorsAllBuffers
*
* This function removes from the buffer pool all the pages of all
* forks of the specified relations. It's equivalent to calling
- * DropRelFileNodeBuffers once per fork per relation with
+ * DropRelFileLocatorBuffers once per fork per relation with
* firstDelBlock = 0.
* --------------------------------------------------------------------
*/
void
-DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
+DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
{
int i;
int j;
@@ -3179,22 +3179,22 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
SMgrRelation *rels;
BlockNumber (*block)[MAX_FORKNUM + 1];
uint64 nBlocksToInvalidate = 0;
- RelFileNode *nodes;
+ RelFileLocator *locators;
bool cached = true;
bool use_bsearch;
- if (nnodes == 0)
+ if (nlocators == 0)
return;
- rels = palloc(sizeof(SMgrRelation) * nnodes); /* non-local relations */
+ rels = palloc(sizeof(SMgrRelation) * nlocators); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
- for (i = 0; i < nnodes; i++)
+ for (i = 0; i < nlocators; i++)
{
- if (RelFileNodeBackendIsTemp(smgr_reln[i]->smgr_rnode))
+ if (RelFileLocatorBackendIsTemp(smgr_reln[i]->smgr_rlocator))
{
- if (smgr_reln[i]->smgr_rnode.backend == MyBackendId)
- DropRelFileNodeAllLocalBuffers(smgr_reln[i]->smgr_rnode.node);
+ if (smgr_reln[i]->smgr_rlocator.backend == MyBackendId)
+ DropRelFileLocatorAllLocalBuffers(smgr_reln[i]->smgr_rlocator.locator);
}
else
rels[n++] = smgr_reln[i];
@@ -3219,7 +3219,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
/*
* We can avoid scanning the entire buffer pool if we know the exact size
- * of each of the given relation forks. See DropRelFileNodeBuffers.
+ * of each of the given relation forks. See DropRelFileLocatorBuffers.
*/
for (i = 0; i < n && cached; i++)
{
@@ -3257,7 +3257,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
/* drop all the buffers for a particular relation fork */
- FindAndDropRelFileNodeBuffers(rels[i]->smgr_rnode.node,
+ FindAndDropRelFileLocatorBuffers(rels[i]->smgr_rlocator.locator,
j, block[i][j], 0);
}
}
@@ -3268,9 +3268,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
pfree(block);
- nodes = palloc(sizeof(RelFileNode) * n); /* non-local relations */
+ locators = palloc(sizeof(RelFileLocator) * n); /* non-local relations */
for (i = 0; i < n; i++)
- nodes[i] = rels[i]->smgr_rnode.node;
+ locators[i] = rels[i]->smgr_rlocator.locator;
/*
* For low number of relations to drop just use a simple walk through, to
@@ -3280,18 +3280,18 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
*/
use_bsearch = n > RELS_BSEARCH_THRESHOLD;
- /* sort the list of rnodes if necessary */
+ /* sort the list of rlocators if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(locators, n, sizeof(RelFileLocator), rlocator_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3301,37 +3301,37 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
{
- rnode = &nodes[j];
+ rlocator = &locators[j];
break;
}
}
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
- rnode_comparator);
+ rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ locators, n, sizeof(RelFileLocator),
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
- if (rnode == NULL)
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
+ if (rlocator == NULL)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
}
- pfree(nodes);
+ pfree(locators);
pfree(rels);
}
/* ---------------------------------------------------------------------
- * FindAndDropRelFileNodeBuffers
+ * FindAndDropRelFileLocatorBuffers
*
* This function performs look up in BufMapping table and removes from the
* buffer pool all the pages of the specified relation fork that has block
@@ -3340,9 +3340,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
* --------------------------------------------------------------------
*/
static void
-FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber nForkBlock,
- BlockNumber firstDelBlock)
+FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber nForkBlock,
+ BlockNumber firstDelBlock)
{
BlockNumber curBlock;
@@ -3356,7 +3356,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rnode, forkNum, curBlock);
+ INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
@@ -3380,7 +3380,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3397,7 +3397,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
* bothering to write them out first. This is used when we destroy a
* database, to avoid trying to flush data to disk when the directory
* tree no longer exists. Implementation is pretty similar to
- * DropRelFileNodeBuffers() which is for destroying just one relation.
+ * DropRelFileLocatorBuffers() which is for destroying just one relation.
* --------------------------------------------------------------------
*/
void
@@ -3416,14 +3416,14 @@ DropDatabaseBuffers(Oid dbid)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rlocator.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3453,7 +3453,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3478,7 +3478,7 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(buf->tag.rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3517,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3561,16 +3561,16 @@ FlushRelationBuffers(Relation rel)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3608,21 +3608,21 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (i = 0; i < nrels; i++)
{
- Assert(!RelFileNodeBackendIsTemp(smgrs[i]->smgr_rnode));
+ Assert(!RelFileLocatorBackendIsTemp(smgrs[i]->smgr_rlocator));
- srels[i].rnode = smgrs[i]->smgr_rnode.node;
+ srels[i].rlocator = smgrs[i]->smgr_rlocator.locator;
srels[i].srel = smgrs[i];
}
/*
* Save the bsearch overhead for low number of relations to sync. See
- * DropRelFileNodesAllBuffers for details.
+ * DropRelFileLocatorsAllBuffers for details.
*/
use_bsearch = nrels > RELS_BSEARCH_THRESHOLD;
/* sort the list of SMgrRelations if necessary */
if (use_bsearch)
- pg_qsort(srels, nrels, sizeof(SMgrSortArray), rnode_comparator);
+ pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
@@ -3634,7 +3634,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3644,7 +3644,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,19 +3653,19 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
srels, nrels, sizeof(SMgrSortArray),
- rnode_comparator);
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
if (srelent == NULL)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3729,7 +3729,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
CHECK_FOR_INTERRUPTS();
/* Read block from source relation. */
- srcBuf = ReadBufferWithoutRelcache(src->rd_node, forkNum, blkno,
+ srcBuf = ReadBufferWithoutRelcache(src->rd_locator, forkNum, blkno,
RBM_NORMAL, bstrategy_src,
permanent);
srcPage = BufferGetPage(srcBuf);
@@ -3740,7 +3740,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
}
/* Use P_NEW to extend the destination relation. */
- dstBuf = ReadBufferWithoutRelcache(dst->rd_node, forkNum, P_NEW,
+ dstBuf = ReadBufferWithoutRelcache(dst->rd_locator, forkNum, P_NEW,
RBM_NORMAL, bstrategy_dst,
permanent);
LockBuffer(dstBuf, BUFFER_LOCK_EXCLUSIVE);
@@ -3775,8 +3775,8 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
* --------------------------------------------------------------------
*/
void
-CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
- bool permanent)
+CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator, bool permanent)
{
Relation src_rel;
Relation dst_rel;
@@ -3793,8 +3793,8 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- src_rel = CreateFakeRelcacheEntry(src_rnode);
- dst_rel = CreateFakeRelcacheEntry(dst_rnode);
+ src_rel = CreateFakeRelcacheEntry(src_rlocator);
+ dst_rel = CreateFakeRelcacheEntry(dst_rlocator);
/*
* Create and copy all forks of the relation. During create database we
@@ -3802,7 +3802,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* directory. Therefore, each individual relation doesn't need to be
* registered for cleanup.
*/
- RelationCreateStorage(dst_rnode, relpersistence, false);
+ RelationCreateStorage(dst_rlocator, relpersistence, false);
/* copy main fork. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, MAIN_FORKNUM, permanent);
@@ -3820,7 +3820,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* init fork of an unlogged relation.
*/
if (permanent || forkNum == INIT_FORKNUM)
- log_smgrcreate(&dst_rnode, forkNum);
+ log_smgrcreate(&dst_rlocator, forkNum);
/* Copy a fork's data, block by block. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, forkNum,
@@ -3864,16 +3864,16 @@ FlushDatabaseBuffers(Oid dbid)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rlocator.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4034,7 +4034,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
/*
- * If we must not write WAL, due to a relfilenode-specific
+ * If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
* set the hint, just not dirty the page as a result so the hint
* is lost when we evict the page or shutdown.
@@ -4042,7 +4042,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
return;
/*
@@ -4651,7 +4651,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4675,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,7 +4693,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4703,27 +4703,27 @@ local_buffer_write_error_callback(void *arg)
}
/*
- * RelFileNode qsort/bsearch comparator; see RelFileNodeEquals.
+ * RelFileLocator qsort/bsearch comparator; see RelFileLocatorEquals.
*/
static int
-rnode_comparator(const void *p1, const void *p2)
+rlocator_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileLocator n1 = *(const RelFileLocator *) p1;
+ RelFileLocator n2 = *(const RelFileLocator *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.relNumber < n2.relNumber)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.relNumber > n2.relNumber)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.dbOid < n2.dbOid)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.dbOid > n2.dbOid)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.spcOid < n2.spcOid)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.spcOid > n2.spcOid)
return 1;
else
return 0;
@@ -4789,7 +4789,7 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
if (ret != 0)
return ret;
@@ -4822,9 +4822,9 @@ ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b)
else if (a->tsId > b->tsId)
return 1;
/* compare relation */
- if (a->relNode < b->relNode)
+ if (a->relNumber < b->relNumber)
return -1;
- else if (a->relNode > b->relNode)
+ else if (a->relNumber > b->relNumber)
return 1;
/* compare fork */
else if (a->forkNum < b->forkNum)
@@ -4960,7 +4960,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +4979,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..3dc9cc7 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -134,7 +134,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum, -b - 1);
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
#endif
buf_state = pg_atomic_read_u32(&bufHdr->state);
@@ -162,7 +162,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum,
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum,
-nextFreeLocalBuf - 1);
#endif
@@ -215,7 +215,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -312,7 +312,7 @@ MarkLocalBufferDirty(Buffer buffer)
}
/*
- * DropRelFileNodeLocalBuffers
+ * DropRelFileLocatorLocalBuffers
* This function removes from the buffer pool all the pages of the
* specified relation that have block numbers >= firstDelBlock.
* (In particular, with firstDelBlock = 0, all pages are removed.)
@@ -320,11 +320,11 @@ MarkLocalBufferDirty(Buffer buffer)
* out first. Therefore, this is NOT rollback-able, and so should be
* used only with extreme caution!
*
- * See DropRelFileNodeBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber firstDelBlock)
+DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber firstDelBlock)
{
int i;
@@ -337,14 +337,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -363,14 +363,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
}
/*
- * DropRelFileNodeAllLocalBuffers
+ * DropRelFileLocatorAllLocalBuffers
* This function removes from the buffer pool all pages of all forks
* of the specified relation.
*
- * See DropRelFileNodesAllBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorsAllBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
+DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
{
int i;
@@ -383,12 +383,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -589,7 +589,7 @@ AtProcExit_LocalBuffers(void)
{
/*
* We shouldn't be holding any remaining pins; if we are, and assertions
- * aren't enabled, we'll fail later in DropRelFileNodeBuffers while trying
+ * aren't enabled, we'll fail later in DropRelFileLocatorBuffers while trying
* to drop the temp rels.
*/
CheckForLocalBufferLeaks();
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index d41ae37..005def5 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -196,7 +196,7 @@ RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk, Size spaceAvail)
* WAL replay
*/
void
-XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail)
{
int new_cat = fsm_space_avail_to_cat(spaceAvail);
@@ -211,8 +211,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR,
- InvalidBuffer);
+ buf = XLogReadBufferExtended(rlocator, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR, InvalidBuffer);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..af4dab7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &rlocator, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 671b00a..9dab931 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -442,7 +442,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
}
void
-ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
+ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileLocator locator)
{
VirtualTransactionId *backends;
@@ -461,7 +461,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
return;
backends = GetConflictingVirtualXIDs(latestRemovedXid,
- node.dbNode);
+ locator.dbOid);
ResolveRecoveryConflictWithVirtualXIDs(backends,
PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
@@ -475,7 +475,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
*/
void
ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node)
+ RelFileLocator locator)
{
/*
* ResolveRecoveryConflictWithSnapshot operates on 32-bit TransactionIds,
@@ -493,7 +493,7 @@ ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXi
TransactionId latestRemovedXid;
latestRemovedXid = XidFromFullTransactionId(latestRemovedFullXid);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, node);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, locator);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 25e7e4e..5136da6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1997,7 +1997,7 @@ PageIsPredicateLocked(Relation relation, BlockNumber blkno)
PREDICATELOCKTARGET *target;
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
@@ -2576,7 +2576,7 @@ PredicateLockRelation(Relation relation, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
PredicateLockAcquire(&tag);
}
@@ -2599,7 +2599,7 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_PAGE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
PredicateLockAcquire(&tag);
@@ -2638,13 +2638,13 @@ PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
* level lock.
*/
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
if (PredicateLockExists(&tag))
return;
SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -2974,7 +2974,7 @@ DropAllPredicateLocksFromTable(Relation relation, bool transfer)
if (!PredicateLockingNeededForRelation(relation))
return;
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
relId = relation->rd_id;
if (relation->rd_index == NULL)
{
@@ -3194,11 +3194,11 @@ PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
Assert(BlockNumberIsValid(newblkno));
SET_PREDICATELOCKTARGETTAG_PAGE(oldtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
oldblkno);
SET_PREDICATELOCKTARGETTAG_PAGE(newtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
newblkno);
@@ -4478,7 +4478,7 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (tid != NULL)
{
SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -4488,14 +4488,14 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (blkno != InvalidBlockNumber)
{
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
CheckTargetForConflictsIn(&targettag);
}
SET_PREDICATELOCKTARGETTAG_RELATION(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
CheckTargetForConflictsIn(&targettag);
}
@@ -4556,7 +4556,7 @@ CheckTableForSerializableConflictIn(Relation relation)
Assert(relation->rd_index == NULL); /* not an index relation */
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
heapId = relation->rd_id;
LWLockAcquire(SerializablePredicateListLock, LW_EXCLUSIVE);
diff --git a/src/backend/storage/smgr/README b/src/backend/storage/smgr/README
index e1cfc6c..1dfc16f 100644
--- a/src/backend/storage/smgr/README
+++ b/src/backend/storage/smgr/README
@@ -46,7 +46,7 @@ physical relation in system catalogs.
It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
exists. Fork numbers are assigned in src/include/common/relpath.h.
Functions in smgr.c and md.c take an extra fork number argument, in addition
-to relfilenode and block number, to identify which relation fork you want to
+to relfilenumber and block number, to identify which relation fork you want to
access. Since most code wants to access the main fork, a shortcut version of
ReadBuffer that accesses MAIN_FORKNUM is provided in the buffer manager for
convenience.
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 43edaf5..3998296 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -35,7 +35,7 @@
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/md.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
@@ -89,11 +89,11 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* Populate a file tag describing an md.c segment file. */
-#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
+#define INIT_MD_FILETAG(a,xx_rlocator,xx_forknum,xx_segno) \
( \
memset(&(a), 0, sizeof(FileTag)), \
(a).handler = SYNC_HANDLER_MD, \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forknum = (xx_forknum), \
(a).segno = (xx_segno) \
)
@@ -121,14 +121,14 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
-static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
+static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
bool isRedo);
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
-static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
ForkNumber forknum,
@@ -199,11 +199,11 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* should be here and not in commands/tablespace.c? But that would imply
* importing a lot of stuff that smgr.c oughtn't know, either.
*/
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
+ TablespaceCreateDbspace(reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
isRedo);
- path = relpath(reln->smgr_rnode, forkNum);
+ path = relpath(reln->smgr_rlocator, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
@@ -234,7 +234,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
/*
* mdunlink() -- Unlink a relation.
*
- * Note that we're passed a RelFileNodeBackend --- by the time this is called,
+ * Note that we're passed a RelFileLocatorBackend --- by the time this is called,
* there won't be an SMgrRelation hashtable entry anymore.
*
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
@@ -243,10 +243,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* For regular relations, we don't unlink the first segment file of the rel,
* but just truncate it to zero length, and record a request to unlink it after
* the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenode
- * number from being reused. The scenario this protects us from is:
+ * however. Leaving the empty file in place prevents that relfilenumber
+ * from being reused. The scenario this protects us from is:
* 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenode as
+ * 2. We create a new relation, which by chance gets the same relfilenumber as
* the just-deleted one (OIDs must've wrapped around for that to happen).
* 3. We crash before another checkpoint occurs.
* During replay, we would delete the file and then recreate it, which is fine
@@ -254,18 +254,18 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* But if we didn't WAL-log insertions, but instead relied on fsyncing the
* file after populating it (as we do at wal_level=minimal), the contents of
* the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenode number until
- * it's safe, because relfilenode assignment skips over any existing file.
+ * next checkpoint, we prevent reassignment of the relfilenumber until it's
+ * safe, because relfilenumber assignment skips over any existing file.
*
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenode number.
+ * to the health of a regular rel that has taken over its relfilenumber.
* The fact that temp rels and regular rels have different file naming
* patterns provides additional safety.
*
* All the above applies only to the relation's main fork; other forks can
* just be removed immediately, since they are not needed to prevent the
- * relfilenode number from being recycled. Also, we do not carefully
+ * relfilenumber from being recycled. Also, we do not carefully
* track whether other forks have been created or not, but just attempt to
* unlink them unconditionally; so we should never complain about ENOENT.
*
@@ -278,16 +278,16 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* we are usually not in a transaction anymore when this is called.
*/
void
-mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlink(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
/* Now do the per-fork work */
if (forkNum == InvalidForkNumber)
{
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
else
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
/*
@@ -315,25 +315,25 @@ do_truncate(const char *path)
}
static void
-mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
int ret;
- path = relpath(rnode, forkNum);
+ path = relpath(rlocator, forkNum);
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
/* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
ret = 0;
@@ -354,7 +354,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
ret = do_truncate(path);
/* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
}
/*
@@ -373,7 +373,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
sprintf(segpath, "%s.%u", path, segno);
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
* Prevent other backends' fds from holding on to the disk
@@ -386,7 +386,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
* Forget any pending sync requests for this segment before we
* try to unlink.
*/
- register_forget_request(rnode, forkNum, segno);
+ register_forget_request(rlocator, forkNum, segno);
}
if (unlink(segpath) < 0)
@@ -437,7 +437,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend file \"%s\" beyond %u blocks",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
InvalidBlockNumber)));
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync, EXTENSION_CREATE);
@@ -490,7 +490,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (reln->md_num_open_segs[forknum] > 0)
return &reln->md_seg_fds[forknum][0];
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);
@@ -645,10 +645,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
MdfdVec *v;
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -660,10 +660,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_READ);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -715,10 +715,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
#endif
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -730,10 +730,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_WRITE);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -842,7 +842,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
return;
ereport(ERROR,
(errmsg("could not truncate file \"%s\" to %u blocks: it's only %u blocks now",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
nblocks, curnblk)));
}
if (nblocks == curnblk)
@@ -983,7 +983,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
{
FileTag tag;
- INIT_MD_FILETAG(tag, reln->smgr_rnode.node, forknum, seg->mdfd_segno);
+ INIT_MD_FILETAG(tag, reln->smgr_rlocator.locator, forknum, seg->mdfd_segno);
/* Temp relations should never be fsync'd */
Assert(!SmgrIsTemp(reln));
@@ -1005,15 +1005,15 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
* register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
*/
static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
/* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
+ Assert(!RelFileLocatorBackendIsTemp(rlocator));
RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
}
@@ -1022,12 +1022,12 @@ register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
-register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true /* retryOnError */ );
}
@@ -1039,13 +1039,13 @@ void
ForgetDatabaseSyncRequests(Oid dbid)
{
FileTag tag;
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.dbNode = dbid;
- rnode.spcNode = 0;
- rnode.relNode = 0;
+ rlocator.dbOid = dbid;
+ rlocator.spcOid = 0;
+ rlocator.relNumber = 0;
- INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
+ INIT_MD_FILETAG(tag, rlocator, InvalidForkNumber, InvalidBlockNumber);
RegisterSyncRequest(&tag, SYNC_FILTER_REQUEST, true /* retryOnError */ );
}
@@ -1054,7 +1054,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
* DropRelationFiles -- drop files of all given relations
*/
void
-DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
+DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo)
{
SMgrRelation *srels;
int i;
@@ -1129,7 +1129,7 @@ _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
char *path,
*fullpath;
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
if (segno > 0)
{
@@ -1345,7 +1345,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(ftag->rlocator, InvalidBackendId);
File file;
bool need_to_close;
int result,
@@ -1395,7 +1395,7 @@ mdunlinkfiletag(const FileTag *ftag, char *path)
char *p;
/* Compute the path. */
- p = relpathperm(ftag->rnode, MAIN_FORKNUM);
+ p = relpathperm(ftag->rlocator, MAIN_FORKNUM);
strlcpy(path, p, MAXPGPATH);
pfree(p);
@@ -1417,5 +1417,5 @@ mdfiletagmatches(const FileTag *ftag, const FileTag *candidate)
* We'll return true for all candidates that have the same database OID as
* the ftag from the SYNC_FILTER_REQUEST request, so they're forgotten.
*/
- return ftag->rnode.dbNode == candidate->rnode.dbNode;
+ return ftag->rlocator.dbOid == candidate->rlocator.dbOid;
}
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index a477f70..b21d8c3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -46,7 +46,7 @@ typedef struct f_smgr
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
bool (*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
- void (*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
+ void (*smgr_unlink) (RelFileLocatorBackend rlocator, ForkNumber forknum,
bool isRedo);
void (*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
@@ -143,9 +143,9 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(RelFileLocator rlocator, BackendId backend)
{
- RelFileNodeBackend brnode;
+ RelFileLocatorBackend brlocator;
SMgrRelation reln;
bool found;
@@ -154,7 +154,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = sizeof(RelFileLocatorBackend);
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
@@ -162,10 +162,10 @@ smgropen(RelFileNode rnode, BackendId backend)
}
/* Look up or create an entry */
- brnode.node = rnode;
- brnode.backend = backend;
+ brlocator.locator = rlocator;
+ brlocator.backend = backend;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &brnode,
+ (void *) &brlocator,
HASH_ENTER, &found);
/* Initialize it if not present before */
@@ -267,7 +267,7 @@ smgrclose(SMgrRelation reln)
dlist_delete(&reln->node);
if (hash_search(SMgrRelationHash,
- (void *) &(reln->smgr_rnode),
+ (void *) &(reln->smgr_rlocator),
HASH_REMOVE, NULL) == NULL)
elog(ERROR, "SMgrRelation hashtable corrupted");
@@ -335,15 +335,15 @@ smgrcloseall(void)
}
/*
- * smgrclosenode() -- Close SMgrRelation object for given RelFileNode,
+ * smgrcloserellocator() -- Close SMgrRelation object for given RelFileLocator,
* if one exists.
*
- * This has the same effects as smgrclose(smgropen(rnode)), but it avoids
+ * This has the same effects as smgrclose(smgropen(rlocator)), but it avoids
* uselessly creating a hashtable entry only to drop it again when no
* such entry exists already.
*/
void
-smgrclosenode(RelFileNodeBackend rnode)
+smgrcloserellocator(RelFileLocatorBackend rlocator)
{
SMgrRelation reln;
@@ -352,7 +352,7 @@ smgrclosenode(RelFileNodeBackend rnode)
return;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &rnode,
+ (void *) &rlocator,
HASH_FIND, NULL);
if (reln != NULL)
smgrclose(reln);
@@ -420,7 +420,7 @@ void
smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
{
int i = 0;
- RelFileNodeBackend *rnodes;
+ RelFileLocatorBackend *rlocators;
ForkNumber forknum;
if (nrels == 0)
@@ -430,19 +430,19 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* Get rid of any remaining buffers for the relations. bufmgr will just
* drop them without bothering to write the contents.
*/
- DropRelFileNodesAllBuffers(rels, nrels);
+ DropRelFileLocatorsAllBuffers(rels, nrels);
/*
* create an array which contains all relations to be dropped, and close
* each relation's forks at the smgr level while at it
*/
- rnodes = palloc(sizeof(RelFileNodeBackend) * nrels);
+ rlocators = palloc(sizeof(RelFileLocatorBackend) * nrels);
for (i = 0; i < nrels; i++)
{
- RelFileNodeBackend rnode = rels[i]->smgr_rnode;
+ RelFileLocatorBackend rlocator = rels[i]->smgr_rlocator;
int which = rels[i]->smgr_which;
- rnodes[i] = rnode;
+ rlocators[i] = rlocator;
/* Close the forks at smgr level */
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
@@ -458,7 +458,7 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* closed our own smgr rel.
*/
for (i = 0; i < nrels; i++)
- CacheInvalidateSmgr(rnodes[i]);
+ CacheInvalidateSmgr(rlocators[i]);
/*
* Delete the physical file(s).
@@ -473,10 +473,10 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
int which = rels[i]->smgr_which;
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- smgrsw[which].smgr_unlink(rnodes[i], forknum, isRedo);
+ smgrsw[which].smgr_unlink(rlocators[i], forknum, isRedo);
}
- pfree(rnodes);
+ pfree(rlocators);
}
@@ -631,7 +631,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* Get rid of any buffers for the about-to-be-deleted blocks. bufmgr will
* just drop them without bothering to write the contents.
*/
- DropRelFileNodeBuffers(reln, forknum, nforks, nblocks);
+ DropRelFileLocatorBuffers(reln, forknum, nforks, nblocks);
/*
* Send a shared-inval message to force other backends to close any smgr
@@ -643,7 +643,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* is a performance-critical path.) As in the unlink code, we want to be
* sure the message is sent before we start changing things on-disk.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
/* Do the truncation */
for (i = 0; i < nforks; i++)
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index b4a2c8d..d8ae082 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -27,7 +27,7 @@
#include "utils/builtins.h"
#include "utils/numeric.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/relmapper.h"
#include "utils/syscache.h"
@@ -292,7 +292,7 @@ pg_tablespace_size_name(PG_FUNCTION_ARGS)
* is no check here or at the call sites for that.
*/
static int64
-calculate_relation_size(RelFileNode *rfn, BackendId backend, ForkNumber forknum)
+calculate_relation_size(RelFileLocator *rfn, BackendId backend, ForkNumber forknum)
{
int64 totalsize = 0;
char *relationpath;
@@ -349,7 +349,7 @@ pg_relation_size(PG_FUNCTION_ARGS)
if (rel == NULL)
PG_RETURN_NULL();
- size = calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size = calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkname_to_number(text_to_cstring(forkName)));
relation_close(rel, AccessShareLock);
@@ -374,7 +374,7 @@ calculate_toast_table_size(Oid toastrelid)
/* toast heap size, including FSM and VM size */
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastRel->rd_node),
+ size += calculate_relation_size(&(toastRel->rd_locator),
toastRel->rd_backend, forkNum);
/* toast index size, including FSM and VM size */
@@ -388,7 +388,7 @@ calculate_toast_table_size(Oid toastrelid)
toastIdxRel = relation_open(lfirst_oid(lc),
AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastIdxRel->rd_node),
+ size += calculate_relation_size(&(toastIdxRel->rd_locator),
toastIdxRel->rd_backend, forkNum);
relation_close(toastIdxRel, AccessShareLock);
@@ -417,7 +417,7 @@ calculate_table_size(Relation rel)
* heap size, including FSM and VM
*/
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size += calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkNum);
/*
@@ -456,7 +456,7 @@ calculate_indexes_size(Relation rel)
idxRel = relation_open(idxOid, AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(idxRel->rd_node),
+ size += calculate_relation_size(&(idxRel->rd_locator),
idxRel->rd_backend,
forkNum);
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelFileNumber result;
HeapTuple tuple;
Form_pg_class relform;
@@ -864,29 +864,29 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (relform->relfilenode)
result = relform->relfilenode;
else /* Consult the relation mapper */
- result = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ result = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelFileNumber;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
PG_RETURN_OID(result);
}
/*
- * Get the relation via (reltablespace, relfilenode)
+ * Get the relation via (reltablespace, relfilenumber)
*
* This is expected to be used when somebody wants to match an individual file
* on the filesystem back to its table. That's not trivially possible via
- * pg_class, because that doesn't contain the relfilenodes of shared and nailed
+ * pg_class, because that doesn't contain the relfilenumbers of shared and nailed
* tables.
*
* We don't fail but return NULL if we cannot find a mapping.
@@ -898,14 +898,14 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_OID(1);
Oid heaprel;
- /* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ /* test needed so RelidByRelfilenumber doesn't misbehave */
+ if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenode(reltablespace, relfilenode);
+ heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
@@ -924,7 +924,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
Oid relid = PG_GETARG_OID(0);
HeapTuple tuple;
Form_pg_class relform;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BackendId backend;
char *path;
@@ -937,29 +937,29 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
{
/* This logic should match RelationInitPhysicalAddr */
if (relform->reltablespace)
- rnode.spcNode = relform->reltablespace;
+ rlocator.spcOid = relform->reltablespace;
else
- rnode.spcNode = MyDatabaseTableSpace;
- if (rnode.spcNode == GLOBALTABLESPACE_OID)
- rnode.dbNode = InvalidOid;
+ rlocator.spcOid = MyDatabaseTableSpace;
+ if (rlocator.spcOid == GLOBALTABLESPACE_OID)
+ rlocator.dbOid = InvalidOid;
else
- rnode.dbNode = MyDatabaseId;
+ rlocator.dbOid = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ rlocator.relNumber = relform->relfilenode;
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ rlocator.relNumber = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rlocator.relNumber = InvalidOid;
/* some compilers generate warnings without these next two lines */
- rnode.dbNode = InvalidOid;
- rnode.spcNode = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.spcOid = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!OidIsValid(rlocator.relNumber))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
@@ -990,7 +990,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
ReleaseSysCache(tuple);
- path = relpathbackend(rnode, backend, MAIN_FORKNUM);
+ path = relpathbackend(rlocator, backend, MAIN_FORKNUM);
PG_RETURN_TEXT_P(cstring_to_text(path));
}
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..4408c00 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -2,7 +2,7 @@
* pg_upgrade_support.c
*
* server-side functions to set backend global variables
- * to control oid and relfilenode assignment, and do other special
+ * to control oid and relfilenumber assignment, and do other special
* hacks needed for pg_upgrade.
*
* Copyright (c) 2010-2022, PostgreSQL Global Development Group
@@ -98,10 +98,10 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -120,10 +120,10 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile
index 38e46d2..5105018 100644
--- a/src/backend/utils/cache/Makefile
+++ b/src/backend/utils/cache/Makefile
@@ -21,7 +21,7 @@ OBJS = \
partcache.o \
plancache.o \
relcache.o \
- relfilenodemap.o \
+ relfilenumbermap.o \
relmapper.o \
spccache.o \
syscache.o \
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index af000d4..eb5782f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -661,11 +661,11 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
* We could have smgr entries for relations of other databases, so no
* short-circuit test is possible here.
*/
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
- rnode.node = msg->sm.rnode;
- rnode.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
- smgrclosenode(rnode);
+ rlocator.locator = msg->sm.rlocator;
+ rlocator.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
+ smgrcloserellocator(rlocator);
}
else if (msg->id == SHAREDINVALRELMAP_ID)
{
@@ -1459,14 +1459,14 @@ CacheInvalidateRelcacheByRelid(Oid relid)
* Thus, the maximum possible backend ID is 2^23-1.
*/
void
-CacheInvalidateSmgr(RelFileNodeBackend rnode)
+CacheInvalidateSmgr(RelFileLocatorBackend rlocator)
{
SharedInvalidationMessage msg;
msg.sm.id = SHAREDINVALSMGR_ID;
- msg.sm.backend_hi = rnode.backend >> 16;
- msg.sm.backend_lo = rnode.backend & 0xffff;
- msg.sm.rnode = rnode.node;
+ msg.sm.backend_hi = rlocator.backend >> 16;
+ msg.sm.backend_lo = rlocator.backend & 0xffff;
+ msg.sm.rlocator = rlocator.locator;
/* check AddCatcacheInvalidationMessage() for an explanation */
VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 0e8fda9..9bab6af 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -369,7 +369,7 @@ ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic)
/*
* The caller might need a tuple that's newer than the one the historic
* snapshot; currently the only case requiring to do so is looking up the
- * relfilenode of non mapped system relations during decoding. That
+ * relfilenumber of non mapped system relations during decoding. That
* snapshot can't change in the midst of a relcache build, so there's no
* need to register the snapshot.
*/
@@ -1133,8 +1133,8 @@ retry:
relation->rd_refcnt = 0;
relation->rd_isnailed = false;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
@@ -1300,7 +1300,7 @@ retry:
}
/*
- * Initialize the physical addressing info (RelFileNode) for a relcache entry
+ * Initialize the physical addressing info (RelFileLocator) for a relcache entry
*
* Note: at the physical level, relations in the pg_global tablespace must
* be treated as shared, even if relisshared isn't set. Hence we do not
@@ -1309,20 +1309,20 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ RelFileNumber oldnumber = relation->rd_locator.relNumber;
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
if (relation->rd_rel->reltablespace)
- relation->rd_node.spcNode = relation->rd_rel->reltablespace;
+ relation->rd_locator.spcOid = relation->rd_rel->reltablespace;
else
- relation->rd_node.spcNode = MyDatabaseTableSpace;
- if (relation->rd_node.spcNode == GLOBALTABLESPACE_OID)
- relation->rd_node.dbNode = InvalidOid;
+ relation->rd_locator.spcOid = MyDatabaseTableSpace;
+ if (relation->rd_locator.spcOid == GLOBALTABLESPACE_OID)
+ relation->rd_locator.dbOid = InvalidOid;
else
- relation->rd_node.dbNode = MyDatabaseId;
+ relation->rd_locator.dbOid = MyDatabaseId;
if (relation->rd_rel->relfilenode)
{
@@ -1356,30 +1356,30 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ relation->rd_locator.relNumber = relation->rd_rel->relfilenode;
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ relation->rd_locator.relNumber =
+ RelationMapOidToFilenumber(relation->rd_id,
+ relation->rd_rel->relisshared);
+ if (!RelFileNumberIsValid(relation->rd_locator.relNumber))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
/*
* For RelationNeedsWAL() to answer correctly on parallel workers, restore
- * rd_firstRelfilenodeSubid. No subtransactions start or end while in
+ * rd_firstRelfilelocatorSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnumber != relation->rd_locator.relNumber)
{
- if (RelFileNodeSkippingWAL(relation->rd_node))
- relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ if (RelFileLocatorSkippingWAL(relation->rd_locator))
+ relation->rd_firstRelfilelocatorSubid = TopSubTransactionId;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
}
@@ -1889,8 +1889,8 @@ formrdesc(const char *relationName, Oid relationReltype,
*/
relation->rd_isnailed = true;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
relation->rd_backend = InvalidBackendId;
relation->rd_islocaltemp = false;
@@ -1978,9 +1978,9 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
- * because there is no other way to know what filenode they currently
+ * because there is no other way to know what filenumber they currently
* have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * specifying that the initial filenumber is the same as the OID.
*/
relation->rd_rel->relfilenode = InvalidOid;
if (IsBootstrapProcessingMode())
@@ -2180,7 +2180,7 @@ RelationClose(Relation relation)
#ifdef RELCACHE_FORCE_RELEASE
if (RelationHasReferenceCountZero(relation) &&
relation->rd_createSubid == InvalidSubTransactionId &&
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
RelationClearRelation(relation, false);
#endif
}
@@ -2352,7 +2352,7 @@ RelationReloadNailed(Relation relation)
{
/*
* If it's a nailed-but-not-mapped index, then we need to re-read the
- * pg_class row to see if its relfilenode changed.
+ * pg_class row to see if its relfilenumber changed.
*/
RelationReloadIndexInfo(relation);
}
@@ -2700,8 +2700,8 @@ RelationClearRelation(Relation relation, bool rebuild)
Assert(newrel->rd_isnailed == relation->rd_isnailed);
/* creation sub-XIDs must be preserved */
SWAPFIELD(SubTransactionId, rd_createSubid);
- SWAPFIELD(SubTransactionId, rd_newRelfilenodeSubid);
- SWAPFIELD(SubTransactionId, rd_firstRelfilenodeSubid);
+ SWAPFIELD(SubTransactionId, rd_newRelfilelocatorSubid);
+ SWAPFIELD(SubTransactionId, rd_firstRelfilelocatorSubid);
SWAPFIELD(SubTransactionId, rd_droppedSubid);
/* un-swap rd_rel pointers, swap contents instead */
SWAPFIELD(Form_pg_class, rd_rel);
@@ -2791,12 +2791,12 @@ static void
RelationFlushRelation(Relation relation)
{
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* New relcache entries are always rebuilt, not flushed; else we'd
* forget the "new" status of the relation. Ditto for the
- * new-relfilenode status.
+ * new-relfilenumber status.
*
* The rel could have zero refcnt here, so temporarily increment the
* refcnt to ensure it's safe to rebuild it. We can assume that the
@@ -2835,7 +2835,7 @@ RelationForgetRelation(Oid rid)
Assert(relation->rd_droppedSubid == InvalidSubTransactionId);
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* In the event of subtransaction rollback, we must not forget
@@ -2894,7 +2894,7 @@ RelationCacheInvalidateEntry(Oid relationId)
*
* Apart from debug_discard_caches, this is currently used only to recover
* from SI message buffer overflow, so we do not touch relations having
- * new-in-transaction relfilenodes; they cannot be targets of cross-backend
+ * new-in-transaction relfilenumbers; they cannot be targets of cross-backend
* SI updates (and our own updates now go through a separate linked list
* that isn't limited by the SI message buffer size).
*
@@ -2909,7 +2909,7 @@ RelationCacheInvalidateEntry(Oid relationId)
* so hash_seq_search will complete safely; (b) during the second pass we
* only hold onto pointers to nondeletable entries.
*
- * The two-phase approach also makes it easy to update relfilenodes for
+ * The two-phase approach also makes it easy to update relfilenumbers for
* mapped relations before we do anything else, and to ensure that the
* second pass processes nailed-in-cache items before other nondeletable
* items. This should ensure that system catalogs are up to date before
@@ -2948,12 +2948,12 @@ RelationCacheInvalidate(bool debug_discard)
/*
* Ignore new relations; no other backend will manipulate them before
- * we commit. Likewise, before replacing a relation's relfilenode, we
- * shall have acquired AccessExclusiveLock and drained any applicable
- * pending invalidations.
+ * we commit. Likewise, before replacing a relation's relfilenumber,
+ * we shall have acquired AccessExclusiveLock and drained any
+ * applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;
relcacheInvalsReceived++;
@@ -2967,8 +2967,8 @@ RelationCacheInvalidate(bool debug_discard)
else
{
/*
- * If it's a mapped relation, immediately update its rd_node in
- * case its relfilenode changed. We must do this during phase 1
+ * If it's a mapped relation, immediately update its rd_locator in
+ * case its relfilenumber changed. We must do this during phase 1
* in case the relation is consulted during rebuild of other
* relcache entries in phase 2. It's safe since consulting the
* map doesn't involve any access to relcache entries.
@@ -3078,14 +3078,14 @@ AssertPendingSyncConsistency(Relation relation)
RelationIsPermanent(relation) &&
((relation->rd_createSubid != InvalidSubTransactionId &&
RELKIND_HAS_STORAGE(relation->rd_rel->relkind)) ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId);
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId);
- Assert(relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node));
+ Assert(relcache_verdict == RelFileLocatorSkippingWAL(relation->rd_locator));
if (relation->rd_droppedSubid != InvalidSubTransactionId)
Assert(!relation->rd_isvalid &&
(relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId));
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId));
}
/*
@@ -3282,8 +3282,8 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
* also lets RelationClearRelation() drop the relcache entry.
*/
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
if (clear_relcache)
@@ -3397,8 +3397,8 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
{
/* allow the entry to be removed */
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
RelationClearRelation(relation, false);
return;
@@ -3419,23 +3419,23 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
}
/*
- * Likewise, update or drop any new-relfilenode-in-subtransaction record
+ * Likewise, update or drop any new-relfilenumber-in-subtransaction record
* or drop record.
*/
- if (relation->rd_newRelfilenodeSubid == mySubid)
+ if (relation->rd_newRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_newRelfilenodeSubid = parentSubid;
+ relation->rd_newRelfilelocatorSubid = parentSubid;
else
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
}
- if (relation->rd_firstRelfilenodeSubid == mySubid)
+ if (relation->rd_firstRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_firstRelfilenodeSubid = parentSubid;
+ relation->rd_firstRelfilelocatorSubid = parentSubid;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
if (relation->rd_droppedSubid == mySubid)
@@ -3459,7 +3459,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3533,8 +3533,8 @@ RelationBuildLocalRelation(const char *relname,
/* it's being created in this transaction */
rel->rd_createSubid = GetCurrentSubTransactionId();
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
/*
@@ -3616,7 +3616,7 @@ RelationBuildLocalRelation(const char *relname,
/*
* Insert relation physical and logical identifiers (OIDs) into the right
- * places. For a mapped relation, we set relfilenode to zero and rely on
+ * places. For a mapped relation, we set relfilenumber to zero and rely on
* RelationInitPhysicalAddr to consult the map.
*/
rel->rd_rel->relisshared = shared_relation;
@@ -3632,10 +3632,10 @@ RelationBuildLocalRelation(const char *relname,
{
rel->rd_rel->relfilenode = InvalidOid;
/* Add it to the active mapping information */
- RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
+ RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
else
- rel->rd_rel->relfilenode = relfilenode;
+ rel->rd_rel->relfilenode = relfilenumber;
RelationInitLockInfo(rel); /* see lmgr.c */
@@ -3683,13 +3683,13 @@ RelationBuildLocalRelation(const char *relname,
/*
- * RelationSetNewRelfilenode
+ * RelationSetNewRelfilenumber
*
- * Assign a new relfilenode (physical file name), and possibly a new
+ * Assign a new relfilenumber (physical file name), and possibly a new
* persistence setting, to the relation.
*
* This allows a full rewrite of the relation to be done with transactional
- * safety (since the filenode assignment can be rolled back). Note however
+ * safety (since the filenumber assignment can be rolled back). Note however
* that there is no simple way to access the relation's old data for the
* remainder of the current transaction. This limits the usefulness to cases
* such as TRUNCATE or rebuilding an index from scratch.
@@ -3697,19 +3697,19 @@ RelationBuildLocalRelation(const char *relname,
* Caller must already hold exclusive lock on the relation.
*/
void
-RelationSetNewRelfilenode(Relation relation, char persistence)
+RelationSetNewRelfilenumber(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelFileNumber newrelfilenumber;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
MultiXactId minmulti = InvalidMultiXactId;
TransactionId freezeXid = InvalidTransactionId;
- RelFileNode newrnode;
+ RelFileLocator newrlocator;
- /* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
- persistence);
+ /* Allocate a new relfilenumber */
+ newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
+ NULL, persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
@@ -3729,28 +3729,28 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelationDropStorage(relation);
/*
- * Create storage for the main fork of the new relfilenode. If it's a
+ * Create storage for the main fork of the new relfilenumber. If it's a
* table-like object, call into the table AM to do so, which'll also
* create the table's init fork if needed.
*
- * NOTE: If relevant for the AM, any conflict in relfilenode value will be
- * caught here, if GetNewRelFileNode messes up for any reason.
+ * NOTE: If relevant for the AM, any conflict in relfilenumber value will be
+ * caught here, if GetNewRelFileNumber messes up for any reason.
*/
- newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ newrlocator = relation->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
- table_relation_set_new_filenode(relation, &newrnode,
- persistence,
- &freezeXid, &minmulti);
+ table_relation_set_new_filelocator(relation, &newrlocator,
+ persistence,
+ &freezeXid, &minmulti);
}
else if (RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
{
/* handle these directly, at least for now */
SMgrRelation srel;
- srel = RelationCreateStorage(newrnode, persistence, true);
+ srel = RelationCreateStorage(newrlocator, persistence, true);
smgrclose(srel);
}
else
@@ -3789,7 +3789,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
/* Do the deed */
RelationMapUpdateMap(RelationGetRelid(relation),
- newrelfilenode,
+ newrelfilenumber,
relation->rd_rel->relisshared,
false);
@@ -3799,7 +3799,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
else
{
/* Normal case, update the pg_class entry */
- classform->relfilenode = newrelfilenode;
+ classform->relfilenode = newrelfilenumber;
/* relpages etc. never change for sequences */
if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
@@ -3825,27 +3825,27 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
*/
CommandCounterIncrement();
- RelationAssumeNewRelfilenode(relation);
+ RelationAssumeNewRelfilelocator(relation);
}
/*
- * RelationAssumeNewRelfilenode
+ * RelationAssumeNewRelfilelocator
*
* Code that modifies pg_class.reltablespace or pg_class.relfilenode must call
* this. The call shall precede any code that might insert WAL records whose
- * replay would modify bytes in the new RelFileNode, and the call shall follow
- * any WAL modifying bytes in the prior RelFileNode. See struct RelationData.
+ * replay would modify bytes in the new RelFileLocator, and the call shall follow
+ * any WAL modifying bytes in the prior RelFileLocator. See struct RelationData.
* Ideally, call this as near as possible to the CommandCounterIncrement()
* that makes the pg_class change visible (before it or after it); that
* minimizes the chance of future development adding a forbidden WAL insertion
- * between RelationAssumeNewRelfilenode() and CommandCounterIncrement().
+ * between RelationAssumeNewRelfilelocator() and CommandCounterIncrement().
*/
void
-RelationAssumeNewRelfilenode(Relation relation)
+RelationAssumeNewRelfilelocator(Relation relation)
{
- relation->rd_newRelfilenodeSubid = GetCurrentSubTransactionId();
- if (relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
- relation->rd_firstRelfilenodeSubid = relation->rd_newRelfilenodeSubid;
+ relation->rd_newRelfilelocatorSubid = GetCurrentSubTransactionId();
+ if (relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid = relation->rd_newRelfilelocatorSubid;
/* Flag relation as needing eoxact cleanup (to clear these fields) */
EOXactListAdd(relation);
@@ -6254,8 +6254,8 @@ load_relcache_init_file(bool shared)
rel->rd_fkeyvalid = false;
rel->rd_fkeylist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
MemSet(&rel->pgstat_info, 0, sizeof(rel->pgstat_info));
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
deleted file mode 100644
index 70c323c..0000000
--- a/src/backend/utils/cache/relfilenodemap.c
+++ /dev/null
@@ -1,244 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.c
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- * src/backend/utils/cache/relfilenodemap.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "access/genam.h"
-#include "access/htup_details.h"
-#include "access/table.h"
-#include "catalog/pg_class.h"
-#include "catalog/pg_tablespace.h"
-#include "miscadmin.h"
-#include "utils/builtins.h"
-#include "utils/catcache.h"
-#include "utils/fmgroids.h"
-#include "utils/hsearch.h"
-#include "utils/inval.h"
-#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
-#include "utils/relmapper.h"
-
-/* Hash table for information about each relfilenode <-> oid pair */
-static HTAB *RelfilenodeMapHash = NULL;
-
-/* built first time through in InitializeRelfilenodeMap */
-static ScanKeyData relfilenode_skey[2];
-
-typedef struct
-{
- Oid reltablespace;
- Oid relfilenode;
-} RelfilenodeMapKey;
-
-typedef struct
-{
- RelfilenodeMapKey key; /* lookup key - must be first */
- Oid relid; /* pg_class.oid */
-} RelfilenodeMapEntry;
-
-/*
- * RelfilenodeMapInvalidateCallback
- * Flush mapping entries when pg_class is updated in a relevant fashion.
- */
-static void
-RelfilenodeMapInvalidateCallback(Datum arg, Oid relid)
-{
- HASH_SEQ_STATUS status;
- RelfilenodeMapEntry *entry;
-
- /* callback only gets registered after creating the hash */
- Assert(RelfilenodeMapHash != NULL);
-
- hash_seq_init(&status, RelfilenodeMapHash);
- while ((entry = (RelfilenodeMapEntry *) hash_seq_search(&status)) != NULL)
- {
- /*
- * If relid is InvalidOid, signaling a complete reset, we must remove
- * all entries, otherwise just remove the specific relation's entry.
- * Always remove negative cache entries.
- */
- if (relid == InvalidOid || /* complete reset */
- entry->relid == InvalidOid || /* negative cache entry */
- entry->relid == relid) /* individual flushed relation */
- {
- if (hash_search(RelfilenodeMapHash,
- (void *) &entry->key,
- HASH_REMOVE,
- NULL) == NULL)
- elog(ERROR, "hash table corrupted");
- }
- }
-}
-
-/*
- * InitializeRelfilenodeMap
- * Initialize cache, either on first use or after a reset.
- */
-static void
-InitializeRelfilenodeMap(void)
-{
- HASHCTL ctl;
- int i;
-
- /* Make sure we've initialized CacheMemoryContext. */
- if (CacheMemoryContext == NULL)
- CreateCacheMemoryContext();
-
- /* build skey */
- MemSet(&relfilenode_skey, 0, sizeof(relfilenode_skey));
-
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenode_skey[i].sk_func,
- CacheMemoryContext);
- relfilenode_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenode_skey[i].sk_subtype = InvalidOid;
- relfilenode_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenode_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenode_skey[1].sk_attno = Anum_pg_class_relfilenode;
-
- /*
- * Only create the RelfilenodeMapHash now, so we don't end up partially
- * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
- * error.
- */
- ctl.keysize = sizeof(RelfilenodeMapKey);
- ctl.entrysize = sizeof(RelfilenodeMapEntry);
- ctl.hcxt = CacheMemoryContext;
-
- RelfilenodeMapHash =
- hash_create("RelfilenodeMap cache", 64, &ctl,
- HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
-
- /* Watch for invalidation events. */
- CacheRegisterRelcacheCallback(RelfilenodeMapInvalidateCallback,
- (Datum) 0);
-}
-
-/*
- * Map a relation's (tablespace, filenode) to a relation's oid and cache the
- * result.
- *
- * Returns InvalidOid if no relation matching the criteria could be found.
- */
-Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
-{
- RelfilenodeMapKey key;
- RelfilenodeMapEntry *entry;
- bool found;
- SysScanDesc scandesc;
- Relation relation;
- HeapTuple ntp;
- ScanKeyData skey[2];
- Oid relid;
-
- if (RelfilenodeMapHash == NULL)
- InitializeRelfilenodeMap();
-
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
-
- /*
- * Check cache and return entry if one is found. Even if no target
- * relation can be found later on we store the negative match and return a
- * InvalidOid from cache. That's not really necessary for performance
- * since querying invalid values isn't supposed to be a frequent thing,
- * but it's basically free.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_FIND, &found);
-
- if (found)
- return entry->relid;
-
- /* ok, no previous cache entry, do it the hard way */
-
- /* initialize empty/negative cache entry before doing the actual lookups */
- relid = InvalidOid;
-
- if (reltablespace == GLOBALTABLESPACE_OID)
- {
- /*
- * Ok, shared table, check relmapper.
- */
- relid = RelationMapFilenodeToOid(relfilenode, true);
- }
- else
- {
- /*
- * Not a shared table, could either be a plain relation or a
- * non-shared, nailed one, like e.g. pg_class.
- */
-
- /* check for plain relations by looking in pg_class */
- relation = table_open(RelationRelationId, AccessShareLock);
-
- /* copy scankey to local copy, it will be modified during the scan */
- memcpy(skey, relfilenode_skey, sizeof(skey));
-
- /* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
-
- scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
- true,
- NULL,
- 2,
- skey);
-
- found = false;
-
- while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
- {
- Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
-
- if (found)
- elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
- reltablespace, relfilenode);
- found = true;
-
- Assert(classform->reltablespace == reltablespace);
- Assert(classform->relfilenode == relfilenode);
- relid = classform->oid;
- }
-
- systable_endscan(scandesc);
- table_close(relation, AccessShareLock);
-
- /* check for tables that are mapped but not shared */
- if (!found)
- relid = RelationMapFilenodeToOid(relfilenode, false);
- }
-
- /*
- * Only enter entry into cache now, our opening of pg_class could have
- * caused cache invalidations to be executed which would have deleted a
- * new entry if we had entered it above.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_ENTER, &found);
- if (found)
- elog(ERROR, "corrupted hashtable");
- entry->relid = relid;
-
- return relid;
-}
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
new file mode 100644
index 0000000..3dc45e9
--- /dev/null
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -0,0 +1,244 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.c
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/utils/cache/relfilenumbermap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/htup_details.h"
+#include "access/table.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/catcache.h"
+#include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/inval.h"
+#include "utils/rel.h"
+#include "utils/relfilenumbermap.h"
+#include "utils/relmapper.h"
+
+/* Hash table for information about each relfilenumber <-> oid pair */
+static HTAB *RelfilenumberMapHash = NULL;
+
+/* built first time through in InitializeRelfilenumberMap */
+static ScanKeyData relfilenumber_skey[2];
+
+typedef struct
+{
+ Oid reltablespace;
+ RelFileNumber relfilenumber;
+} RelfilenumberMapKey;
+
+typedef struct
+{
+ RelfilenumberMapKey key; /* lookup key - must be first */
+ Oid relid; /* pg_class.oid */
+} RelfilenumberMapEntry;
+
+/*
+ * RelfilenumberMapInvalidateCallback
+ * Flush mapping entries when pg_class is updated in a relevant fashion.
+ */
+static void
+RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
+{
+ HASH_SEQ_STATUS status;
+ RelfilenumberMapEntry *entry;
+
+ /* callback only gets registered after creating the hash */
+ Assert(RelfilenumberMapHash != NULL);
+
+ hash_seq_init(&status, RelfilenumberMapHash);
+ while ((entry = (RelfilenumberMapEntry *) hash_seq_search(&status)) != NULL)
+ {
+ /*
+ * If relid is InvalidOid, signaling a complete reset, we must remove
+ * all entries, otherwise just remove the specific relation's entry.
+ * Always remove negative cache entries.
+ */
+ if (relid == InvalidOid || /* complete reset */
+ entry->relid == InvalidOid || /* negative cache entry */
+ entry->relid == relid) /* individual flushed relation */
+ {
+ if (hash_search(RelfilenumberMapHash,
+ (void *) &entry->key,
+ HASH_REMOVE,
+ NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+ }
+ }
+}
+
+/*
+ * InitializeRelfilenumberMap
+ * Initialize cache, either on first use or after a reset.
+ */
+static void
+InitializeRelfilenumberMap(void)
+{
+ HASHCTL ctl;
+ int i;
+
+ /* Make sure we've initialized CacheMemoryContext. */
+ if (CacheMemoryContext == NULL)
+ CreateCacheMemoryContext();
+
+ /* build skey */
+ MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
+
+ for (i = 0; i < 2; i++)
+ {
+ fmgr_info_cxt(F_OIDEQ,
+ &relfilenumber_skey[i].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[i].sk_subtype = InvalidOid;
+ relfilenumber_skey[i].sk_collation = InvalidOid;
+ }
+
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
+ relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+
+ /*
+ * Only create the RelfilenumberMapHash now, so we don't end up partially
+ * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
+ * error.
+ */
+ ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.entrysize = sizeof(RelfilenumberMapEntry);
+ ctl.hcxt = CacheMemoryContext;
+
+ RelfilenumberMapHash =
+ hash_create("RelfilenumberMap cache", 64, &ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+ /* Watch for invalidation events. */
+ CacheRegisterRelcacheCallback(RelfilenumberMapInvalidateCallback,
+ (Datum) 0);
+}
+
+/*
+ * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * the result.
+ *
+ * Returns InvalidOid if no relation matching the criteria could be found.
+ */
+Oid
+RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
+{
+ RelfilenumberMapKey key;
+ RelfilenumberMapEntry *entry;
+ bool found;
+ SysScanDesc scandesc;
+ Relation relation;
+ HeapTuple ntp;
+ ScanKeyData skey[2];
+ Oid relid;
+
+ if (RelfilenumberMapHash == NULL)
+ InitializeRelfilenumberMap();
+
+ /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
+ if (reltablespace == MyDatabaseTableSpace)
+ reltablespace = 0;
+
+ MemSet(&key, 0, sizeof(key));
+ key.reltablespace = reltablespace;
+ key.relfilenumber = relfilenumber;
+
+ /*
+ * Check cache and return entry if one is found. Even if no target
+ * relation can be found later on we store the negative match and return a
+ * InvalidOid from cache. That's not really necessary for performance
+ * since querying invalid values isn't supposed to be a frequent thing,
+ * but it's basically free.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+
+ if (found)
+ return entry->relid;
+
+ /* ok, no previous cache entry, do it the hard way */
+
+ /* initialize empty/negative cache entry before doing the actual lookups */
+ relid = InvalidOid;
+
+ if (reltablespace == GLOBALTABLESPACE_OID)
+ {
+ /*
+ * Ok, shared table, check relmapper.
+ */
+ relid = RelationMapFilenumberToOid(relfilenumber, true);
+ }
+ else
+ {
+ /*
+ * Not a shared table, could either be a plain relation or a
+ * non-shared, nailed one, like e.g. pg_class.
+ */
+
+ /* check for plain relations by looking in pg_class */
+ relation = table_open(RelationRelationId, AccessShareLock);
+
+ /* copy scankey to local copy, it will be modified during the scan */
+ memcpy(skey, relfilenumber_skey, sizeof(skey));
+
+ /* set scan arguments */
+ skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
+ skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+
+ scandesc = systable_beginscan(relation,
+ ClassTblspcRelfilenodeIndexId,
+ true,
+ NULL,
+ 2,
+ skey);
+
+ found = false;
+
+ while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
+ {
+ Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
+
+ if (found)
+ elog(ERROR,
+ "unexpected duplicate for tablespace %u, relfilenumber %u",
+ reltablespace, relfilenumber);
+ found = true;
+
+ Assert(classform->reltablespace == reltablespace);
+ Assert(classform->relfilenode == relfilenumber);
+ relid = classform->oid;
+ }
+
+ systable_endscan(scandesc);
+ table_close(relation, AccessShareLock);
+
+ /* check for tables that are mapped but not shared */
+ if (!found)
+ relid = RelationMapFilenumberToOid(relfilenumber, false);
+ }
+
+ /*
+ * Only enter entry into cache now, our opening of pg_class could have
+ * caused cache invalidations to be executed which would have deleted a
+ * new entry if we had entered it above.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ if (found)
+ elog(ERROR, "corrupted hashtable");
+ entry->relid = relid;
+
+ return relid;
+}
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 2a330cf..8009726 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.c
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
* For most tables, the physical file underlying the table is specified by
* pg_class.relfilenode. However, that obviously won't work for pg_class
@@ -11,7 +11,7 @@
* update other databases' pg_class entries when relocating a shared catalog.
* Therefore, for these special catalogs (henceforth referred to as "mapped
* catalogs") we rely on a separately maintained file that shows the mapping
- * from catalog OIDs to filenode numbers. Each database has a map file for
+ * from catalog OIDs to filenumber. Each database has a map file for
* its local mapped catalogs, and there is a separate map file for shared
* catalogs. Mapped catalogs have zero in their pg_class.relfilenode entries.
*
@@ -78,8 +78,8 @@
typedef struct RelMapping
{
- Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ Oid mapoid; /* OID of a catalog */
+ RelFileNumber mapfilenumber; /* its rel file number */
} RelMapping;
typedef struct RelMapFile
@@ -116,7 +116,7 @@ static RelMapFile local_map;
* subtransactions, so one set of transaction-level changes is sufficient.
*
* The active_xxx variables contain updates that are valid in our transaction
- * and should be honored by RelationMapOidToFilenode. The pending_xxx
+ * and should be honored by RelationMapOidToFilenumber. The pending_xxx
* variables contain updates we have been told about that aren't active yet;
* they will become active at the next CommandCounterIncrement. This setup
* lets map updates act similarly to updates of pg_class rows, ie, they
@@ -132,8 +132,8 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
- bool add_okay);
+static void apply_map_update(RelMapFile *map, Oid relationId,
+ RelFileNumber filenumber, bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
static void load_relmap_file(bool shared, bool lock_held);
@@ -146,9 +146,9 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
/*
- * RelationMapOidToFilenode
+ * RelationMapOidToFilenumber
*
- * The raison d' etre ... given a relation OID, look up its filenode.
+ * The raison d' etre ... given a relation OID, look up its filenumber.
*
* Although shared and local relation OIDs should never overlap, the caller
* always knows which we need --- so pass that information to avoid useless
@@ -157,8 +157,8 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
-RelationMapOidToFilenode(Oid relationId, bool shared)
+RelFileNumber
+RelationMapOidToFilenumber(Oid relationId, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -170,13 +170,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
else
@@ -185,33 +185,33 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
- * RelationMapFilenodeToOid
+ * RelationMapFilenumberToOid
*
* Do the reverse of the normal direction of mapping done in
- * RelationMapOidToFilenode.
+ * RelationMapOidToFilenumber.
*
* This is not supposed to be used during normal running but rather for
* information purposes when looking at the filesystem or xlog.
*
* Returns InvalidOid if the OID is not known; this can easily happen if the
- * relfilenode doesn't pertain to a mapped relation.
+ * relfilenumber doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenumberToOid(RelFileNumber filenumber, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -222,13 +222,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -237,13 +237,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -252,13 +252,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
}
/*
- * RelationMapOidToFilenodeForDatabase
+ * RelationMapOidToFilenumberForDatabase
*
- * Like RelationMapOidToFilenode, but reads the mapping from the indicated
+ * Like RelationMapOidToFilenumber, but reads the mapping from the indicated
* path instead of using the one for the current database.
*/
Oid
-RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
+RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId)
{
RelMapFile map;
int i;
@@ -270,7 +270,7 @@ RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
for (i = 0; i < map.num_mappings; i++)
{
if (relationId == map.mappings[i].mapoid)
- return map.mappings[i].mapfilenode;
+ return map.mappings[i].mapfilenumber;
}
return InvalidOid;
@@ -311,13 +311,13 @@ RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath, char *dstdbpath)
/*
* RelationMapUpdateMap
*
- * Install a new relfilenode mapping for the specified relation.
+ * Install a new relfilenumber mapping for the specified relation.
*
* If immediate is true (or we're bootstrapping), the mapping is activated
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -362,7 +362,7 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
map = &pending_local_updates;
}
}
- apply_map_update(map, relationId, fileNode, true);
+ apply_map_update(map, relationId, fileNumber, true);
}
/*
@@ -375,7 +375,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelFileNumber fileNumber,
+ bool add_okay)
{
int32 i;
@@ -384,7 +385,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ map->mappings[i].mapfilenumber = fileNumber;
return;
}
}
@@ -396,7 +397,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ map->mappings[map->num_mappings].mapfilenumber = fileNumber;
map->num_mappings++;
}
@@ -415,7 +416,7 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ updates->mappings[i].mapfilenumber,
add_okay);
}
}
@@ -983,12 +984,12 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
for (i = 0; i < newmap->num_mappings; i++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.spcNode = tsid;
- rnode.dbNode = dbid;
- rnode.relNode = newmap->mappings[i].mapfilenode;
- RelationPreserveStorage(rnode, false);
+ rlocator.spcOid = tsid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = newmap->mappings[i].mapfilenumber;
+ RelationPreserveStorage(rlocator, false);
}
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7cc9c72..30b2f85 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4805,16 +4805,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
bool is_index)
{
PQExpBuffer upgrade_query = createPQExpBuffer();
- PGresult *upgrade_res;
- Oid relfilenode;
- Oid toast_oid;
- Oid toast_relfilenode;
- char relkind;
- Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ PGresult *upgrade_res;
+ RelFileNumber relfilenumber;
+ Oid toast_oid;
+ RelFileNumber toast_relfilenumber;
+ char relkind;
+ Oid toast_index_oid;
+ RelFileNumber toast_index_relfilenumber;
/*
- * Preserve the OID and relfilenode of the table, table's index, table's
+ * Preserve the OID and relfilenumber of the table, table's index, table's
* toast table and toast table's index if any.
*
* One complexity is that the current table definition might not require
@@ -4837,15 +4837,15 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
@@ -4859,13 +4859,13 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
/*
* Not every relation has storage. Also, in a pre-v12 database,
- * partitioned tables have a relfilenode, which should not be
+ * partitioned tables have a relfilenumber, which should not be
* preserved when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
/*
* In a pre-v12 database, partitioned tables might be marked as having
@@ -4879,7 +4879,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
- toast_relfilenode);
+ toast_relfilenumber);
/* every toast table has an index */
appendPQExpBuffer(upgrade_buffer,
@@ -4887,20 +4887,20 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- toast_index_relfilenode);
+ toast_index_relfilenumber);
}
PQclear(upgrade_res);
}
else
{
- /* Preserve the OID and relfilenode of the index */
+ /* Preserve the OID and relfilenumber of the index */
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
}
appendPQExpBufferChar(upgrade_buffer, '\n');
diff --git a/src/bin/pg_rewind/datapagemap.h b/src/bin/pg_rewind/datapagemap.h
index ae4965f..235b676 100644
--- a/src/bin/pg_rewind/datapagemap.h
+++ b/src/bin/pg_rewind/datapagemap.h
@@ -10,7 +10,7 @@
#define DATAPAGEMAP_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
struct datapagemap
{
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 6252931..269ed64 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -56,7 +56,7 @@ static uint32 hash_string_pointer(const char *s);
static filehash_hash *filehash;
static bool isRelDataFile(const char *path);
-static char *datasegpath(RelFileNode rnode, ForkNumber forknum,
+static char *datasegpath(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber segno);
static file_entry_t *insert_filehash_entry(const char *path);
@@ -288,7 +288,7 @@ process_target_file(const char *path, file_type_t type, size_t size,
* hash table!
*/
void
-process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
+process_target_wal_block_change(ForkNumber forknum, RelFileLocator rlocator,
BlockNumber blkno)
{
char *path;
@@ -299,7 +299,7 @@ process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
segno = blkno / RELSEG_SIZE;
blkno_inseg = blkno % RELSEG_SIZE;
- path = datasegpath(rnode, forknum, segno);
+ path = datasegpath(rlocator, forknum, segno);
entry = lookup_filehash_entry(path);
pfree(path);
@@ -508,7 +508,7 @@ print_filemap(filemap_t *filemap)
static bool
isRelDataFile(const char *path)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
unsigned int segNo;
int nmatch;
bool matched;
@@ -532,32 +532,32 @@ isRelDataFile(const char *path)
*
*----
*/
- rnode.spcNode = InvalidOid;
- rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rlocator.spcOid = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.relNumber = InvalidRelFileNumber;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
- rnode.spcNode = GLOBALTABLESPACE_OID;
- rnode.dbNode = 0;
+ rlocator.spcOid = GLOBALTABLESPACE_OID;
+ rlocator.dbOid = 0;
matched = true;
}
else
{
nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ &rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
- rnode.spcNode = DEFAULTTABLESPACE_OID;
+ rlocator.spcOid = DEFAULTTABLESPACE_OID;
matched = true;
}
else
{
nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ &rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
matched = true;
@@ -567,12 +567,12 @@ isRelDataFile(const char *path)
/*
* The sscanf tests above can match files that have extra characters at
* the end. To eliminate such cases, cross-check that GetRelationPath
- * creates the exact same filename, when passed the RelFileNode
+ * creates the exact same filename, when passed the RelFileLocator
* information we extracted from the filename.
*/
if (matched)
{
- char *check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
+ char *check_path = datasegpath(rlocator, MAIN_FORKNUM, segNo);
if (strcmp(check_path, path) != 0)
matched = false;
@@ -589,12 +589,12 @@ isRelDataFile(const char *path)
* The returned path is palloc'd
*/
static char *
-datasegpath(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
+datasegpath(RelFileLocator rlocator, ForkNumber forknum, BlockNumber segno)
{
char *path;
char *segpath;
- path = relpathperm(rnode, forknum);
+ path = relpathperm(rlocator, forknum);
if (segno > 0)
{
segpath = psprintf("%s.%u", path, segno);
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 096f57a..0e011fb 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -10,7 +10,7 @@
#include "datapagemap.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* these enum values are sorted in the order we want actions to be processed */
typedef enum
@@ -103,7 +103,7 @@ extern void process_source_file(const char *path, file_type_t type,
extern void process_target_file(const char *path, file_type_t type,
size_t size, const char *link_target);
extern void process_target_wal_block_change(ForkNumber forknum,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blkno);
extern filemap_t *decide_file_actions(void);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index c6792da..d97240e 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -445,18 +445,18 @@ extractPageInfo(XLogReaderState *record)
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
- ForkNumber forknum;
- BlockNumber blkno;
+ RelFileLocator rlocator;
+ ForkNumber forknum;
+ BlockNumber blkno;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
continue;
/* We only care about the main fork; others are copied in toto */
if (forknum != MAIN_FORKNUM)
continue;
- process_target_wal_block_change(forknum, rnode, blkno);
+ process_target_wal_block_change(forknum, rlocator, blkno);
}
}
diff --git a/src/bin/pg_rewind/pg_rewind.h b/src/bin/pg_rewind/pg_rewind.h
index 393182f..8b4b50a 100644
--- a/src/bin/pg_rewind/pg_rewind.h
+++ b/src/bin/pg_rewind/pg_rewind.h
@@ -16,7 +16,7 @@
#include "datapagemap.h"
#include "libpq-fe.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* Configuration options */
extern char *datadir_target;
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 587793e..7f8042f 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -19,7 +19,7 @@ OBJS = \
option.o \
parallel.o \
pg_upgrade.o \
- relfilenode.o \
+ relfilenumber.o \
server.o \
tablespace.o \
util.o \
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5c3968e..b45a32c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -190,9 +190,9 @@ create_rel_filename_map(const char *old_data, const char *new_data,
map->new_tablespace_suffix = new_cluster.tablespace_suffix;
}
- /* DB oid and relfilenodes are preserved between old and new cluster */
+ /* DB oid and relfilenumbers are preserved between old and new cluster */
map->db_oid = old_db->db_oid;
- map->relfilenode = old_rel->relfilenode;
+ map->relfilenumber = old_rel->relfilenumber;
/* used only for logging and error reporting, old/new are identical */
map->nspname = old_rel->nspname;
@@ -399,7 +399,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
+ i_relfilenumber,
i_reltablespace;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
@@ -495,7 +495,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_toastheap = PQfnumber(res, "toastheap");
i_nspname = PQfnumber(res, "nspname");
i_relname = PQfnumber(res, "relname");
- i_relfilenode = PQfnumber(res, "relfilenode");
+ i_relfilenumber = PQfnumber(res, "relfilenode");
i_reltablespace = PQfnumber(res, "reltablespace");
i_spclocation = PQfnumber(res, "spclocation");
@@ -527,7 +527,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 55de244..30c3ee6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -132,15 +132,15 @@ extern char *output_files[];
typedef struct
{
/* Can't use NAMEDATALEN; not guaranteed to be same on client */
- char *nspname; /* namespace name */
- char *relname; /* relation name */
- Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
- Oid indtable; /* if index, OID of its table, else 0 */
- Oid toastheap; /* if toast table, OID of base table, else 0 */
- char *tablespace; /* tablespace path; "" for cluster default */
- bool nsp_alloc; /* should nspname be freed? */
- bool tblsp_alloc; /* should tablespace be freed? */
+ char *nspname; /* namespace name */
+ char *relname; /* relation name */
+ Oid reloid; /* relation OID */
+ RelFileNumber relfilenumber; /* relation file number */
+ Oid indtable; /* if index, OID of its table, else 0 */
+ Oid toastheap; /* if toast table, OID of base table, else 0 */
+ char *tablespace; /* tablespace path; "" for cluster default */
+ bool nsp_alloc; /* should nspname be freed? */
+ bool tblsp_alloc; /* should tablespace be freed? */
} RelInfo;
typedef struct
@@ -159,7 +159,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
@@ -400,7 +400,7 @@ void parseCommandLine(int argc, char *argv[]);
void adjust_data_dir(ClusterInfo *cluster);
void get_sock_dir(ClusterInfo *cluster, bool live_check);
-/* relfilenode.c */
+/* relfilenumber.c */
void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
deleted file mode 100644
index d23ac88..0000000
--- a/src/bin/pg_upgrade/relfilenode.c
+++ /dev/null
@@ -1,259 +0,0 @@
-/*
- * relfilenode.c
- *
- * relfilenode functions
- *
- * Copyright (c) 2010-2022, PostgreSQL Global Development Group
- * src/bin/pg_upgrade/relfilenode.c
- */
-
-#include "postgres_fe.h"
-
-#include <sys/stat.h>
-
-#include "access/transam.h"
-#include "catalog/pg_class_d.h"
-#include "pg_upgrade.h"
-
-static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
-static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
-
-
-/*
- * transfer_all_new_tablespaces()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata)
-{
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- prep_status_progress("Cloning user relation files");
- break;
- case TRANSFER_MODE_COPY:
- prep_status_progress("Copying user relation files");
- break;
- case TRANSFER_MODE_LINK:
- prep_status_progress("Linking user relation files");
- break;
- }
-
- /*
- * Transferring files by tablespace is tricky because a single database
- * can use multiple tablespaces. For non-parallel mode, we just pass a
- * NULL tablespace path, which matches all tablespaces. In parallel mode,
- * we pass the default tablespace and all user-created tablespaces and let
- * those operations happen in parallel.
- */
- if (user_opts.jobs <= 1)
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, NULL);
- else
- {
- int tblnum;
-
- /* transfer default tablespace */
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, old_pgdata);
-
- for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
- parallel_transfer_all_new_dbs(old_db_arr,
- new_db_arr,
- old_pgdata,
- new_pgdata,
- os_info.old_tablespaces[tblnum]);
- /* reap all children */
- while (reap_child(true) == true)
- ;
- }
-
- end_progress_output();
- check_ok();
-}
-
-
-/*
- * transfer_all_new_dbs()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata, char *old_tablespace)
-{
- int old_dbnum,
- new_dbnum;
-
- /* Scan the old cluster databases and transfer their files */
- for (old_dbnum = new_dbnum = 0;
- old_dbnum < old_db_arr->ndbs;
- old_dbnum++, new_dbnum++)
- {
- DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
- *new_db = NULL;
- FileNameMap *mappings;
- int n_maps;
-
- /*
- * Advance past any databases that exist in the new cluster but not in
- * the old, e.g. "postgres". (The user might have removed the
- * 'postgres' database from the old cluster.)
- */
- for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
- {
- new_db = &new_db_arr->dbs[new_dbnum];
- if (strcmp(old_db->db_name, new_db->db_name) == 0)
- break;
- }
-
- if (new_dbnum >= new_db_arr->ndbs)
- pg_fatal("old database \"%s\" not found in the new cluster\n",
- old_db->db_name);
-
- mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
- new_pgdata);
- if (n_maps)
- {
- transfer_single_new_db(mappings, n_maps, old_tablespace);
- }
- /* We allocate something even for n_maps == 0 */
- pg_free(mappings);
- }
-}
-
-/*
- * transfer_single_new_db()
- *
- * create links for mappings stored in "maps" array.
- */
-static void
-transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
-{
- int mapnum;
- bool vm_must_add_frozenbit = false;
-
- /*
- * Do we need to rewrite visibilitymap?
- */
- if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
- new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
- vm_must_add_frozenbit = true;
-
- for (mapnum = 0; mapnum < size; mapnum++)
- {
- if (old_tablespace == NULL ||
- strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
- {
- /* transfer primary file */
- transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
-
- /*
- * Copy/link any fsm and vm files, if they exist
- */
- transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
- transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
- }
- }
-}
-
-
-/*
- * transfer_relfile()
- *
- * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
- * is true, visibility map forks are converted and rewritten, even in link
- * mode.
- */
-static void
-transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
-{
- char old_file[MAXPGPATH];
- char new_file[MAXPGPATH];
- int segno;
- char extent_suffix[65];
- struct stat statbuf;
-
- /*
- * Now copy/link any related segments as well. Remember, PG breaks large
- * files into 1GB segments, the first segment has no extension, subsequent
- * segments are named relfilenode.1, relfilenode.2, relfilenode.3.
- */
- for (segno = 0;; segno++)
- {
- if (segno == 0)
- extent_suffix[0] = '\0';
- else
- snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
-
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
- map->old_tablespace,
- map->old_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
- map->new_tablespace,
- map->new_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
-
- /* Is it an extent, fsm, or vm file? */
- if (type_suffix[0] != '\0' || segno != 0)
- {
- /* Did file open fail? */
- if (stat(old_file, &statbuf) != 0)
- {
- /* File does not exist? That's OK, just return */
- if (errno == ENOENT)
- return;
- else
- pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
- map->nspname, map->relname, old_file, new_file,
- strerror(errno));
- }
-
- /* If file is empty, just return */
- if (statbuf.st_size == 0)
- return;
- }
-
- unlink(new_file);
-
- /* Copying files might take some time, so give feedback. */
- pg_log(PG_STATUS, "%s", old_file);
-
- if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
- {
- /* Need to rewrite visibility map format */
- pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
- old_file, new_file);
- rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
- }
- else
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
- old_file, new_file);
- cloneFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_COPY:
- pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
- old_file, new_file);
- copyFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_LINK:
- pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
- old_file, new_file);
- linkFile(old_file, new_file, map->nspname, map->relname);
- }
- }
-}
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
new file mode 100644
index 0000000..b3ad820
--- /dev/null
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -0,0 +1,259 @@
+/*
+ * relfilenumber.c
+ *
+ * relfilenumber functions
+ *
+ * Copyright (c) 2010-2022, PostgreSQL Global Development Group
+ * src/bin/pg_upgrade/relfilenumber.c
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+
+#include "access/transam.h"
+#include "catalog/pg_class_d.h"
+#include "pg_upgrade.h"
+
+static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
+static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
+
+
+/*
+ * transfer_all_new_tablespaces()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata)
+{
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ prep_status_progress("Cloning user relation files");
+ break;
+ case TRANSFER_MODE_COPY:
+ prep_status_progress("Copying user relation files");
+ break;
+ case TRANSFER_MODE_LINK:
+ prep_status_progress("Linking user relation files");
+ break;
+ }
+
+ /*
+ * Transferring files by tablespace is tricky because a single database
+ * can use multiple tablespaces. For non-parallel mode, we just pass a
+ * NULL tablespace path, which matches all tablespaces. In parallel mode,
+ * we pass the default tablespace and all user-created tablespaces and let
+ * those operations happen in parallel.
+ */
+ if (user_opts.jobs <= 1)
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, NULL);
+ else
+ {
+ int tblnum;
+
+ /* transfer default tablespace */
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, old_pgdata);
+
+ for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+ parallel_transfer_all_new_dbs(old_db_arr,
+ new_db_arr,
+ old_pgdata,
+ new_pgdata,
+ os_info.old_tablespaces[tblnum]);
+ /* reap all children */
+ while (reap_child(true) == true)
+ ;
+ }
+
+ end_progress_output();
+ check_ok();
+}
+
+
+/*
+ * transfer_all_new_dbs()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata, char *old_tablespace)
+{
+ int old_dbnum,
+ new_dbnum;
+
+ /* Scan the old cluster databases and transfer their files */
+ for (old_dbnum = new_dbnum = 0;
+ old_dbnum < old_db_arr->ndbs;
+ old_dbnum++, new_dbnum++)
+ {
+ DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
+ *new_db = NULL;
+ FileNameMap *mappings;
+ int n_maps;
+
+ /*
+ * Advance past any databases that exist in the new cluster but not in
+ * the old, e.g. "postgres". (The user might have removed the
+ * 'postgres' database from the old cluster.)
+ */
+ for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
+ {
+ new_db = &new_db_arr->dbs[new_dbnum];
+ if (strcmp(old_db->db_name, new_db->db_name) == 0)
+ break;
+ }
+
+ if (new_dbnum >= new_db_arr->ndbs)
+ pg_fatal("old database \"%s\" not found in the new cluster\n",
+ old_db->db_name);
+
+ mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
+ new_pgdata);
+ if (n_maps)
+ {
+ transfer_single_new_db(mappings, n_maps, old_tablespace);
+ }
+ /* We allocate something even for n_maps == 0 */
+ pg_free(mappings);
+ }
+}
+
+/*
+ * transfer_single_new_db()
+ *
+ * create links for mappings stored in "maps" array.
+ */
+static void
+transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
+{
+ int mapnum;
+ bool vm_must_add_frozenbit = false;
+
+ /*
+ * Do we need to rewrite visibilitymap?
+ */
+ if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
+ new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
+ vm_must_add_frozenbit = true;
+
+ for (mapnum = 0; mapnum < size; mapnum++)
+ {
+ if (old_tablespace == NULL ||
+ strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
+ {
+ /* transfer primary file */
+ transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
+
+ /*
+ * Copy/link any fsm and vm files, if they exist
+ */
+ transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
+ transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
+ }
+ }
+}
+
+
+/*
+ * transfer_relfile()
+ *
+ * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
+ * is true, visibility map forks are converted and rewritten, even in link
+ * mode.
+ */
+static void
+transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
+{
+ char old_file[MAXPGPATH];
+ char new_file[MAXPGPATH];
+ int segno;
+ char extent_suffix[65];
+ struct stat statbuf;
+
+ /*
+ * Now copy/link any related segments as well. Remember, PG breaks large
+ * files into 1GB segments, the first segment has no extension, subsequent
+ * segments are named relfilenumber.1, relfilenumber.2, relfilenumber.3.
+ */
+ for (segno = 0;; segno++)
+ {
+ if (segno == 0)
+ extent_suffix[0] = '\0';
+ else
+ snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
+
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ map->old_tablespace,
+ map->old_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ map->new_tablespace,
+ map->new_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+
+ /* Is it an extent, fsm, or vm file? */
+ if (type_suffix[0] != '\0' || segno != 0)
+ {
+ /* Did file open fail? */
+ if (stat(old_file, &statbuf) != 0)
+ {
+ /* File does not exist? That's OK, just return */
+ if (errno == ENOENT)
+ return;
+ else
+ pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+ map->nspname, map->relname, old_file, new_file,
+ strerror(errno));
+ }
+
+ /* If file is empty, just return */
+ if (statbuf.st_size == 0)
+ return;
+ }
+
+ unlink(new_file);
+
+ /* Copying files might take some time, so give feedback. */
+ pg_log(PG_STATUS, "%s", old_file);
+
+ if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
+ {
+ /* Need to rewrite visibility map format */
+ pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
+ }
+ else
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ cloneFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_COPY:
+ pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ copyFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_LINK:
+ pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ linkFile(old_file, new_file, map->nspname, map->relname);
+ }
+ }
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5dc6010..0fdde9d 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -37,7 +37,7 @@ static const char *progname;
static int WalSegSz;
static volatile sig_atomic_t time_to_stop = false;
-static const RelFileNode emptyRelFileNode = {0, 0, 0};
+static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
typedef struct XLogDumpPrivate
{
@@ -63,7 +63,7 @@ typedef struct XLogDumpConfig
bool filter_by_rmgr_enabled;
TransactionId filter_by_xid;
bool filter_by_xid_enabled;
- RelFileNode filter_by_relation;
+ RelFileLocator filter_by_relation;
bool filter_by_extended;
bool filter_by_relation_enabled;
BlockNumber filter_by_relation_block;
@@ -393,7 +393,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
*/
static bool
XLogRecordMatchesRelationBlock(XLogReaderState *record,
- RelFileNode matchRnode,
+ RelFileLocator matchRlocator,
BlockNumber matchBlock,
ForkNumber matchFork)
{
@@ -401,17 +401,17 @@ XLogRecordMatchesRelationBlock(XLogReaderState *record,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if ((matchFork == InvalidForkNumber || matchFork == forknum) &&
- (RelFileNodeEquals(matchRnode, emptyRelFileNode) ||
- RelFileNodeEquals(matchRnode, rnode)) &&
+ (RelFileLocatorEquals(matchRlocator, emptyRelFileLocator) ||
+ RelFileLocatorEquals(matchRlocator, rlocator)) &&
(matchBlock == InvalidBlockNumber || matchBlock == blk))
return true;
}
@@ -885,11 +885,11 @@ main(int argc, char **argv)
break;
case 'R':
if (sscanf(optarg, "%u/%u/%u",
- &config.filter_by_relation.spcNode,
- &config.filter_by_relation.dbNode,
- &config.filter_by_relation.relNode) != 3 ||
- !OidIsValid(config.filter_by_relation.spcNode) ||
- !OidIsValid(config.filter_by_relation.relNode))
+ &config.filter_by_relation.spcOid,
+ &config.filter_by_relation.dbOid,
+ &config.filter_by_relation.relNumber) != 3 ||
+ !OidIsValid(config.filter_by_relation.spcOid) ||
+ !OidIsValid(config.filter_by_relation.relNumber))
{
pg_log_error("invalid relation specification: \"%s\"", optarg);
pg_log_error_detail("Expecting \"tablespace OID/database OID/relation filenode\".");
@@ -1132,7 +1132,7 @@ main(int argc, char **argv)
!XLogRecordMatchesRelationBlock(xlogreader_state,
config.filter_by_relation_enabled ?
config.filter_by_relation :
- emptyRelFileNode,
+ emptyRelFileLocator,
config.filter_by_relation_block_enabled ?
config.filter_by_relation_block :
InvalidBlockNumber,
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..1b6b620 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -107,24 +107,24 @@ forkname_chars(const char *str, ForkNumber *fork)
* XXX this must agree with GetRelationPath()!
*/
char *
-GetDatabasePath(Oid dbNode, Oid spcNode)
+GetDatabasePath(Oid dbOid, Oid spcOid)
{
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
return pstrdup("global");
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
- return psprintf("base/%u", dbNode);
+ return psprintf("base/%u", dbOid);
}
else
{
/* All other tablespaces are accessed via symlinks */
return psprintf("pg_tblspc/%u/%s/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY, dbNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY, dbOid);
}
}
@@ -138,44 +138,44 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber)
{
char *path;
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
path = psprintf("global/%u_%s",
- relNode, forkNames[forkNumber]);
+ relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/%u", relNumber);
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/%u_%s",
- dbNode, relNode,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/%u",
- dbNode, relNode);
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
- dbNode, backendId, relNode,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/t%d_%u",
- dbNode, backendId, relNode);
+ dbOid, backendId, relNumber);
}
}
else
@@ -185,25 +185,25 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber);
}
}
return path;
diff --git a/src/include/access/brin_xlog.h b/src/include/access/brin_xlog.h
index 95bfc7e..012a9af 100644
--- a/src/include/access/brin_xlog.h
+++ b/src/include/access/brin_xlog.h
@@ -18,7 +18,7 @@
#include "lib/stringinfo.h"
#include "storage/bufpage.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
diff --git a/src/include/access/ginxlog.h b/src/include/access/ginxlog.h
index 21de389..7f98503 100644
--- a/src/include/access/ginxlog.h
+++ b/src/include/access/ginxlog.h
@@ -110,7 +110,7 @@ typedef struct
typedef struct ginxlogSplit
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber rrlink; /* right link, or root's blocknumber if root
* split */
BlockNumber leftChildBlkno; /* valid on a non-leaf split */
@@ -167,7 +167,7 @@ typedef struct ginxlogDeletePage
*/
typedef struct ginxlogUpdateMeta
{
- RelFileNode node;
+ RelFileLocator locator;
GinMetaPageData metadata;
BlockNumber prevTail;
BlockNumber newRightlink;
diff --git a/src/include/access/gistxlog.h b/src/include/access/gistxlog.h
index 4537e67..9bbe4c2 100644
--- a/src/include/access/gistxlog.h
+++ b/src/include/access/gistxlog.h
@@ -97,7 +97,7 @@ typedef struct gistxlogPageDelete
*/
typedef struct gistxlogPageReuse
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} gistxlogPageReuse;
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 2d8a7f6..1705e73 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
@@ -370,9 +370,9 @@ typedef struct xl_heap_new_cid
CommandId combocid; /* just for debugging */
/*
- * Store the relfilenode/ctid pair to facilitate lookups.
+ * Store the relfilelocator/ctid pair to facilitate lookups.
*/
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
} xl_heap_new_cid;
@@ -415,7 +415,7 @@ extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
MultiXactId *relminmxid_out);
extern void heap_execute_freeze_tuple(HeapTupleHeader tuple,
xl_heap_freeze_tuple *xlrec_tp);
-extern XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer,
+extern XLogRecPtr log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer,
Buffer vm_buffer, TransactionId cutoff_xid, uint8 flags);
#endif /* HEAPAM_XLOG_H */
diff --git a/src/include/access/nbtxlog.h b/src/include/access/nbtxlog.h
index de362d3..d79489e 100644
--- a/src/include/access/nbtxlog.h
+++ b/src/include/access/nbtxlog.h
@@ -180,12 +180,12 @@ typedef struct xl_btree_dedup
* This is what we need to know about page reuse within btree. This record
* only exists to generate a conflict point for Hot Standby.
*
- * Note that we must include a RelFileNode in the record because we don't
+ * Note that we must include a RelFileLocator in the record because we don't
* actually register the buffer with the record.
*/
typedef struct xl_btree_reuse_page
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} xl_btree_reuse_page;
diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h
index 3e27790..353cbb2 100644
--- a/src/include/access/rewriteheap.h
+++ b/src/include/access/rewriteheap.h
@@ -15,7 +15,7 @@
#include "access/htup.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* struct definition is private to rewriteheap.c */
@@ -34,8 +34,8 @@ extern bool rewrite_heap_dead_tuple(RewriteState state, HeapTuple oldTuple);
*/
typedef struct LogicalRewriteMappingData
{
- RelFileNode old_node;
- RelFileNode new_node;
+ RelFileLocator old_locator;
+ RelFileLocator new_locator;
ItemPointerData old_tid;
ItemPointerData new_tid;
} LogicalRewriteMappingData;
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index fe869c6..83a8e7e 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -560,32 +560,32 @@ typedef struct TableAmRoutine
*/
/*
- * This callback needs to create a new relation filenode for `rel`, with
+ * This callback needs to create a new relation filelocator for `rel`, with
* appropriate durability behaviour for `persistence`.
*
* Note that only the subset of the relcache filled by
* RelationBuildLocalRelation() can be relied upon and that the relation's
* catalog entries will either not yet exist (new relation), or will still
- * reference the old relfilenode.
+ * reference the old relfilelocator.
*
* As output *freezeXid, *minmulti must be set to the values appropriate
* for pg_class.{relfrozenxid, relminmxid}. For AMs that don't need those
* fields to be filled they can be set to InvalidTransactionId and
* InvalidMultiXactId, respectively.
*
- * See also table_relation_set_new_filenode().
+ * See also table_relation_set_new_filelocator().
*/
- void (*relation_set_new_filenode) (Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti);
+ void (*relation_set_new_filelocator) (Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti);
/*
* This callback needs to remove all contents from `rel`'s current
- * relfilenode. No provisions for transactional behaviour need to be made.
- * Often this can be implemented by truncating the underlying storage to
- * its minimal size.
+ * relfilelocator. No provisions for transactional behaviour need to be
+ * made. Often this can be implemented by truncating the underlying
+ * storage to its minimal size.
*
* See also table_relation_nontransactional_truncate().
*/
@@ -598,7 +598,7 @@ typedef struct TableAmRoutine
* storage, unless it contains references to the tablespace internally.
*/
void (*relation_copy_data) (Relation rel,
- const RelFileNode *newrnode);
+ const RelFileLocator *newrlocator);
/* See table_relation_copy_for_cluster() */
void (*relation_copy_for_cluster) (Relation NewTable,
@@ -1348,7 +1348,7 @@ table_index_delete_tuples(Relation rel, TM_IndexDeleteOp *delstate)
* RelationGetBufferForTuple. See that method for more information.
*
* TABLE_INSERT_FROZEN should only be specified for inserts into
- * relfilenodes created during the current subtransaction and when
+ * relfilenumbers created during the current subtransaction and when
* there are no prior snapshots or pre-existing portals open.
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
@@ -1577,33 +1577,34 @@ table_finish_bulk_insert(Relation rel, int options)
*/
/*
- * Create storage for `rel` in `newrnode`, with persistence set to
+ * Create storage for `rel` in `newrlocator`, with persistence set to
* `persistence`.
*
* This is used both during relation creation and various DDL operations to
- * create a new relfilenode that can be filled from scratch. When creating
- * new storage for an existing relfilenode, this should be called before the
+ * create a new relfilelocator that can be filled from scratch. When creating
+ * new storage for an existing relfilelocator, this should be called before the
* relcache entry has been updated.
*
* *freezeXid, *minmulti are set to the xid / multixact horizon for the table
* that pg_class.{relfrozenxid, relminmxid} have to be set to.
*/
static inline void
-table_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+table_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
- rel->rd_tableam->relation_set_new_filenode(rel, newrnode, persistence,
- freezeXid, minmulti);
+ rel->rd_tableam->relation_set_new_filelocator(rel, newrlocator,
+ persistence, freezeXid,
+ minmulti);
}
/*
* Remove all table contents from `rel`, in a non-transactional manner.
* Non-transactional meaning that there's no need to support rollbacks. This
- * commonly only is used to perform truncations for relfilenodes created in the
- * current transaction.
+ * commonly only is used to perform truncations for relfilelocators created in
+ * the current transaction.
*/
static inline void
table_relation_nontransactional_truncate(Relation rel)
@@ -1612,15 +1613,15 @@ table_relation_nontransactional_truncate(Relation rel)
}
/*
- * Copy data from `rel` into the new relfilenode `newrnode`. The new
- * relfilenode may not have storage associated before this function is
+ * Copy data from `rel` into the new relfilelocator `newrlocator`. The new
+ * relfilelocator may not have storage associated before this function is
* called. This is only supposed to be used for low level operations like
* changing a relation's tablespace.
*/
static inline void
-table_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+table_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
- rel->rd_tableam->relation_copy_data(rel, newrnode);
+ rel->rd_tableam->relation_copy_data(rel, newrlocator);
}
/*
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 4794941..7d2b352 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -19,7 +19,7 @@
#include "datatype/timestamp.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/sinval.h"
/*
@@ -174,7 +174,7 @@ typedef struct SavedTransactionCharacteristics
*/
#define XACT_XINFO_HAS_DBINFO (1U << 0)
#define XACT_XINFO_HAS_SUBXACTS (1U << 1)
-#define XACT_XINFO_HAS_RELFILENODES (1U << 2)
+#define XACT_XINFO_HAS_RELFILELOCATORS (1U << 2)
#define XACT_XINFO_HAS_INVALS (1U << 3)
#define XACT_XINFO_HAS_TWOPHASE (1U << 4)
#define XACT_XINFO_HAS_ORIGIN (1U << 5)
@@ -252,12 +252,12 @@ typedef struct xl_xact_subxacts
} xl_xact_subxacts;
#define MinSizeOfXactSubxacts offsetof(xl_xact_subxacts, subxacts)
-typedef struct xl_xact_relfilenodes
+typedef struct xl_xact_relfilelocators
{
int nrels; /* number of relations */
- RelFileNode xnodes[FLEXIBLE_ARRAY_MEMBER];
-} xl_xact_relfilenodes;
-#define MinSizeOfXactRelfilenodes offsetof(xl_xact_relfilenodes, xnodes)
+ RelFileLocator xlocators[FLEXIBLE_ARRAY_MEMBER];
+} xl_xact_relfilelocators;
+#define MinSizeOfXactRelfileLocators offsetof(xl_xact_relfilelocators, xlocators)
/*
* A transactionally dropped statistics entry.
@@ -305,7 +305,7 @@ typedef struct xl_xact_commit
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* xl_xact_invals follows if XINFO_HAS_INVALS */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -321,7 +321,7 @@ typedef struct xl_xact_abort
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* No invalidation messages needed. */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -367,7 +367,7 @@ typedef struct xl_xact_parsed_commit
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -378,7 +378,7 @@ typedef struct xl_xact_parsed_commit
TransactionId twophase_xid; /* only for 2PC */
char twophase_gid[GIDSIZE]; /* only for 2PC */
int nabortrels; /* only for 2PC */
- RelFileNode *abortnodes; /* only for 2PC */
+ RelFileLocator *abortlocators; /* only for 2PC */
int nabortstats; /* only for 2PC */
xl_xact_stats_item *abortstats; /* only for 2PC */
@@ -400,7 +400,7 @@ typedef struct xl_xact_parsed_abort
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -483,7 +483,7 @@ extern int xactGetCommittedChildren(TransactionId **ptr);
extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int nmsgs, SharedInvalidationMessage *msgs,
@@ -494,7 +494,7 @@ extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
extern XLogRecPtr XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int xactflags, TransactionId twophase_xid,
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index fae0bef..3524c39 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,7 +25,7 @@
#include "lib/stringinfo.h"
#include "pgtime.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index 5fc340c..c04f77b 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "storage/block.h"
#include "storage/buf.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/*
@@ -45,16 +45,16 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int max_block_id, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
+extern void XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator,
ForkNumber forknum, BlockNumber blknum, char *page,
uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
+extern XLogRecPtr log_newpage(RelFileLocator *rlocator, ForkNumber forkNum,
BlockNumber blk, char *page, bool page_std);
-extern void log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+extern void log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, char **pages, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index e73ea4a..5395f15 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -122,7 +122,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
@@ -430,10 +430,10 @@ extern FullTransactionId XLogRecGetFullXid(XLogReaderState *record);
extern bool RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern void XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum);
extern bool XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer);
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 052ac68..7e467ef 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "port/pg_crc32c.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* The overall layout of an XLOG record is:
@@ -97,7 +97,7 @@ typedef struct XLogRecordBlockHeader
* image) */
/* If BKPBLOCK_HAS_IMAGE, an XLogRecordBlockImageHeader struct follows */
- /* If BKPBLOCK_SAME_REL is not set, a RelFileNode follows */
+ /* If BKPBLOCK_SAME_REL is not set, a RelFileLocator follows */
/* BlockNumber follows */
} XLogRecordBlockHeader;
@@ -175,7 +175,7 @@ typedef struct XLogRecordBlockCompressHeader
(SizeOfXLogRecordBlockHeader + \
SizeOfXLogRecordBlockImageHeader + \
SizeOfXLogRecordBlockCompressHeader + \
- sizeof(RelFileNode) + \
+ sizeof(RelFileLocator) + \
sizeof(BlockNumber))
/*
@@ -187,7 +187,7 @@ typedef struct XLogRecordBlockCompressHeader
#define BKPBLOCK_HAS_IMAGE 0x10 /* block data is an XLogRecordBlockImage */
#define BKPBLOCK_HAS_DATA 0x20
#define BKPBLOCK_WILL_INIT 0x40 /* redo will re-init the page */
-#define BKPBLOCK_SAME_REL 0x80 /* RelFileNode omitted, same as previous */
+#define BKPBLOCK_SAME_REL 0x80 /* RelFileLocator omitted, same as previous */
/*
* XLogRecordDataHeaderShort/Long are used for the "main data" portion of
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index c9d0b75..ef18297 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -60,9 +60,9 @@ extern PGDLLIMPORT HotStandbyState standbyState;
extern bool XLogHaveInvalidPages(void);
extern void XLogCheckInvalidPages(void);
-extern void XLogDropRelation(RelFileNode rnode, ForkNumber forknum);
+extern void XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum);
extern void XLogDropDatabase(Oid dbid);
-extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+extern void XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks);
/* Result codes for XLogReadBufferForRedo[Extended] */
@@ -89,11 +89,11 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer);
-extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
+extern Relation CreateFakeRelcacheEntry(RelFileLocator rlocator);
extern void FreeFakeRelcacheEntry(Relation fakerel);
extern int read_local_xlog_page(XLogReaderState *state,
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..fd93442 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..66900f1 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,7 +38,8 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ Relation pg_class,
+ char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 07c5b88..5774c46 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1bdb00a 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/storage.h b/src/include/catalog/storage.h
index 59f3404..9964c31 100644
--- a/src/include/catalog/storage.h
+++ b/src/include/catalog/storage.h
@@ -15,23 +15,23 @@
#define STORAGE_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
/* GUC variables */
extern PGDLLIMPORT int wal_skip_threshold;
-extern SMgrRelation RelationCreateStorage(RelFileNode rnode,
+extern SMgrRelation RelationCreateStorage(RelFileLocator rlocator,
char relpersistence,
bool register_delete);
extern void RelationDropStorage(Relation rel);
-extern void RelationPreserveStorage(RelFileNode rnode, bool atCommit);
+extern void RelationPreserveStorage(RelFileLocator rlocator, bool atCommit);
extern void RelationPreTruncate(Relation rel);
extern void RelationTruncate(Relation rel, BlockNumber nblocks);
extern void RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
-extern bool RelFileNodeSkippingWAL(RelFileNode rnode);
+extern bool RelFileLocatorSkippingWAL(RelFileLocator rlocator);
extern Size EstimatePendingSyncsSpace(void);
extern void SerializePendingSyncs(Size maxSize, char *startAddress);
extern void RestorePendingSyncs(char *startAddress);
@@ -42,7 +42,7 @@ extern void RestorePendingSyncs(char *startAddress);
*/
extern void smgrDoPendingDeletes(bool isCommit);
extern void smgrDoPendingSyncs(bool isCommit, bool isParallelWorker);
-extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr);
+extern int smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr);
extern void AtSubCommit_smgr(void);
extern void AtSubAbort_smgr(void);
extern void PostPrepare_smgr(void);
diff --git a/src/include/catalog/storage_xlog.h b/src/include/catalog/storage_xlog.h
index 622de22..44a5e20 100644
--- a/src/include/catalog/storage_xlog.h
+++ b/src/include/catalog/storage_xlog.h
@@ -17,7 +17,7 @@
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Declarations for smgr-related XLOG records
@@ -32,7 +32,7 @@
typedef struct xl_smgr_create
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
} xl_smgr_create;
@@ -46,11 +46,11 @@ typedef struct xl_smgr_create
typedef struct xl_smgr_truncate
{
BlockNumber blkno;
- RelFileNode rnode;
+ RelFileLocator rlocator;
int flags;
} xl_smgr_truncate;
-extern void log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum);
+extern void log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum);
extern void smgr_redo(XLogReaderState *record);
extern void smgr_desc(StringInfo buf, XLogReaderState *record);
diff --git a/src/include/commands/sequence.h b/src/include/commands/sequence.h
index 9da2300..d38c0e2 100644
--- a/src/include/commands/sequence.h
+++ b/src/include/commands/sequence.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "parser/parse_node.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
typedef struct FormData_pg_sequence_data
@@ -47,7 +47,7 @@ typedef FormData_pg_sequence_data *Form_pg_sequence_data;
typedef struct xl_seq_rec
{
- RelFileNode node;
+ RelFileLocator locator;
/* SEQUENCE TUPLE DATA FOLLOWS AT THE END */
} xl_seq_rec;
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..4cc2b17 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ Oid newRelFileNumber);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index 24b6473..1f80907 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -50,7 +50,7 @@ extern void DropTableSpace(DropTableSpaceStmt *stmt);
extern ObjectAddress RenameTableSpace(const char *oldname, const char *newname);
extern Oid AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
-extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
+extern void TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo);
extern Oid GetDefaultTablespace(char relpersistence, bool partitioned);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 13849a3..3ab7132 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -64,27 +64,27 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
/*
* Stuff for computing filesystem pathnames for relations.
*/
-extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
+extern char *GetDatabasePath(Oid dbOid, Oid spcOid);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber);
/*
* Wrapper macros for GetRelationPath. Beware of multiple
- * evaluation of the RelFileNode or RelFileNodeBackend argument!
+ * evaluation of the RelFileLocator or RelFileLocatorBackend argument!
*/
-/* First argument is a RelFileNode */
-#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
+/* First argument is a RelFileLocator */
+#define relpathbackend(rlocator, backend, forknum) \
+ GetRelationPath((rlocator).dbOid, (rlocator).spcOid, (rlocator).relNumber, \
backend, forknum)
-/* First argument is a RelFileNode */
-#define relpathperm(rnode, forknum) \
- relpathbackend(rnode, InvalidBackendId, forknum)
+/* First argument is a RelFileLocator */
+#define relpathperm(rlocator, forknum) \
+ relpathbackend(rlocator, InvalidBackendId, forknum)
-/* First argument is a RelFileNodeBackend */
-#define relpath(rnode, forknum) \
- relpathbackend((rnode).node, (rnode).backend, forknum)
+/* First argument is a RelFileLocatorBackend */
+#define relpath(rlocator, forknum) \
+ relpathbackend((rlocator).locator, (rlocator).backend, forknum)
#endif /* RELPATH_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 73f635b..562f21c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3247,10 +3247,10 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
- SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
- SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
- * oldNode */
+ RelFileNumber oldNumber; /* relfilenumber of existing storage, if any */
+ SubTransactionId oldCreateSubid; /* rd_createSubid of oldNumber */
+ SubTransactionId oldFirstRelfilenumberSubid; /* rd_firstRelfilelocatorSubid
+ * of oldNumber */
bool unique; /* is index unique? */
bool nulls_not_distinct; /* null treatment for UNIQUE constraints */
bool primary; /* is index a primary key? */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..d8af68b 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,13 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelFileNumber data type identifies the specific relation file name.
+ */
+typedef Oid RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+#define RelFileNumberIsValid(relnumber) \
+ ((bool) ((relnumber) != InvalidRelFileNumber))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 2511ef4..b67fb1e 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -16,7 +16,7 @@
#define _BGWRITER_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 4a01f87..d109d0b 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -99,7 +99,7 @@ typedef struct ReorderBufferChange
struct
{
/* relation that has been changed */
- RelFileNode relnode;
+ RelFileLocator rlocator;
/* no previously reassembled toast chunks are necessary anymore */
bool clear_toast_afterwards;
@@ -145,7 +145,7 @@ typedef struct ReorderBufferChange
*/
struct
{
- RelFileNode node;
+ RelFileLocator locator;
ItemPointerData tid;
CommandId cmin;
CommandId cmax;
@@ -657,7 +657,7 @@ extern void ReorderBufferAddSnapshot(ReorderBuffer *, TransactionId, XLogRecPtr
extern void ReorderBufferAddNewCommandId(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
CommandId cid);
extern void ReorderBufferAddNewTupleCids(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
- RelFileNode node, ItemPointerData pt,
+ RelFileLocator locator, ItemPointerData pt,
CommandId cmin, CommandId cmax, CommandId combocid);
extern void ReorderBufferAddInvalidations(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
Size nmsgs, SharedInvalidationMessage *msgs);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index a17e7b2..d54e1f6 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,30 +90,30 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rlocator.spcOid = InvalidOid, \
+ (a).rlocator.dbOid = InvalidOid, \
+ (a).rlocator.relNumber = InvalidOid, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -291,11 +291,11 @@ extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
- Oid tsId;
- Oid relNode;
- ForkNumber forkNum;
- BlockNumber blockNum;
- int buf_id;
+ Oid tsId;
+ RelFileNumber relNumber;
+ ForkNumber forkNum;
+ BlockNumber blockNum;
+ int buf_id;
} CkptSortItem;
extern PGDLLIMPORT CkptSortItem *CkptBufferIds;
@@ -337,9 +337,9 @@ extern PrefetchBufferResult PrefetchLocalBuffer(SMgrRelation smgr,
extern BufferDesc *LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum,
BlockNumber blockNum, bool *foundPtr);
extern void MarkLocalBufferDirty(Buffer buffer);
-extern void DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
+extern void DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber firstDelBlock);
-extern void DropRelFileNodeAllLocalBuffers(RelFileNode rnode);
+extern void DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator);
extern void AtEOXact_LocalBuffers(bool isCommit);
#endif /* BUFMGR_INTERNALS_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 5839140..96e473e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,7 +17,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -176,13 +176,13 @@ extern PrefetchBufferResult PrefetchSharedBuffer(struct SMgrRelationData *smgr_r
BlockNumber blockNum);
extern PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum,
BlockNumber blockNum);
-extern bool ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum,
+extern bool ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, Buffer recent_buffer);
extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
+extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool permanent);
@@ -204,13 +204,13 @@ extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
extern void FlushOneBuffer(Buffer buffer);
extern void FlushRelationBuffers(Relation rel);
extern void FlushRelationsAllBuffers(struct SMgrRelationData **smgrs, int nrels);
-extern void CreateAndCopyRelationData(RelFileNode src_rnode,
- RelFileNode dst_rnode,
+extern void CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator,
bool permanent);
extern void FlushDatabaseBuffers(Oid dbid);
-extern void DropRelFileNodeBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
+extern void DropRelFileLocatorBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock);
-extern void DropRelFileNodesAllBuffers(struct SMgrRelationData **smgr_reln, int nnodes);
+extern void DropRelFileLocatorsAllBuffers(struct SMgrRelationData **smgr_reln, int nlocators);
extern void DropDatabaseBuffers(Oid dbid);
#define RelationGetNumberOfBlocks(reln) \
@@ -223,7 +223,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileLocator *rlocator,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/freespace.h b/src/include/storage/freespace.h
index dcc40eb..fcb0802 100644
--- a/src/include/storage/freespace.h
+++ b/src/include/storage/freespace.h
@@ -15,7 +15,7 @@
#define FREESPACE_H_
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* prototypes for public functions in freespace.c */
@@ -27,7 +27,7 @@ extern BlockNumber RecordAndGetPageWithFreeSpace(Relation rel,
Size spaceNeeded);
extern void RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk,
Size spaceAvail);
-extern void XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+extern void XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail);
extern BlockNumber FreeSpaceMapPrepareTruncateRel(Relation rel,
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ffffa40..10aa1b0 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -15,7 +15,7 @@
#define MD_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
@@ -25,7 +25,7 @@ extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
-extern void mdunlink(RelFileNodeBackend rnode, ForkNumber forknum, bool isRedo);
+extern void mdunlink(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo);
extern void mdextend(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern bool mdprefetch(SMgrRelation reln, ForkNumber forknum,
@@ -42,7 +42,7 @@ extern void mdtruncate(SMgrRelation reln, ForkNumber forknum,
extern void mdimmedsync(SMgrRelation reln, ForkNumber forknum);
extern void ForgetDatabaseSyncRequests(Oid dbid);
-extern void DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo);
+extern void DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo);
/* md sync callbacks */
extern int mdsyncfiletag(const FileTag *ftag, char *path);
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
new file mode 100644
index 0000000..7211fe7
--- /dev/null
+++ b/src/include/storage/relfilelocator.h
@@ -0,0 +1,99 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilelocator.h
+ * Physical access information for relations.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/relfilelocator.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILELOCATOR_H
+#define RELFILELOCATOR_H
+
+#include "common/relpath.h"
+#include "storage/backendid.h"
+
+/*
+ * RelFileLocator must provide all that we need to know to physically access
+ * a relation, with the exception of the backend ID, which can be provided
+ * separately. Note, however, that a "physical" relation is comprised of
+ * multiple files on the filesystem, as each fork is stored as a separate
+ * file, and each fork can be divided into multiple segments. See md.c.
+ *
+ * spcOid identifies the tablespace of the relation. It corresponds to
+ * pg_tablespace.oid.
+ *
+ * dbOid identifies the database of the relation. It is zero for
+ * "shared" relations (those common to all databases of a cluster).
+ * Nonzero dbOid values correspond to pg_database.oid.
+ *
+ * relNumber identifies the specific relation. relNumber corresponds to
+ * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
+ * to assign new physical files to relations in some situations).
+ * Notice that relNumber is only unique within a database in a particular
+ * tablespace.
+ *
+ * Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
+ * zero. We support shared relations only in the "global" tablespace.
+ *
+ * Note: in pg_class we allow reltablespace == 0 to denote that the
+ * relation is stored in its database's "default" tablespace (as
+ * identified by pg_database.dattablespace). However this shorthand
+ * is NOT allowed in RelFileLocator structs --- the real tablespace ID
+ * must be supplied when setting spcOid.
+ *
+ * Note: in pg_class, relfilenode can be zero to denote that the relation
+ * is a "mapped" relation, whose current true filenode number is available
+ * from relmapper.c. Again, this case is NOT allowed in RelFileLocators.
+ *
+ * Note: various places use RelFileLocator in hashtable keys. Therefore,
+ * there *must not* be any unused padding bytes in this struct. That
+ * should be safe as long as all the fields are of type Oid.
+ */
+typedef struct RelFileLocator
+{
+ Oid spcOid; /* tablespace */
+ Oid dbOid; /* database */
+ RelFileNumber relNumber; /* relation */
+} RelFileLocator;
+
+/*
+ * Augmenting a relfilelocator with the backend ID provides all the information
+ * we need to locate the physical storage. The backend ID is InvalidBackendId
+ * for regular relations (those accessible to more than one backend), or the
+ * owning backend's ID for backend-local relations. Backend-local relations
+ * are always transient and removed in case of a database crash; they are
+ * never WAL-logged or fsync'd.
+ */
+typedef struct RelFileLocatorBackend
+{
+ RelFileLocator locator;
+ BackendId backend;
+} RelFileLocatorBackend;
+
+#define RelFileLocatorBackendIsTemp(rlocator) \
+ ((rlocator).backend != InvalidBackendId)
+
+/*
+ * Note: RelFileLocatorEquals and RelFileLocatorBackendEquals compare relNumber first
+ * since that is most likely to be different in two unequal RelFileLocators. It
+ * is probably redundant to compare spcOid if the other fields are found equal,
+ * but do it anyway to be sure. Likewise for checking the backend ID in
+ * RelFileLocatorBackendEquals.
+ */
+#define RelFileLocatorEquals(locator1, locator2) \
+ ((locator1).relNumber == (locator2).relNumber && \
+ (locator1).dbOid == (locator2).dbOid && \
+ (locator1).spcOid == (locator2).spcOid)
+
+#define RelFileLocatorBackendEquals(locator1, locator2) \
+ ((locator1).locator.relNumber == (locator2).locator.relNumber && \
+ (locator1).locator.dbOid == (locator2).locator.dbOid && \
+ (locator1).backend == (locator2).backend && \
+ (locator1).locator.spcOid == (locator2).locator.spcOid)
+
+#endif /* RELFILELOCATOR_H */
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
deleted file mode 100644
index 4fdc606..0000000
--- a/src/include/storage/relfilenode.h
+++ /dev/null
@@ -1,99 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenode.h
- * Physical access information for relations.
- *
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/storage/relfilenode.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODE_H
-#define RELFILENODE_H
-
-#include "common/relpath.h"
-#include "storage/backendid.h"
-
-/*
- * RelFileNode must provide all that we need to know to physically access
- * a relation, with the exception of the backend ID, which can be provided
- * separately. Note, however, that a "physical" relation is comprised of
- * multiple files on the filesystem, as each fork is stored as a separate
- * file, and each fork can be divided into multiple segments. See md.c.
- *
- * spcNode identifies the tablespace of the relation. It corresponds to
- * pg_tablespace.oid.
- *
- * dbNode identifies the database of the relation. It is zero for
- * "shared" relations (those common to all databases of a cluster).
- * Nonzero dbNode values correspond to pg_database.oid.
- *
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
- *
- * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
- * zero. We support shared relations only in the "global" tablespace.
- *
- * Note: in pg_class we allow reltablespace == 0 to denote that the
- * relation is stored in its database's "default" tablespace (as
- * identified by pg_database.dattablespace). However this shorthand
- * is NOT allowed in RelFileNode structs --- the real tablespace ID
- * must be supplied when setting spcNode.
- *
- * Note: in pg_class, relfilenode can be zero to denote that the relation
- * is a "mapped" relation, whose current true filenode number is available
- * from relmapper.c. Again, this case is NOT allowed in RelFileNodes.
- *
- * Note: various places use RelFileNode in hashtable keys. Therefore,
- * there *must not* be any unused padding bytes in this struct. That
- * should be safe as long as all the fields are of type Oid.
- */
-typedef struct RelFileNode
-{
- Oid spcNode; /* tablespace */
- Oid dbNode; /* database */
- Oid relNode; /* relation */
-} RelFileNode;
-
-/*
- * Augmenting a relfilenode with the backend ID provides all the information
- * we need to locate the physical storage. The backend ID is InvalidBackendId
- * for regular relations (those accessible to more than one backend), or the
- * owning backend's ID for backend-local relations. Backend-local relations
- * are always transient and removed in case of a database crash; they are
- * never WAL-logged or fsync'd.
- */
-typedef struct RelFileNodeBackend
-{
- RelFileNode node;
- BackendId backend;
-} RelFileNodeBackend;
-
-#define RelFileNodeBackendIsTemp(rnode) \
- ((rnode).backend != InvalidBackendId)
-
-/*
- * Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
- * since that is most likely to be different in two unequal RelFileNodes. It
- * is probably redundant to compare spcNode if the other fields are found equal,
- * but do it anyway to be sure. Likewise for checking the backend ID in
- * RelFileNodeBackendEquals.
- */
-#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
- (node1).dbNode == (node2).dbNode && \
- (node1).spcNode == (node2).spcNode)
-
-#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
- (node1).node.dbNode == (node2).node.dbNode && \
- (node1).backend == (node2).backend && \
- (node1).node.spcNode == (node2).node.spcNode)
-
-#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index e7cd456..56c6fc9 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -16,7 +16,7 @@
#include <signal.h>
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* We support several types of shared-invalidation messages:
@@ -90,7 +90,7 @@ typedef struct
int8 id; /* type field --- must be first */
int8 backend_hi; /* high bits of backend ID, if temprel */
uint16 backend_lo; /* low bits of backend ID, if temprel */
- RelFileNode rnode; /* spcNode, dbNode, relNode */
+ RelFileLocator rlocator; /* spcOid, dbOid, relNumber */
} SharedInvalSmgrMsg;
#define SHAREDINVALRELMAP_ID (-4)
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 6b63c60..a077153 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -16,7 +16,7 @@
#include "lib/ilist.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* smgr.c maintains a table of SMgrRelation objects, which are essentially
@@ -38,8 +38,8 @@
*/
typedef struct SMgrRelationData
{
- /* rnode is the hashtable lookup key, so it must be first! */
- RelFileNodeBackend smgr_rnode; /* relation physical identifier */
+ /* rlocator is the hashtable lookup key, so it must be first! */
+ RelFileLocatorBackend smgr_rlocator; /* relation physical identifier */
/* pointer to owning pointer, or NULL if none */
struct SMgrRelationData **smgr_owner;
@@ -75,16 +75,16 @@ typedef struct SMgrRelationData
typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
- RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+ RelFileLocatorBackendIsTemp((smgr)->smgr_rlocator)
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(RelFileLocator rlocator, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclose(SMgrRelation reln);
extern void smgrcloseall(void);
-extern void smgrclosenode(RelFileNodeBackend rnode);
+extern void smgrcloserellocator(RelFileLocatorBackend rlocator);
extern void smgrrelease(SMgrRelation reln);
extern void smgrreleaseall(void);
extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 6a77632..dacef92 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -17,7 +17,7 @@
#include "datatype/timestamp.h"
#include "storage/lock.h"
#include "storage/procsignal.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/standbydefs.h"
/* User-settable GUC parameters */
@@ -30,9 +30,9 @@ extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithTablespace(Oid tsid);
extern void ResolveRecoveryConflictWithDatabase(Oid dbid);
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..049af87 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -13,7 +13,7 @@
#ifndef SYNC_H
#define SYNC_H
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Type of sync request. These are used to manage the set of pending
@@ -51,7 +51,7 @@ typedef struct FileTag
{
int16 handler; /* SyncRequestHandler value, saving space */
int16 forknum; /* ForkNumber, saving space */
- RelFileNode rnode;
+ RelFileLocator rlocator;
uint32 segno;
} FileTag;
diff --git a/src/include/utils/inval.h b/src/include/utils/inval.h
index 0e0323b..23748b7 100644
--- a/src/include/utils/inval.h
+++ b/src/include/utils/inval.h
@@ -15,7 +15,7 @@
#define INVAL_H
#include "access/htup.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
extern PGDLLIMPORT int debug_discard_caches;
@@ -48,7 +48,7 @@ extern void CacheInvalidateRelcacheByTuple(HeapTuple classTuple);
extern void CacheInvalidateRelcacheByRelid(Oid relid);
-extern void CacheInvalidateSmgr(RelFileNodeBackend rnode);
+extern void CacheInvalidateSmgr(RelFileLocatorBackend rlocator);
extern void CacheInvalidateRelmap(Oid databaseId);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1896a9a..e5b6662 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -23,7 +23,7 @@
#include "partitioning/partdefs.h"
#include "rewrite/prs2lock.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/reltrigger.h"
@@ -53,7 +53,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
- RelFileNode rd_node; /* relation physical identifier */
+ RelFileLocator rd_locator; /* relation physical identifier */
SMgrRelation rd_smgr; /* cached file handle, or NULL */
int rd_refcnt; /* reference count */
BackendId rd_backend; /* owning backend id, if temporary relation */
@@ -66,44 +66,44 @@ typedef struct RelationData
/*----------
* rd_createSubid is the ID of the highest subtransaction the rel has
- * survived into or zero if the rel or its rd_node was created before the
- * current top transaction. (IndexStmt.oldNode leads to the case of a new
- * rel with an old rd_node.) rd_firstRelfilenodeSubid is the ID of the
- * highest subtransaction an rd_node change has survived into or zero if
- * rd_node matches the value it had at the start of the current top
+ * survived into or zero if the rel or its rd_locator was created before the
+ * current top transaction. (IndexStmt.oldNumber leads to the case of a new
+ * rel with an old rd_locator.) rd_firstRelfilelocatorSubid is the ID of the
+ * highest subtransaction an rd_locator change has survived into or zero if
+ * rd_locator matches the value it had at the start of the current top
* transaction. (Rolling back the subtransaction that
- * rd_firstRelfilenodeSubid denotes would restore rd_node to the value it
+ * rd_firstRelfilelocatorSubid denotes would restore rd_locator to the value it
* had at the start of the current top transaction. Rolling back any
* lower subtransaction would not.) Their accuracy is critical to
* RelationNeedsWAL().
*
- * rd_newRelfilenodeSubid is the ID of the highest subtransaction the
- * most-recent relfilenode change has survived into or zero if not changed
+ * rd_newRelfilelocatorSubid is the ID of the highest subtransaction the
+ * most-recent relfilenumber change has survived into or zero if not changed
* in the current transaction (or we have forgotten changing it). This
* field is accurate when non-zero, but it can be zero when a relation has
- * multiple new relfilenodes within a single transaction, with one of them
+ * multiple new relfilenumbers within a single transaction, with one of them
* occurring in a subsequently aborted subtransaction, e.g.
* BEGIN;
* TRUNCATE t;
* SAVEPOINT save;
* TRUNCATE t;
* ROLLBACK TO save;
- * -- rd_newRelfilenodeSubid is now forgotten
+ * -- rd_newRelfilelocatorSubid is now forgotten
*
* If every rd_*Subid field is zero, they are read-only outside
- * relcache.c. Files that trigger rd_node changes by updating
+ * relcache.c. Files that trigger rd_locator changes by updating
* pg_class.reltablespace and/or pg_class.relfilenode call
- * RelationAssumeNewRelfilenode() to update rd_*Subid.
+ * RelationAssumeNewRelfilelocator() to update rd_*Subid.
*
* rd_droppedSubid is the ID of the highest subtransaction that a drop of
* the rel has survived into. In entries visible outside relcache.c, this
* is always zero.
*/
SubTransactionId rd_createSubid; /* rel was created in current xact */
- SubTransactionId rd_newRelfilenodeSubid; /* highest subxact changing
- * rd_node to current value */
- SubTransactionId rd_firstRelfilenodeSubid; /* highest subxact changing
- * rd_node to any value */
+ SubTransactionId rd_newRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to current value */
+ SubTransactionId rd_firstRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to any value */
SubTransactionId rd_droppedSubid; /* dropped with another Subid set */
Form_pg_class rd_rel; /* RELATION tuple */
@@ -531,12 +531,12 @@ typedef struct ViewOptions
/*
* RelationIsMapped
- * True if the relation uses the relfilenode map. Note multiple eval
+ * True if the relation uses the relfilenumber map. Note multiple eval
* of argument!
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelFileNumber))
/*
* RelationGetSmgr
@@ -555,7 +555,7 @@ static inline SMgrRelation
RelationGetSmgr(Relation rel)
{
if (unlikely(rel->rd_smgr == NULL))
- smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_node, rel->rd_backend));
+ smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_locator, rel->rd_backend));
return rel->rd_smgr;
}
@@ -607,12 +607,12 @@ RelationGetSmgr(Relation rel)
*
* Returns false if wal_level = minimal and this relation is created or
* truncated in the current transaction. See "Skipping WAL for New
- * RelFileNode" in src/backend/access/transam/README.
+ * RelFileLocator" in src/backend/access/transam/README.
*/
#define RelationNeedsWAL(relation) \
(RelationIsPermanent(relation) && (XLogIsNeeded() || \
(relation->rd_createSubid == InvalidSubTransactionId && \
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)))
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)))
/*
* RelationUsesLocalBuffers
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index c93d865..ba35d6b 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -103,7 +103,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -111,10 +111,10 @@ extern Relation RelationBuildLocalRelation(const char *relname,
char relkind);
/*
- * Routines to manage assignment of new relfilenode to a relation
+ * Routines to manage assignment of new relfilenumber to a relation
*/
-extern void RelationSetNewRelfilenode(Relation relation, char persistence);
-extern void RelationAssumeNewRelfilenode(Relation relation);
+extern void RelationSetNewRelfilenumber(Relation relation, char persistence);
+extern void RelationAssumeNewRelfilelocator(Relation relation);
/*
* Routines for flushing/rebuilding relcache entries in various scenarios
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
deleted file mode 100644
index 77d8046..0000000
--- a/src/include/utils/relfilenodemap.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.h
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/relfilenodemap.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODEMAP_H
-#define RELFILENODEMAP_H
-
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
-
-#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
new file mode 100644
index 0000000..c149a93
--- /dev/null
+++ b/src/include/utils/relfilenumbermap.h
@@ -0,0 +1,19 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.h
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/relfilenumbermap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILENUMBERMAP_H
+#define RELFILENUMBERMAP_H
+
+extern Oid RelidByRelfilenumber(Oid reltablespace,
+ RelFileNumber relfilenumber);
+
+#endif /* RELFILENUMBERMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 557f77e..4c2117c 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.h
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
*
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
@@ -35,14 +35,14 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern Oid RelationMapOidToFilenumber(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
-extern Oid RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId);
+extern Oid RelationMapFilenumberToOid(Oid relationId, bool shared);
+extern Oid RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId);
extern void RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath,
char *dstdbpath);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
- bool immediate);
+extern void RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber,
+ bool shared, bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
index 4700d49..869d9d5 100644
--- a/src/test/recovery/t/018_wal_optimize.pl
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -5,7 +5,7 @@
#
# These tests exercise code that once violated the mandate described in
# src/backend/access/transam/README section "Skipping WAL for New
-# RelFileNode". The tests work by committing some transactions, initiating an
+# RelFileLocator". The tests work by committing some transactions, initiating an
# immediate shutdown, and confirming that the expected data survives recovery.
# For many years, individual commands made the decision to skip WAL, hence the
# frequent appearance of COPY in these tests.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4fb7469..11b68b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2255,8 +2255,8 @@ ReindexObjectType
ReindexParams
ReindexStmt
ReindexType
-RelFileNode
-RelFileNodeBackend
+RelFileLocator
+RelFileLocatorBackend
RelIdCacheEnt
RelInfo
RelInfoArr
@@ -2274,8 +2274,8 @@ RelationPtr
RelationSyncEntry
RelcacheCallbackFunction
ReleaseMatchCB
-RelfilenodeMapEntry
-RelfilenodeMapKey
+RelfilenumberMapEntry
+RelfilenumberMapKey
Relids
RelocationBufferInfo
RelptrFreePageBtree
@@ -3877,7 +3877,7 @@ xl_xact_parsed_abort
xl_xact_parsed_commit
xl_xact_parsed_prepare
xl_xact_prepare
-xl_xact_relfilenodes
+xl_xact_relfilelocators
xl_xact_stats_item
xl_xact_stats_items
xl_xact_subxacts
--
1.8.3.1
On Fri, Jun 24, 2022 at 7:08 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have changed that. PFA, the updated patch.
Apart from one minor nitpick (see below) I don't see a problem with
this in isolation. It seems like a pretty clean renaming. So I think
we need to move onto the question of how clean the rest of the patch
series looks with this as a base.
A preliminary refactoring that was discussed in the past and was
originally in 0001 was to move the fields included in BufferTag via
RelFileNode/Locator directly into the struct. I think maybe it doesn't
make sense to include that in 0001 as you have it here, but maybe that
could be 0002 with the main patch to follow as 0003, or something like
that. I wonder if we can get by with redefining RelFileNode like this
in 0002:
typedef struct buftag
{
Oid spcOid;
Oid dbOid;
RelFileNumber fileNumber;
ForkNumber forkNum;
} BufferTag;
And then like this in 0003:
typedef struct buftag
{
Oid spcOid;
Oid dbOid;
RelFileNumber fileNumber:56;
ForkNumber forkNum:8;
} BufferTag;
- * from catalog OIDs to filenode numbers. Each database has a map file for
+ * from catalog OIDs to filenumber. Each database has a map file for
should be filenumbers
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 2022-06-24 10:59:25 -0400, Robert Haas wrote:
A preliminary refactoring that was discussed in the past and was
originally in 0001 was to move the fields included in BufferTag via
RelFileNode/Locator directly into the struct. I think maybe it doesn't
make sense to include that in 0001 as you have it here, but maybe that
could be 0002 with the main patch to follow as 0003, or something like
that. I wonder if we can get by with redefining RelFileNode like this
in 0002:typedef struct buftag
{
Oid spcOid;
Oid dbOid;
RelFileNumber fileNumber;
ForkNumber forkNum;
} BufferTag;
If we "inline" RelFileNumber, it's probably worth reorder the members so that
the most distinguishing elements come first, to make it quicker to detect hash
collisions. It shows up in profiles today...
I guess it should be blockNum, fileNumber, forkNumber, dbOid, spcOid? I think
as long as blockNum, fileNumber are first, the rest doesn't matter much.
And then like this in 0003:
typedef struct buftag
{
Oid spcOid;
Oid dbOid;
RelFileNumber fileNumber:56;
ForkNumber forkNum:8;
} BufferTag;
Probably worth checking the generated code / the performance effects of using
bitfields (vs manual maskery). I've seen some awful cases, but here it's at a
byte boundary, so it might be ok.
Greetings,
Andres Freund
On Fri, Jun 24, 2022 at 9:30 PM Andres Freund <andres@anarazel.de> wrote:
If we "inline" RelFileNumber, it's probably worth reorder the members so that
the most distinguishing elements come first, to make it quicker to detect hash
collisions. It shows up in profiles today...I guess it should be blockNum, fileNumber, forkNumber, dbOid, spcOid? I think
as long as blockNum, fileNumber are first, the rest doesn't matter much.
Hmm, I guess we could do that. Possibly as a separate, very small patch.
And then like this in 0003:
typedef struct buftag
{
Oid spcOid;
Oid dbOid;
RelFileNumber fileNumber:56;
ForkNumber forkNum:8;
} BufferTag;Probably worth checking the generated code / the performance effects of using
bitfields (vs manual maskery). I've seen some awful cases, but here it's at a
byte boundary, so it might be ok.
One advantage of using bitfields is that it might mean we don't need
to introduce accessor macros. Now, if that's going to lead to terrible
performance I guess we should go ahead and add the accessor macros -
Dilip had those in an earlier patch anyway. But it'd be nice if it
weren't necessary.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Sat, 25 Jun 2022 at 02:30, Andres Freund <andres@anarazel.de> wrote:
And then like this in 0003:
typedef struct buftag
{
Oid spcOid;
Oid dbOid;
RelFileNumber fileNumber:56;
ForkNumber forkNum:8;
} BufferTag;Probably worth checking the generated code / the performance effects of using
bitfields (vs manual maskery). I've seen some awful cases, but here it's at a
byte boundary, so it might be ok.
Another approach would be to condense spcOid and dbOid into a single
4-byte Oid-like number, since in most cases they are associated with
each other, and not often many of them anyway. So this new number
would indicate both the database and the tablespace. I know that we
want to be able to make file changes without doing catalog lookups,
but since the number of combinations is usually 1, but even then, low,
it can be cached easily in a smgr array and included in the checkpoint
record (or nearby) for ease of use.
typedef struct buftag
{
Oid db_spcOid;
ForkNumber uint32;
RelFileNumber uint64;
} BufferTag;
That way we could just have a simple 64-bit RelFileNumber, without
restriction, and probably some spare bytes on the ForkNumber, if we
needed them later.
--
Simon Riggs http://www.EnterpriseDB.com/
On Tue, Jun 28, 2022 at 7:45 AM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
Another approach would be to condense spcOid and dbOid into a single
4-byte Oid-like number, since in most cases they are associated with
each other, and not often many of them anyway. So this new number
would indicate both the database and the tablespace. I know that we
want to be able to make file changes without doing catalog lookups,
but since the number of combinations is usually 1, but even then, low,
it can be cached easily in a smgr array and included in the checkpoint
record (or nearby) for ease of use.typedef struct buftag
{
Oid db_spcOid;
ForkNumber uint32;
RelFileNumber uint64;
} BufferTag;
I've thought about this before too, because it does seem like the DB
OID and tablespace OID are a poor use of bit space. You might not even
need to keep the db_spcOid value in any persistent place, because it
could just be an alias for buffer mapping lookups that might change on
every restart. That does have the problem that you now need a
secondary hash table - in theory of unbounded size - to store mappings
from <dboid,tsoid> to db_spcOid, and that seems complicated and hard
to get right. It might be possible, though. Alternatively, you could
imagine a durable mapping that also affects the on-disk structure, but
I don't quite see how to make that work: for example, pg_basebackup
wants to produce a tar file for each tablespace directory, and if the
pathnames no longer contain the tablespace OID but only the db_spcOid,
then that doesn't work any more.
But the primary problem we're trying to solve here is that right now
we sometimes reuse the same filename for a whole new file, and that
results in bugs that only manifest themselves in obscure
circumstances, e.g. see 4eb2176318d0561846c1f9fb3c68bede799d640f.
There are residual failure modes even now related to the "tombstone"
files that are created when you drop a relation: remove everything but
the first file from the main fork but then keep that file (only)
around until after the next checkpoint. OID wraparound is another
annoyance that has influenced the design of quite a bit of code over
the years and where we probably still have bugs. If we don't reuse
relfilenodes, we can avoid a lot of that pain. Combining the DB OID
and TS OID fields doesn't solve that problem.
That way we could just have a simple 64-bit RelFileNumber, without
restriction, and probably some spare bytes on the ForkNumber, if we
needed them later.
In my personal opinion, the ForkNumber system is an ugly wart which
has nothing to recommend it except that the VM and FSM forks are
awesome. But if we could have those things without needing forks, I
think that would be way better. Forks add code complexity in tons of
places, and it's barely possible to scale it to the 4 forks we have
already, let alone any larger number. Furthermore, there are really
negative performance effects from creating 3 files per small relation
rather than 1, and we sure can't afford to have that number get any
bigger. I'd rather kill the ForkNumber system with fire that expand it
further, but even if we do expand it, we're not close to being able to
cope with more than 256 forks per relation.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jun 28, 2022 at 11:25 AM Robert Haas <robertmhaas@gmail.com> wrote:
But the primary problem we're trying to solve here is that right now
we sometimes reuse the same filename for a whole new file, and that
results in bugs that only manifest themselves in obscure
circumstances, e.g. see 4eb2176318d0561846c1f9fb3c68bede799d640f.
There are residual failure modes even now related to the "tombstone"
files that are created when you drop a relation: remove everything but
the first file from the main fork but then keep that file (only)
around until after the next checkpoint. OID wraparound is another
annoyance that has influenced the design of quite a bit of code over
the years and where we probably still have bugs. If we don't reuse
relfilenodes, we can avoid a lot of that pain. Combining the DB OID
and TS OID fields doesn't solve that problem.
Oh wait, I'm being stupid. You were going to combine those fields but
then also widen the relfilenode, so that would solve this problem
after all. Oops, I'm dumb.
I still think this is a lot more complicated though, to the point
where I'm not sure we can really make it work at all.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, 28 Jun 2022 at 13:45, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
but since the number of combinations is usually 1, but even then, low,
it can be cached easily in a smgr array and included in the checkpoint
record (or nearby) for ease of use.
I was reading the thread to keep up with storage-related prototypes
and patches, and this specifically doesn't sound quite right to me. I
do not know what values you considered to be 'low' or what 'can be
cached easily', so here's some field data:
I have seen PostgreSQL clusters that utilized the relative isolation
of seperate databases within the same cluster (instance / postmaster)
to provide higher guarantees of data access isolation while still
being able to share a resource pool, which resulted in several
clusters containing upwards of 100 databases.
I will be the first to admit that it is quite unlikely to be common
practise, but this workload increases the number of dbOid+spcOid
combinations to 100s (even while using only a single tablespace),
which in my opinion requires some more thought than just handwaving it
into an smgr array and/or checkpoint records.
Kind regards,
Matthias van de Meent
On Fri, Jun 24, 2022 at 8:29 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jun 24, 2022 at 7:08 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have changed that. PFA, the updated patch.
Apart from one minor nitpick (see below) I don't see a problem with
this in isolation. It seems like a pretty clean renaming. So I think
we need to move onto the question of how clean the rest of the patch
series looks with this as a base.
PFA, the remaining set of patches. It might need to fix some
indentation but lets first see how is the overall idea then we can
work on it. I have fixed all the open review comment from the
previous thread except this comment from Robert.
- It looks to me like you need to give significantly more thought to
the proper way of adjusting the relfilenode-related test cases in
alter_table.out.
It seems to me that this test case is just testing whether the
table/child table are rewritten or not after the alter table. And for
that it is comparing the oid with the relfilenode, now that is not
possible so I think it's quite reasonable to just compare the current
relfilenode with the old relfilenode and if they are same the table is
not rewritten. So I am not sure why the original test case had two
cases 'own' and 'orig'. With respect to this test case they both have
the same meaning, in fact comparing old relfilenode with current
relfilenode is better way of testing than comparing the oid with
relfilenode.
diff --git a/src/test/regress/expected/alter_table.out
b/src/test/regress/expected/alter_table.out
index 5ede56d..80af97e 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,7 +2164,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v3-0002-Preliminary-refactoring-for-supporting-larger-rel.patchapplication/x-patch; name=v3-0002-Preliminary-refactoring-for-supporting-larger-rel.patchDownload
From ce5a6edc4a1304780cda5619b9f3b77446082973 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Sat, 25 Jun 2022 10:43:12 +0530
Subject: [PATCH v3 2/4] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
contrib/pg_prewarm/autoprewarm.c | 7 +-
src/backend/storage/buffer/bufmgr.c | 113 ++++++++++++++++++--------
src/backend/storage/buffer/localbuf.c | 22 +++--
src/include/storage/buf_internals.h | 43 ++++++++--
5 files changed, 137 insertions(+), 54 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 713f52a..abc8813 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
+ fctx->record[i].relfilenumber = BufTagGetFileNumber(bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 7f1d55c..ca80d5a 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -631,9 +631,10 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetFileNumber(bufHdr->tag);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7071ff6..d34fff3 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BuffTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BuffTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetFileNumber(bufHdr->tag);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,7 +2790,7 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
+ BuffTagCopyRelFileLocator(bufHdr->tag, *rlocator);
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,7 +2841,12 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -3141,14 +3149,14 @@ DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3301,7 +3309,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, locators[j]))
{
rlocator = &locators[j];
break;
@@ -3310,7 +3318,10 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, locator);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3320,7 +3331,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3380,7 +3391,7 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3419,11 +3430,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3447,13 +3458,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3473,12 +3487,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3535,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3564,13 +3582,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BuffTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3644,7 +3662,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,7 +3671,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3665,7 +3686,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3867,13 +3888,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4033,6 +4054,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4041,8 +4066,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4650,8 +4674,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+ path = relpathperm(rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4701,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathperm(rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,8 +4723,11 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathbackend(rlocator, MyBackendId, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4787,9 +4820,14 @@ WaitBufHdrUnlocked(BufferDesc *buf)
static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
- int ret;
+ int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ BuffTagCopyRelFileLocator(*ba, rlocatora);
+ BuffTagCopyRelFileLocator(*bb, rlocatorb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
@@ -4946,10 +4984,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BuffTagCopyRelFileLocator(tag, currlocator);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4957,10 +4997,13 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BuffTagCopyRelFileLocator(next->tag, nextrlocator);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +5022,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
+ reln = smgropen(currlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 3dc9cc7..1d43f22 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,9 +213,12 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -337,16 +340,22 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,12 +392,15 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator))
{
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index d54e1f6..b1b8061 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,61 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
- BlockNumber blockNum; /* blknum relative to begin of reln */
+ Oid spcOid; /* tablespace oid. */
+ Oid dbOid; /* database oid. */
+ RelFileNumber relNumber; /* relation file number. */
+ ForkNumber forkNum;
+ BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+#define BufTagGetFileNumber(a) ((a).relNumber)
+
+#define BufTagSetFileNumber(a, relnumber) \
+( \
+ (a).relNumber = (relnumber) \
+)
+
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidOid, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ BufTagSetFileNumber(a, InvalidRelFileNumber), \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rlocator = (xx_rlocator), \
+ (a).spcOid = (xx_rlocator).spcOid, \
+ (a).dbOid = (xx_rlocator).dbOid, \
+ BufTagSetFileNumber(a, (xx_rlocator).relNumber), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).relNumber == (b).relNumber && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BuffTagCopyRelFileLocator(a, locator) \
+do { \
+ (locator).spcOid = (a).spcOid; \
+ (locator).dbOid = (a).dbOid; \
+ (locator).relNumber = (a).relNumber; \
+} while(0)
+
+#define BuffTagRelFileLocatorEquals(a, locator) \
+( \
+ (a).spcOid == (locator).spcOid && \
+ (a).dbOid == (locator).dbOid && \
+ (a).relNumber == (locator).relNumber \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v3-0004-Don-t-delay-removing-Tombstone-file-until-next.patchapplication/x-patch; name=v3-0004-Don-t-delay-removing-Tombstone-file-until-next.patchDownload
From a089ac1d4cc696e7cba0ed441ba2560e0641589c Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Wed, 29 Jun 2022 13:24:32 +0530
Subject: [PATCH v3 4/4] Don't delay removing Tombstone file until next
checkpoint
Currently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/storage/smgr/md.c | 58 ++++++----------------
src/backend/storage/sync/sync.c | 101 --------------------------------------
src/include/storage/sync.h | 2 -
4 files changed, 14 insertions(+), 152 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 302da4a..50ac3ea 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6644,11 +6644,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..e8c1cfa 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -126,8 +126,6 @@ static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -325,36 +323,25 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileLocatorBackendIsTemp(rlocator))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rlocator, forkNum, 0 /* first seg */ );
- }
- else
- ret = 0;
+ /* Prevent other backends' fds from holding on to the disk space */
+ ret = do_truncate(path);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
- {
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
+ /* Forget any pending sync requests for the first segment */
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
+ ret = 0;
- /* Register request to unlink first segment later */
- register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
+ /* Next unlink the file, unless it was already found to be missing */
+ if (ret == 0 || errno != ENOENT)
+ {
+ ret = unlink(path);
+ if (ret < 0 && errno != ENOENT)
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
/*
@@ -1002,23 +989,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileLocatorBackendIsTemp(rlocator));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e1fb631..9a4a31c 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -201,92 +201,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -532,21 +446,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 049af87..2c0b812 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -23,7 +23,6 @@
typedef enum SyncRequestType
{
SYNC_REQUEST, /* schedule a call of sync function */
- SYNC_UNLINK_REQUEST, /* schedule a call of unlink function */
SYNC_FORGET_REQUEST, /* forget all calls for a tag */
SYNC_FILTER_REQUEST /* forget all calls satisfying match fn */
} SyncRequestType;
@@ -57,7 +56,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
v3-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchapplication/x-patch; name=v3-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchDownload
From 137c99e34797c5148d2eb49d8ee90f3ee1ff5436 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Sat, 25 Jun 2022 15:12:27 +0530
Subject: [PATCH v3 3/4] Use 56 bits for relfilenumber to avoid wraparound
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As part of this patch, we will make the relfilenumber 56 bits wide.
But the problem is that if we make it 56 bits wide then the size
of the BufferTag will be increased which will increase the memory
usage and that may also impact the performance. So in order to
avoid that inside the buffer tag, instead of using 64 bits for the
relfilenumber we will use 8 bits for the fork number and 56 bits for
the relfilenumber.
---
contrib/pg_buffercache/Makefile | 3 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 20 ++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 +++--
src/backend/access/transam/README | 4 +-
src/backend/access/transam/varsup.c | 94 +++++++++++++++++++++-
src/backend/access/transam/xlog.c | 48 +++++++++++
src/backend/access/transam/xlogprefetcher.c | 14 ++--
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 8 +-
src/backend/catalog/catalog.c | 55 +++----------
src/backend/catalog/heap.c | 8 +-
src/backend/catalog/index.c | 4 +-
src/backend/commands/tablecmds.c | 9 ++-
src/backend/nodes/outfuncs.c | 2 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 4 +-
src/backend/utils/adt/pg_upgrade_support.c | 9 ++-
src/backend/utils/cache/relcache.c | 4 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 ++-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 ++---
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 11 +--
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 ++---
src/fe_utils/option_utils.c | 42 ++++++++++
src/include/access/transam.h | 5 ++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 1 -
src/include/catalog/pg_class.h | 10 +--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +--
src/include/fe_utils/option_utils.h | 3 +
src/include/postgres_ext.h | 7 +-
src/include/storage/relfilelocator.h | 12 ++-
src/test/regress/expected/alter_table.out | 20 +++--
src/test/regress/sql/alter_table.sql | 4 +-
56 files changed, 406 insertions(+), 176 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..2fbb62f 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -7,7 +7,8 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+ pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
+ pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index abc8813..b8d9cea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -237,3 +238,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 25b02c4..076bf8f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..e21559d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 565f994..c72f4fb 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,8 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because relfilenumber is 56 bit wide so logically there should not be any
+collisions. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..3b7e950 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,94 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GenerateNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GenerateNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* Check for the wraparound for the relfilenumber counter */
+ if (unlikely (ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /* If we run out of logged for use RelFileNumber then we must log more */
+ if (ShmemVariableCache->relnumbercount == 0)
+ {
+ XLogPutNextRelFileNumber(ShmemVariableCache->nextRelFileNumber +
+ VAR_RFN_PREFETCH);
+
+ ShmemVariableCache->relnumbercount = VAR_RFN_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+ (ShmemVariableCache->relnumbercount)--;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ int relnumbercount;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the relfilenode for the objects can be in any
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * Check if we set the new relfilenumber then do we run out of the logged
+ * relnumber, if so then we need to WAL log again. Otherwise, just adjust
+ * the relnumbercount.
+ */
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ XLogPutNextRelFileNumber(relnumber + VAR_RFN_PREFETCH);
+ ShmemVariableCache->relnumbercount = VAR_RFN_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8764084..302da4a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4546,6 +4546,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4559,7 +4560,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5026,7 +5029,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6475,6 +6480,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ checkPoint.nextRelFileNumber += ShmemVariableCache->relnumbercount;
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7353,6 +7364,29 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record.
+ */
+void
+XLogPutNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * Flush xlog record to disk before returning. To protect against file
+ * system changes reaching the disk before the XLOG_NEXT_RELFILENUMBER log.
+ *
+ * This should not impact the performance because we are WAL logging the
+ * RelFileNumber after assigning every 8192 RelFileNumber
+ */
+ XLogFlush(recptr);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7567,6 +7601,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7581,6 +7625,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index d1662f3..6b6fa56 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -610,7 +610,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -633,7 +633,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -732,7 +732,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -753,7 +753,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -792,7 +792,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -930,7 +930,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -946,7 +946,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 8306518..b7a67aa 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 42a0f51..2f58e77 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilelocator instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2a33273..df861c4 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -482,26 +482,16 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
/*
* GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
+ * Generate a new relfilenumber.
*
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
+ * We are using 56 bits for the relfilenumber so we expect that to be unique
+ * for the cluster so if it is already exists then report an error.
*/
RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
{
RelFileLocatorBackend rlocator;
char *rpath;
- bool collides;
BackendId backend;
/*
@@ -535,40 +525,13 @@ GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
* are properly detected.
*/
rlocator.backend = backend;
+ rlocator.locator.relNumber = GenerateNewRelFileNumber();
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
+ /* Check for existing file of same name */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
+ if (access(rpath, F_OK) == 0)
+ elog(ERROR, "new relfilenumber file already exists: \"%s\"\n", rpath);
return rlocator.locator.relNumber;
}
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index c69c923..f6ef93f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -347,7 +347,7 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ relfilenumber = GetNewRelFileNumber(reltablespace, relpersistence);
}
/*
@@ -900,7 +900,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1231,8 +1231,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f245df8..46b914b 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -937,8 +937,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index bf645b8..54eebdf 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14371,10 +14371,13 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenumber. Although relfilenumber are unique within a
+ * cluster, we are unable to use the old relfilenumber since unused
+ * relfilenumber are not unlinked until commit. So if within a
+ * transaction, if we set the old tablespace again, we will get conflicting
+ * relfilenumber file.
*/
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 3724d48..3f2618a 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2928,7 +2928,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNumber);
+ WRITE_UINT64_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilenumberSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index f8fb228..4366ae6 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..b64dbe7 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index b21d8c3..5f6c12a 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index d8ae082..5bbd847 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,7 +898,7 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenumber doesn't misbehave */
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 4408c00..f5b6d41 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -98,10 +98,11 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +121,11 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +144,11 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 9bab6af..c3c7e9d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3630,7 +3630,7 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelFileNumber;
/* Add it to the active mapping information */
RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
@@ -3709,7 +3709,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index 3dc45e9..a5ec78c 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " INT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 21dfe1b..65fc623 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 30b2f85..2d70833 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4837,16 +4837,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4864,7 +4864,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4878,7 +4878,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4886,7 +4886,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4899,7 +4899,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5d30b87..ea62e7d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,11 +399,11 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
- char query[QUERY_ALLOC];
- char *last_namespace = NULL,
- *last_tablespace = NULL;
+ RelFileNumber i_relfilenumber;
+ char query[QUERY_ALLOC];
+ char *last_namespace = NULL,
+ *last_tablespace = NULL;
query[0] = '\0'; /* initialize query string to empty */
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 265d829..4c4f03a 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index b3ad820..50e94df 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 0fdde9d..e5b0b50 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..e576091 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -213,6 +213,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ uint32 relnumbercount; /* relfilenumbers available before must do
+ XLOG work */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +296,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GenerateNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..4cae54b 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void XLogPutNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 66900f1..eefee54 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -39,7 +39,6 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..1cf039c 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -31,6 +31,10 @@
*/
CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,RelationRelation_Rowtype_Id) BKI_SCHEMA_MACRO
{
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* oid */
Oid oid;
@@ -52,10 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..d5e6172 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87aa571..0df6411 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index d8af68b..51b9d49 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,14 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#define FirstNormalRelFileNumber ((RelFileNumber) 1)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 7211fe7..6046506 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -34,8 +34,7 @@
* relNumber identifies the specific relation. relNumber corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * Notice that relNumber is unique within a cluster.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +74,15 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
+/*
+ * Max value of the relfilnumber. RelFileNumber will be of 56bits wide for
+ * more details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 5ede56d..80af97e 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,7 +2164,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,7 +2197,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | OTHER | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | OTHER | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 52001e3..c0b7d39 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,7 +1478,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1499,7 +1498,6 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
else 'OTHER'
end as storage,
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
v3-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchapplication/x-patch; name=v3-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchDownload
From ab91baece490b02f14bd54eb5b05223ae29012e7 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 21 Jun 2022 14:04:01 +0530
Subject: [PATCH v3 1/4] Rename RelFileNode to RelFileLocator and relNode to
RelNumber
Currently, the way relfilenode and relnode are used is really confusing.
Although there is some precedent for calling the number that pertains to
the file on disk "relnode" and that value when combined with the database
and tablespace OIDs "relfilenode," but it's definitely not the most obvious
thing, and this terminology is also not used uniformaly.
So as part of this patchset these variables are renamed to something more suited
with their usage. So the RelFileNode is renamed to the RelFileLocator
and all related variable declaration from relfilenode to relfilelocator.
And the relNode in the RelFileLocator is renamed to relNumber and along with that
the dbNode and spcNode are also renamed to dbOid and spcOid. Along with that
all other references to relnode/relfilenode w.r.t to the ondisk file is renamed to
relnumber/relfilenumber.
---
contrib/bloom/blinsert.c | 2 +-
contrib/oid2name/oid2name.c | 28 +--
contrib/pg_buffercache/pg_buffercache_pages.c | 10 +-
contrib/pg_prewarm/autoprewarm.c | 26 +--
contrib/pg_visibility/pg_visibility.c | 2 +-
src/backend/access/common/syncscan.c | 29 +--
src/backend/access/gin/ginbtree.c | 2 +-
src/backend/access/gin/ginfast.c | 2 +-
src/backend/access/gin/ginutil.c | 2 +-
src/backend/access/gin/ginxlog.c | 6 +-
src/backend/access/gist/gistbuild.c | 4 +-
src/backend/access/gist/gistxlog.c | 11 +-
src/backend/access/hash/hash_xlog.c | 6 +-
src/backend/access/hash/hashpage.c | 4 +-
src/backend/access/heap/heapam.c | 78 +++----
src/backend/access/heap/heapam_handler.c | 26 +--
src/backend/access/heap/rewriteheap.c | 10 +-
src/backend/access/heap/visibilitymap.c | 4 +-
src/backend/access/nbtree/nbtpage.c | 2 +-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 2 +-
src/backend/access/nbtree/nbtxlog.c | 8 +-
src/backend/access/rmgrdesc/genericdesc.c | 2 +-
src/backend/access/rmgrdesc/gindesc.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 6 +-
src/backend/access/rmgrdesc/heapdesc.c | 6 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/smgrdesc.c | 4 +-
src/backend/access/rmgrdesc/xactdesc.c | 44 ++--
src/backend/access/rmgrdesc/xlogdesc.c | 10 +-
src/backend/access/spgist/spginsert.c | 6 +-
src/backend/access/spgist/spgxlog.c | 6 +-
src/backend/access/table/tableamapi.c | 2 +-
src/backend/access/transam/README | 14 +-
src/backend/access/transam/README.parallel | 2 +-
src/backend/access/transam/twophase.c | 38 ++--
src/backend/access/transam/varsup.c | 2 +-
src/backend/access/transam/xact.c | 40 ++--
src/backend/access/transam/xloginsert.c | 38 ++--
src/backend/access/transam/xlogprefetcher.c | 96 ++++----
src/backend/access/transam/xlogreader.c | 25 ++-
src/backend/access/transam/xlogrecovery.c | 18 +-
src/backend/access/transam/xlogutils.c | 73 +++---
src/backend/bootstrap/bootparse.y | 8 +-
src/backend/catalog/catalog.c | 30 +--
src/backend/catalog/heap.c | 56 ++---
src/backend/catalog/index.c | 37 +--
src/backend/catalog/storage.c | 119 +++++-----
src/backend/commands/cluster.c | 46 ++--
src/backend/commands/copyfrom.c | 6 +-
src/backend/commands/dbcommands.c | 104 ++++-----
src/backend/commands/indexcmds.c | 14 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/sequence.c | 29 +--
src/backend/commands/tablecmds.c | 87 ++++----
src/backend/commands/tablespace.c | 18 +-
src/backend/nodes/copyfuncs.c | 4 +-
src/backend/nodes/equalfuncs.c | 4 +-
src/backend/nodes/outfuncs.c | 4 +-
src/backend/parser/gram.y | 8 +-
src/backend/parser/parse_utilcmd.c | 8 +-
src/backend/postmaster/checkpointer.c | 2 +-
src/backend/replication/logical/decode.c | 40 ++--
src/backend/replication/logical/reorderbuffer.c | 50 ++---
src/backend/replication/logical/snapbuild.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 284 ++++++++++++------------
src/backend/storage/buffer/localbuf.c | 34 +--
src/backend/storage/freespace/freespace.c | 6 +-
src/backend/storage/freespace/fsmpage.c | 6 +-
src/backend/storage/ipc/standby.c | 8 +-
src/backend/storage/lmgr/predicate.c | 24 +-
src/backend/storage/smgr/README | 2 +-
src/backend/storage/smgr/md.c | 126 +++++------
src/backend/storage/smgr/smgr.c | 44 ++--
src/backend/utils/adt/dbsize.c | 64 +++---
src/backend/utils/adt/pg_upgrade_support.c | 14 +-
src/backend/utils/cache/Makefile | 2 +-
src/backend/utils/cache/inval.c | 16 +-
src/backend/utils/cache/relcache.c | 180 +++++++--------
src/backend/utils/cache/relfilenodemap.c | 244 --------------------
src/backend/utils/cache/relfilenumbermap.c | 244 ++++++++++++++++++++
src/backend/utils/cache/relmapper.c | 85 +++----
src/bin/pg_dump/pg_dump.c | 36 +--
src/bin/pg_rewind/datapagemap.h | 2 +-
src/bin/pg_rewind/filemap.c | 34 +--
src/bin/pg_rewind/filemap.h | 4 +-
src/bin/pg_rewind/parsexlog.c | 10 +-
src/bin/pg_rewind/pg_rewind.h | 2 +-
src/bin/pg_upgrade/Makefile | 2 +-
src/bin/pg_upgrade/info.c | 10 +-
src/bin/pg_upgrade/pg_upgrade.h | 22 +-
src/bin/pg_upgrade/relfilenode.c | 259 ---------------------
src/bin/pg_upgrade/relfilenumber.c | 259 +++++++++++++++++++++
src/bin/pg_waldump/pg_waldump.c | 26 +--
src/common/relpath.c | 48 ++--
src/include/access/brin_xlog.h | 2 +-
src/include/access/ginxlog.h | 4 +-
src/include/access/gistxlog.h | 2 +-
src/include/access/heapam_xlog.h | 8 +-
src/include/access/nbtxlog.h | 4 +-
src/include/access/rewriteheap.h | 6 +-
src/include/access/tableam.h | 59 ++---
src/include/access/xact.h | 26 +--
src/include/access/xlog_internal.h | 2 +-
src/include/access/xloginsert.h | 8 +-
src/include/access/xlogreader.h | 6 +-
src/include/access/xlogrecord.h | 8 +-
src/include/access/xlogutils.h | 8 +-
src/include/catalog/binary_upgrade.h | 6 +-
src/include/catalog/catalog.h | 5 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/storage.h | 10 +-
src/include/catalog/storage_xlog.h | 8 +-
src/include/commands/sequence.h | 4 +-
src/include/commands/tablecmds.h | 2 +-
src/include/commands/tablespace.h | 2 +-
src/include/common/relpath.h | 24 +-
src/include/nodes/parsenodes.h | 8 +-
src/include/postgres_ext.h | 7 +
src/include/postmaster/bgwriter.h | 2 +-
src/include/replication/reorderbuffer.h | 6 +-
src/include/storage/buf_internals.h | 28 +--
src/include/storage/bufmgr.h | 16 +-
src/include/storage/freespace.h | 4 +-
src/include/storage/md.h | 6 +-
src/include/storage/relfilelocator.h | 99 +++++++++
src/include/storage/relfilenode.h | 99 ---------
src/include/storage/sinval.h | 4 +-
src/include/storage/smgr.h | 12 +-
src/include/storage/standby.h | 6 +-
src/include/storage/sync.h | 4 +-
src/include/utils/inval.h | 4 +-
src/include/utils/rel.h | 46 ++--
src/include/utils/relcache.h | 8 +-
src/include/utils/relfilenodemap.h | 18 --
src/include/utils/relfilenumbermap.h | 19 ++
src/include/utils/relmapper.h | 13 +-
src/test/recovery/t/018_wal_optimize.pl | 2 +-
src/tools/pgindent/typedefs.list | 10 +-
141 files changed, 2070 insertions(+), 2042 deletions(-)
delete mode 100644 src/backend/utils/cache/relfilenodemap.c
create mode 100644 src/backend/utils/cache/relfilenumbermap.c
delete mode 100644 src/bin/pg_upgrade/relfilenode.c
create mode 100644 src/bin/pg_upgrade/relfilenumber.c
create mode 100644 src/include/storage/relfilelocator.h
delete mode 100644 src/include/storage/relfilenode.h
delete mode 100644 src/include/utils/relfilenodemap.h
create mode 100644 src/include/utils/relfilenumbermap.h
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 82378db..e64291e 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -179,7 +179,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index a62a5ee..2e08bc7 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -30,7 +30,7 @@ struct options
{
eary *tables;
eary *oids;
- eary *filenodes;
+ eary *filenumbers;
bool quiet;
bool systables;
@@ -125,9 +125,9 @@ get_opts(int argc, char **argv, struct options *my_opts)
my_opts->dbname = pg_strdup(optarg);
break;
- /* specify one filenode to show */
+ /* specify one filenumber to show */
case 'f':
- add_one_elt(optarg, my_opts->filenodes);
+ add_one_elt(optarg, my_opts->filenumbers);
break;
/* host to connect to */
@@ -494,7 +494,7 @@ sql_exec_dumpalltables(PGconn *conn, struct options *opts)
}
/*
- * Show oid, filenode, name, schema and tablespace for each of the
+ * Show oid, filenumber, name, schema and tablespace for each of the
* given objects in the current database.
*/
void
@@ -504,19 +504,19 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
char *qualifiers,
*ptr;
char *comma_oids,
- *comma_filenodes,
+ *comma_filenumbers,
*comma_tables;
bool written = false;
char *addfields = ",c.oid AS \"Oid\", nspname AS \"Schema\", spcname as \"Tablespace\" ";
- /* get tables qualifiers, whether names, filenodes, or OIDs */
+ /* get tables qualifiers, whether names, filenumbers, or OIDs */
comma_oids = get_comma_elts(opts->oids);
comma_tables = get_comma_elts(opts->tables);
- comma_filenodes = get_comma_elts(opts->filenodes);
+ comma_filenumbers = get_comma_elts(opts->filenumbers);
/* 80 extra chars for SQL expression */
qualifiers = (char *) pg_malloc(strlen(comma_oids) + strlen(comma_tables) +
- strlen(comma_filenodes) + 80);
+ strlen(comma_filenumbers) + 80);
ptr = qualifiers;
if (opts->oids->num > 0)
@@ -524,11 +524,11 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
ptr += sprintf(ptr, "c.oid IN (%s)", comma_oids);
written = true;
}
- if (opts->filenodes->num > 0)
+ if (opts->filenumbers->num > 0)
{
if (written)
ptr += sprintf(ptr, " OR ");
- ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenodes);
+ ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenumbers);
written = true;
}
if (opts->tables->num > 0)
@@ -539,7 +539,7 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
}
free(comma_oids);
free(comma_tables);
- free(comma_filenodes);
+ free(comma_filenumbers);
/* now build the query */
todo = psprintf("SELECT pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\", relname as \"Table Name\" %s\n"
@@ -588,11 +588,11 @@ main(int argc, char **argv)
my_opts->oids = (eary *) pg_malloc(sizeof(eary));
my_opts->tables = (eary *) pg_malloc(sizeof(eary));
- my_opts->filenodes = (eary *) pg_malloc(sizeof(eary));
+ my_opts->filenumbers = (eary *) pg_malloc(sizeof(eary));
my_opts->oids->num = my_opts->oids->alloc = 0;
my_opts->tables->num = my_opts->tables->alloc = 0;
- my_opts->filenodes->num = my_opts->filenodes->alloc = 0;
+ my_opts->filenumbers->num = my_opts->filenumbers->alloc = 0;
/* parse the opts */
get_opts(argc, argv, my_opts);
@@ -618,7 +618,7 @@ main(int argc, char **argv)
/* display the given elements in the database */
if (my_opts->oids->num > 0 ||
my_opts->tables->num > 0 ||
- my_opts->filenodes->num > 0)
+ my_opts->filenumbers->num > 0)
{
if (!my_opts->quiet)
printf("From database \"%s\":\n", my_opts->dbname);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..713f52a 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
+ fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c0c4f5d..7f1d55c 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -52,7 +52,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/resowner.h"
#define AUTOPREWARM_FILE "autoprewarm.blocks"
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ RelFileNumber filenumber;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -347,7 +347,7 @@ apw_load_buffers(void)
unsigned forknum;
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
- &blkinfo[i].tablespace, &blkinfo[i].filenode,
+ &blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
(errmsg("autoprewarm block dump file is corrupted at line %d",
@@ -494,7 +494,7 @@ autoprewarm_database_main(Datum main_arg)
* relation. Note that rel will be NULL if try_relation_open failed
* previously; in that case, there is nothing to close.
*/
- if (old_blk != NULL && old_blk->filenode != blk->filenode &&
+ if (old_blk != NULL && old_blk->filenumber != blk->filenumber &&
rel != NULL)
{
relation_close(rel, AccessShareLock);
@@ -506,13 +506,13 @@ autoprewarm_database_main(Datum main_arg)
* Try to open each new relation, but only once, when we first
* encounter it. If it's been dropped, skip the associated blocks.
*/
- if (old_blk == NULL || old_blk->filenode != blk->filenode)
+ if (old_blk == NULL || old_blk->filenumber != blk->filenumber)
{
Oid reloid;
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenode(blk->tablespace, blk->filenode);
+ reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
@@ -527,7 +527,7 @@ autoprewarm_database_main(Datum main_arg)
/* Once per fork, check for fork existence and size. */
if (old_blk == NULL ||
- old_blk->filenode != blk->filenode ||
+ old_blk->filenumber != blk->filenumber ||
old_blk->forknum != blk->forknum)
{
/*
@@ -631,9 +631,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
+ block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
@@ -671,7 +671,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
- block_info_array[i].filenode,
+ block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
if (ret < 0)
@@ -900,7 +900,7 @@ do { \
* We depend on all records for a particular database being consecutive
* in the dump file; each per-database worker will preload blocks until
* it sees a block for some other database. Sorting by tablespace,
- * filenode, forknum, and blocknum isn't critical for correctness, but
+ * filenumber, forknum, and blocknum isn't critical for correctness, but
* helps us get a sequential I/O pattern.
*/
static int
@@ -911,7 +911,7 @@ apw_compare_blockinfo(const void *p, const void *q)
cmp_member_elem(database);
cmp_member_elem(tablespace);
- cmp_member_elem(filenode);
+ cmp_member_elem(filenumber);
cmp_member_elem(forknum);
cmp_member_elem(blocknum);
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 1853c35..4e2e9ea 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -407,7 +407,7 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
xl_smgr_truncate xlrec;
xlrec.blkno = 0;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_VM;
XLogBeginInsert();
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..ad48cb7 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -90,7 +90,7 @@ bool trace_syncscan = false;
*/
typedef struct ss_scan_location_t
{
- RelFileNode relfilenode; /* identity of a relation */
+ RelFileLocator relfilelocator; /* identity of a relation */
BlockNumber location; /* last-reported location in the relation */
} ss_scan_location_t;
@@ -115,7 +115,7 @@ typedef struct ss_scan_locations_t
static ss_scan_locations_t *scan_locations;
/* prototypes for internal functions */
-static BlockNumber ss_search(RelFileNode relfilenode,
+static BlockNumber ss_search(RelFileLocator relfilelocator,
BlockNumber location, bool set);
@@ -159,9 +159,9 @@ SyncScanShmemInit(void)
* these invalid entries will fall off the LRU list and get
* replaced with real entries.
*/
- item->location.relfilenode.spcNode = InvalidOid;
- item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
@@ -176,10 +176,10 @@ SyncScanShmemInit(void)
/*
* ss_search --- search the scan_locations structure for an entry with the
- * given relfilenode.
+ * given relfilelocator.
*
* If "set" is true, the location is updated to the given location. If no
- * entry for the given relfilenode is found, it will be created at the head
+ * entry for the given relfilelocator is found, it will be created at the head
* of the list with the given location, even if "set" is false.
*
* In any case, the location after possible update is returned.
@@ -188,7 +188,7 @@ SyncScanShmemInit(void)
* data structure.
*/
static BlockNumber
-ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
+ss_search(RelFileLocator relfilelocator, BlockNumber location, bool set)
{
ss_lru_item_t *item;
@@ -197,7 +197,8 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
{
bool match;
- match = RelFileNodeEquals(item->location.relfilenode, relfilenode);
+ match = RelFileLocatorEquals(item->location.relfilelocator,
+ relfilelocator);
if (match || item->next == NULL)
{
@@ -207,7 +208,7 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
*/
if (!match)
{
- item->location.relfilenode = relfilenode;
+ item->location.relfilelocator = relfilelocator;
item->location.location = location;
}
else if (set)
@@ -255,7 +256,7 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
BlockNumber startloc;
LWLockAcquire(SyncScanLock, LW_EXCLUSIVE);
- startloc = ss_search(rel->rd_node, 0, false);
+ startloc = ss_search(rel->rd_locator, 0, false);
LWLockRelease(SyncScanLock);
/*
@@ -281,8 +282,8 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
* ss_report_location --- update the current scan location
*
* Writes an entry into the shared Sync Scan state of the form
- * (relfilenode, blocknumber), overwriting any existing entry for the
- * same relfilenode.
+ * (relfilelocator, blocknumber), overwriting any existing entry for the
+ * same relfilelocator.
*/
void
ss_report_location(Relation rel, BlockNumber location)
@@ -309,7 +310,7 @@ ss_report_location(Relation rel, BlockNumber location)
{
if (LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE))
{
- (void) ss_search(rel->rd_node, location, true);
+ (void) ss_search(rel->rd_locator, location, true);
LWLockRelease(SyncScanLock);
}
#ifdef TRACE_SYNCSCAN
diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index cc6d4e6..c75bfc2 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -470,7 +470,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
savedRightLink = GinPageGetOpaque(page)->rightlink;
/* Begin setting up WAL record */
- data.node = btree->index->rd_node;
+ data.locator = btree->index->rd_locator;
data.flags = xlflags;
if (BufferIsValid(childbuf))
{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 7409fdc..6c67744 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -235,7 +235,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
needWal = RelationNeedsWAL(index);
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 20f4706..6df7f2e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -688,7 +688,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats, bool is_build)
XLogRecPtr recptr;
ginxlogUpdateMeta data;
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
memcpy(&data.metadata, metadata, sizeof(GinMetaPageData));
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..41b9211 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileLocator locator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &locator, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index f5a5caf..374e64e 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -462,7 +462,7 @@ gist_indexsortbuild(GISTBuildState *state)
smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
if (RelationNeedsWAL(state->indexrel))
- log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM, GIST_ROOT_BLKNO,
+ log_newpage(&state->indexrel->rd_locator, MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
pfree(levelstate->pages[0]);
@@ -663,7 +663,7 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
}
if (RelationNeedsWAL(state->indexrel))
- log_newpages(&state->indexrel->rd_node, MAIN_FORKNUM, state->ready_num_pages,
+ log_newpages(&state->indexrel->rd_locator, MAIN_FORKNUM, state->ready_num_pages,
state->ready_blknos, state->ready_pages, true);
for (int i = 0; i < state->ready_num_pages; i++)
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index df70f90..b4f629f 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -191,11 +191,12 @@ gistRedoDeleteRecord(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid,
+ rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -395,7 +396,7 @@ gistRedoPageReuse(XLogReaderState *record)
*/
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
@@ -607,7 +608,7 @@ gistXLogPageReuse(Relation rel, BlockNumber blkno, FullTransactionId latestRemov
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = latestRemovedXid;
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index 62dbfc3..2e68303 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -999,10 +999,10 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rlocator);
}
action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL, true, &buffer);
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 39206d1..d2edcd4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -428,7 +428,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1019,7 +1019,7 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 637de11..aab8d6f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8189,7 +8189,7 @@ log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid,
* heap_buffer, if necessary.
*/
XLogRecPtr
-log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
+log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer, Buffer vm_buffer,
TransactionId cutoff_xid, uint8 vmflags)
{
xl_heap_visible xlrec;
@@ -8454,7 +8454,7 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
Assert(tup->t_tableOid != InvalidOid);
xlrec.top_xid = GetTopTransactionId();
- xlrec.target_node = relation->rd_node;
+ xlrec.target_locator = relation->rd_locator;
xlrec.target_tid = tup->t_self;
/*
@@ -8623,18 +8623,18 @@ heap_xlog_prune(XLogReaderState *record)
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_prune *xlrec = (xl_heap_prune *) XLogRecGetData(record);
Buffer buffer;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
* no queries running for which the removed tuples are still visible.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
/*
* If we have a full-page image, restore it (using a cleanup lock) and
@@ -8694,7 +8694,7 @@ heap_xlog_prune(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8751,9 +8751,9 @@ heap_xlog_vacuum(XLogReaderState *record)
if (BufferIsValid(buffer))
{
Size freespace = PageGetHeapFreeSpace(BufferGetPage(buffer));
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
UnlockReleaseBuffer(buffer);
@@ -8766,7 +8766,7 @@ heap_xlog_vacuum(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8786,11 +8786,11 @@ heap_xlog_visible(XLogReaderState *record)
Buffer vmbuffer = InvalidBuffer;
Buffer buffer;
Page page;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -8802,7 +8802,7 @@ heap_xlog_visible(XLogReaderState *record)
* rather than killing the transaction outright.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rlocator);
/*
* Read the heap page, if it still exists. If the heap file has dropped or
@@ -8865,7 +8865,7 @@ heap_xlog_visible(XLogReaderState *record)
* FSM data is not in the page anyway.
*/
if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
- XLogRecordPageWithFreeSpace(rnode, blkno, space);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, space);
}
/*
@@ -8890,7 +8890,7 @@ heap_xlog_visible(XLogReaderState *record)
*/
LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
- reln = CreateFakeRelcacheEntry(rnode);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, blkno, &vmbuffer);
/*
@@ -8933,13 +8933,13 @@ heap_xlog_freeze_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
TransactionId latestRemovedXid = cutoff_xid;
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -9007,10 +9007,10 @@ heap_xlog_delete(XLogReaderState *record)
ItemId lp = NULL;
HeapTupleHeader htup;
BlockNumber blkno;
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9020,7 +9020,7 @@ heap_xlog_delete(XLogReaderState *record)
*/
if (xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9086,12 +9086,12 @@ heap_xlog_insert(XLogReaderState *record)
xl_heap_header xlhdr;
uint32 newlen;
Size freespace = 0;
- RelFileNode target_node;
+ RelFileLocator target_locator;
BlockNumber blkno;
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9101,7 +9101,7 @@ heap_xlog_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9184,7 +9184,7 @@ heap_xlog_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(target_node, blkno, freespace);
+ XLogRecordPageWithFreeSpace(target_locator, blkno, freespace);
}
/*
@@ -9195,7 +9195,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_multi_insert *xlrec;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
Buffer buffer;
Page page;
@@ -9217,7 +9217,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/* check that the mutually exclusive flags are not both set */
Assert(!((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) &&
@@ -9229,7 +9229,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9331,7 +9331,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
/*
@@ -9342,7 +9342,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_update *xlrec = (xl_heap_update *) XLogRecGetData(record);
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber oldblk;
BlockNumber newblk;
ItemPointerData newtid;
@@ -9371,7 +9371,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &newblk);
if (XLogRecGetBlockTagExtended(record, 1, NULL, NULL, &oldblk, NULL))
{
/* HOT updates are never done across pages */
@@ -9388,7 +9388,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, oldblk, &vmbuffer);
@@ -9472,7 +9472,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, newblk, &vmbuffer);
@@ -9606,7 +9606,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
* totally accurate anyway.
*/
if (newaction == BLK_NEEDS_REDO && !hot_update && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, newblk, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, newblk, freespace);
}
static void
@@ -9662,13 +9662,13 @@ heap_xlog_lock(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
@@ -9735,13 +9735,13 @@ heap_xlog_lock_updated(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 444f027..7f227be 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -566,11 +566,11 @@ tuple_lock_retry:
*/
static void
-heapam_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+heapam_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
SMgrRelation srel;
@@ -591,7 +591,7 @@ heapam_relation_set_new_filenode(Relation rel,
*/
*minmulti = GetOldestMultiXactId();
- srel = RelationCreateStorage(*newrnode, persistence, true);
+ srel = RelationCreateStorage(*newrlocator, persistence, true);
/*
* If required, set up an init fork for an unlogged table so that it can
@@ -608,7 +608,7 @@ heapam_relation_set_new_filenode(Relation rel,
rel->rd_rel->relkind == RELKIND_MATVIEW ||
rel->rd_rel->relkind == RELKIND_TOASTVALUE);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(newrnode, INIT_FORKNUM);
+ log_smgrcreate(newrlocator, INIT_FORKNUM);
smgrimmedsync(srel, INIT_FORKNUM);
}
@@ -622,11 +622,11 @@ heapam_relation_nontransactional_truncate(Relation rel)
}
static void
-heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+heapam_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(*newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -640,10 +640,10 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(*newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -664,7 +664,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(newrnode, forkNum);
+ log_smgrcreate(newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
@@ -2569,7 +2569,7 @@ static const TableAmRoutine heapam_methods = {
.tuple_satisfies_snapshot = heapam_tuple_satisfies_snapshot,
.index_delete_tuples = heap_index_delete_tuples,
- .relation_set_new_filenode = heapam_relation_set_new_filenode,
+ .relation_set_new_filelocator = heapam_relation_set_new_filelocator,
.relation_nontransactional_truncate = heapam_relation_nontransactional_truncate,
.relation_copy_data = heapam_relation_copy_data,
.relation_copy_for_cluster = heapam_relation_copy_for_cluster,
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826..197f06b 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -318,7 +318,7 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -679,7 +679,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
page,
@@ -742,7 +742,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
* When doing logical decoding - which relies on using cmin/cmax of catalog
* tuples, via xl_heap_new_cid records - heap rewrites have to log enough
* information to allow the decoding backend to update its internal mapping
- * of (relfilenode,ctid) => (cmin, cmax) to be correct for the rewritten heap.
+ * of (relfilelocator,ctid) => (cmin, cmax) to be correct for the rewritten heap.
*
* For that, every time we find a tuple that's been modified in a catalog
* relation within the xmin horizon of any decoding slot, we log a mapping
@@ -1080,9 +1080,9 @@ logical_rewrite_heap_tuple(RewriteState state, ItemPointerData old_tid,
return;
/* fill out mapping information */
- map.old_node = state->rs_old_rel->rd_node;
+ map.old_locator = state->rs_old_rel->rd_locator;
map.old_tid = old_tid;
- map.new_node = state->rs_new_rel->rd_node;
+ map.new_locator = state->rs_new_rel->rd_locator;
map.new_tid = new_tid;
/* ---
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index e09f25a..ed72eb7 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -283,7 +283,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
if (XLogRecPtrIsInvalid(recptr))
{
Assert(!InRecovery);
- recptr = log_heap_visible(rel->rd_node, heapBuf, vmBuf,
+ recptr = log_heap_visible(rel->rd_locator, heapBuf, vmBuf,
cutoff_xid, flags);
/*
@@ -668,7 +668,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
* to keep checking for creation or extension of the file, which happens
* infrequently.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
UnlockRelationForExtension(rel, ExclusiveLock);
}
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 20adb60..8b96708 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -836,7 +836,7 @@ _bt_log_reuse_page(Relation rel, BlockNumber blkno, FullTransactionId safexid)
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = safexid;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 9b730f3..b52eca8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -166,7 +166,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&RelationGetSmgr(index)->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&RelationGetSmgr(index)->smgr_rlocator.locator, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9f60fa9..bd1685c 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -647,7 +647,7 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the XLOG_FPI record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(&wstate->index->rd_locator, MAIN_FORKNUM, blkno, page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index f9186ca..ad489e3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -664,11 +664,11 @@ btree_xlog_delete(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
}
/*
@@ -1006,7 +1006,7 @@ btree_xlog_reuse_page(XLogReaderState *record)
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
diff --git a/src/backend/access/rmgrdesc/genericdesc.c b/src/backend/access/rmgrdesc/genericdesc.c
index 877beb5..d8509b8 100644
--- a/src/backend/access/rmgrdesc/genericdesc.c
+++ b/src/backend/access/rmgrdesc/genericdesc.c
@@ -15,7 +15,7 @@
#include "access/generic_xlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Description of generic xlog record: write page regions that this record
diff --git a/src/backend/access/rmgrdesc/gindesc.c b/src/backend/access/rmgrdesc/gindesc.c
index 57f7bce..7d147ce 100644
--- a/src/backend/access/rmgrdesc/gindesc.c
+++ b/src/backend/access/rmgrdesc/gindesc.c
@@ -17,7 +17,7 @@
#include "access/ginxlog.h"
#include "access/xlogutils.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
desc_recompress_leaf(StringInfo buf, ginxlogRecompressDataLeaf *insertData)
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index d0c8e24..7dd3c1d 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -16,7 +16,7 @@
#include "access/gistxlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
@@ -27,8 +27,8 @@ static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..923d3bc 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -170,9 +170,9 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
- xlrec->target_node.spcNode,
- xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ xlrec->target_locator.spcOid,
+ xlrec->target_locator.dbOid,
+ xlrec->target_locator.relNumber,
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..4843cd5 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -101,8 +101,8 @@ btree_desc(StringInfo buf, XLogReaderState *record)
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..b3845f9 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -26,8 +26,8 @@ seq_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SEQ_LOG)
appendStringInfo(buf, "rel %u/%u/%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber);
}
const char *
diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 7547813..e0ee8a0 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,7 +26,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SMGR_CREATE)
{
xl_smgr_create *xlrec = (xl_smgr_create *) rec;
- char *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+ char *path = relpathperm(xlrec->rlocator, xlrec->forkNum);
appendStringInfoString(buf, path);
pfree(path);
@@ -34,7 +34,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
else if (info == XLOG_SMGR_TRUNCATE)
{
xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
- char *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+ char *path = relpathperm(xlrec->rlocator, MAIN_FORKNUM);
appendStringInfo(buf, "%s to %u blocks flags %d", path,
xlrec->blkno, xlrec->flags);
diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 90b6ac2..39752cf 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -73,15 +73,15 @@ ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *pars
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocators = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocators->nrels;
+ parsed->xlocators = xl_rellocators->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocators->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -179,15 +179,15 @@ ParseAbortRecord(uint8 info, xl_xact_abort *xlrec, xl_xact_parsed_abort *parsed)
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocator = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocator->nrels;
+ parsed->xlocators = xl_rellocator->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocator->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -260,11 +260,11 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
parsed->subxacts = (TransactionId *) bufptr;
bufptr += MAXALIGN(xlrec->nsubxacts * sizeof(TransactionId));
- parsed->xnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileNode));
+ parsed->xlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileLocator));
- parsed->abortnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileNode));
+ parsed->abortlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileLocator));
parsed->stats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(xlrec->ncommitstats * sizeof(xl_xact_stats_item));
@@ -278,7 +278,7 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
static void
xact_desc_relations(StringInfo buf, char *label, int nrels,
- RelFileNode *xnodes)
+ RelFileLocator *xlocators)
{
int i;
@@ -287,7 +287,7 @@ xact_desc_relations(StringInfo buf, char *label, int nrels,
appendStringInfo(buf, "; %s:", label);
for (i = 0; i < nrels; i++)
{
- char *path = relpathperm(xnodes[i], MAIN_FORKNUM);
+ char *path = relpathperm(xlocators[i], MAIN_FORKNUM);
appendStringInfo(buf, " %s", path);
pfree(path);
@@ -340,7 +340,7 @@ xact_desc_commit(StringInfo buf, uint8 info, xl_xact_commit *xlrec, RepOriginId
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
xact_desc_stats(buf, "", parsed.nstats, parsed.stats);
@@ -376,7 +376,7 @@ xact_desc_abort(StringInfo buf, uint8 info, xl_xact_abort *xlrec, RepOriginId or
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
if (parsed.xinfo & XACT_XINFO_HAS_ORIGIN)
@@ -400,9 +400,9 @@ xact_desc_prepare(StringInfo buf, uint8 info, xl_xact_prepare *xlrec, RepOriginI
appendStringInfo(buf, "gid %s: ", parsed.twophase_gid);
appendStringInfoString(buf, timestamptz_to_str(parsed.xact_time));
- xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xlocators);
xact_desc_relations(buf, "rels(abort)", parsed.nabortrels,
- parsed.abortnodes);
+ parsed.abortlocators);
xact_desc_stats(buf, "commit ", parsed.nstats, parsed.stats);
xact_desc_stats(buf, "abort ", parsed.nabortstats, parsed.abortstats);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index fefc563..6fec485 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -219,12 +219,12 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (detailed_format)
@@ -239,7 +239,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
"blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
@@ -299,7 +299,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
}
@@ -308,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index bfb7404..c6821b5 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index b500b2c..4c9f402 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -877,11 +877,11 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
if (TransactionIdIsValid(xldata->newestRedirectXid))
{
- RelFileNode node;
+ RelFileLocator locator;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &locator, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
- node);
+ locator);
}
}
diff --git a/src/backend/access/table/tableamapi.c b/src/backend/access/table/tableamapi.c
index 76df798..873d961 100644
--- a/src/backend/access/table/tableamapi.c
+++ b/src/backend/access/table/tableamapi.c
@@ -82,7 +82,7 @@ GetTableAmRoutine(Oid amhandler)
Assert(routine->tuple_update != NULL);
Assert(routine->tuple_lock != NULL);
- Assert(routine->relation_set_new_filenode != NULL);
+ Assert(routine->relation_set_new_filelocator != NULL);
Assert(routine->relation_nontransactional_truncate != NULL);
Assert(routine->relation_copy_data != NULL);
Assert(routine->relation_copy_for_cluster != NULL);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..565f994 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -557,7 +557,7 @@ void XLogRegisterBuffer(uint8 block_id, Buffer buf, uint8 flags);
XLogRegisterBuffer adds information about a data block to the WAL record.
block_id is an arbitrary number used to identify this page reference in
the redo routine. The information needed to re-find the page at redo -
- relfilenode, fork, and block number - are included in the WAL record.
+ relfilenumber, fork, and block number - are included in the WAL record.
XLogInsert will automatically include a full copy of the page contents, if
this is the first modification of the buffer since the last checkpoint.
@@ -692,7 +692,7 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
+because we check for on-disk collisions when allocating new relfilenumber
OIDs. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
@@ -725,10 +725,10 @@ then restart recovery. This is part of the reason for not writing a WAL
entry until we've successfully done the original action.
-Skipping WAL for New RelFileNode
+Skipping WAL for New RelFileLocator
--------------------------------
-Under wal_level=minimal, if a change modifies a relfilenode that ROLLBACK
+Under wal_level=minimal, if a change modifies a relfilenumber that ROLLBACK
would unlink, in-tree access methods write no WAL for that change. Code that
writes WAL without calling RelationNeedsWAL() must check for this case. This
skipping is mandatory. If a WAL-writing change preceded a WAL-skipping change
@@ -748,9 +748,9 @@ unconditionally for permanent relations. Under these approaches, the access
method callbacks must not call functions that react to RelationNeedsWAL().
This applies only to WAL records whose replay would modify bytes stored in the
-new relfilenode. It does not apply to other records about the relfilenode,
+new relfilenumber. It does not apply to other records about the relfilenumber,
such as XLOG_SMGR_CREATE. Because it operates at the level of individual
-relfilenodes, RelationNeedsWAL() can differ for tightly-coupled relations.
+relfilenumbers, RelationNeedsWAL() can differ for tightly-coupled relations.
Consider "CREATE TABLE t (); BEGIN; ALTER TABLE t ADD c text; ..." in which
ALTER TABLE adds a TOAST relation. The TOAST relation will skip WAL, while
the table owning it will not. ALTER TABLE SET TABLESPACE will cause a table
@@ -860,7 +860,7 @@ Changes to a temp table are not WAL-logged, hence could reach disk in
advance of T1's commit, but we don't care since temp table contents don't
survive crashes anyway.
-Database writes that skip WAL for new relfilenodes are also safe. In these
+Database writes that skip WAL for new relfilenumbers are also safe. In these
cases it's entirely possible for the data to reach disk before T1's commit,
because T1 will fsync it down to disk without any sort of interlock. However,
all these paths are designed to write data that no other transaction can see
diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel
index 99c588d..e486bff 100644
--- a/src/backend/access/transam/README.parallel
+++ b/src/backend/access/transam/README.parallel
@@ -126,7 +126,7 @@ worker. This includes:
an index that is currently being rebuilt.
- Active relmapper.c mapping state. This is needed to allow consistent
- answers when fetching the current relfilenode for relation oids of
+ answers when fetching the current relfilenumber for relation oids of
mapped relations.
To prevent unprincipled deadlocks when running in parallel mode, this code
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 75551f6..41b31c5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -204,7 +204,7 @@ static void RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -215,7 +215,7 @@ static void RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid);
@@ -951,8 +951,8 @@ TwoPhaseGetDummyProc(TransactionId xid, bool lock_held)
*
* 1. TwoPhaseFileHeader
* 2. TransactionId[] (subtransactions)
- * 3. RelFileNode[] (files to be deleted at commit)
- * 4. RelFileNode[] (files to be deleted at abort)
+ * 3. RelFileLocator[] (files to be deleted at commit)
+ * 4. RelFileLocator[] (files to be deleted at abort)
* 5. SharedInvalidationMessage[] (inval messages to be sent at commit)
* 6. TwoPhaseRecordOnDisk
* 7. ...
@@ -1047,8 +1047,8 @@ StartPrepare(GlobalTransaction gxact)
TransactionId xid = gxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
xl_xact_stats_item *abortstats = NULL;
xl_xact_stats_item *commitstats = NULL;
SharedInvalidationMessage *invalmsgs;
@@ -1102,12 +1102,12 @@ StartPrepare(GlobalTransaction gxact)
}
if (hdr.ncommitrels > 0)
{
- save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileNode));
+ save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileLocator));
pfree(commitrels);
}
if (hdr.nabortrels > 0)
{
- save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileNode));
+ save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileLocator));
pfree(abortrels);
}
if (hdr.ncommitstats > 0)
@@ -1489,9 +1489,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
TwoPhaseFileHeader *hdr;
TransactionId latestXid;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
- RelFileNode *delrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
+ RelFileLocator *delrels;
int ndelrels;
xl_xact_stats_item *commitstats;
xl_xact_stats_item *abortstats;
@@ -1525,10 +1525,10 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
bufptr += MAXALIGN(hdr->gidlen);
children = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- commitrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- abortrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ commitrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ abortrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
commitstats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
abortstats = (xl_xact_stats_item *) bufptr;
@@ -2100,8 +2100,8 @@ RecoverPreparedTransactions(void)
bufptr += MAXALIGN(hdr->gidlen);
subxids = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->nabortstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->ninvalmsgs * sizeof(SharedInvalidationMessage));
@@ -2285,7 +2285,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -2383,7 +2383,7 @@ RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..849a7ce 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -521,7 +521,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
+ * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
* catalog/catalog.c.
*/
Oid
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 47d80b0..9379723 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1282,7 +1282,7 @@ RecordTransactionCommit(void)
bool markXidCommitted = TransactionIdIsValid(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int nchildren;
TransactionId *children;
int ndroppedstats = 0;
@@ -1705,7 +1705,7 @@ RecordTransactionAbort(bool isSubXact)
TransactionId xid = GetCurrentTransactionIdIfAny();
TransactionId latestXid;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int ndroppedstats = 0;
xl_xact_stats_item *droppedstats = NULL;
int nchildren;
@@ -5586,7 +5586,7 @@ xactGetCommittedChildren(TransactionId **ptr)
XLogRecPtr
XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int nmsgs, SharedInvalidationMessage *msgs,
bool relcacheInval,
@@ -5597,7 +5597,7 @@ XactLogCommitRecord(TimestampTz commit_time,
xl_xact_xinfo xl_xinfo;
xl_xact_dbinfo xl_dbinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_invals xl_invals;
xl_xact_twophase xl_twophase;
@@ -5651,8 +5651,8 @@ XactLogCommitRecord(TimestampTz commit_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5710,12 +5710,12 @@ XactLogCommitRecord(TimestampTz commit_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -5758,7 +5758,7 @@ XactLogCommitRecord(TimestampTz commit_time,
XLogRecPtr
XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int xactflags, TransactionId twophase_xid,
const char *twophase_gid)
@@ -5766,7 +5766,7 @@ XactLogAbortRecord(TimestampTz abort_time,
xl_xact_abort xlrec;
xl_xact_xinfo xl_xinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_twophase xl_twophase;
xl_xact_dbinfo xl_dbinfo;
@@ -5800,8 +5800,8 @@ XactLogAbortRecord(TimestampTz abort_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5864,12 +5864,12 @@ XactLogAbortRecord(TimestampTz abort_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -6010,7 +6010,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
XLogFlush(lsn);
/* Make sure files supposed to be dropped are dropped */
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
@@ -6121,7 +6121,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid,
*/
XLogFlush(lsn);
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 2ce9be2..ec27d36 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -70,7 +70,7 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ RelFileLocator rlocator; /* identifies the relation and block */
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -257,7 +257,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->rlocator, ®buf->forkno, ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -278,7 +278,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -293,7 +293,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
+XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
BlockNumber blknum, Page page, uint8 flags)
{
registered_buffer *regbuf;
@@ -308,7 +308,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
- regbuf->rnode = *rnode;
+ regbuf->rlocator = *rlocator;
regbuf->forkno = forknum;
regbuf->block = blknum;
regbuf->page = page;
@@ -331,7 +331,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -768,7 +768,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && RelFileLocatorEquals(regbuf->rlocator, prev_regbuf->rlocator))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -793,8 +793,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
- memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
- scratch += sizeof(RelFileNode);
+ memcpy(scratch, ®buf->rlocator, sizeof(RelFileLocator));
+ scratch += sizeof(RelFileLocator);
}
memcpy(scratch, ®buf->block, sizeof(BlockNumber));
scratch += sizeof(BlockNumber);
@@ -1031,7 +1031,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags = 0;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkno;
BlockNumber blkno;
@@ -1058,8 +1058,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &rlocator, &forkno, &blkno);
+ XLogRegisterBlock(0, &rlocator, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1080,7 +1080,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
+log_newpage(RelFileLocator *rlocator, ForkNumber forkNum, BlockNumber blkno,
Page page, bool page_std)
{
int flags;
@@ -1091,7 +1091,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, rlocator, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1112,7 +1112,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
* because we can write multiple pages in a single WAL record.
*/
void
-log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, Page *pages, bool page_std)
{
int flags;
@@ -1142,7 +1142,7 @@ log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
nbatch = 0;
while (nbatch < XLR_MAX_BLOCK_ID && i < num_pages)
{
- XLogRegisterBlock(nbatch, rnode, forkNum, blknos[i], pages[i], flags);
+ XLogRegisterBlock(nbatch, rlocator, forkNum, blknos[i], pages[i], flags);
i++;
nbatch++;
}
@@ -1177,16 +1177,16 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
BlockNumber blkno;
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &rlocator, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rlocator, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 959e409..d1662f3 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -138,7 +138,7 @@ struct XLogPrefetcher
dlist_head filter_queue;
/* Book-keeping to avoid repeat prefetches. */
- RelFileNode recent_rnode[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
+ RelFileLocator recent_rlocator[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
BlockNumber recent_block[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
int recent_idx;
@@ -161,7 +161,7 @@ struct XLogPrefetcher
*/
typedef struct XLogPrefetcherFilter
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
XLogRecPtr filter_until_replayed;
BlockNumber filter_from_block;
dlist_node link;
@@ -187,11 +187,11 @@ typedef struct XLogPrefetchStats
} XLogPrefetchStats;
static inline void XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno,
XLogRecPtr lsn);
static inline bool XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno);
static inline void XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher,
XLogRecPtr replaying_lsn);
@@ -365,7 +365,7 @@ XLogPrefetcherAllocate(XLogReaderState *reader)
{
XLogPrefetcher *prefetcher;
static HASHCTL hash_table_ctl = {
- .keysize = sizeof(RelFileNode),
+ .keysize = sizeof(RelFileLocator),
.entrysize = sizeof(XLogPrefetcherFilter)
};
@@ -568,22 +568,22 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
xl_dbase_create_file_copy_rec *xlrec =
(xl_dbase_create_file_copy_rec *) record->main_data;
- RelFileNode rnode = {InvalidOid, xlrec->db_id, InvalidOid};
+ RelFileLocator rlocator = {InvalidOid, xlrec->db_id, InvalidOid};
/*
* Don't try to prefetch anything in this database until
* it has been created, or we might confuse the blocks of
- * different generations, if a database OID or relfilenode
- * is reused. It's also more efficient than discovering
- * that relations don't exist on disk yet with ENOENT
- * errors.
+ * different generations, if a database OID or
+ * relfilenumber is reused. It's also more efficient than
+ * discovering that relations don't exist on disk yet with
+ * ENOENT errors.
*/
- XLogPrefetcherAddFilter(prefetcher, rnode, 0, record->lsn);
+ XLogPrefetcherAddFilter(prefetcher, rlocator, 0, record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in database %u until %X/%X is replayed due to raw file copy",
- rnode.dbNode,
+ rlocator.dbOid,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -601,19 +601,19 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't prefetch anything for this whole relation
* until it has been created. Otherwise we might
* confuse the blocks of different generations, if a
- * relfilenode is reused. This also avoids the need
+ * relfilenumber is reused. This also avoids the need
* to discover the problem via extra syscalls that
* report ENOENT.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -627,16 +627,16 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't consider prefetching anything in the truncated
* range until the truncation has been performed.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator,
xlrec->blkno,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
xlrec->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
@@ -688,7 +688,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
}
/* Should we skip prefetching this block due to a filter? */
- if (XLogPrefetcherIsFiltered(prefetcher, block->rnode, block->blkno))
+ if (XLogPrefetcherIsFiltered(prefetcher, block->rlocator, block->blkno))
{
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -698,7 +698,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
for (int i = 0; i < XLOGPREFETCHER_SEQ_WINDOW_SIZE; ++i)
{
if (block->blkno == prefetcher->recent_block[i] &&
- RelFileNodeEquals(block->rnode, prefetcher->recent_rnode[i]))
+ RelFileLocatorEquals(block->rlocator, prefetcher->recent_rlocator[i]))
{
/*
* XXX If we also remembered where it was, we could set
@@ -709,7 +709,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
return LRQ_NEXT_NO_IO;
}
}
- prefetcher->recent_rnode[prefetcher->recent_idx] = block->rnode;
+ prefetcher->recent_rlocator[prefetcher->recent_idx] = block->rlocator;
prefetcher->recent_block[prefetcher->recent_idx] = block->blkno;
prefetcher->recent_idx =
(prefetcher->recent_idx + 1) % XLOGPREFETCHER_SEQ_WINDOW_SIZE;
@@ -719,7 +719,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* same relation (with some scheme to handle invalidations
* safely), but for now we'll call smgropen() every time.
*/
- reln = smgropen(block->rnode, InvalidBackendId);
+ reln = smgropen(block->rlocator, InvalidBackendId);
/*
* If the relation file doesn't exist on disk, for example because
@@ -733,12 +733,12 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, 0,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -754,13 +754,13 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, block->blkno,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, block->blkno,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -793,9 +793,9 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
*/
elog(ERROR,
"could not prefetch relation %u/%u/%u block %u",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno);
}
}
@@ -852,17 +852,17 @@ pg_stat_get_recovery_prefetch(PG_FUNCTION_ARGS)
}
/*
- * Don't prefetch any blocks >= 'blockno' from a given 'rnode', until 'lsn'
+ * Don't prefetch any blocks >= 'blockno' from a given 'rlocator', until 'lsn'
* has been replayed.
*/
static inline void
-XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno, XLogRecPtr lsn)
{
XLogPrefetcherFilter *filter;
bool found;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_ENTER, &found);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_ENTER, &found);
if (!found)
{
/*
@@ -875,7 +875,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
else
{
/*
- * We were already filtering this rnode. Extend the filter's lifetime
+ * We were already filtering this rlocator. Extend the filter's lifetime
* to cover this WAL record, but leave the lower of the block numbers
* there because we don't want to have to track individual blocks.
*/
@@ -890,7 +890,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
* Have we replayed any records that caused us to begin filtering a block
* range? That means that relations should have been created, extended or
* dropped as required, so we can stop filtering out accesses to a given
- * relfilenode.
+ * relfilenumber.
*/
static inline void
XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_lsn)
@@ -913,7 +913,7 @@ XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_l
* Check if a given block should be skipped due to a filter.
*/
static inline bool
-XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno)
{
/*
@@ -925,13 +925,13 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
XLogPrefetcherFilter *filter;
/* See if the block range is filtered. */
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter && filter->filter_from_block <= blockno)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
#endif
@@ -939,15 +939,15 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
}
/* See if the whole database is filtered. */
- rnode.relNode = InvalidOid;
- rnode.spcNode = InvalidOid;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ rlocator.relNumber = InvalidRelFileNumber;
+ rlocator.spcOid = InvalidOid;
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
return true;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cf5db23..f3dc4b7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1638,7 +1638,7 @@ DecodeXLogRecord(XLogReaderState *state,
char *out;
uint32 remaining;
uint32 datatotal;
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
uint8 block_id;
decoded->header = *record;
@@ -1823,12 +1823,12 @@ DecodeXLogRecord(XLogReaderState *state,
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
- COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
- rnode = &blk->rnode;
+ COPY_HEADER_FIELD(&blk->rlocator, sizeof(RelFileLocator));
+ rlocator = &blk->rlocator;
}
else
{
- if (rnode == NULL)
+ if (rlocator == NULL)
{
report_invalid_record(state,
"BKPBLOCK_SAME_REL set but no previous rel at %X/%X",
@@ -1836,7 +1836,7 @@ DecodeXLogRecord(XLogReaderState *state,
goto err;
}
- blk->rnode = *rnode;
+ blk->rlocator = *rlocator;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
}
@@ -1926,10 +1926,11 @@ err:
*/
void
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
+ RelFileLocator *rlocator, ForkNumber *forknum,
+ BlockNumber *blknum)
{
- if (!XLogRecGetBlockTagExtended(record, block_id, rnode, forknum, blknum,
- NULL))
+ if (!XLogRecGetBlockTagExtended(record, block_id, rlocator, forknum,
+ blknum, NULL))
{
#ifndef FRONTEND
elog(ERROR, "failed to locate backup block with ID %d in WAL record",
@@ -1945,13 +1946,13 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
* Returns information about the block that a block reference refers to,
* optionally including the buffer that the block may already be in.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
+ * If the WAL record contains a block reference with the given ID, *rlocator,
* *forknum, *blknum and *prefetch_buffer are filled in (if not NULL), and
* returns true. Otherwise returns false.
*/
bool
XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer)
{
@@ -1961,8 +1962,8 @@ XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->record->blocks[block_id];
- if (rnode)
- *rnode = bkpb->rnode;
+ if (rlocator)
+ *rlocator = bkpb->rlocator;
if (forknum)
*forknum = bkpb->forknum;
if (blknum)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6eba626..8306518 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2166,24 +2166,26 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
/* decode block references */
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (forknum != MAIN_FORKNUM)
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
forknum,
blk);
else
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
@@ -2285,7 +2287,7 @@ static void
verifyBackupPageConsistency(XLogReaderState *record)
{
RmgrData rmgr = GetRmgr(XLogRecGetRmid(record));
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
int block_id;
@@ -2302,7 +2304,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
Page page;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -2327,7 +2329,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(rlocator, forknum, blkno,
RBM_NORMAL_NO_LOG,
InvalidBuffer);
if (!BufferIsValid(buf))
@@ -2377,7 +2379,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
{
elog(FATAL,
"inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 4851669..42a0f51 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -67,7 +67,7 @@ HotStandbyState standbyState = STANDBY_DISABLED;
*/
typedef struct xl_invalid_page_key
{
- RelFileNode node; /* the relation */
+ RelFileLocator locator; /* the relation */
ForkNumber forkno; /* the fork number */
BlockNumber blkno; /* the page */
} xl_invalid_page_key;
@@ -86,10 +86,10 @@ static int read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPa
/* Report a reference to an invalid page */
static void
-report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
+report_invalid_page(int elevel, RelFileLocator locator, ForkNumber forkno,
BlockNumber blkno, bool present)
{
- char *path = relpathperm(node, forkno);
+ char *path = relpathperm(locator, forkno);
if (present)
elog(elevel, "page %u of relation %s is uninitialized",
@@ -102,7 +102,7 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
/* Log a reference to an invalid page */
static void
-log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
+log_invalid_page(RelFileLocator locator, ForkNumber forkno, BlockNumber blkno,
bool present)
{
xl_invalid_page_key key;
@@ -119,7 +119,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
*/
if (reachedConsistency)
{
- report_invalid_page(WARNING, node, forkno, blkno, present);
+ report_invalid_page(WARNING, locator, forkno, blkno, present);
elog(ignore_invalid_pages ? WARNING : PANIC,
"WAL contains references to invalid pages");
}
@@ -130,7 +130,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
* something about the XLOG record that generated the reference).
*/
if (message_level_is_interesting(DEBUG1))
- report_invalid_page(DEBUG1, node, forkno, blkno, present);
+ report_invalid_page(DEBUG1, locator, forkno, blkno, present);
if (invalid_page_tab == NULL)
{
@@ -147,7 +147,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
}
/* we currently assume xl_invalid_page_key contains no padding */
- key.node = node;
+ key.locator = locator;
key.forkno = forkno;
key.blkno = blkno;
hentry = (xl_invalid_page *)
@@ -166,7 +166,8 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
/* Forget any invalid pages >= minblkno, because they've been dropped */
static void
-forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
+forget_invalid_pages(RelFileLocator locator, ForkNumber forkno,
+ BlockNumber minblkno)
{
HASH_SEQ_STATUS status;
xl_invalid_page *hentry;
@@ -178,13 +179,13 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (RelFileNodeEquals(hentry->key.node, node) &&
+ if (RelFileLocatorEquals(hentry->key.locator, locator) &&
hentry->key.forkno == forkno &&
hentry->key.blkno >= minblkno)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, forkno);
+ char *path = relpathperm(hentry->key.locator, forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -213,11 +214,11 @@ forget_invalid_pages_db(Oid dbid)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (hentry->key.node.dbNode == dbid)
+ if (hentry->key.locator.dbOid == dbid)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, hentry->key.forkno);
+ char *path = relpathperm(hentry->key.locator, hentry->key.forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -261,7 +262,7 @@ XLogCheckInvalidPages(void)
*/
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- report_invalid_page(WARNING, hentry->key.node, hentry->key.forkno,
+ report_invalid_page(WARNING, hentry->key.locator, hentry->key.forkno,
hentry->key.blkno, hentry->present);
foundone = true;
}
@@ -356,7 +357,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Buffer *buf)
{
XLogRecPtr lsn = record->EndRecPtr;
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
Buffer prefetch_buffer;
@@ -364,7 +365,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
bool zeromode;
bool willinit;
- if (!XLogRecGetBlockTagExtended(record, block_id, &rnode, &forknum, &blkno,
+ if (!XLogRecGetBlockTagExtended(record, block_id, &rlocator, &forknum, &blkno,
&prefetch_buffer))
{
/* Caller specified a bogus block_id */
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK,
prefetch_buffer);
page = BufferGetPage(*buf);
@@ -418,7 +419,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode, prefetch_buffer);
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno, mode, prefetch_buffer);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -468,7 +469,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* they will be invisible to tools that need to know which pages are modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer)
{
@@ -481,14 +482,14 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* Do we have a clue where the buffer might be already? */
if (BufferIsValid(recent_buffer) &&
mode == RBM_NORMAL &&
- ReadRecentBuffer(rnode, forknum, blkno, recent_buffer))
+ ReadRecentBuffer(rlocator, forknum, blkno, recent_buffer))
{
buffer = recent_buffer;
goto recent_buffer_fast_path;
}
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(rlocator, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -505,7 +506,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
else
@@ -513,7 +514,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* hm, page doesn't exist in file */
if (mode == RBM_NORMAL)
{
- log_invalid_page(rnode, forknum, blkno, false);
+ log_invalid_page(rlocator, forknum, blkno, false);
return InvalidBuffer;
}
if (mode == RBM_NORMAL_NO_LOG)
@@ -530,7 +531,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum,
P_NEW, mode, NULL, true);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -540,7 +541,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
}
@@ -559,7 +560,7 @@ recent_buffer_fast_path:
if (PageIsNew(page))
{
ReleaseBuffer(buffer);
- log_invalid_page(rnode, forknum, blkno, true);
+ log_invalid_page(rlocator, forknum, blkno, true);
return InvalidBuffer;
}
}
@@ -594,7 +595,7 @@ typedef FakeRelCacheEntryData *FakeRelCacheEntry;
* Caller must free the returned entry with FreeFakeRelcacheEntry().
*/
Relation
-CreateFakeRelcacheEntry(RelFileNode rnode)
+CreateFakeRelcacheEntry(RelFileLocator rlocator)
{
FakeRelCacheEntry fakeentry;
Relation rel;
@@ -604,7 +605,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel = (Relation) fakeentry;
rel->rd_rel = &fakeentry->pgc;
- rel->rd_node = rnode;
+ rel->rd_locator = rlocator;
/*
* We will never be working with temp rels during recovery or while
@@ -615,18 +616,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
/* It must be a permanent table here */
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
- /* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ /* We don't know the name of the relation; use relfilelocator instead */
+ sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
+ * relation. Note that this is fairly bogus since relNumber may be
* different from the relation's OID. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
- rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
+ rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
rel->rd_smgr = NULL;
@@ -652,9 +653,9 @@ FreeFakeRelcacheEntry(Relation fakerel)
* any open "invalid-page" records for the relation.
*/
void
-XLogDropRelation(RelFileNode rnode, ForkNumber forknum)
+XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum)
{
- forget_invalid_pages(rnode, forknum, 0);
+ forget_invalid_pages(rlocator, forknum, 0);
}
/*
@@ -682,10 +683,10 @@ XLogDropDatabase(Oid dbid)
* We need to clean up any open "invalid-page" records for the dropped pages.
*/
void
-XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks)
{
- forget_invalid_pages(rnode, forkNum, nblocks);
+ forget_invalid_pages(rlocator, forkNum, nblocks);
}
/*
diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index e5cf1b3..a872199 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -287,9 +287,9 @@ Boot_DeclareIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = false;
stmt->primary = false;
stmt->isconstraint = false;
@@ -339,9 +339,9 @@ Boot_DeclareUniqueIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = true;
stmt->primary = false;
stmt->isconstraint = false;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index e784538..2a33273 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,14 +481,14 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
+ * GetNewRelFileNumber
+ * Generate a new relfilenumber that is unique within the
* database of the given tablespace.
*
- * If the relfilenode will also be used as the relation's OID, pass the
+ * If the relfilenumber will also be used as the relation's OID, pass the
* opened pg_class catalog, and this routine will guarantee that the result
* is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
+ * as a relfilenumber for an existing relation, pass NULL for pg_class.
*
* As with GetNewOidWithIndex(), there is some theoretical risk of a race
* condition, but it doesn't seem worth worrying about.
@@ -496,17 +496,17 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
* Note: we don't support using this in bootstrap mode. All relations
* created by bootstrap have preassigned OIDs, so there's no need.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
{
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
char *rpath;
bool collides;
BackendId backend;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenode assignments during a binary-upgrade run should be
+ * relfilenumber assignments during a binary-upgrade run should be
* determined by commands in the dump script.
*/
Assert(!IsBinaryUpgrade);
@@ -526,15 +526,15 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
}
/* This logic should match RelationInitPhysicalAddr */
- rnode.node.spcNode = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rnode.node.dbNode = (rnode.node.spcNode == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
+ rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
/*
* The relpath will vary based on the backend ID, so we must initialize
* that properly here to make sure that any collisions based on filename
* are properly detected.
*/
- rnode.backend = backend;
+ rlocator.backend = backend;
do
{
@@ -542,13 +542,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
/* Generate the OID */
if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
Anum_pg_class_oid);
else
- rnode.node.relNode = GetNewObjectId();
+ rlocator.locator.relNumber = GetNewObjectId();
/* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ rpath = relpath(rlocator, MAIN_FORKNUM);
if (access(rpath, F_OK) == 0)
{
@@ -570,7 +570,7 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
pfree(rpath);
} while (collides);
- return rnode.node.relNode;
+ return rlocator.locator.relNumber;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 1803194..c69c923 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -77,9 +77,11 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber =
+ InvalidRelFileNumber;
+RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber =
+ InvalidRelFileNumber;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -273,7 +275,7 @@ SystemAttributeByName(const char *attname)
* heap_create - Create an uncataloged heap relation
*
* Note API change: the caller must now always provide the OID
- * to use for the relation. The relfilenode may be (and in
+ * to use for the relation. The relfilenumber may be (and in
* the simplest cases is) left unspecified.
*
* create_storage indicates whether or not to create the storage.
@@ -289,7 +291,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -341,11 +343,11 @@ heap_create(const char *relname,
else
{
/*
- * If relfilenode is unspecified by the caller then create storage
+ * If relfilenumber is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelFileNumberIsValid(relfilenumber))
+ relfilenumber = relid;
}
/*
@@ -368,7 +370,7 @@ heap_create(const char *relname,
tupDesc,
relid,
accessmtd,
- relfilenode,
+ relfilenumber,
reltablespace,
shared_relation,
mapped_relation,
@@ -385,11 +387,11 @@ heap_create(const char *relname,
if (create_storage)
{
if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind))
- table_relation_set_new_filenode(rel, &rel->rd_node,
- relpersistence,
- relfrozenxid, relminmxid);
+ table_relation_set_new_filelocator(rel, &rel->rd_locator,
+ relpersistence,
+ relfrozenxid, relminmxid);
else if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
- RelationCreateStorage(rel->rd_node, relpersistence, true);
+ RelationCreateStorage(rel->rd_locator, relpersistence, true);
else
Assert(false);
}
@@ -1069,7 +1071,7 @@ AddNewRelationType(const char *typeName,
* relkind: relkind for new rel
* relpersistence: rel's persistence status (permanent, temp, or unlogged)
* shared_relation: true if it's to be a shared relation
- * mapped_relation: true if the relation will use the relfilenode map
+ * mapped_relation: true if the relation will use the relfilenumber map
* oncommit: ON COMMIT marking (only relevant if it's a temp table)
* reloptions: reloptions in Datum form, or (Datum) 0 if none
* use_user_acl: true if should look for user-defined default permissions;
@@ -1115,7 +1117,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1173,12 +1175,12 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(relid))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
/*
@@ -1196,13 +1198,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_toast_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("toast relfilenode value not set when in binary upgrade mode")));
+ errmsg("toast relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_toast_pg_class_relfilenumber;
+ binary_upgrade_next_toast_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
else
@@ -1217,20 +1219,20 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_heap_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("relfilenode value not set when in binary upgrade mode")));
+ errmsg("relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_heap_pg_class_relfilenumber;
+ binary_upgrade_next_heap_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
+ relpersistence);
}
/*
@@ -1273,7 +1275,7 @@ heap_create_with_catalog(const char *relname,
relnamespace,
reltablespace,
relid,
- relfilenode,
+ relfilenumber,
accessmtd,
tupdesc,
relkind,
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bdd3c34..f245df8 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,8 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber =
+ InvalidRelFileNumber;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,7 +663,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNumber: normally, pass InvalidOid to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -703,7 +704,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +736,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelFileNumberIsValid(relFileNumber);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -751,7 +752,7 @@ index_create(Relation heapRelation,
/*
* The index will be in the same namespace as its parent table, and is
* shared across databases if and only if the parent is. Likewise, it
- * will use the relfilenode map if and only if the parent does; and it
+ * will use the relfilenumber map if and only if the parent does; and it
* inherits the parent's relpersistence.
*/
namespaceId = RelationGetNamespace(heapRelation);
@@ -902,12 +903,12 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(indexRelationId))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
if (!OidIsValid(binary_upgrade_next_index_pg_class_oid))
@@ -918,14 +919,14 @@ index_create(Relation heapRelation,
indexRelationId = binary_upgrade_next_index_pg_class_oid;
binary_upgrade_next_index_pg_class_oid = InvalidOid;
- /* Override the index relfilenode */
+ /* Override the index relfilenumber */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!RelFileNumberIsValid(binary_upgrade_next_index_pg_class_relfilenumber)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("index relfilenode value not set when in binary upgrade mode")));
- relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ errmsg("index relfilenumber value not set when in binary upgrade mode")));
+ relFileNumber = binary_upgrade_next_index_pg_class_relfilenumber;
+ binary_upgrade_next_index_pg_class_relfilenumber = InvalidRelFileNumber;
/*
* Note that we want create_storage = true for binary upgrade. The
@@ -937,7 +938,7 @@ index_create(Relation heapRelation,
else
{
indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
}
}
@@ -950,7 +951,7 @@ index_create(Relation heapRelation,
namespaceId,
tableSpaceId,
indexRelationId,
- relFileNode,
+ relFileNumber,
accessMethodObjectId,
indexTupDesc,
relkind,
@@ -1408,7 +1409,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelFileNumber, /* relFileNumber */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
@@ -3024,7 +3025,7 @@ index_build(Relation heapRelation,
* it -- but we must first check whether one already exists. If, for
* example, an unlogged relation is truncated in the transaction that
* created it, or truncated twice in a subsequent transaction, the
- * relfilenode won't change, and nothing needs to be done here.
+ * relfilenumber won't change, and nothing needs to be done here.
*/
if (indexRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
!smgrexists(RelationGetSmgr(indexRelation), INIT_FORKNUM))
@@ -3681,7 +3682,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
* Schedule unlinking of the old index storage at transaction commit.
*/
RelationDropStorage(iRel);
- RelationAssumeNewRelfilenode(iRel);
+ RelationAssumeNewRelfilelocator(iRel);
/* Make sure the reltablespace change is visible */
CommandCounterIncrement();
@@ -3711,7 +3712,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
SetReindexProcessing(heapId, indexId);
/* Create a new physical relation for the index */
- RelationSetNewRelfilenode(iRel, persistence);
+ RelationSetNewRelfilenumber(iRel, persistence);
/* Initialize the index and rebuild */
/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index c06e414..37dd2b9 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -38,7 +38,7 @@
int wal_skip_threshold = 2048; /* in kilobytes */
/*
- * We keep a list of all relations (represented as RelFileNode values)
+ * We keep a list of all relations (represented as RelFileLocator values)
* that have been created or deleted in the current transaction. When
* a relation is created, we create the physical file immediately, but
* remember it so that we can delete the file again if the current
@@ -59,7 +59,7 @@ int wal_skip_threshold = 2048; /* in kilobytes */
typedef struct PendingRelDelete
{
- RelFileNode relnode; /* relation that may need to be deleted */
+ RelFileLocator rlocator; /* relation that may need to be deleted */
BackendId backend; /* InvalidBackendId if not a temp rel */
bool atCommit; /* T=delete at commit; F=delete at abort */
int nestLevel; /* xact nesting level of request */
@@ -68,7 +68,7 @@ typedef struct PendingRelDelete
typedef struct PendingRelSync
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
bool is_truncated; /* Has the file experienced truncation? */
} PendingRelSync;
@@ -81,7 +81,7 @@ static HTAB *pendingSyncHash = NULL;
* Queue an at-commit fsync.
*/
static void
-AddPendingSync(const RelFileNode *rnode)
+AddPendingSync(const RelFileLocator *rlocator)
{
PendingRelSync *pending;
bool found;
@@ -91,14 +91,14 @@ AddPendingSync(const RelFileNode *rnode)
{
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNode);
+ ctl.keysize = sizeof(RelFileLocator);
ctl.entrysize = sizeof(PendingRelSync);
ctl.hcxt = TopTransactionContext;
pendingSyncHash = hash_create("pending sync hash", 16, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
}
- pending = hash_search(pendingSyncHash, rnode, HASH_ENTER, &found);
+ pending = hash_search(pendingSyncHash, rlocator, HASH_ENTER, &found);
Assert(!found);
pending->is_truncated = false;
}
@@ -117,7 +117,7 @@ AddPendingSync(const RelFileNode *rnode)
* pass register_delete = false.
*/
SMgrRelation
-RelationCreateStorage(RelFileNode rnode, char relpersistence,
+RelationCreateStorage(RelFileLocator rlocator, char relpersistence,
bool register_delete)
{
SMgrRelation srel;
@@ -145,11 +145,11 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(rlocator, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
- log_smgrcreate(&srel->smgr_rnode.node, MAIN_FORKNUM);
+ log_smgrcreate(&srel->smgr_rlocator.locator, MAIN_FORKNUM);
/*
* Add the relation to the list of stuff to delete at abort, if we are
@@ -161,7 +161,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rnode;
+ pending->rlocator = rlocator;
pending->backend = backend;
pending->atCommit = false; /* delete if abort */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -172,7 +172,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
if (relpersistence == RELPERSISTENCE_PERMANENT && !XLogIsNeeded())
{
Assert(backend == InvalidBackendId);
- AddPendingSync(&rnode);
+ AddPendingSync(&rlocator);
}
return srel;
@@ -182,14 +182,14 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
* Perform XLogInsert of an XLOG_SMGR_CREATE record to WAL.
*/
void
-log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum)
+log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum)
{
xl_smgr_create xlrec;
/*
* Make an XLOG entry reporting the file creation.
*/
- xlrec.rnode = *rnode;
+ xlrec.rlocator = *rlocator;
xlrec.forkNum = forkNum;
XLogBeginInsert();
@@ -209,7 +209,7 @@ RelationDropStorage(Relation rel)
/* Add the relation to the list of stuff to delete at commit */
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rel->rd_node;
+ pending->rlocator = rel->rd_locator;
pending->backend = rel->rd_backend;
pending->atCommit = true; /* delete if commit */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -247,7 +247,7 @@ RelationDropStorage(Relation rel)
* No-op if the relation is not among those scheduled for deletion.
*/
void
-RelationPreserveStorage(RelFileNode rnode, bool atCommit)
+RelationPreserveStorage(RelFileLocator rlocator, bool atCommit)
{
PendingRelDelete *pending;
PendingRelDelete *prev;
@@ -257,7 +257,7 @@ RelationPreserveStorage(RelFileNode rnode, bool atCommit)
for (pending = pendingDeletes; pending != NULL; pending = next)
{
next = pending->next;
- if (RelFileNodeEquals(rnode, pending->relnode)
+ if (RelFileLocatorEquals(rlocator, pending->rlocator)
&& pending->atCommit == atCommit)
{
/* unlink and delete list entry */
@@ -369,7 +369,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
xl_smgr_truncate xlrec;
xlrec.blkno = nblocks;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_ALL;
XLogBeginInsert();
@@ -428,7 +428,7 @@ RelationPreTruncate(Relation rel)
return;
pending = hash_search(pendingSyncHash,
- &(RelationGetSmgr(rel)->smgr_rnode.node),
+ &(RelationGetSmgr(rel)->smgr_rlocator.locator),
HASH_FIND, NULL);
if (pending)
pending->is_truncated = true;
@@ -472,7 +472,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* We need to log the copied data in WAL iff WAL archiving/streaming is
* enabled AND it's a permanent relation. This gives the same answer as
* "RelationNeedsWAL(rel) || copying_initfork", because we know the
- * current operation created a new relfilenode.
+ * current operation created a new relfilelocator.
*/
use_wal = XLogIsNeeded() &&
(relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork);
@@ -496,8 +496,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* (errcontext callbacks shouldn't be risking any such thing, but
* people have been known to forget that rule.)
*/
- char *relpath = relpathbackend(src->smgr_rnode.node,
- src->smgr_rnode.backend,
+ char *relpath = relpathbackend(src->smgr_rlocator.locator,
+ src->smgr_rlocator.backend,
forkNum);
ereport(ERROR,
@@ -512,7 +512,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
PageSetChecksumInplace(page, blkno);
@@ -538,19 +538,19 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
}
/*
- * RelFileNodeSkippingWAL
- * Check if a BM_PERMANENT relfilenode is using WAL.
+ * RelFileLocatorSkippingWAL
+ * Check if a BM_PERMANENT relfilelocator is using WAL.
*
- * Changes of certain relfilenodes must not write WAL; see "Skipping WAL for
- * New RelFileNode" in src/backend/access/transam/README. Though it is known
- * from Relation efficiently, this function is intended for the code paths not
- * having access to Relation.
+ * Changes of certain relfilelocator must not write WAL; see "Skipping WAL for
+ * New RelFileLocator" in src/backend/access/transam/README. Though it is
+ * known from Relation efficiently, this function is intended for the code
+ * paths not having access to Relation.
*/
bool
-RelFileNodeSkippingWAL(RelFileNode rnode)
+RelFileLocatorSkippingWAL(RelFileLocator rlocator)
{
if (!pendingSyncHash ||
- hash_search(pendingSyncHash, &rnode, HASH_FIND, NULL) == NULL)
+ hash_search(pendingSyncHash, &rlocator, HASH_FIND, NULL) == NULL)
return false;
return true;
@@ -566,7 +566,7 @@ EstimatePendingSyncsSpace(void)
long entries;
entries = pendingSyncHash ? hash_get_num_entries(pendingSyncHash) : 0;
- return mul_size(1 + entries, sizeof(RelFileNode));
+ return mul_size(1 + entries, sizeof(RelFileLocator));
}
/*
@@ -581,57 +581,58 @@ SerializePendingSyncs(Size maxSize, char *startAddress)
HASH_SEQ_STATUS scan;
PendingRelSync *sync;
PendingRelDelete *delete;
- RelFileNode *src;
- RelFileNode *dest = (RelFileNode *) startAddress;
+ RelFileLocator *src;
+ RelFileLocator *dest = (RelFileLocator *) startAddress;
if (!pendingSyncHash)
goto terminate;
- /* Create temporary hash to collect active relfilenodes */
- ctl.keysize = sizeof(RelFileNode);
- ctl.entrysize = sizeof(RelFileNode);
+ /* Create temporary hash to collect active relfilelocators */
+ ctl.keysize = sizeof(RelFileLocator);
+ ctl.entrysize = sizeof(RelFileLocator);
ctl.hcxt = CurrentMemoryContext;
- tmphash = hash_create("tmp relfilenodes",
+ tmphash = hash_create("tmp relfilelocators",
hash_get_num_entries(pendingSyncHash), &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
- /* collect all rnodes from pending syncs */
+ /* collect all rlocator from pending syncs */
hash_seq_init(&scan, pendingSyncHash);
while ((sync = (PendingRelSync *) hash_seq_search(&scan)))
- (void) hash_search(tmphash, &sync->rnode, HASH_ENTER, NULL);
+ (void) hash_search(tmphash, &sync->rlocator, HASH_ENTER, NULL);
/* remove deleted rnodes */
for (delete = pendingDeletes; delete != NULL; delete = delete->next)
if (delete->atCommit)
- (void) hash_search(tmphash, (void *) &delete->relnode,
+ (void) hash_search(tmphash, (void *) &delete->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, tmphash);
- while ((src = (RelFileNode *) hash_seq_search(&scan)))
+ while ((src = (RelFileLocator *) hash_seq_search(&scan)))
*dest++ = *src;
hash_destroy(tmphash);
terminate:
- MemSet(dest, 0, sizeof(RelFileNode));
+ MemSet(dest, 0, sizeof(RelFileLocator));
}
/*
* RestorePendingSyncs
* Restore syncs within a parallel worker.
*
- * RelationNeedsWAL() and RelFileNodeSkippingWAL() must offer the correct
+ * RelationNeedsWAL() and RelFileLocatorSkippingWAL() must offer the correct
* answer to parallel workers. Only smgrDoPendingSyncs() reads the
* is_truncated field, at end of transaction. Hence, don't restore it.
*/
void
RestorePendingSyncs(char *startAddress)
{
- RelFileNode *rnode;
+ RelFileLocator *rlocator;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
- AddPendingSync(rnode);
+ for (rlocator = (RelFileLocator *) startAddress; rlocator->relNumber != 0;
+ rlocator++)
+ AddPendingSync(rlocator);
}
/*
@@ -677,7 +678,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(pending->rlocator, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -747,7 +748,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
/* Skip syncing nodes that smgrDoPendingDeletes() will delete. */
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
if (pending->atCommit)
- (void) hash_search(pendingSyncHash, (void *) &pending->relnode,
+ (void) hash_search(pendingSyncHash, (void *) &pending->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, pendingSyncHash);
@@ -758,7 +759,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
BlockNumber total_blocks = 0;
SMgrRelation srel;
- srel = smgropen(pendingsync->rnode, InvalidBackendId);
+ srel = smgropen(pendingsync->rlocator, InvalidBackendId);
/*
* We emit newpage WAL records for smaller relations.
@@ -832,7 +833,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* page including any unused space. ReadBufferExtended()
* counts some pgstat events; unfortunately, we discard them.
*/
- rel = CreateFakeRelcacheEntry(srel->smgr_rnode.node);
+ rel = CreateFakeRelcacheEntry(srel->smgr_rlocator.locator);
log_newpage_range(rel, fork, 0, n, false);
FreeFakeRelcacheEntry(rel);
}
@@ -852,7 +853,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* smgrGetPendingDeletes() -- Get a list of non-temp relations to be deleted.
*
* The return value is the number of relations scheduled for termination.
- * *ptr is set to point to a freshly-palloc'd array of RelFileNodes.
+ * *ptr is set to point to a freshly-palloc'd array of RelFileLocators.
* If there are no relations to be deleted, *ptr is set to NULL.
*
* Only non-temporary relations are included in the returned list. This is OK
@@ -866,11 +867,11 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* by upper-level transactions.
*/
int
-smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
+smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr)
{
int nestLevel = GetCurrentTransactionNestLevel();
int nrels;
- RelFileNode *rptr;
+ RelFileLocator *rptr;
PendingRelDelete *pending;
nrels = 0;
@@ -885,14 +886,14 @@ smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
*ptr = NULL;
return 0;
}
- rptr = (RelFileNode *) palloc(nrels * sizeof(RelFileNode));
+ rptr = (RelFileLocator *) palloc(nrels * sizeof(RelFileLocator));
*ptr = rptr;
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
{
if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit
&& pending->backend == InvalidBackendId)
{
- *rptr = pending->relnode;
+ *rptr = pending->rlocator;
rptr++;
}
}
@@ -967,7 +968,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -980,7 +981,7 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
@@ -1015,11 +1016,11 @@ smgr_redo(XLogReaderState *record)
nforks++;
/* Also tell xlogutils.c about it */
- XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+ XLogTruncateRelation(xlrec->rlocator, MAIN_FORKNUM, xlrec->blkno);
}
/* Prepare for truncation of FSM and VM too */
- rel = CreateFakeRelcacheEntry(xlrec->rnode);
+ rel = CreateFakeRelcacheEntry(xlrec->rlocator);
if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 &&
smgrexists(reln, FSM_FORKNUM))
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cea2c8b..da137eb 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -293,7 +293,7 @@ cluster_multiple_rels(List *rtcs, ClusterParams *params)
* cluster_rel
*
* This clusters the table by creating a new, clustered table and
- * swapping the relfilenodes of the new table and the old table, so
+ * swapping the relfilenumbers of the new table and the old table, so
* the OID of the original table is preserved. Thus we do not lose
* GRANT, inheritance nor references to this table (this was a bug
* in releases through 7.3).
@@ -1025,8 +1025,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
/*
* Swap the physical files of two given relations.
*
- * We swap the physical identity (reltablespace, relfilenode) while keeping the
- * same logical identities of the two relations. relpersistence is also
+ * We swap the physical identity (reltablespace, relfilenumber) while keeping
+ * the same logical identities of the two relations. relpersistence is also
* swapped, which is critical since it determines where buffers live for each
* relation.
*
@@ -1061,9 +1061,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
- relfilenode2;
- Oid swaptemp;
+ RelFileNumber relfilenumber1,
+ relfilenumber2;
+ RelFileNumber swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
@@ -1079,13 +1079,14 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
elog(ERROR, "cache lookup failed for relation %u", r2);
relform2 = (Form_pg_class) GETSTRUCT(reltup2);
- relfilenode1 = relform1->relfilenode;
- relfilenode2 = relform2->relfilenode;
+ relfilenumber1 = relform1->relfilenode;
+ relfilenumber2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) &&
+ RelFileNumberIsValid(relfilenumber2))
{
/*
- * Normal non-mapped relations: swap relfilenodes, reltablespaces,
+ * Normal non-mapped relations: swap relfilenumbers, reltablespaces,
* relpersistence
*/
Assert(!target_is_pg_class);
@@ -1120,7 +1121,8 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) ||
+ RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1148,12 +1150,12 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
/*
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
- relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ relfilenumber1 = RelationMapOidToFilenumber(r1, relform1->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
- relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ relfilenumber2 = RelationMapOidToFilenumber(r2, relform2->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
@@ -1161,15 +1163,15 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Send replacement mappings to relmapper. Note these won't actually
* take effect until CommandCounterIncrement.
*/
- RelationMapUpdateMap(r1, relfilenode2, relform1->relisshared, false);
- RelationMapUpdateMap(r2, relfilenode1, relform2->relisshared, false);
+ RelationMapUpdateMap(r1, relfilenumber2, relform1->relisshared, false);
+ RelationMapUpdateMap(r2, relfilenumber1, relform2->relisshared, false);
/* Pass OIDs of mapped r2 tables back to caller */
*mapped_tables++ = r2;
}
/*
- * Recognize that rel1's relfilenode (swapped from rel2) is new in this
+ * Recognize that rel1's relfilenumber (swapped from rel2) is new in this
* subtransaction. The rel2 storage (swapped from rel1) may or may not be
* new.
*/
@@ -1180,9 +1182,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
rel1 = relation_open(r1, NoLock);
rel2 = relation_open(r2, NoLock);
rel2->rd_createSubid = rel1->rd_createSubid;
- rel2->rd_newRelfilenodeSubid = rel1->rd_newRelfilenodeSubid;
- rel2->rd_firstRelfilenodeSubid = rel1->rd_firstRelfilenodeSubid;
- RelationAssumeNewRelfilenode(rel1);
+ rel2->rd_newRelfilelocatorSubid = rel1->rd_newRelfilelocatorSubid;
+ rel2->rd_firstRelfilelocatorSubid = rel1->rd_firstRelfilelocatorSubid;
+ RelationAssumeNewRelfilelocator(rel1);
relation_close(rel1, NoLock);
relation_close(rel2, NoLock);
}
@@ -1523,7 +1525,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
table_close(relRelation, RowExclusiveLock);
}
- /* Destroy new heap with old filenode */
+ /* Destroy new heap with old filenumber */
object.classId = RelationRelationId;
object.objectId = OIDNewHeap;
object.objectSubId = 0;
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 35a1d3a..c985fea 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -593,11 +593,11 @@ CopyFrom(CopyFromState cstate)
*/
if (RELKIND_HAS_STORAGE(cstate->rel->rd_rel->relkind) &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
- cstate->rel->rd_firstRelfilenodeSubid != InvalidSubTransactionId))
+ cstate->rel->rd_firstRelfilelocatorSubid != InvalidSubTransactionId))
ti_options |= TABLE_INSERT_SKIP_FSM;
/*
- * Optimize if new relfilenode was created in this subxact or one of its
+ * Optimize if new relfilenumber was created in this subxact or one of its
* committed children and we won't see those rows later as part of an
* earlier scan or command. The subxact test ensures that if this subxact
* aborts then the frozen rows won't be visible after xact cleanup. Note
@@ -640,7 +640,7 @@ CopyFrom(CopyFromState cstate)
errmsg("cannot perform COPY FREEZE because of prior transaction activity")));
if (cstate->rel->rd_createSubid != GetCurrentSubTransactionId() &&
- cstate->rel->rd_newRelfilenodeSubid != GetCurrentSubTransactionId())
+ cstate->rel->rd_newRelfilelocatorSubid != GetCurrentSubTransactionId())
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction")));
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index f269168..ca2f884 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -101,7 +101,7 @@ typedef struct
*/
typedef struct CreateDBRelInfo
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
Oid reloid; /* relation oid */
bool permanent; /* relation is permanent or unlogged */
} CreateDBRelInfo;
@@ -127,7 +127,7 @@ static void CreateDatabaseUsingWalLog(Oid src_dboid, Oid dboid, Oid src_tsid,
static List *ScanSourceDatabasePgClass(Oid srctbid, Oid srcdbid, char *srcpath);
static List *ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid,
Oid dbid, char *srcpath,
- List *rnodelist, Snapshot snapshot);
+ List *rlocatorlist, Snapshot snapshot);
static CreateDBRelInfo *ScanSourceDatabasePgClassTuple(HeapTupleData *tuple,
Oid tbid, Oid dbid,
char *srcpath);
@@ -147,12 +147,12 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
{
char *srcpath;
char *dstpath;
- List *rnodelist = NULL;
+ List *rlocatorlist = NULL;
ListCell *cell;
LockRelId srcrelid;
LockRelId dstrelid;
- RelFileNode srcrnode;
- RelFileNode dstrnode;
+ RelFileLocator srcrlocator;
+ RelFileLocator dstrlocator;
CreateDBRelInfo *relinfo;
/* Get source and destination database paths. */
@@ -165,9 +165,9 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
/* Copy relmap file from source database to the destination database. */
RelationMapCopy(dst_dboid, dst_tsid, srcpath, dstpath);
- /* Get list of relfilenodes to copy from the source database. */
- rnodelist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
- Assert(rnodelist != NIL);
+ /* Get list of relfilelocators to copy from the source database. */
+ rlocatorlist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
+ Assert(rlocatorlist != NIL);
/*
* Database IDs will be the same for all relations so set them before
@@ -176,11 +176,11 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
srcrelid.dbId = src_dboid;
dstrelid.dbId = dst_dboid;
- /* Loop over our list of relfilenodes and copy each one. */
- foreach(cell, rnodelist)
+ /* Loop over our list of relfilelocators and copy each one. */
+ foreach(cell, rlocatorlist)
{
relinfo = lfirst(cell);
- srcrnode = relinfo->rnode;
+ srcrlocator = relinfo->rlocator;
/*
* If the relation is from the source db's default tablespace then we
@@ -188,13 +188,13 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
* Otherwise, we need to create in the same tablespace as it is in the
* source database.
*/
- if (srcrnode.spcNode == src_tsid)
- dstrnode.spcNode = dst_tsid;
+ if (srcrlocator.spcOid == src_tsid)
+ dstrlocator.spcOid = dst_tsid;
else
- dstrnode.spcNode = srcrnode.spcNode;
+ dstrlocator.spcOid = srcrlocator.spcOid;
- dstrnode.dbNode = dst_dboid;
- dstrnode.relNode = srcrnode.relNode;
+ dstrlocator.dbOid = dst_dboid;
+ dstrlocator.relNumber = srcrlocator.relNumber;
/*
* Acquire locks on source and target relations before copying.
@@ -210,7 +210,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
LockRelationId(&dstrelid, AccessShareLock);
/* Copy relation storage from source to the destination. */
- CreateAndCopyRelationData(srcrnode, dstrnode, relinfo->permanent);
+ CreateAndCopyRelationData(srcrlocator, dstrlocator, relinfo->permanent);
/* Release the relation locks. */
UnlockRelationId(&srcrelid, AccessShareLock);
@@ -219,7 +219,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
pfree(srcpath);
pfree(dstpath);
- list_free_deep(rnodelist);
+ list_free_deep(rlocatorlist);
}
/*
@@ -246,31 +246,31 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
static List *
ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenode;
+ Oid relfilenumber;
Page page;
- List *rnodelist = NIL;
+ List *rlocatorlist = NIL;
LockRelId relid;
Relation rel;
Snapshot snapshot;
BufferAccessStrategy bstrategy;
- /* Get pg_class relfilenode. */
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- RelationRelationId);
+ /* Get pg_class relfilenumber. */
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ RelationRelationId);
/* Don't read data into shared_buffers without holding a relation lock. */
relid.dbId = dbid;
relid.relId = RelationRelationId;
LockRelationId(&relid, AccessShareLock);
- /* Prepare a RelFileNode for the pg_class relation. */
- rnode.spcNode = tbid;
- rnode.dbNode = dbid;
- rnode.relNode = relfilenode;
+ /* Prepare a RelFileLocator for the pg_class relation. */
+ rlocator.spcOid = tbid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = relfilenumber;
/*
* We can't use a real relcache entry for a relation in some other
@@ -279,7 +279,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- rel = CreateFakeRelcacheEntry(rnode);
+ rel = CreateFakeRelcacheEntry(rlocator);
nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
FreeFakeRelcacheEntry(rel);
@@ -299,7 +299,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
CHECK_FOR_INTERRUPTS();
- buf = ReadBufferWithoutRelcache(rnode, MAIN_FORKNUM, blkno,
+ buf = ReadBufferWithoutRelcache(rlocator, MAIN_FORKNUM, blkno,
RBM_NORMAL, bstrategy, false);
LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -310,9 +310,9 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
continue;
}
- /* Append relevant pg_class tuples for current page to rnodelist. */
- rnodelist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
- srcpath, rnodelist,
+ /* Append relevant pg_class tuples for current page to rlocatorlist. */
+ rlocatorlist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
+ srcpath, rlocatorlist,
snapshot);
UnlockReleaseBuffer(buf);
@@ -321,16 +321,16 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
/* Release relation lock. */
UnlockRelationId(&relid, AccessShareLock);
- return rnodelist;
+ return rlocatorlist;
}
/*
* Scan one page of the source database's pg_class relation and add relevant
- * entries to rnodelist. The return value is the updated list.
+ * entries to rlocatorlist. The return value is the updated list.
*/
static List *
ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
- char *srcpath, List *rnodelist,
+ char *srcpath, List *rlocatorlist,
Snapshot snapshot)
{
BlockNumber blkno = BufferGetBlockNumber(buf);
@@ -376,11 +376,11 @@ ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
relinfo = ScanSourceDatabasePgClassTuple(&tuple, tbid, dbid,
srcpath);
if (relinfo != NULL)
- rnodelist = lappend(rnodelist, relinfo);
+ rlocatorlist = lappend(rlocatorlist, relinfo);
}
}
- return rnodelist;
+ return rlocatorlist;
}
/*
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenode = InvalidOid;
+ Oid relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
@@ -418,29 +418,29 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
return NULL;
/*
- * If relfilenode is valid then directly use it. Otherwise, consult the
+ * If relfilenumber is valid then directly use it. Otherwise, consult the
* relmap.
*/
if (OidIsValid(classForm->relfilenode))
- relfilenode = classForm->relfilenode;
+ relfilenumber = classForm->relfilenode;
else
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- classForm->oid);
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ classForm->oid);
- /* We must have a valid relfilenode oid. */
- if (!OidIsValid(relfilenode))
- elog(ERROR, "relation with OID %u does not have a valid relfilenode",
+ /* We must have a valid relfilenumber oid. */
+ if (!RelFileNumberIsValid(relfilenumber))
+ elog(ERROR, "relation with OID %u does not have a valid relfilenumber",
classForm->oid);
/* Prepare a rel info element and add it to the list. */
relinfo = (CreateDBRelInfo *) palloc(sizeof(CreateDBRelInfo));
if (OidIsValid(classForm->reltablespace))
- relinfo->rnode.spcNode = classForm->reltablespace;
+ relinfo->rlocator.spcOid = classForm->reltablespace;
else
- relinfo->rnode.spcNode = tbid;
+ relinfo->rlocator.spcOid = tbid;
- relinfo->rnode.dbNode = dbid;
- relinfo->rnode.relNode = relfilenode;
+ relinfo->rlocator.dbOid = dbid;
+ relinfo->rlocator.relNumber = relfilenumber;
relinfo->reloid = classForm->oid;
/* Temporary relations were rejected above. */
@@ -2867,8 +2867,8 @@ remove_dbtablespaces(Oid db_id)
* try to remove that already-existing subdirectory during the cleanup in
* remove_dbtablespaces. Nuking existing files seems like a bad idea, so
* instead we make this extra check before settling on the OID of the new
- * database. This exactly parallels what GetNewRelFileNode() does for table
- * relfilenode values.
+ * database. This exactly parallels what GetNewRelFileNumber() does for table
+ * relfilenumber values.
*/
static bool
check_db_file_conflict(Oid db_id)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 99f5ab8..7a827d4 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1109,10 +1109,10 @@ DefineIndex(Oid relationId,
}
/*
- * A valid stmt->oldNode implies that we already have a built form of the
+ * A valid stmt->oldNumber implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelFileNumberIsValid(stmt->oldNumber) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1154,7 +1154,7 @@ DefineIndex(Oid relationId,
indexRelationId =
index_create(rel, indexRelationName, indexRelationId, parentIndexId,
parentConstraintId,
- stmt->oldNode, indexInfo, indexColNames,
+ stmt->oldNumber, indexInfo, indexColNames,
accessMethodId, tablespaceId,
collationObjectId, classObjectId,
coloptions, reloptions,
@@ -1361,15 +1361,15 @@ DefineIndex(Oid relationId,
* We can't use the same index name for the child index,
* so clear idxname to let the recursive invocation choose
* a new name. Likewise, the existing target relation
- * field is wrong, and if indexOid or oldNode are set,
+ * field is wrong, and if indexOid or oldNumber are set,
* they mustn't be applied to the child either.
*/
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNumber = InvalidRelFileNumber;
childStmt->oldCreateSubid = InvalidSubTransactionId;
- childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ childStmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
/*
* Adjust any Vars (both in expressions and in the index's
@@ -3015,7 +3015,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelFileNumberIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106..9ac0383 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -118,7 +118,7 @@ SetMatViewPopulatedState(Relation relation, bool newstate)
* ExecRefreshMatView -- execute a REFRESH MATERIALIZED VIEW command
*
* This refreshes the materialized view by creating a new table and swapping
- * the relfilenodes of the new table and the old materialized view, so the OID
+ * the relfilenumbers of the new table and the old materialized view, so the OID
* of the original materialized view is preserved. Thus we do not lose GRANT
* nor references to this materialized view.
*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ddf219b..48d9d43 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -75,7 +75,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelFileNumber filenumber; /* last seen relfilenumber of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
@@ -255,7 +255,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
*
* The change is made transactionally, so that on failure of the current
* transaction, the sequence will be restored to its previous state.
- * We do that by creating a whole new relfilenode for the sequence; so this
+ * We do that by creating a whole new relfilenumber for the sequence; so this
* works much like the rewriting forms of ALTER TABLE.
*
* Caller is assumed to have acquired AccessExclusiveLock on the sequence,
@@ -310,7 +310,7 @@ ResetSequence(Oid seq_relid)
/*
* Create a new storage file for the sequence.
*/
- RelationSetNewRelfilenode(seq_rel, seq_rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seq_rel, seq_rel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -347,9 +347,9 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
{
SMgrRelation srel;
- srel = smgropen(rel->rd_node, InvalidBackendId);
+ srel = smgropen(rel->rd_locator, InvalidBackendId);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
+ log_smgrcreate(&rel->rd_locator, INIT_FORKNUM);
fill_seq_fork_with_data(rel, tuple, INIT_FORKNUM);
FlushRelationBuffers(rel);
smgrclose(srel);
@@ -418,7 +418,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = rel->rd_node;
+ xlrec.locator = rel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) tuple->t_data, tuple->t_len);
@@ -509,7 +509,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
* Create a new storage file for the sequence, making the state
* changes transactional.
*/
- RelationSetNewRelfilenode(seqrel, seqrel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seqrel, seqrel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -557,7 +557,7 @@ SequenceChangePersistence(Oid relid, char newrelpersistence)
GetTopTransactionId();
(void) read_seq_tuple(seqrel, &buf, &seqdatatuple);
- RelationSetNewRelfilenode(seqrel, newrelpersistence);
+ RelationSetNewRelfilenumber(seqrel, newrelpersistence);
fill_seq_with_data(seqrel, &seqdatatuple);
UnlockReleaseBuffer(buf);
@@ -836,7 +836,7 @@ nextval_internal(Oid relid, bool check_permissions)
seq->is_called = true;
seq->log_cnt = 0;
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1023,7 +1023,7 @@ do_setval(Oid relid, int64 next, bool iscalled)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1147,7 +1147,7 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
if (!found)
{
/* relid already filled in */
- elm->filenode = InvalidOid;
+ elm->filenumber = InvalidRelFileNumber;
elm->lxid = InvalidLocalTransactionId;
elm->last_valid = false;
elm->last = elm->cached = 0;
@@ -1169,9 +1169,9 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
* discard any cached-but-unissued values. We do not touch the currval()
* state, however.
*/
- if (seqrel->rd_rel->relfilenode != elm->filenode)
+ if (seqrel->rd_rel->relfilenode != elm->filenumber)
{
- elm->filenode = seqrel->rd_rel->relfilenode;
+ elm->filenumber = seqrel->rd_rel->relfilenode;
elm->cached = elm->last;
}
@@ -1254,7 +1254,8 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
* changed. This allows ALTER SEQUENCE to behave transactionally. Currently,
* the only option that doesn't cause that is OWNED BY. It's *necessary* for
* ALTER SEQUENCE OWNED BY to not rewrite the sequence, because that would
- * break pg_upgrade by causing unwanted changes in the sequence's relfilenode.
+ * break pg_upgrade by causing unwanted changes in the sequence's
+ * relfilenumber.
*/
static void
init_params(ParseState *pstate, List *options, bool for_identity,
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 2de0eba..bf645b8 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -596,7 +596,7 @@ static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
static ObjectAddress ATExecSetCompression(AlteredTableInfo *tab, Relation rel,
const char *column, Node *newValue, LOCKMODE lockmode);
-static void index_copy_data(Relation rel, RelFileNode newrnode);
+static void index_copy_data(Relation rel, RelFileLocator newrlocator);
static const char *storage_name(char c);
static void RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid,
@@ -1986,12 +1986,12 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* Normally, we need a transaction-safe truncation here. However, if
* the table was either created in the current (sub)transaction or has
- * a new relfilenode in the current (sub)transaction, then we can just
+ * a new relfilenumber in the current (sub)transaction, then we can just
* truncate it in-place, because a rollback would cause the whole
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilelocatorSubid == mySubid)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -2014,10 +2014,10 @@ ExecuteTruncateGuts(List *explicit_rels,
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
- * as the relfilenode value. The old storage file is scheduled for
+ * as the relfilenumber value. The old storage file is scheduled for
* deletion at commit.
*/
- RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(rel, rel->rd_rel->relpersistence);
heap_relid = RelationGetRelid(rel);
@@ -2030,7 +2030,7 @@ ExecuteTruncateGuts(List *explicit_rels,
Relation toastrel = relation_open(toast_relid,
AccessExclusiveLock);
- RelationSetNewRelfilenode(toastrel,
+ RelationSetNewRelfilenumber(toastrel,
toastrel->rd_rel->relpersistence);
table_close(toastrel, NoLock);
}
@@ -3315,10 +3315,10 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
/*
* SetRelationTableSpace
- * Set new reltablespace and relfilenode in pg_class entry.
+ * Set new reltablespace and relfilenumber in pg_class entry.
*
* newTableSpaceId is the new tablespace for the relation, and
- * newRelFileNode its new filenode. If newRelFileNode is InvalidOid,
+ * newRelFilenumber its new filenumber. If newRelFilenumber is InvalidOid,
* this field is not updated.
*
* NOTE: The caller must hold AccessExclusiveLock on the relation.
@@ -3331,7 +3331,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelFileNumber newRelFilenumber)
{
Relation pg_class;
HeapTuple tuple;
@@ -3351,8 +3351,8 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
- rd_rel->relfilenode = newRelFileNode;
+ if (RelFileNumberIsValid(newRelFilenumber))
+ rd_rel->relfilenode = newRelFilenumber;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
/*
@@ -5420,7 +5420,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* persistence: on one hand, we need to ensure that the buffers
* belonging to each of the two relations are marked with or without
* BM_PERMANENT properly. On the other hand, since rewriting creates
- * and assigns a new relfilenode, we automatically create or drop an
+ * and assigns a new relfilenumber, we automatically create or drop an
* init fork for the relation as appropriate.
*/
if (tab->rewrite > 0 && tab->relkind != RELKIND_SEQUENCE)
@@ -5506,12 +5506,13 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* Create transient table that will receive the modified data.
*
* Ensure it is marked correctly as logged or unlogged. We have
- * to do this here so that buffers for the new relfilenode will
+ * to do this here so that buffers for the new relfilenumber will
* have the right persistence set, and at the same time ensure
- * that the original filenode's buffers will get read in with the
- * correct setting (i.e. the original one). Otherwise a rollback
- * after the rewrite would possibly result with buffers for the
- * original filenode having the wrong persistence setting.
+ * that the original filenumbers's buffers will get read in with
+ * the correct setting (i.e. the original one). Otherwise a
+ * rollback after the rewrite would possibly result with buffers
+ * for the original filenumbers having the wrong persistence
+ * setting.
*
* NB: This relies on swap_relation_files() also swapping the
* persistence. That wouldn't work for pg_class, but that can't be
@@ -8597,7 +8598,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelFileNumberIsValid(stmt->oldNumber);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8613,7 +8614,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
quiet);
/*
- * If TryReuseIndex() stashed a relfilenode for us, we used it for the new
+ * If TryReuseIndex() stashed a relfilenumber for us, we used it for the new
* index instead of building from scratch. Restore associated fields.
* This may store InvalidSubTransactionId in both fields, in which case
* relcache.c will assume it can rebuild the relcache entry. Hence, do
@@ -8621,13 +8622,13 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelFileNumberIsValid(stmt->oldNumber))
{
Relation irel = index_open(address.objectId, NoLock);
irel->rd_createSubid = stmt->oldCreateSubid;
- irel->rd_firstRelfilenodeSubid = stmt->oldFirstRelfilenodeSubid;
- RelationPreserveStorage(irel->rd_node, true);
+ irel->rd_firstRelfilelocatorSubid = stmt->oldFirstRelfilenumberSubid;
+ RelationPreserveStorage(irel->rd_locator, true);
index_close(irel, NoLock);
}
@@ -13491,9 +13492,9 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNumber = irel->rd_locator.relNumber;
stmt->oldCreateSubid = irel->rd_createSubid;
- stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
+ stmt->oldFirstRelfilenumberSubid = irel->rd_firstRelfilelocatorSubid;
}
index_close(irel, NoLock);
}
@@ -14340,8 +14341,8 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
- RelFileNode newrnode;
+ RelFileNumber newrelfilenumber;
+ RelFileLocator newrlocator;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14370,26 +14371,28 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
+ * Relfilenumbers are not unique in databases across tablespaces, so we need
* to allocate a new one in the new tablespace.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
- newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
- newrnode.spcNode = newTableSpace;
+ newrlocator = rel->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
+ newrlocator.spcOid = newTableSpace;
- /* hand off to AM to actually create the new filenode and copy the data */
+ /*
+ * hand off to AM to actually create the new filelocator and copy the data
+ */
if (rel->rd_rel->relkind == RELKIND_INDEX)
{
- index_copy_data(rel, newrnode);
+ index_copy_data(rel, newrlocator);
}
else
{
Assert(RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind));
- table_relation_copy_data(rel, &newrnode);
+ table_relation_copy_data(rel, &newrlocator);
}
/*
@@ -14400,11 +14403,11 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* the updated pg_class entry), but that's forbidden with
* CheckRelationTableSpaceMove().
*/
- SetRelationTableSpace(rel, newTableSpace, newrelfilenode);
+ SetRelationTableSpace(rel, newTableSpace, newrelfilenumber);
InvokeObjectPostAlterHook(RelationRelationId, RelationGetRelid(rel), 0);
- RelationAssumeNewRelfilenode(rel);
+ RelationAssumeNewRelfilelocator(rel);
relation_close(rel, NoLock);
@@ -14630,11 +14633,11 @@ AlterTableMoveAll(AlterTableMoveAllStmt *stmt)
}
static void
-index_copy_data(Relation rel, RelFileNode newrnode)
+index_copy_data(Relation rel, RelFileLocator newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -14648,10 +14651,10 @@ index_copy_data(Relation rel, RelFileNode newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -14672,7 +14675,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(&newrnode, forkNum);
+ log_smgrcreate(&newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 00ca397..c8bdd99 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -12,12 +12,12 @@
* remove the possibility of having file name conflicts, we isolate
* files within a tablespace into database-specific subdirectories.
*
- * To support file access via the information given in RelFileNode, we
+ * To support file access via the information given in RelFileLocator, we
* maintain a symbolic-link map in $PGDATA/pg_tblspc. The symlinks are
* named by tablespace OIDs and point to the actual tablespace directories.
* There is also a per-cluster version directory in each tablespace.
* Thus the full path to an arbitrary file is
- * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenode
+ * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenumber
* e.g.
* $PGDATA/pg_tblspc/20981/PG_9.0_201002161/719849/83292814
*
@@ -25,8 +25,8 @@
* tables) and pg_default (for everything else). For backwards compatibility
* and to remain functional on platforms without symlinks, these tablespaces
* are accessed specially: they are respectively
- * $PGDATA/global/relfilenode
- * $PGDATA/base/dboid/relfilenode
+ * $PGDATA/global/relfilenumber
+ * $PGDATA/base/dboid/relfilenumber
*
* To allow CREATE DATABASE to give a new database a default tablespace
* that's different from the template database's default, we make the
@@ -115,7 +115,7 @@ static bool destroy_tablespace_directories(Oid tablespaceoid, bool redo);
* re-create a database subdirectory (of $PGDATA/base) during WAL replay.
*/
void
-TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
+TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
{
struct stat st;
char *dir;
@@ -124,13 +124,13 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
* The global tablespace doesn't have per-database subdirectories, so
* nothing to do for it.
*/
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
return;
- Assert(OidIsValid(spcNode));
- Assert(OidIsValid(dbNode));
+ Assert(OidIsValid(spcOid));
+ Assert(OidIsValid(dbOid));
- dir = GetDatabasePath(dbNode, spcNode);
+ dir = GetDatabasePath(dbOid, spcOid);
if (stat(dir, &st) < 0)
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630f..7d50b50 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4193,9 +4193,9 @@ _copyIndexStmt(const IndexStmt *from)
COPY_NODE_FIELD(excludeOpNames);
COPY_STRING_FIELD(idxcomment);
COPY_SCALAR_FIELD(indexOid);
- COPY_SCALAR_FIELD(oldNode);
+ COPY_SCALAR_FIELD(oldNumber);
COPY_SCALAR_FIELD(oldCreateSubid);
- COPY_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COPY_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COPY_SCALAR_FIELD(unique);
COPY_SCALAR_FIELD(nulls_not_distinct);
COPY_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index e747e16..d63d326 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1752,9 +1752,9 @@ _equalIndexStmt(const IndexStmt *a, const IndexStmt *b)
COMPARE_NODE_FIELD(excludeOpNames);
COMPARE_STRING_FIELD(idxcomment);
COMPARE_SCALAR_FIELD(indexOid);
- COMPARE_SCALAR_FIELD(oldNode);
+ COMPARE_SCALAR_FIELD(oldNumber);
COMPARE_SCALAR_FIELD(oldCreateSubid);
- COMPARE_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COMPARE_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COMPARE_SCALAR_FIELD(unique);
COMPARE_SCALAR_FIELD(nulls_not_distinct);
COMPARE_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ce12915..3724d48 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2928,9 +2928,9 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_OID_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
- WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
+ WRITE_UINT_FIELD(oldFirstRelfilenumberSubid);
WRITE_BOOL_FIELD(unique);
WRITE_BOOL_FIELD(nulls_not_distinct);
WRITE_BOOL_FIELD(primary);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 969c9c1..394404d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,9 +7990,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
@@ -8022,9 +8022,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 1a64a52..390b454 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1578,9 +1578,9 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
index->nulls_not_distinct = idxrec->indnullsnotdistinct;
index->primary = idxrec->indisprimary;
@@ -2201,9 +2201,9 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->transformed = false;
index->concurrent = false;
index->if_not_exists = false;
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index c937c39..5fc076f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -1207,7 +1207,7 @@ CompactCheckpointerRequestQueue(void)
* We use the request struct directly as a hashtable key. This
* assumes that any padding bytes in the structs are consistently the
* same, which should be okay because we zeroed them in
- * CheckpointerShmemInit. Note also that RelFileNode had better
+ * CheckpointerShmemInit. Note also that RelFileLocator had better
* contain no pad bytes.
*/
request = &CheckpointerShmem->requests[n];
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index aa2427b..c5c6a2b 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -845,7 +845,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_insert *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_insert *) XLogRecGetData(r);
@@ -857,8 +857,8 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -872,7 +872,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
tupledata = XLogRecGetBlockData(r, 0, &datalen);
tuplelen = datalen - SizeOfHeapHeader;
@@ -902,13 +902,13 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xl_heap_update *xlrec;
ReorderBufferChange *change;
char *data;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -918,7 +918,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change = ReorderBufferGetChange(ctx->reorder);
change->action = REORDER_BUFFER_CHANGE_UPDATE;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
if (xlrec->flags & XLH_UPDATE_CONTAINS_NEW_TUPLE)
{
@@ -968,13 +968,13 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_delete *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -990,7 +990,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
/* old primary key stored */
if (xlrec->flags & XLH_DELETE_CONTAINS_OLD)
@@ -1063,7 +1063,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
char *data;
char *tupledata;
Size tuplelen;
- RelFileNode rnode;
+ RelFileLocator rlocator;
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
@@ -1075,8 +1075,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
- if (rnode.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &rlocator, NULL, NULL);
+ if (rlocator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1103,7 +1103,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &rnode, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &rlocator, sizeof(RelFileLocator));
xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
@@ -1165,11 +1165,11 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
{
XLogReaderState *r = buf->record;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1180,7 +1180,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
change->data.tp.clear_toast_afterwards = true;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8da5f90..f8fb228 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -106,7 +106,7 @@
#include "utils/memdebug.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
/* entry for a hash table we use to map from xid to our transaction state */
@@ -116,10 +116,10 @@ typedef struct ReorderBufferTXNByIdEnt
ReorderBufferTXN *txn;
} ReorderBufferTXNByIdEnt;
-/* data structures for (relfilenode, ctid) => (cmin, cmax) mapping */
+/* data structures for (relfilelocator, ctid) => (cmin, cmax) mapping */
typedef struct ReorderBufferTupleCidKey
{
- RelFileNode relnode;
+ RelFileLocator rlocator;
ItemPointerData tid;
} ReorderBufferTupleCidKey;
@@ -1643,7 +1643,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Destroy the (relfilenode, ctid) hashtable, so that we don't leak any
+ * Destroy the (relfilelocator, ctid) hashtable, so that we don't leak any
* memory. We could also keep the hash table and update it with new ctid
* values, but this seems simpler and good enough for now.
*/
@@ -1673,7 +1673,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Build a hash with a (relfilenode, ctid) -> (cmin, cmax) mapping for use by
+ * Build a hash with a (relfilelocator, ctid) -> (cmin, cmax) mapping for use by
* HeapTupleSatisfiesHistoricMVCC.
*/
static void
@@ -1711,7 +1711,7 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
/* be careful about padding */
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
- key.relnode = change->data.tuplecid.node;
+ key.rlocator = change->data.tuplecid.locator;
ItemPointerCopy(&change->data.tuplecid.tid,
&key.tid);
@@ -2140,36 +2140,36 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
+ change->data.tp.rlocator.relNumber);
/*
* Mapped catalog tuple without data, emitted while
* catalog table was in the process of being rewritten. We
- * can fail to look up the relfilenode, because the
+ * can fail to look up the relfilenumber, because the
* relmapper has no "historic" view, in contrast to the
* normal catalog during decoding. Thus repeated rewrites
* can cause a lookup failure. That's OK because we do not
* decode catalog changes anyway. Normally such tuples
* would be skipped over below, but we can't identify
* whether the table should be logically logged without
- * mapping the relfilenode to the oid.
+ * mapping the relfilenumber to the oid.
*/
if (reloid == InvalidOid &&
change->data.tp.newtuple == NULL &&
change->data.tp.oldtuple == NULL)
goto change_done;
else if (reloid == InvalidOid)
- elog(ERROR, "could not map filenode \"%s\" to relation OID",
- relpathperm(change->data.tp.relnode,
+ elog(ERROR, "could not map filenumber \"%s\" to relation OID",
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
relation = RelationIdGetRelation(reloid);
if (!RelationIsValid(relation))
- elog(ERROR, "could not open relation with OID %u (for filenode \"%s\")",
+ elog(ERROR, "could not open relation with OID %u (for filenumber \"%s\")",
reloid,
- relpathperm(change->data.tp.relnode,
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
if (!RelationIsLogicallyLogged(relation))
@@ -3157,7 +3157,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
}
/*
- * Add new (relfilenode, tid) -> (cmin, cmax) mappings.
+ * Add new (relfilelocator, tid) -> (cmin, cmax) mappings.
*
* We do not include this change type in memory accounting, because we
* keep CIDs in a separate list and do not evict them when reaching
@@ -3165,7 +3165,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
*/
void
ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
- XLogRecPtr lsn, RelFileNode node,
+ XLogRecPtr lsn, RelFileLocator locator,
ItemPointerData tid, CommandId cmin,
CommandId cmax, CommandId combocid)
{
@@ -3174,7 +3174,7 @@ ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
- change->data.tuplecid.node = node;
+ change->data.tuplecid.locator = locator;
change->data.tuplecid.tid = tid;
change->data.tuplecid.cmin = cmin;
change->data.tuplecid.cmax = cmax;
@@ -4839,7 +4839,7 @@ ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn)
* need anymore.
*
* To resolve those problems we have a per-transaction hash of (cmin,
- * cmax) tuples keyed by (relfilenode, ctid) which contains the actual
+ * cmax) tuples keyed by (relfilelocator, ctid) which contains the actual
* (cmin, cmax) values. That also takes care of combo CIDs by simply
* not caring about them at all. As we have the real cmin/cmax values
* combo CIDs aren't interesting.
@@ -4870,9 +4870,9 @@ DisplayMapping(HTAB *tuplecid_data)
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
- ent->key.relnode.dbNode,
- ent->key.relnode.spcNode,
- ent->key.relnode.relNode,
+ ent->key.rlocator.dbOid,
+ ent->key.rlocator.spcOid,
+ ent->key.rlocator.relNumber,
ItemPointerGetBlockNumber(&ent->key.tid),
ItemPointerGetOffsetNumber(&ent->key.tid),
ent->cmin,
@@ -4932,7 +4932,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
path, readBytes,
(int32) sizeof(LogicalRewriteMappingData))));
- key.relnode = map.old_node;
+ key.rlocator = map.old_locator;
ItemPointerCopy(&map.old_tid,
&key.tid);
@@ -4947,7 +4947,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
if (!ent)
continue;
- key.relnode = map.new_node;
+ key.rlocator = map.new_locator;
ItemPointerCopy(&map.new_tid,
&key.tid);
@@ -5120,10 +5120,10 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
Assert(!BufferIsLocal(buffer));
/*
- * get relfilenode from the buffer, no convenient way to access it other
+ * get relfilelocator from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &key.rlocator, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1119a12..73c0f15 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -781,7 +781,7 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
- xlrec->target_node, xlrec->target_tid,
+ xlrec->target_locator, xlrec->target_tid,
xlrec->cmin, xlrec->cmax,
xlrec->combocid);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index ae13011..7071ff6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -121,12 +121,12 @@ typedef struct CkptTsStatus
* Type for array used to sort SMgrRelations
*
* FlushRelationsAllBuffers shares the same comparator function with
- * DropRelFileNodesAllBuffers. Pointer to this struct and RelFileNode must be
+ * DropRelFileLocatorsAllBuffers. Pointer to this struct and RelFileLocator must be
* compatible.
*/
typedef struct SMgrSortArray
{
- RelFileNode rnode; /* This must be the first member */
+ RelFileLocator rlocator; /* This must be the first member */
SMgrRelation srel;
} SMgrSortArray;
@@ -483,7 +483,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
BufferAccessStrategy strategy,
bool *foundPtr);
static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
-static void FindAndDropRelFileNodeBuffers(RelFileNode rnode,
+static void FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator,
ForkNumber forkNum,
BlockNumber nForkBlock,
BlockNumber firstDelBlock);
@@ -492,7 +492,7 @@ static void RelationCopyStorageUsingBuffer(Relation src, Relation dst,
bool isunlogged);
static void AtProcExit_Buffers(int code, Datum arg);
static void CheckForBufferLeaks(void);
-static int rnode_comparator(const void *p1, const void *p2);
+static int rlocator_comparator(const void *p1, const void *p2);
static inline int buffertag_comparator(const BufferTag *a, const BufferTag *b);
static inline int ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b);
static int ts_ckpt_progress_comparator(Datum a, Datum b, void *arg);
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -620,7 +620,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
* tag. In that case, the buffer is pinned and the usage count is bumped.
*/
bool
-ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
+ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockNum,
Buffer recent_buffer)
{
BufferDesc *bufHdr;
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rnode, forkNum, blockNum);
+ INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -786,13 +786,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* BackendId).
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool permanent)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(rlocator, InvalidBackendId);
return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
@@ -824,10 +824,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
isExtend = (blockNum == P_NEW);
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
@@ -839,7 +839,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend relation %s beyond %u blocks",
- relpath(smgr->smgr_rnode, forkNum),
+ relpath(smgr->smgr_rlocator, forkNum),
P_NEW)));
}
@@ -886,10 +886,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -926,7 +926,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (!PageIsNew((Page) bufBlock))
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
- blockNum, relpath(smgr->smgr_rnode, forkNum)),
+ blockNum, relpath(smgr->smgr_rlocator, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
@@ -1028,7 +1028,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
@@ -1036,7 +1036,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
}
}
}
@@ -1076,10 +1076,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageMiss;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1255,9 +1255,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
/* OK, do the I/O */
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1266,9 +1266,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
&buf->tag);
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
}
else
{
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rlocator.spcOid;
+ item->relNumber = bufHdr->tag.rlocator.relNumber;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2708,7 +2708,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2769,11 +2769,11 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
+ * Returns the relfilelocator, fork number and block number associated with
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2787,7 +2787,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ *rlocator = bufHdr->tag.rlocator;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,13 +2838,13 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rlocator, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
buf_state = LockBufHdr(buf);
@@ -2922,9 +2922,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3026,7 +3026,7 @@ BufferGetLSNAtomic(Buffer buffer)
}
/* ---------------------------------------------------------------------
- * DropRelFileNodeBuffers
+ * DropRelFileLocatorBuffers
*
* This function removes from the buffer pool all the pages of the
* specified relation forks that have block numbers >= firstDelBlock.
@@ -3047,24 +3047,24 @@ BufferGetLSNAtomic(Buffer buffer)
* --------------------------------------------------------------------
*/
void
-DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
+DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock)
{
int i;
int j;
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
BlockNumber nForkBlock[MAX_FORKNUM];
uint64 nBlocksToInvalidate = 0;
- rnode = smgr_reln->smgr_rnode;
+ rlocator = smgr_reln->smgr_rlocator;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileLocatorBackendIsTemp(rlocator))
{
- if (rnode.backend == MyBackendId)
+ if (rlocator.backend == MyBackendId)
{
for (j = 0; j < nforks; j++)
- DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
+ DropRelFileLocatorLocalBuffers(rlocator.locator, forkNum[j],
firstDelBlock[j]);
}
return;
@@ -3115,7 +3115,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)
{
for (j = 0; j < nforks; j++)
- FindAndDropRelFileNodeBuffers(rnode.node, forkNum[j],
+ FindAndDropRelFileLocatorBuffers(rlocator.locator, forkNum[j],
nForkBlock[j], firstDelBlock[j]);
return;
}
@@ -3138,17 +3138,17 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* false positives are safe because we'll recheck after getting the
* buffer lock.
*
- * We could check forkNum and blockNum as well as the rnode, but the
+ * We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3162,16 +3162,16 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
}
/* ---------------------------------------------------------------------
- * DropRelFileNodesAllBuffers
+ * DropRelFileLocatorsAllBuffers
*
* This function removes from the buffer pool all the pages of all
* forks of the specified relations. It's equivalent to calling
- * DropRelFileNodeBuffers once per fork per relation with
+ * DropRelFileLocatorBuffers once per fork per relation with
* firstDelBlock = 0.
* --------------------------------------------------------------------
*/
void
-DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
+DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
{
int i;
int j;
@@ -3179,22 +3179,22 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
SMgrRelation *rels;
BlockNumber (*block)[MAX_FORKNUM + 1];
uint64 nBlocksToInvalidate = 0;
- RelFileNode *nodes;
+ RelFileLocator *locators;
bool cached = true;
bool use_bsearch;
- if (nnodes == 0)
+ if (nlocators == 0)
return;
- rels = palloc(sizeof(SMgrRelation) * nnodes); /* non-local relations */
+ rels = palloc(sizeof(SMgrRelation) * nlocators); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
- for (i = 0; i < nnodes; i++)
+ for (i = 0; i < nlocators; i++)
{
- if (RelFileNodeBackendIsTemp(smgr_reln[i]->smgr_rnode))
+ if (RelFileLocatorBackendIsTemp(smgr_reln[i]->smgr_rlocator))
{
- if (smgr_reln[i]->smgr_rnode.backend == MyBackendId)
- DropRelFileNodeAllLocalBuffers(smgr_reln[i]->smgr_rnode.node);
+ if (smgr_reln[i]->smgr_rlocator.backend == MyBackendId)
+ DropRelFileLocatorAllLocalBuffers(smgr_reln[i]->smgr_rlocator.locator);
}
else
rels[n++] = smgr_reln[i];
@@ -3219,7 +3219,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
/*
* We can avoid scanning the entire buffer pool if we know the exact size
- * of each of the given relation forks. See DropRelFileNodeBuffers.
+ * of each of the given relation forks. See DropRelFileLocatorBuffers.
*/
for (i = 0; i < n && cached; i++)
{
@@ -3257,7 +3257,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
/* drop all the buffers for a particular relation fork */
- FindAndDropRelFileNodeBuffers(rels[i]->smgr_rnode.node,
+ FindAndDropRelFileLocatorBuffers(rels[i]->smgr_rlocator.locator,
j, block[i][j], 0);
}
}
@@ -3268,9 +3268,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
pfree(block);
- nodes = palloc(sizeof(RelFileNode) * n); /* non-local relations */
+ locators = palloc(sizeof(RelFileLocator) * n); /* non-local relations */
for (i = 0; i < n; i++)
- nodes[i] = rels[i]->smgr_rnode.node;
+ locators[i] = rels[i]->smgr_rlocator.locator;
/*
* For low number of relations to drop just use a simple walk through, to
@@ -3280,18 +3280,18 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
*/
use_bsearch = n > RELS_BSEARCH_THRESHOLD;
- /* sort the list of rnodes if necessary */
+ /* sort the list of rlocators if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(locators, n, sizeof(RelFileLocator), rlocator_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3301,37 +3301,37 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
{
- rnode = &nodes[j];
+ rlocator = &locators[j];
break;
}
}
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
- rnode_comparator);
+ rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ locators, n, sizeof(RelFileLocator),
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
- if (rnode == NULL)
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
+ if (rlocator == NULL)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
}
- pfree(nodes);
+ pfree(locators);
pfree(rels);
}
/* ---------------------------------------------------------------------
- * FindAndDropRelFileNodeBuffers
+ * FindAndDropRelFileLocatorBuffers
*
* This function performs look up in BufMapping table and removes from the
* buffer pool all the pages of the specified relation fork that has block
@@ -3340,9 +3340,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
* --------------------------------------------------------------------
*/
static void
-FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber nForkBlock,
- BlockNumber firstDelBlock)
+FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber nForkBlock,
+ BlockNumber firstDelBlock)
{
BlockNumber curBlock;
@@ -3356,7 +3356,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rnode, forkNum, curBlock);
+ INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
@@ -3380,7 +3380,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3397,7 +3397,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
* bothering to write them out first. This is used when we destroy a
* database, to avoid trying to flush data to disk when the directory
* tree no longer exists. Implementation is pretty similar to
- * DropRelFileNodeBuffers() which is for destroying just one relation.
+ * DropRelFileLocatorBuffers() which is for destroying just one relation.
* --------------------------------------------------------------------
*/
void
@@ -3416,14 +3416,14 @@ DropDatabaseBuffers(Oid dbid)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rlocator.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3453,7 +3453,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3478,7 +3478,7 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(buf->tag.rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3517,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3561,16 +3561,16 @@ FlushRelationBuffers(Relation rel)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3608,21 +3608,21 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (i = 0; i < nrels; i++)
{
- Assert(!RelFileNodeBackendIsTemp(smgrs[i]->smgr_rnode));
+ Assert(!RelFileLocatorBackendIsTemp(smgrs[i]->smgr_rlocator));
- srels[i].rnode = smgrs[i]->smgr_rnode.node;
+ srels[i].rlocator = smgrs[i]->smgr_rlocator.locator;
srels[i].srel = smgrs[i];
}
/*
* Save the bsearch overhead for low number of relations to sync. See
- * DropRelFileNodesAllBuffers for details.
+ * DropRelFileLocatorsAllBuffers for details.
*/
use_bsearch = nrels > RELS_BSEARCH_THRESHOLD;
/* sort the list of SMgrRelations if necessary */
if (use_bsearch)
- pg_qsort(srels, nrels, sizeof(SMgrSortArray), rnode_comparator);
+ pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
@@ -3634,7 +3634,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3644,7 +3644,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,19 +3653,19 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
srels, nrels, sizeof(SMgrSortArray),
- rnode_comparator);
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
if (srelent == NULL)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3729,7 +3729,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
CHECK_FOR_INTERRUPTS();
/* Read block from source relation. */
- srcBuf = ReadBufferWithoutRelcache(src->rd_node, forkNum, blkno,
+ srcBuf = ReadBufferWithoutRelcache(src->rd_locator, forkNum, blkno,
RBM_NORMAL, bstrategy_src,
permanent);
srcPage = BufferGetPage(srcBuf);
@@ -3740,7 +3740,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
}
/* Use P_NEW to extend the destination relation. */
- dstBuf = ReadBufferWithoutRelcache(dst->rd_node, forkNum, P_NEW,
+ dstBuf = ReadBufferWithoutRelcache(dst->rd_locator, forkNum, P_NEW,
RBM_NORMAL, bstrategy_dst,
permanent);
LockBuffer(dstBuf, BUFFER_LOCK_EXCLUSIVE);
@@ -3775,8 +3775,8 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
* --------------------------------------------------------------------
*/
void
-CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
- bool permanent)
+CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator, bool permanent)
{
Relation src_rel;
Relation dst_rel;
@@ -3793,8 +3793,8 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- src_rel = CreateFakeRelcacheEntry(src_rnode);
- dst_rel = CreateFakeRelcacheEntry(dst_rnode);
+ src_rel = CreateFakeRelcacheEntry(src_rlocator);
+ dst_rel = CreateFakeRelcacheEntry(dst_rlocator);
/*
* Create and copy all forks of the relation. During create database we
@@ -3802,7 +3802,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* directory. Therefore, each individual relation doesn't need to be
* registered for cleanup.
*/
- RelationCreateStorage(dst_rnode, relpersistence, false);
+ RelationCreateStorage(dst_rlocator, relpersistence, false);
/* copy main fork. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, MAIN_FORKNUM, permanent);
@@ -3820,7 +3820,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* init fork of an unlogged relation.
*/
if (permanent || forkNum == INIT_FORKNUM)
- log_smgrcreate(&dst_rnode, forkNum);
+ log_smgrcreate(&dst_rlocator, forkNum);
/* Copy a fork's data, block by block. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, forkNum,
@@ -3864,16 +3864,16 @@ FlushDatabaseBuffers(Oid dbid)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rlocator.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4034,7 +4034,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
/*
- * If we must not write WAL, due to a relfilenode-specific
+ * If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
* set the hint, just not dirty the page as a result so the hint
* is lost when we evict the page or shutdown.
@@ -4042,7 +4042,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
return;
/*
@@ -4651,7 +4651,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4675,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,7 +4693,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4703,27 +4703,27 @@ local_buffer_write_error_callback(void *arg)
}
/*
- * RelFileNode qsort/bsearch comparator; see RelFileNodeEquals.
+ * RelFileLocator qsort/bsearch comparator; see RelFileLocatorEquals.
*/
static int
-rnode_comparator(const void *p1, const void *p2)
+rlocator_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileLocator n1 = *(const RelFileLocator *) p1;
+ RelFileLocator n2 = *(const RelFileLocator *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.relNumber < n2.relNumber)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.relNumber > n2.relNumber)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.dbOid < n2.dbOid)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.dbOid > n2.dbOid)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.spcOid < n2.spcOid)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.spcOid > n2.spcOid)
return 1;
else
return 0;
@@ -4789,7 +4789,7 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
if (ret != 0)
return ret;
@@ -4822,9 +4822,9 @@ ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b)
else if (a->tsId > b->tsId)
return 1;
/* compare relation */
- if (a->relNode < b->relNode)
+ if (a->relNumber < b->relNumber)
return -1;
- else if (a->relNode > b->relNode)
+ else if (a->relNumber > b->relNumber)
return 1;
/* compare fork */
else if (a->forkNum < b->forkNum)
@@ -4960,7 +4960,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +4979,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..3dc9cc7 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -134,7 +134,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum, -b - 1);
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
#endif
buf_state = pg_atomic_read_u32(&bufHdr->state);
@@ -162,7 +162,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum,
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum,
-nextFreeLocalBuf - 1);
#endif
@@ -215,7 +215,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -312,7 +312,7 @@ MarkLocalBufferDirty(Buffer buffer)
}
/*
- * DropRelFileNodeLocalBuffers
+ * DropRelFileLocatorLocalBuffers
* This function removes from the buffer pool all the pages of the
* specified relation that have block numbers >= firstDelBlock.
* (In particular, with firstDelBlock = 0, all pages are removed.)
@@ -320,11 +320,11 @@ MarkLocalBufferDirty(Buffer buffer)
* out first. Therefore, this is NOT rollback-able, and so should be
* used only with extreme caution!
*
- * See DropRelFileNodeBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber firstDelBlock)
+DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber firstDelBlock)
{
int i;
@@ -337,14 +337,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -363,14 +363,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
}
/*
- * DropRelFileNodeAllLocalBuffers
+ * DropRelFileLocatorAllLocalBuffers
* This function removes from the buffer pool all pages of all forks
* of the specified relation.
*
- * See DropRelFileNodesAllBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorsAllBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
+DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
{
int i;
@@ -383,12 +383,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -589,7 +589,7 @@ AtProcExit_LocalBuffers(void)
{
/*
* We shouldn't be holding any remaining pins; if we are, and assertions
- * aren't enabled, we'll fail later in DropRelFileNodeBuffers while trying
+ * aren't enabled, we'll fail later in DropRelFileLocatorBuffers while trying
* to drop the temp rels.
*/
CheckForLocalBufferLeaks();
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index d41ae37..005def5 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -196,7 +196,7 @@ RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk, Size spaceAvail)
* WAL replay
*/
void
-XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail)
{
int new_cat = fsm_space_avail_to_cat(spaceAvail);
@@ -211,8 +211,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR,
- InvalidBuffer);
+ buf = XLogReadBufferExtended(rlocator, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR, InvalidBuffer);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..af4dab7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &rlocator, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 671b00a..9dab931 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -442,7 +442,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
}
void
-ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
+ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileLocator locator)
{
VirtualTransactionId *backends;
@@ -461,7 +461,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
return;
backends = GetConflictingVirtualXIDs(latestRemovedXid,
- node.dbNode);
+ locator.dbOid);
ResolveRecoveryConflictWithVirtualXIDs(backends,
PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
@@ -475,7 +475,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
*/
void
ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node)
+ RelFileLocator locator)
{
/*
* ResolveRecoveryConflictWithSnapshot operates on 32-bit TransactionIds,
@@ -493,7 +493,7 @@ ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXi
TransactionId latestRemovedXid;
latestRemovedXid = XidFromFullTransactionId(latestRemovedFullXid);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, node);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, locator);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 25e7e4e..5136da6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1997,7 +1997,7 @@ PageIsPredicateLocked(Relation relation, BlockNumber blkno)
PREDICATELOCKTARGET *target;
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
@@ -2576,7 +2576,7 @@ PredicateLockRelation(Relation relation, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
PredicateLockAcquire(&tag);
}
@@ -2599,7 +2599,7 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_PAGE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
PredicateLockAcquire(&tag);
@@ -2638,13 +2638,13 @@ PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
* level lock.
*/
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
if (PredicateLockExists(&tag))
return;
SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -2974,7 +2974,7 @@ DropAllPredicateLocksFromTable(Relation relation, bool transfer)
if (!PredicateLockingNeededForRelation(relation))
return;
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
relId = relation->rd_id;
if (relation->rd_index == NULL)
{
@@ -3194,11 +3194,11 @@ PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
Assert(BlockNumberIsValid(newblkno));
SET_PREDICATELOCKTARGETTAG_PAGE(oldtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
oldblkno);
SET_PREDICATELOCKTARGETTAG_PAGE(newtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
newblkno);
@@ -4478,7 +4478,7 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (tid != NULL)
{
SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -4488,14 +4488,14 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (blkno != InvalidBlockNumber)
{
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
CheckTargetForConflictsIn(&targettag);
}
SET_PREDICATELOCKTARGETTAG_RELATION(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
CheckTargetForConflictsIn(&targettag);
}
@@ -4556,7 +4556,7 @@ CheckTableForSerializableConflictIn(Relation relation)
Assert(relation->rd_index == NULL); /* not an index relation */
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
heapId = relation->rd_id;
LWLockAcquire(SerializablePredicateListLock, LW_EXCLUSIVE);
diff --git a/src/backend/storage/smgr/README b/src/backend/storage/smgr/README
index e1cfc6c..1dfc16f 100644
--- a/src/backend/storage/smgr/README
+++ b/src/backend/storage/smgr/README
@@ -46,7 +46,7 @@ physical relation in system catalogs.
It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
exists. Fork numbers are assigned in src/include/common/relpath.h.
Functions in smgr.c and md.c take an extra fork number argument, in addition
-to relfilenode and block number, to identify which relation fork you want to
+to relfilenumber and block number, to identify which relation fork you want to
access. Since most code wants to access the main fork, a shortcut version of
ReadBuffer that accesses MAIN_FORKNUM is provided in the buffer manager for
convenience.
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 43edaf5..3998296 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -35,7 +35,7 @@
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/md.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
@@ -89,11 +89,11 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* Populate a file tag describing an md.c segment file. */
-#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
+#define INIT_MD_FILETAG(a,xx_rlocator,xx_forknum,xx_segno) \
( \
memset(&(a), 0, sizeof(FileTag)), \
(a).handler = SYNC_HANDLER_MD, \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forknum = (xx_forknum), \
(a).segno = (xx_segno) \
)
@@ -121,14 +121,14 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
-static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
+static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
bool isRedo);
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
-static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
ForkNumber forknum,
@@ -199,11 +199,11 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* should be here and not in commands/tablespace.c? But that would imply
* importing a lot of stuff that smgr.c oughtn't know, either.
*/
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
+ TablespaceCreateDbspace(reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
isRedo);
- path = relpath(reln->smgr_rnode, forkNum);
+ path = relpath(reln->smgr_rlocator, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
@@ -234,7 +234,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
/*
* mdunlink() -- Unlink a relation.
*
- * Note that we're passed a RelFileNodeBackend --- by the time this is called,
+ * Note that we're passed a RelFileLocatorBackend --- by the time this is called,
* there won't be an SMgrRelation hashtable entry anymore.
*
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
@@ -243,10 +243,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* For regular relations, we don't unlink the first segment file of the rel,
* but just truncate it to zero length, and record a request to unlink it after
* the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenode
- * number from being reused. The scenario this protects us from is:
+ * however. Leaving the empty file in place prevents that relfilenumber
+ * from being reused. The scenario this protects us from is:
* 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenode as
+ * 2. We create a new relation, which by chance gets the same relfilenumber as
* the just-deleted one (OIDs must've wrapped around for that to happen).
* 3. We crash before another checkpoint occurs.
* During replay, we would delete the file and then recreate it, which is fine
@@ -254,18 +254,18 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* But if we didn't WAL-log insertions, but instead relied on fsyncing the
* file after populating it (as we do at wal_level=minimal), the contents of
* the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenode number until
- * it's safe, because relfilenode assignment skips over any existing file.
+ * next checkpoint, we prevent reassignment of the relfilenumber until it's
+ * safe, because relfilenumber assignment skips over any existing file.
*
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenode number.
+ * to the health of a regular rel that has taken over its relfilenumber.
* The fact that temp rels and regular rels have different file naming
* patterns provides additional safety.
*
* All the above applies only to the relation's main fork; other forks can
* just be removed immediately, since they are not needed to prevent the
- * relfilenode number from being recycled. Also, we do not carefully
+ * relfilenumber from being recycled. Also, we do not carefully
* track whether other forks have been created or not, but just attempt to
* unlink them unconditionally; so we should never complain about ENOENT.
*
@@ -278,16 +278,16 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* we are usually not in a transaction anymore when this is called.
*/
void
-mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlink(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
/* Now do the per-fork work */
if (forkNum == InvalidForkNumber)
{
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
else
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
/*
@@ -315,25 +315,25 @@ do_truncate(const char *path)
}
static void
-mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
int ret;
- path = relpath(rnode, forkNum);
+ path = relpath(rlocator, forkNum);
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
/* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
ret = 0;
@@ -354,7 +354,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
ret = do_truncate(path);
/* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
}
/*
@@ -373,7 +373,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
sprintf(segpath, "%s.%u", path, segno);
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
* Prevent other backends' fds from holding on to the disk
@@ -386,7 +386,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
* Forget any pending sync requests for this segment before we
* try to unlink.
*/
- register_forget_request(rnode, forkNum, segno);
+ register_forget_request(rlocator, forkNum, segno);
}
if (unlink(segpath) < 0)
@@ -437,7 +437,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend file \"%s\" beyond %u blocks",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
InvalidBlockNumber)));
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync, EXTENSION_CREATE);
@@ -490,7 +490,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (reln->md_num_open_segs[forknum] > 0)
return &reln->md_seg_fds[forknum][0];
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);
@@ -645,10 +645,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
MdfdVec *v;
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -660,10 +660,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_READ);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -715,10 +715,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
#endif
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -730,10 +730,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_WRITE);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -842,7 +842,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
return;
ereport(ERROR,
(errmsg("could not truncate file \"%s\" to %u blocks: it's only %u blocks now",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
nblocks, curnblk)));
}
if (nblocks == curnblk)
@@ -983,7 +983,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
{
FileTag tag;
- INIT_MD_FILETAG(tag, reln->smgr_rnode.node, forknum, seg->mdfd_segno);
+ INIT_MD_FILETAG(tag, reln->smgr_rlocator.locator, forknum, seg->mdfd_segno);
/* Temp relations should never be fsync'd */
Assert(!SmgrIsTemp(reln));
@@ -1005,15 +1005,15 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
* register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
*/
static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
/* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
+ Assert(!RelFileLocatorBackendIsTemp(rlocator));
RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
}
@@ -1022,12 +1022,12 @@ register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
-register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true /* retryOnError */ );
}
@@ -1039,13 +1039,13 @@ void
ForgetDatabaseSyncRequests(Oid dbid)
{
FileTag tag;
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.dbNode = dbid;
- rnode.spcNode = 0;
- rnode.relNode = 0;
+ rlocator.dbOid = dbid;
+ rlocator.spcOid = 0;
+ rlocator.relNumber = 0;
- INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
+ INIT_MD_FILETAG(tag, rlocator, InvalidForkNumber, InvalidBlockNumber);
RegisterSyncRequest(&tag, SYNC_FILTER_REQUEST, true /* retryOnError */ );
}
@@ -1054,7 +1054,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
* DropRelationFiles -- drop files of all given relations
*/
void
-DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
+DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo)
{
SMgrRelation *srels;
int i;
@@ -1129,7 +1129,7 @@ _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
char *path,
*fullpath;
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
if (segno > 0)
{
@@ -1345,7 +1345,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(ftag->rlocator, InvalidBackendId);
File file;
bool need_to_close;
int result,
@@ -1395,7 +1395,7 @@ mdunlinkfiletag(const FileTag *ftag, char *path)
char *p;
/* Compute the path. */
- p = relpathperm(ftag->rnode, MAIN_FORKNUM);
+ p = relpathperm(ftag->rlocator, MAIN_FORKNUM);
strlcpy(path, p, MAXPGPATH);
pfree(p);
@@ -1417,5 +1417,5 @@ mdfiletagmatches(const FileTag *ftag, const FileTag *candidate)
* We'll return true for all candidates that have the same database OID as
* the ftag from the SYNC_FILTER_REQUEST request, so they're forgotten.
*/
- return ftag->rnode.dbNode == candidate->rnode.dbNode;
+ return ftag->rlocator.dbOid == candidate->rlocator.dbOid;
}
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index a477f70..b21d8c3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -46,7 +46,7 @@ typedef struct f_smgr
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
bool (*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
- void (*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
+ void (*smgr_unlink) (RelFileLocatorBackend rlocator, ForkNumber forknum,
bool isRedo);
void (*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
@@ -143,9 +143,9 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(RelFileLocator rlocator, BackendId backend)
{
- RelFileNodeBackend brnode;
+ RelFileLocatorBackend brlocator;
SMgrRelation reln;
bool found;
@@ -154,7 +154,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = sizeof(RelFileLocatorBackend);
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
@@ -162,10 +162,10 @@ smgropen(RelFileNode rnode, BackendId backend)
}
/* Look up or create an entry */
- brnode.node = rnode;
- brnode.backend = backend;
+ brlocator.locator = rlocator;
+ brlocator.backend = backend;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &brnode,
+ (void *) &brlocator,
HASH_ENTER, &found);
/* Initialize it if not present before */
@@ -267,7 +267,7 @@ smgrclose(SMgrRelation reln)
dlist_delete(&reln->node);
if (hash_search(SMgrRelationHash,
- (void *) &(reln->smgr_rnode),
+ (void *) &(reln->smgr_rlocator),
HASH_REMOVE, NULL) == NULL)
elog(ERROR, "SMgrRelation hashtable corrupted");
@@ -335,15 +335,15 @@ smgrcloseall(void)
}
/*
- * smgrclosenode() -- Close SMgrRelation object for given RelFileNode,
+ * smgrcloserellocator() -- Close SMgrRelation object for given RelFileLocator,
* if one exists.
*
- * This has the same effects as smgrclose(smgropen(rnode)), but it avoids
+ * This has the same effects as smgrclose(smgropen(rlocator)), but it avoids
* uselessly creating a hashtable entry only to drop it again when no
* such entry exists already.
*/
void
-smgrclosenode(RelFileNodeBackend rnode)
+smgrcloserellocator(RelFileLocatorBackend rlocator)
{
SMgrRelation reln;
@@ -352,7 +352,7 @@ smgrclosenode(RelFileNodeBackend rnode)
return;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &rnode,
+ (void *) &rlocator,
HASH_FIND, NULL);
if (reln != NULL)
smgrclose(reln);
@@ -420,7 +420,7 @@ void
smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
{
int i = 0;
- RelFileNodeBackend *rnodes;
+ RelFileLocatorBackend *rlocators;
ForkNumber forknum;
if (nrels == 0)
@@ -430,19 +430,19 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* Get rid of any remaining buffers for the relations. bufmgr will just
* drop them without bothering to write the contents.
*/
- DropRelFileNodesAllBuffers(rels, nrels);
+ DropRelFileLocatorsAllBuffers(rels, nrels);
/*
* create an array which contains all relations to be dropped, and close
* each relation's forks at the smgr level while at it
*/
- rnodes = palloc(sizeof(RelFileNodeBackend) * nrels);
+ rlocators = palloc(sizeof(RelFileLocatorBackend) * nrels);
for (i = 0; i < nrels; i++)
{
- RelFileNodeBackend rnode = rels[i]->smgr_rnode;
+ RelFileLocatorBackend rlocator = rels[i]->smgr_rlocator;
int which = rels[i]->smgr_which;
- rnodes[i] = rnode;
+ rlocators[i] = rlocator;
/* Close the forks at smgr level */
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
@@ -458,7 +458,7 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* closed our own smgr rel.
*/
for (i = 0; i < nrels; i++)
- CacheInvalidateSmgr(rnodes[i]);
+ CacheInvalidateSmgr(rlocators[i]);
/*
* Delete the physical file(s).
@@ -473,10 +473,10 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
int which = rels[i]->smgr_which;
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- smgrsw[which].smgr_unlink(rnodes[i], forknum, isRedo);
+ smgrsw[which].smgr_unlink(rlocators[i], forknum, isRedo);
}
- pfree(rnodes);
+ pfree(rlocators);
}
@@ -631,7 +631,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* Get rid of any buffers for the about-to-be-deleted blocks. bufmgr will
* just drop them without bothering to write the contents.
*/
- DropRelFileNodeBuffers(reln, forknum, nforks, nblocks);
+ DropRelFileLocatorBuffers(reln, forknum, nforks, nblocks);
/*
* Send a shared-inval message to force other backends to close any smgr
@@ -643,7 +643,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* is a performance-critical path.) As in the unlink code, we want to be
* sure the message is sent before we start changing things on-disk.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
/* Do the truncation */
for (i = 0; i < nforks; i++)
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index b4a2c8d..d8ae082 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -27,7 +27,7 @@
#include "utils/builtins.h"
#include "utils/numeric.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/relmapper.h"
#include "utils/syscache.h"
@@ -292,7 +292,7 @@ pg_tablespace_size_name(PG_FUNCTION_ARGS)
* is no check here or at the call sites for that.
*/
static int64
-calculate_relation_size(RelFileNode *rfn, BackendId backend, ForkNumber forknum)
+calculate_relation_size(RelFileLocator *rfn, BackendId backend, ForkNumber forknum)
{
int64 totalsize = 0;
char *relationpath;
@@ -349,7 +349,7 @@ pg_relation_size(PG_FUNCTION_ARGS)
if (rel == NULL)
PG_RETURN_NULL();
- size = calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size = calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkname_to_number(text_to_cstring(forkName)));
relation_close(rel, AccessShareLock);
@@ -374,7 +374,7 @@ calculate_toast_table_size(Oid toastrelid)
/* toast heap size, including FSM and VM size */
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastRel->rd_node),
+ size += calculate_relation_size(&(toastRel->rd_locator),
toastRel->rd_backend, forkNum);
/* toast index size, including FSM and VM size */
@@ -388,7 +388,7 @@ calculate_toast_table_size(Oid toastrelid)
toastIdxRel = relation_open(lfirst_oid(lc),
AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastIdxRel->rd_node),
+ size += calculate_relation_size(&(toastIdxRel->rd_locator),
toastIdxRel->rd_backend, forkNum);
relation_close(toastIdxRel, AccessShareLock);
@@ -417,7 +417,7 @@ calculate_table_size(Relation rel)
* heap size, including FSM and VM
*/
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size += calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkNum);
/*
@@ -456,7 +456,7 @@ calculate_indexes_size(Relation rel)
idxRel = relation_open(idxOid, AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(idxRel->rd_node),
+ size += calculate_relation_size(&(idxRel->rd_locator),
idxRel->rd_backend,
forkNum);
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelFileNumber result;
HeapTuple tuple;
Form_pg_class relform;
@@ -864,29 +864,29 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (relform->relfilenode)
result = relform->relfilenode;
else /* Consult the relation mapper */
- result = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ result = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelFileNumber;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
PG_RETURN_OID(result);
}
/*
- * Get the relation via (reltablespace, relfilenode)
+ * Get the relation via (reltablespace, relfilenumber)
*
* This is expected to be used when somebody wants to match an individual file
* on the filesystem back to its table. That's not trivially possible via
- * pg_class, because that doesn't contain the relfilenodes of shared and nailed
+ * pg_class, because that doesn't contain the relfilenumbers of shared and nailed
* tables.
*
* We don't fail but return NULL if we cannot find a mapping.
@@ -898,14 +898,14 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_OID(1);
Oid heaprel;
- /* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ /* test needed so RelidByRelfilenumber doesn't misbehave */
+ if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenode(reltablespace, relfilenode);
+ heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
@@ -924,7 +924,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
Oid relid = PG_GETARG_OID(0);
HeapTuple tuple;
Form_pg_class relform;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BackendId backend;
char *path;
@@ -937,29 +937,29 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
{
/* This logic should match RelationInitPhysicalAddr */
if (relform->reltablespace)
- rnode.spcNode = relform->reltablespace;
+ rlocator.spcOid = relform->reltablespace;
else
- rnode.spcNode = MyDatabaseTableSpace;
- if (rnode.spcNode == GLOBALTABLESPACE_OID)
- rnode.dbNode = InvalidOid;
+ rlocator.spcOid = MyDatabaseTableSpace;
+ if (rlocator.spcOid == GLOBALTABLESPACE_OID)
+ rlocator.dbOid = InvalidOid;
else
- rnode.dbNode = MyDatabaseId;
+ rlocator.dbOid = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ rlocator.relNumber = relform->relfilenode;
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ rlocator.relNumber = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rlocator.relNumber = InvalidOid;
/* some compilers generate warnings without these next two lines */
- rnode.dbNode = InvalidOid;
- rnode.spcNode = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.spcOid = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!OidIsValid(rlocator.relNumber))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
@@ -990,7 +990,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
ReleaseSysCache(tuple);
- path = relpathbackend(rnode, backend, MAIN_FORKNUM);
+ path = relpathbackend(rlocator, backend, MAIN_FORKNUM);
PG_RETURN_TEXT_P(cstring_to_text(path));
}
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..4408c00 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -2,7 +2,7 @@
* pg_upgrade_support.c
*
* server-side functions to set backend global variables
- * to control oid and relfilenode assignment, and do other special
+ * to control oid and relfilenumber assignment, and do other special
* hacks needed for pg_upgrade.
*
* Copyright (c) 2010-2022, PostgreSQL Global Development Group
@@ -98,10 +98,10 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -120,10 +120,10 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile
index 38e46d2..5105018 100644
--- a/src/backend/utils/cache/Makefile
+++ b/src/backend/utils/cache/Makefile
@@ -21,7 +21,7 @@ OBJS = \
partcache.o \
plancache.o \
relcache.o \
- relfilenodemap.o \
+ relfilenumbermap.o \
relmapper.o \
spccache.o \
syscache.o \
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index af000d4..eb5782f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -661,11 +661,11 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
* We could have smgr entries for relations of other databases, so no
* short-circuit test is possible here.
*/
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
- rnode.node = msg->sm.rnode;
- rnode.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
- smgrclosenode(rnode);
+ rlocator.locator = msg->sm.rlocator;
+ rlocator.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
+ smgrcloserellocator(rlocator);
}
else if (msg->id == SHAREDINVALRELMAP_ID)
{
@@ -1459,14 +1459,14 @@ CacheInvalidateRelcacheByRelid(Oid relid)
* Thus, the maximum possible backend ID is 2^23-1.
*/
void
-CacheInvalidateSmgr(RelFileNodeBackend rnode)
+CacheInvalidateSmgr(RelFileLocatorBackend rlocator)
{
SharedInvalidationMessage msg;
msg.sm.id = SHAREDINVALSMGR_ID;
- msg.sm.backend_hi = rnode.backend >> 16;
- msg.sm.backend_lo = rnode.backend & 0xffff;
- msg.sm.rnode = rnode.node;
+ msg.sm.backend_hi = rlocator.backend >> 16;
+ msg.sm.backend_lo = rlocator.backend & 0xffff;
+ msg.sm.rlocator = rlocator.locator;
/* check AddCatcacheInvalidationMessage() for an explanation */
VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 0e8fda9..9bab6af 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -369,7 +369,7 @@ ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic)
/*
* The caller might need a tuple that's newer than the one the historic
* snapshot; currently the only case requiring to do so is looking up the
- * relfilenode of non mapped system relations during decoding. That
+ * relfilenumber of non mapped system relations during decoding. That
* snapshot can't change in the midst of a relcache build, so there's no
* need to register the snapshot.
*/
@@ -1133,8 +1133,8 @@ retry:
relation->rd_refcnt = 0;
relation->rd_isnailed = false;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
@@ -1300,7 +1300,7 @@ retry:
}
/*
- * Initialize the physical addressing info (RelFileNode) for a relcache entry
+ * Initialize the physical addressing info (RelFileLocator) for a relcache entry
*
* Note: at the physical level, relations in the pg_global tablespace must
* be treated as shared, even if relisshared isn't set. Hence we do not
@@ -1309,20 +1309,20 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ RelFileNumber oldnumber = relation->rd_locator.relNumber;
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
if (relation->rd_rel->reltablespace)
- relation->rd_node.spcNode = relation->rd_rel->reltablespace;
+ relation->rd_locator.spcOid = relation->rd_rel->reltablespace;
else
- relation->rd_node.spcNode = MyDatabaseTableSpace;
- if (relation->rd_node.spcNode == GLOBALTABLESPACE_OID)
- relation->rd_node.dbNode = InvalidOid;
+ relation->rd_locator.spcOid = MyDatabaseTableSpace;
+ if (relation->rd_locator.spcOid == GLOBALTABLESPACE_OID)
+ relation->rd_locator.dbOid = InvalidOid;
else
- relation->rd_node.dbNode = MyDatabaseId;
+ relation->rd_locator.dbOid = MyDatabaseId;
if (relation->rd_rel->relfilenode)
{
@@ -1356,30 +1356,30 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ relation->rd_locator.relNumber = relation->rd_rel->relfilenode;
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ relation->rd_locator.relNumber =
+ RelationMapOidToFilenumber(relation->rd_id,
+ relation->rd_rel->relisshared);
+ if (!RelFileNumberIsValid(relation->rd_locator.relNumber))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
/*
* For RelationNeedsWAL() to answer correctly on parallel workers, restore
- * rd_firstRelfilenodeSubid. No subtransactions start or end while in
+ * rd_firstRelfilelocatorSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnumber != relation->rd_locator.relNumber)
{
- if (RelFileNodeSkippingWAL(relation->rd_node))
- relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ if (RelFileLocatorSkippingWAL(relation->rd_locator))
+ relation->rd_firstRelfilelocatorSubid = TopSubTransactionId;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
}
@@ -1889,8 +1889,8 @@ formrdesc(const char *relationName, Oid relationReltype,
*/
relation->rd_isnailed = true;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
relation->rd_backend = InvalidBackendId;
relation->rd_islocaltemp = false;
@@ -1978,9 +1978,9 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
- * because there is no other way to know what filenode they currently
+ * because there is no other way to know what filenumber they currently
* have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * specifying that the initial filenumber is the same as the OID.
*/
relation->rd_rel->relfilenode = InvalidOid;
if (IsBootstrapProcessingMode())
@@ -2180,7 +2180,7 @@ RelationClose(Relation relation)
#ifdef RELCACHE_FORCE_RELEASE
if (RelationHasReferenceCountZero(relation) &&
relation->rd_createSubid == InvalidSubTransactionId &&
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
RelationClearRelation(relation, false);
#endif
}
@@ -2352,7 +2352,7 @@ RelationReloadNailed(Relation relation)
{
/*
* If it's a nailed-but-not-mapped index, then we need to re-read the
- * pg_class row to see if its relfilenode changed.
+ * pg_class row to see if its relfilenumber changed.
*/
RelationReloadIndexInfo(relation);
}
@@ -2700,8 +2700,8 @@ RelationClearRelation(Relation relation, bool rebuild)
Assert(newrel->rd_isnailed == relation->rd_isnailed);
/* creation sub-XIDs must be preserved */
SWAPFIELD(SubTransactionId, rd_createSubid);
- SWAPFIELD(SubTransactionId, rd_newRelfilenodeSubid);
- SWAPFIELD(SubTransactionId, rd_firstRelfilenodeSubid);
+ SWAPFIELD(SubTransactionId, rd_newRelfilelocatorSubid);
+ SWAPFIELD(SubTransactionId, rd_firstRelfilelocatorSubid);
SWAPFIELD(SubTransactionId, rd_droppedSubid);
/* un-swap rd_rel pointers, swap contents instead */
SWAPFIELD(Form_pg_class, rd_rel);
@@ -2791,12 +2791,12 @@ static void
RelationFlushRelation(Relation relation)
{
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* New relcache entries are always rebuilt, not flushed; else we'd
* forget the "new" status of the relation. Ditto for the
- * new-relfilenode status.
+ * new-relfilenumber status.
*
* The rel could have zero refcnt here, so temporarily increment the
* refcnt to ensure it's safe to rebuild it. We can assume that the
@@ -2835,7 +2835,7 @@ RelationForgetRelation(Oid rid)
Assert(relation->rd_droppedSubid == InvalidSubTransactionId);
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* In the event of subtransaction rollback, we must not forget
@@ -2894,7 +2894,7 @@ RelationCacheInvalidateEntry(Oid relationId)
*
* Apart from debug_discard_caches, this is currently used only to recover
* from SI message buffer overflow, so we do not touch relations having
- * new-in-transaction relfilenodes; they cannot be targets of cross-backend
+ * new-in-transaction relfilenumbers; they cannot be targets of cross-backend
* SI updates (and our own updates now go through a separate linked list
* that isn't limited by the SI message buffer size).
*
@@ -2909,7 +2909,7 @@ RelationCacheInvalidateEntry(Oid relationId)
* so hash_seq_search will complete safely; (b) during the second pass we
* only hold onto pointers to nondeletable entries.
*
- * The two-phase approach also makes it easy to update relfilenodes for
+ * The two-phase approach also makes it easy to update relfilenumbers for
* mapped relations before we do anything else, and to ensure that the
* second pass processes nailed-in-cache items before other nondeletable
* items. This should ensure that system catalogs are up to date before
@@ -2948,12 +2948,12 @@ RelationCacheInvalidate(bool debug_discard)
/*
* Ignore new relations; no other backend will manipulate them before
- * we commit. Likewise, before replacing a relation's relfilenode, we
- * shall have acquired AccessExclusiveLock and drained any applicable
- * pending invalidations.
+ * we commit. Likewise, before replacing a relation's relfilenumber,
+ * we shall have acquired AccessExclusiveLock and drained any
+ * applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;
relcacheInvalsReceived++;
@@ -2967,8 +2967,8 @@ RelationCacheInvalidate(bool debug_discard)
else
{
/*
- * If it's a mapped relation, immediately update its rd_node in
- * case its relfilenode changed. We must do this during phase 1
+ * If it's a mapped relation, immediately update its rd_locator in
+ * case its relfilenumber changed. We must do this during phase 1
* in case the relation is consulted during rebuild of other
* relcache entries in phase 2. It's safe since consulting the
* map doesn't involve any access to relcache entries.
@@ -3078,14 +3078,14 @@ AssertPendingSyncConsistency(Relation relation)
RelationIsPermanent(relation) &&
((relation->rd_createSubid != InvalidSubTransactionId &&
RELKIND_HAS_STORAGE(relation->rd_rel->relkind)) ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId);
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId);
- Assert(relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node));
+ Assert(relcache_verdict == RelFileLocatorSkippingWAL(relation->rd_locator));
if (relation->rd_droppedSubid != InvalidSubTransactionId)
Assert(!relation->rd_isvalid &&
(relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId));
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId));
}
/*
@@ -3282,8 +3282,8 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
* also lets RelationClearRelation() drop the relcache entry.
*/
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
if (clear_relcache)
@@ -3397,8 +3397,8 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
{
/* allow the entry to be removed */
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
RelationClearRelation(relation, false);
return;
@@ -3419,23 +3419,23 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
}
/*
- * Likewise, update or drop any new-relfilenode-in-subtransaction record
+ * Likewise, update or drop any new-relfilenumber-in-subtransaction record
* or drop record.
*/
- if (relation->rd_newRelfilenodeSubid == mySubid)
+ if (relation->rd_newRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_newRelfilenodeSubid = parentSubid;
+ relation->rd_newRelfilelocatorSubid = parentSubid;
else
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
}
- if (relation->rd_firstRelfilenodeSubid == mySubid)
+ if (relation->rd_firstRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_firstRelfilenodeSubid = parentSubid;
+ relation->rd_firstRelfilelocatorSubid = parentSubid;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
if (relation->rd_droppedSubid == mySubid)
@@ -3459,7 +3459,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3533,8 +3533,8 @@ RelationBuildLocalRelation(const char *relname,
/* it's being created in this transaction */
rel->rd_createSubid = GetCurrentSubTransactionId();
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
/*
@@ -3616,7 +3616,7 @@ RelationBuildLocalRelation(const char *relname,
/*
* Insert relation physical and logical identifiers (OIDs) into the right
- * places. For a mapped relation, we set relfilenode to zero and rely on
+ * places. For a mapped relation, we set relfilenumber to zero and rely on
* RelationInitPhysicalAddr to consult the map.
*/
rel->rd_rel->relisshared = shared_relation;
@@ -3632,10 +3632,10 @@ RelationBuildLocalRelation(const char *relname,
{
rel->rd_rel->relfilenode = InvalidOid;
/* Add it to the active mapping information */
- RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
+ RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
else
- rel->rd_rel->relfilenode = relfilenode;
+ rel->rd_rel->relfilenode = relfilenumber;
RelationInitLockInfo(rel); /* see lmgr.c */
@@ -3683,13 +3683,13 @@ RelationBuildLocalRelation(const char *relname,
/*
- * RelationSetNewRelfilenode
+ * RelationSetNewRelfilenumber
*
- * Assign a new relfilenode (physical file name), and possibly a new
+ * Assign a new relfilenumber (physical file name), and possibly a new
* persistence setting, to the relation.
*
* This allows a full rewrite of the relation to be done with transactional
- * safety (since the filenode assignment can be rolled back). Note however
+ * safety (since the filenumber assignment can be rolled back). Note however
* that there is no simple way to access the relation's old data for the
* remainder of the current transaction. This limits the usefulness to cases
* such as TRUNCATE or rebuilding an index from scratch.
@@ -3697,19 +3697,19 @@ RelationBuildLocalRelation(const char *relname,
* Caller must already hold exclusive lock on the relation.
*/
void
-RelationSetNewRelfilenode(Relation relation, char persistence)
+RelationSetNewRelfilenumber(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelFileNumber newrelfilenumber;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
MultiXactId minmulti = InvalidMultiXactId;
TransactionId freezeXid = InvalidTransactionId;
- RelFileNode newrnode;
+ RelFileLocator newrlocator;
- /* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
- persistence);
+ /* Allocate a new relfilenumber */
+ newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
+ NULL, persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
@@ -3729,28 +3729,28 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelationDropStorage(relation);
/*
- * Create storage for the main fork of the new relfilenode. If it's a
+ * Create storage for the main fork of the new relfilenumber. If it's a
* table-like object, call into the table AM to do so, which'll also
* create the table's init fork if needed.
*
- * NOTE: If relevant for the AM, any conflict in relfilenode value will be
- * caught here, if GetNewRelFileNode messes up for any reason.
+ * NOTE: If relevant for the AM, any conflict in relfilenumber value will be
+ * caught here, if GetNewRelFileNumber messes up for any reason.
*/
- newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ newrlocator = relation->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
- table_relation_set_new_filenode(relation, &newrnode,
- persistence,
- &freezeXid, &minmulti);
+ table_relation_set_new_filelocator(relation, &newrlocator,
+ persistence,
+ &freezeXid, &minmulti);
}
else if (RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
{
/* handle these directly, at least for now */
SMgrRelation srel;
- srel = RelationCreateStorage(newrnode, persistence, true);
+ srel = RelationCreateStorage(newrlocator, persistence, true);
smgrclose(srel);
}
else
@@ -3789,7 +3789,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
/* Do the deed */
RelationMapUpdateMap(RelationGetRelid(relation),
- newrelfilenode,
+ newrelfilenumber,
relation->rd_rel->relisshared,
false);
@@ -3799,7 +3799,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
else
{
/* Normal case, update the pg_class entry */
- classform->relfilenode = newrelfilenode;
+ classform->relfilenode = newrelfilenumber;
/* relpages etc. never change for sequences */
if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
@@ -3825,27 +3825,27 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
*/
CommandCounterIncrement();
- RelationAssumeNewRelfilenode(relation);
+ RelationAssumeNewRelfilelocator(relation);
}
/*
- * RelationAssumeNewRelfilenode
+ * RelationAssumeNewRelfilelocator
*
* Code that modifies pg_class.reltablespace or pg_class.relfilenode must call
* this. The call shall precede any code that might insert WAL records whose
- * replay would modify bytes in the new RelFileNode, and the call shall follow
- * any WAL modifying bytes in the prior RelFileNode. See struct RelationData.
+ * replay would modify bytes in the new RelFileLocator, and the call shall follow
+ * any WAL modifying bytes in the prior RelFileLocator. See struct RelationData.
* Ideally, call this as near as possible to the CommandCounterIncrement()
* that makes the pg_class change visible (before it or after it); that
* minimizes the chance of future development adding a forbidden WAL insertion
- * between RelationAssumeNewRelfilenode() and CommandCounterIncrement().
+ * between RelationAssumeNewRelfilelocator() and CommandCounterIncrement().
*/
void
-RelationAssumeNewRelfilenode(Relation relation)
+RelationAssumeNewRelfilelocator(Relation relation)
{
- relation->rd_newRelfilenodeSubid = GetCurrentSubTransactionId();
- if (relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
- relation->rd_firstRelfilenodeSubid = relation->rd_newRelfilenodeSubid;
+ relation->rd_newRelfilelocatorSubid = GetCurrentSubTransactionId();
+ if (relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid = relation->rd_newRelfilelocatorSubid;
/* Flag relation as needing eoxact cleanup (to clear these fields) */
EOXactListAdd(relation);
@@ -6254,8 +6254,8 @@ load_relcache_init_file(bool shared)
rel->rd_fkeyvalid = false;
rel->rd_fkeylist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
MemSet(&rel->pgstat_info, 0, sizeof(rel->pgstat_info));
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
deleted file mode 100644
index 70c323c..0000000
--- a/src/backend/utils/cache/relfilenodemap.c
+++ /dev/null
@@ -1,244 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.c
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- * src/backend/utils/cache/relfilenodemap.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "access/genam.h"
-#include "access/htup_details.h"
-#include "access/table.h"
-#include "catalog/pg_class.h"
-#include "catalog/pg_tablespace.h"
-#include "miscadmin.h"
-#include "utils/builtins.h"
-#include "utils/catcache.h"
-#include "utils/fmgroids.h"
-#include "utils/hsearch.h"
-#include "utils/inval.h"
-#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
-#include "utils/relmapper.h"
-
-/* Hash table for information about each relfilenode <-> oid pair */
-static HTAB *RelfilenodeMapHash = NULL;
-
-/* built first time through in InitializeRelfilenodeMap */
-static ScanKeyData relfilenode_skey[2];
-
-typedef struct
-{
- Oid reltablespace;
- Oid relfilenode;
-} RelfilenodeMapKey;
-
-typedef struct
-{
- RelfilenodeMapKey key; /* lookup key - must be first */
- Oid relid; /* pg_class.oid */
-} RelfilenodeMapEntry;
-
-/*
- * RelfilenodeMapInvalidateCallback
- * Flush mapping entries when pg_class is updated in a relevant fashion.
- */
-static void
-RelfilenodeMapInvalidateCallback(Datum arg, Oid relid)
-{
- HASH_SEQ_STATUS status;
- RelfilenodeMapEntry *entry;
-
- /* callback only gets registered after creating the hash */
- Assert(RelfilenodeMapHash != NULL);
-
- hash_seq_init(&status, RelfilenodeMapHash);
- while ((entry = (RelfilenodeMapEntry *) hash_seq_search(&status)) != NULL)
- {
- /*
- * If relid is InvalidOid, signaling a complete reset, we must remove
- * all entries, otherwise just remove the specific relation's entry.
- * Always remove negative cache entries.
- */
- if (relid == InvalidOid || /* complete reset */
- entry->relid == InvalidOid || /* negative cache entry */
- entry->relid == relid) /* individual flushed relation */
- {
- if (hash_search(RelfilenodeMapHash,
- (void *) &entry->key,
- HASH_REMOVE,
- NULL) == NULL)
- elog(ERROR, "hash table corrupted");
- }
- }
-}
-
-/*
- * InitializeRelfilenodeMap
- * Initialize cache, either on first use or after a reset.
- */
-static void
-InitializeRelfilenodeMap(void)
-{
- HASHCTL ctl;
- int i;
-
- /* Make sure we've initialized CacheMemoryContext. */
- if (CacheMemoryContext == NULL)
- CreateCacheMemoryContext();
-
- /* build skey */
- MemSet(&relfilenode_skey, 0, sizeof(relfilenode_skey));
-
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenode_skey[i].sk_func,
- CacheMemoryContext);
- relfilenode_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenode_skey[i].sk_subtype = InvalidOid;
- relfilenode_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenode_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenode_skey[1].sk_attno = Anum_pg_class_relfilenode;
-
- /*
- * Only create the RelfilenodeMapHash now, so we don't end up partially
- * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
- * error.
- */
- ctl.keysize = sizeof(RelfilenodeMapKey);
- ctl.entrysize = sizeof(RelfilenodeMapEntry);
- ctl.hcxt = CacheMemoryContext;
-
- RelfilenodeMapHash =
- hash_create("RelfilenodeMap cache", 64, &ctl,
- HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
-
- /* Watch for invalidation events. */
- CacheRegisterRelcacheCallback(RelfilenodeMapInvalidateCallback,
- (Datum) 0);
-}
-
-/*
- * Map a relation's (tablespace, filenode) to a relation's oid and cache the
- * result.
- *
- * Returns InvalidOid if no relation matching the criteria could be found.
- */
-Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
-{
- RelfilenodeMapKey key;
- RelfilenodeMapEntry *entry;
- bool found;
- SysScanDesc scandesc;
- Relation relation;
- HeapTuple ntp;
- ScanKeyData skey[2];
- Oid relid;
-
- if (RelfilenodeMapHash == NULL)
- InitializeRelfilenodeMap();
-
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
-
- /*
- * Check cache and return entry if one is found. Even if no target
- * relation can be found later on we store the negative match and return a
- * InvalidOid from cache. That's not really necessary for performance
- * since querying invalid values isn't supposed to be a frequent thing,
- * but it's basically free.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_FIND, &found);
-
- if (found)
- return entry->relid;
-
- /* ok, no previous cache entry, do it the hard way */
-
- /* initialize empty/negative cache entry before doing the actual lookups */
- relid = InvalidOid;
-
- if (reltablespace == GLOBALTABLESPACE_OID)
- {
- /*
- * Ok, shared table, check relmapper.
- */
- relid = RelationMapFilenodeToOid(relfilenode, true);
- }
- else
- {
- /*
- * Not a shared table, could either be a plain relation or a
- * non-shared, nailed one, like e.g. pg_class.
- */
-
- /* check for plain relations by looking in pg_class */
- relation = table_open(RelationRelationId, AccessShareLock);
-
- /* copy scankey to local copy, it will be modified during the scan */
- memcpy(skey, relfilenode_skey, sizeof(skey));
-
- /* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
-
- scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
- true,
- NULL,
- 2,
- skey);
-
- found = false;
-
- while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
- {
- Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
-
- if (found)
- elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
- reltablespace, relfilenode);
- found = true;
-
- Assert(classform->reltablespace == reltablespace);
- Assert(classform->relfilenode == relfilenode);
- relid = classform->oid;
- }
-
- systable_endscan(scandesc);
- table_close(relation, AccessShareLock);
-
- /* check for tables that are mapped but not shared */
- if (!found)
- relid = RelationMapFilenodeToOid(relfilenode, false);
- }
-
- /*
- * Only enter entry into cache now, our opening of pg_class could have
- * caused cache invalidations to be executed which would have deleted a
- * new entry if we had entered it above.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_ENTER, &found);
- if (found)
- elog(ERROR, "corrupted hashtable");
- entry->relid = relid;
-
- return relid;
-}
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
new file mode 100644
index 0000000..3dc45e9
--- /dev/null
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -0,0 +1,244 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.c
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/utils/cache/relfilenumbermap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/htup_details.h"
+#include "access/table.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/catcache.h"
+#include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/inval.h"
+#include "utils/rel.h"
+#include "utils/relfilenumbermap.h"
+#include "utils/relmapper.h"
+
+/* Hash table for information about each relfilenumber <-> oid pair */
+static HTAB *RelfilenumberMapHash = NULL;
+
+/* built first time through in InitializeRelfilenumberMap */
+static ScanKeyData relfilenumber_skey[2];
+
+typedef struct
+{
+ Oid reltablespace;
+ RelFileNumber relfilenumber;
+} RelfilenumberMapKey;
+
+typedef struct
+{
+ RelfilenumberMapKey key; /* lookup key - must be first */
+ Oid relid; /* pg_class.oid */
+} RelfilenumberMapEntry;
+
+/*
+ * RelfilenumberMapInvalidateCallback
+ * Flush mapping entries when pg_class is updated in a relevant fashion.
+ */
+static void
+RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
+{
+ HASH_SEQ_STATUS status;
+ RelfilenumberMapEntry *entry;
+
+ /* callback only gets registered after creating the hash */
+ Assert(RelfilenumberMapHash != NULL);
+
+ hash_seq_init(&status, RelfilenumberMapHash);
+ while ((entry = (RelfilenumberMapEntry *) hash_seq_search(&status)) != NULL)
+ {
+ /*
+ * If relid is InvalidOid, signaling a complete reset, we must remove
+ * all entries, otherwise just remove the specific relation's entry.
+ * Always remove negative cache entries.
+ */
+ if (relid == InvalidOid || /* complete reset */
+ entry->relid == InvalidOid || /* negative cache entry */
+ entry->relid == relid) /* individual flushed relation */
+ {
+ if (hash_search(RelfilenumberMapHash,
+ (void *) &entry->key,
+ HASH_REMOVE,
+ NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+ }
+ }
+}
+
+/*
+ * InitializeRelfilenumberMap
+ * Initialize cache, either on first use or after a reset.
+ */
+static void
+InitializeRelfilenumberMap(void)
+{
+ HASHCTL ctl;
+ int i;
+
+ /* Make sure we've initialized CacheMemoryContext. */
+ if (CacheMemoryContext == NULL)
+ CreateCacheMemoryContext();
+
+ /* build skey */
+ MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
+
+ for (i = 0; i < 2; i++)
+ {
+ fmgr_info_cxt(F_OIDEQ,
+ &relfilenumber_skey[i].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[i].sk_subtype = InvalidOid;
+ relfilenumber_skey[i].sk_collation = InvalidOid;
+ }
+
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
+ relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+
+ /*
+ * Only create the RelfilenumberMapHash now, so we don't end up partially
+ * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
+ * error.
+ */
+ ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.entrysize = sizeof(RelfilenumberMapEntry);
+ ctl.hcxt = CacheMemoryContext;
+
+ RelfilenumberMapHash =
+ hash_create("RelfilenumberMap cache", 64, &ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+ /* Watch for invalidation events. */
+ CacheRegisterRelcacheCallback(RelfilenumberMapInvalidateCallback,
+ (Datum) 0);
+}
+
+/*
+ * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * the result.
+ *
+ * Returns InvalidOid if no relation matching the criteria could be found.
+ */
+Oid
+RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
+{
+ RelfilenumberMapKey key;
+ RelfilenumberMapEntry *entry;
+ bool found;
+ SysScanDesc scandesc;
+ Relation relation;
+ HeapTuple ntp;
+ ScanKeyData skey[2];
+ Oid relid;
+
+ if (RelfilenumberMapHash == NULL)
+ InitializeRelfilenumberMap();
+
+ /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
+ if (reltablespace == MyDatabaseTableSpace)
+ reltablespace = 0;
+
+ MemSet(&key, 0, sizeof(key));
+ key.reltablespace = reltablespace;
+ key.relfilenumber = relfilenumber;
+
+ /*
+ * Check cache and return entry if one is found. Even if no target
+ * relation can be found later on we store the negative match and return a
+ * InvalidOid from cache. That's not really necessary for performance
+ * since querying invalid values isn't supposed to be a frequent thing,
+ * but it's basically free.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+
+ if (found)
+ return entry->relid;
+
+ /* ok, no previous cache entry, do it the hard way */
+
+ /* initialize empty/negative cache entry before doing the actual lookups */
+ relid = InvalidOid;
+
+ if (reltablespace == GLOBALTABLESPACE_OID)
+ {
+ /*
+ * Ok, shared table, check relmapper.
+ */
+ relid = RelationMapFilenumberToOid(relfilenumber, true);
+ }
+ else
+ {
+ /*
+ * Not a shared table, could either be a plain relation or a
+ * non-shared, nailed one, like e.g. pg_class.
+ */
+
+ /* check for plain relations by looking in pg_class */
+ relation = table_open(RelationRelationId, AccessShareLock);
+
+ /* copy scankey to local copy, it will be modified during the scan */
+ memcpy(skey, relfilenumber_skey, sizeof(skey));
+
+ /* set scan arguments */
+ skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
+ skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+
+ scandesc = systable_beginscan(relation,
+ ClassTblspcRelfilenodeIndexId,
+ true,
+ NULL,
+ 2,
+ skey);
+
+ found = false;
+
+ while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
+ {
+ Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
+
+ if (found)
+ elog(ERROR,
+ "unexpected duplicate for tablespace %u, relfilenumber %u",
+ reltablespace, relfilenumber);
+ found = true;
+
+ Assert(classform->reltablespace == reltablespace);
+ Assert(classform->relfilenode == relfilenumber);
+ relid = classform->oid;
+ }
+
+ systable_endscan(scandesc);
+ table_close(relation, AccessShareLock);
+
+ /* check for tables that are mapped but not shared */
+ if (!found)
+ relid = RelationMapFilenumberToOid(relfilenumber, false);
+ }
+
+ /*
+ * Only enter entry into cache now, our opening of pg_class could have
+ * caused cache invalidations to be executed which would have deleted a
+ * new entry if we had entered it above.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ if (found)
+ elog(ERROR, "corrupted hashtable");
+ entry->relid = relid;
+
+ return relid;
+}
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 2a330cf..2dd236f 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.c
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
* For most tables, the physical file underlying the table is specified by
* pg_class.relfilenode. However, that obviously won't work for pg_class
@@ -11,7 +11,7 @@
* update other databases' pg_class entries when relocating a shared catalog.
* Therefore, for these special catalogs (henceforth referred to as "mapped
* catalogs") we rely on a separately maintained file that shows the mapping
- * from catalog OIDs to filenode numbers. Each database has a map file for
+ * from catalog OIDs to filenumbers. Each database has a map file for
* its local mapped catalogs, and there is a separate map file for shared
* catalogs. Mapped catalogs have zero in their pg_class.relfilenode entries.
*
@@ -78,8 +78,8 @@
typedef struct RelMapping
{
- Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ Oid mapoid; /* OID of a catalog */
+ RelFileNumber mapfilenumber; /* its rel file number */
} RelMapping;
typedef struct RelMapFile
@@ -116,7 +116,7 @@ static RelMapFile local_map;
* subtransactions, so one set of transaction-level changes is sufficient.
*
* The active_xxx variables contain updates that are valid in our transaction
- * and should be honored by RelationMapOidToFilenode. The pending_xxx
+ * and should be honored by RelationMapOidToFilenumber. The pending_xxx
* variables contain updates we have been told about that aren't active yet;
* they will become active at the next CommandCounterIncrement. This setup
* lets map updates act similarly to updates of pg_class rows, ie, they
@@ -132,8 +132,8 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
- bool add_okay);
+static void apply_map_update(RelMapFile *map, Oid relationId,
+ RelFileNumber filenumber, bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
static void load_relmap_file(bool shared, bool lock_held);
@@ -146,9 +146,9 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
/*
- * RelationMapOidToFilenode
+ * RelationMapOidToFilenumber
*
- * The raison d' etre ... given a relation OID, look up its filenode.
+ * The raison d' etre ... given a relation OID, look up its filenumber.
*
* Although shared and local relation OIDs should never overlap, the caller
* always knows which we need --- so pass that information to avoid useless
@@ -157,8 +157,8 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
-RelationMapOidToFilenode(Oid relationId, bool shared)
+RelFileNumber
+RelationMapOidToFilenumber(Oid relationId, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -170,13 +170,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
else
@@ -185,33 +185,33 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
- * RelationMapFilenodeToOid
+ * RelationMapFilenumberToOid
*
* Do the reverse of the normal direction of mapping done in
- * RelationMapOidToFilenode.
+ * RelationMapOidToFilenumber.
*
* This is not supposed to be used during normal running but rather for
* information purposes when looking at the filesystem or xlog.
*
* Returns InvalidOid if the OID is not known; this can easily happen if the
- * relfilenode doesn't pertain to a mapped relation.
+ * relfilenumber doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenumberToOid(RelFileNumber filenumber, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -222,13 +222,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -237,13 +237,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -252,13 +252,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
}
/*
- * RelationMapOidToFilenodeForDatabase
+ * RelationMapOidToFilenumberForDatabase
*
- * Like RelationMapOidToFilenode, but reads the mapping from the indicated
+ * Like RelationMapOidToFilenumber, but reads the mapping from the indicated
* path instead of using the one for the current database.
*/
-Oid
-RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
+RelFileNumber
+RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId)
{
RelMapFile map;
int i;
@@ -270,10 +270,10 @@ RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
for (i = 0; i < map.num_mappings; i++)
{
if (relationId == map.mappings[i].mapoid)
- return map.mappings[i].mapfilenode;
+ return map.mappings[i].mapfilenumber;
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
@@ -311,13 +311,13 @@ RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath, char *dstdbpath)
/*
* RelationMapUpdateMap
*
- * Install a new relfilenode mapping for the specified relation.
+ * Install a new relfilenumber mapping for the specified relation.
*
* If immediate is true (or we're bootstrapping), the mapping is activated
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -362,7 +362,7 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
map = &pending_local_updates;
}
}
- apply_map_update(map, relationId, fileNode, true);
+ apply_map_update(map, relationId, fileNumber, true);
}
/*
@@ -375,7 +375,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelFileNumber fileNumber,
+ bool add_okay)
{
int32 i;
@@ -384,7 +385,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ map->mappings[i].mapfilenumber = fileNumber;
return;
}
}
@@ -396,7 +397,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ map->mappings[map->num_mappings].mapfilenumber = fileNumber;
map->num_mappings++;
}
@@ -415,7 +416,7 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ updates->mappings[i].mapfilenumber,
add_okay);
}
}
@@ -983,12 +984,12 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
for (i = 0; i < newmap->num_mappings; i++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.spcNode = tsid;
- rnode.dbNode = dbid;
- rnode.relNode = newmap->mappings[i].mapfilenode;
- RelationPreserveStorage(rnode, false);
+ rlocator.spcOid = tsid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = newmap->mappings[i].mapfilenumber;
+ RelationPreserveStorage(rlocator, false);
}
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7cc9c72..30b2f85 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4805,16 +4805,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
bool is_index)
{
PQExpBuffer upgrade_query = createPQExpBuffer();
- PGresult *upgrade_res;
- Oid relfilenode;
- Oid toast_oid;
- Oid toast_relfilenode;
- char relkind;
- Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ PGresult *upgrade_res;
+ RelFileNumber relfilenumber;
+ Oid toast_oid;
+ RelFileNumber toast_relfilenumber;
+ char relkind;
+ Oid toast_index_oid;
+ RelFileNumber toast_index_relfilenumber;
/*
- * Preserve the OID and relfilenode of the table, table's index, table's
+ * Preserve the OID and relfilenumber of the table, table's index, table's
* toast table and toast table's index if any.
*
* One complexity is that the current table definition might not require
@@ -4837,15 +4837,15 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
@@ -4859,13 +4859,13 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
/*
* Not every relation has storage. Also, in a pre-v12 database,
- * partitioned tables have a relfilenode, which should not be
+ * partitioned tables have a relfilenumber, which should not be
* preserved when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
/*
* In a pre-v12 database, partitioned tables might be marked as having
@@ -4879,7 +4879,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
- toast_relfilenode);
+ toast_relfilenumber);
/* every toast table has an index */
appendPQExpBuffer(upgrade_buffer,
@@ -4887,20 +4887,20 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- toast_index_relfilenode);
+ toast_index_relfilenumber);
}
PQclear(upgrade_res);
}
else
{
- /* Preserve the OID and relfilenode of the index */
+ /* Preserve the OID and relfilenumber of the index */
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
}
appendPQExpBufferChar(upgrade_buffer, '\n');
diff --git a/src/bin/pg_rewind/datapagemap.h b/src/bin/pg_rewind/datapagemap.h
index ae4965f..235b676 100644
--- a/src/bin/pg_rewind/datapagemap.h
+++ b/src/bin/pg_rewind/datapagemap.h
@@ -10,7 +10,7 @@
#define DATAPAGEMAP_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
struct datapagemap
{
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 6252931..269ed64 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -56,7 +56,7 @@ static uint32 hash_string_pointer(const char *s);
static filehash_hash *filehash;
static bool isRelDataFile(const char *path);
-static char *datasegpath(RelFileNode rnode, ForkNumber forknum,
+static char *datasegpath(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber segno);
static file_entry_t *insert_filehash_entry(const char *path);
@@ -288,7 +288,7 @@ process_target_file(const char *path, file_type_t type, size_t size,
* hash table!
*/
void
-process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
+process_target_wal_block_change(ForkNumber forknum, RelFileLocator rlocator,
BlockNumber blkno)
{
char *path;
@@ -299,7 +299,7 @@ process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
segno = blkno / RELSEG_SIZE;
blkno_inseg = blkno % RELSEG_SIZE;
- path = datasegpath(rnode, forknum, segno);
+ path = datasegpath(rlocator, forknum, segno);
entry = lookup_filehash_entry(path);
pfree(path);
@@ -508,7 +508,7 @@ print_filemap(filemap_t *filemap)
static bool
isRelDataFile(const char *path)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
unsigned int segNo;
int nmatch;
bool matched;
@@ -532,32 +532,32 @@ isRelDataFile(const char *path)
*
*----
*/
- rnode.spcNode = InvalidOid;
- rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rlocator.spcOid = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.relNumber = InvalidRelFileNumber;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
- rnode.spcNode = GLOBALTABLESPACE_OID;
- rnode.dbNode = 0;
+ rlocator.spcOid = GLOBALTABLESPACE_OID;
+ rlocator.dbOid = 0;
matched = true;
}
else
{
nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ &rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
- rnode.spcNode = DEFAULTTABLESPACE_OID;
+ rlocator.spcOid = DEFAULTTABLESPACE_OID;
matched = true;
}
else
{
nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ &rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
matched = true;
@@ -567,12 +567,12 @@ isRelDataFile(const char *path)
/*
* The sscanf tests above can match files that have extra characters at
* the end. To eliminate such cases, cross-check that GetRelationPath
- * creates the exact same filename, when passed the RelFileNode
+ * creates the exact same filename, when passed the RelFileLocator
* information we extracted from the filename.
*/
if (matched)
{
- char *check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
+ char *check_path = datasegpath(rlocator, MAIN_FORKNUM, segNo);
if (strcmp(check_path, path) != 0)
matched = false;
@@ -589,12 +589,12 @@ isRelDataFile(const char *path)
* The returned path is palloc'd
*/
static char *
-datasegpath(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
+datasegpath(RelFileLocator rlocator, ForkNumber forknum, BlockNumber segno)
{
char *path;
char *segpath;
- path = relpathperm(rnode, forknum);
+ path = relpathperm(rlocator, forknum);
if (segno > 0)
{
segpath = psprintf("%s.%u", path, segno);
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 096f57a..0e011fb 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -10,7 +10,7 @@
#include "datapagemap.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* these enum values are sorted in the order we want actions to be processed */
typedef enum
@@ -103,7 +103,7 @@ extern void process_source_file(const char *path, file_type_t type,
extern void process_target_file(const char *path, file_type_t type,
size_t size, const char *link_target);
extern void process_target_wal_block_change(ForkNumber forknum,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blkno);
extern filemap_t *decide_file_actions(void);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index c6792da..d97240e 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -445,18 +445,18 @@ extractPageInfo(XLogReaderState *record)
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
- ForkNumber forknum;
- BlockNumber blkno;
+ RelFileLocator rlocator;
+ ForkNumber forknum;
+ BlockNumber blkno;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
continue;
/* We only care about the main fork; others are copied in toto */
if (forknum != MAIN_FORKNUM)
continue;
- process_target_wal_block_change(forknum, rnode, blkno);
+ process_target_wal_block_change(forknum, rlocator, blkno);
}
}
diff --git a/src/bin/pg_rewind/pg_rewind.h b/src/bin/pg_rewind/pg_rewind.h
index 393182f..8b4b50a 100644
--- a/src/bin/pg_rewind/pg_rewind.h
+++ b/src/bin/pg_rewind/pg_rewind.h
@@ -16,7 +16,7 @@
#include "datapagemap.h"
#include "libpq-fe.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* Configuration options */
extern char *datadir_target;
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 587793e..7f8042f 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -19,7 +19,7 @@ OBJS = \
option.o \
parallel.o \
pg_upgrade.o \
- relfilenode.o \
+ relfilenumber.o \
server.o \
tablespace.o \
util.o \
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 36b0670..5d30b87 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -190,9 +190,9 @@ create_rel_filename_map(const char *old_data, const char *new_data,
map->new_tablespace_suffix = new_cluster.tablespace_suffix;
}
- /* DB oid and relfilenodes are preserved between old and new cluster */
+ /* DB oid and relfilenumbers are preserved between old and new cluster */
map->db_oid = old_db->db_oid;
- map->relfilenode = old_rel->relfilenode;
+ map->relfilenumber = old_rel->relfilenumber;
/* used only for logging and error reporting, old/new are identical */
map->nspname = old_rel->nspname;
@@ -399,7 +399,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
+ i_relfilenumber,
i_reltablespace;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
@@ -495,7 +495,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_toastheap = PQfnumber(res, "toastheap");
i_nspname = PQfnumber(res, "nspname");
i_relname = PQfnumber(res, "relname");
- i_relfilenode = PQfnumber(res, "relfilenode");
+ i_relfilenumber = PQfnumber(res, "relfilenode");
i_reltablespace = PQfnumber(res, "reltablespace");
i_spclocation = PQfnumber(res, "spclocation");
@@ -527,7 +527,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 55de244..30c3ee6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -132,15 +132,15 @@ extern char *output_files[];
typedef struct
{
/* Can't use NAMEDATALEN; not guaranteed to be same on client */
- char *nspname; /* namespace name */
- char *relname; /* relation name */
- Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
- Oid indtable; /* if index, OID of its table, else 0 */
- Oid toastheap; /* if toast table, OID of base table, else 0 */
- char *tablespace; /* tablespace path; "" for cluster default */
- bool nsp_alloc; /* should nspname be freed? */
- bool tblsp_alloc; /* should tablespace be freed? */
+ char *nspname; /* namespace name */
+ char *relname; /* relation name */
+ Oid reloid; /* relation OID */
+ RelFileNumber relfilenumber; /* relation file number */
+ Oid indtable; /* if index, OID of its table, else 0 */
+ Oid toastheap; /* if toast table, OID of base table, else 0 */
+ char *tablespace; /* tablespace path; "" for cluster default */
+ bool nsp_alloc; /* should nspname be freed? */
+ bool tblsp_alloc; /* should tablespace be freed? */
} RelInfo;
typedef struct
@@ -159,7 +159,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
@@ -400,7 +400,7 @@ void parseCommandLine(int argc, char *argv[]);
void adjust_data_dir(ClusterInfo *cluster);
void get_sock_dir(ClusterInfo *cluster, bool live_check);
-/* relfilenode.c */
+/* relfilenumber.c */
void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
deleted file mode 100644
index d23ac88..0000000
--- a/src/bin/pg_upgrade/relfilenode.c
+++ /dev/null
@@ -1,259 +0,0 @@
-/*
- * relfilenode.c
- *
- * relfilenode functions
- *
- * Copyright (c) 2010-2022, PostgreSQL Global Development Group
- * src/bin/pg_upgrade/relfilenode.c
- */
-
-#include "postgres_fe.h"
-
-#include <sys/stat.h>
-
-#include "access/transam.h"
-#include "catalog/pg_class_d.h"
-#include "pg_upgrade.h"
-
-static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
-static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
-
-
-/*
- * transfer_all_new_tablespaces()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata)
-{
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- prep_status_progress("Cloning user relation files");
- break;
- case TRANSFER_MODE_COPY:
- prep_status_progress("Copying user relation files");
- break;
- case TRANSFER_MODE_LINK:
- prep_status_progress("Linking user relation files");
- break;
- }
-
- /*
- * Transferring files by tablespace is tricky because a single database
- * can use multiple tablespaces. For non-parallel mode, we just pass a
- * NULL tablespace path, which matches all tablespaces. In parallel mode,
- * we pass the default tablespace and all user-created tablespaces and let
- * those operations happen in parallel.
- */
- if (user_opts.jobs <= 1)
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, NULL);
- else
- {
- int tblnum;
-
- /* transfer default tablespace */
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, old_pgdata);
-
- for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
- parallel_transfer_all_new_dbs(old_db_arr,
- new_db_arr,
- old_pgdata,
- new_pgdata,
- os_info.old_tablespaces[tblnum]);
- /* reap all children */
- while (reap_child(true) == true)
- ;
- }
-
- end_progress_output();
- check_ok();
-}
-
-
-/*
- * transfer_all_new_dbs()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata, char *old_tablespace)
-{
- int old_dbnum,
- new_dbnum;
-
- /* Scan the old cluster databases and transfer their files */
- for (old_dbnum = new_dbnum = 0;
- old_dbnum < old_db_arr->ndbs;
- old_dbnum++, new_dbnum++)
- {
- DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
- *new_db = NULL;
- FileNameMap *mappings;
- int n_maps;
-
- /*
- * Advance past any databases that exist in the new cluster but not in
- * the old, e.g. "postgres". (The user might have removed the
- * 'postgres' database from the old cluster.)
- */
- for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
- {
- new_db = &new_db_arr->dbs[new_dbnum];
- if (strcmp(old_db->db_name, new_db->db_name) == 0)
- break;
- }
-
- if (new_dbnum >= new_db_arr->ndbs)
- pg_fatal("old database \"%s\" not found in the new cluster\n",
- old_db->db_name);
-
- mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
- new_pgdata);
- if (n_maps)
- {
- transfer_single_new_db(mappings, n_maps, old_tablespace);
- }
- /* We allocate something even for n_maps == 0 */
- pg_free(mappings);
- }
-}
-
-/*
- * transfer_single_new_db()
- *
- * create links for mappings stored in "maps" array.
- */
-static void
-transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
-{
- int mapnum;
- bool vm_must_add_frozenbit = false;
-
- /*
- * Do we need to rewrite visibilitymap?
- */
- if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
- new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
- vm_must_add_frozenbit = true;
-
- for (mapnum = 0; mapnum < size; mapnum++)
- {
- if (old_tablespace == NULL ||
- strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
- {
- /* transfer primary file */
- transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
-
- /*
- * Copy/link any fsm and vm files, if they exist
- */
- transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
- transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
- }
- }
-}
-
-
-/*
- * transfer_relfile()
- *
- * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
- * is true, visibility map forks are converted and rewritten, even in link
- * mode.
- */
-static void
-transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
-{
- char old_file[MAXPGPATH];
- char new_file[MAXPGPATH];
- int segno;
- char extent_suffix[65];
- struct stat statbuf;
-
- /*
- * Now copy/link any related segments as well. Remember, PG breaks large
- * files into 1GB segments, the first segment has no extension, subsequent
- * segments are named relfilenode.1, relfilenode.2, relfilenode.3.
- */
- for (segno = 0;; segno++)
- {
- if (segno == 0)
- extent_suffix[0] = '\0';
- else
- snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
-
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
- map->old_tablespace,
- map->old_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
- map->new_tablespace,
- map->new_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
-
- /* Is it an extent, fsm, or vm file? */
- if (type_suffix[0] != '\0' || segno != 0)
- {
- /* Did file open fail? */
- if (stat(old_file, &statbuf) != 0)
- {
- /* File does not exist? That's OK, just return */
- if (errno == ENOENT)
- return;
- else
- pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
- map->nspname, map->relname, old_file, new_file,
- strerror(errno));
- }
-
- /* If file is empty, just return */
- if (statbuf.st_size == 0)
- return;
- }
-
- unlink(new_file);
-
- /* Copying files might take some time, so give feedback. */
- pg_log(PG_STATUS, "%s", old_file);
-
- if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
- {
- /* Need to rewrite visibility map format */
- pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
- old_file, new_file);
- rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
- }
- else
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
- old_file, new_file);
- cloneFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_COPY:
- pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
- old_file, new_file);
- copyFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_LINK:
- pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
- old_file, new_file);
- linkFile(old_file, new_file, map->nspname, map->relname);
- }
- }
-}
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
new file mode 100644
index 0000000..b3ad820
--- /dev/null
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -0,0 +1,259 @@
+/*
+ * relfilenumber.c
+ *
+ * relfilenumber functions
+ *
+ * Copyright (c) 2010-2022, PostgreSQL Global Development Group
+ * src/bin/pg_upgrade/relfilenumber.c
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+
+#include "access/transam.h"
+#include "catalog/pg_class_d.h"
+#include "pg_upgrade.h"
+
+static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
+static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
+
+
+/*
+ * transfer_all_new_tablespaces()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata)
+{
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ prep_status_progress("Cloning user relation files");
+ break;
+ case TRANSFER_MODE_COPY:
+ prep_status_progress("Copying user relation files");
+ break;
+ case TRANSFER_MODE_LINK:
+ prep_status_progress("Linking user relation files");
+ break;
+ }
+
+ /*
+ * Transferring files by tablespace is tricky because a single database
+ * can use multiple tablespaces. For non-parallel mode, we just pass a
+ * NULL tablespace path, which matches all tablespaces. In parallel mode,
+ * we pass the default tablespace and all user-created tablespaces and let
+ * those operations happen in parallel.
+ */
+ if (user_opts.jobs <= 1)
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, NULL);
+ else
+ {
+ int tblnum;
+
+ /* transfer default tablespace */
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, old_pgdata);
+
+ for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+ parallel_transfer_all_new_dbs(old_db_arr,
+ new_db_arr,
+ old_pgdata,
+ new_pgdata,
+ os_info.old_tablespaces[tblnum]);
+ /* reap all children */
+ while (reap_child(true) == true)
+ ;
+ }
+
+ end_progress_output();
+ check_ok();
+}
+
+
+/*
+ * transfer_all_new_dbs()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata, char *old_tablespace)
+{
+ int old_dbnum,
+ new_dbnum;
+
+ /* Scan the old cluster databases and transfer their files */
+ for (old_dbnum = new_dbnum = 0;
+ old_dbnum < old_db_arr->ndbs;
+ old_dbnum++, new_dbnum++)
+ {
+ DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
+ *new_db = NULL;
+ FileNameMap *mappings;
+ int n_maps;
+
+ /*
+ * Advance past any databases that exist in the new cluster but not in
+ * the old, e.g. "postgres". (The user might have removed the
+ * 'postgres' database from the old cluster.)
+ */
+ for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
+ {
+ new_db = &new_db_arr->dbs[new_dbnum];
+ if (strcmp(old_db->db_name, new_db->db_name) == 0)
+ break;
+ }
+
+ if (new_dbnum >= new_db_arr->ndbs)
+ pg_fatal("old database \"%s\" not found in the new cluster\n",
+ old_db->db_name);
+
+ mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
+ new_pgdata);
+ if (n_maps)
+ {
+ transfer_single_new_db(mappings, n_maps, old_tablespace);
+ }
+ /* We allocate something even for n_maps == 0 */
+ pg_free(mappings);
+ }
+}
+
+/*
+ * transfer_single_new_db()
+ *
+ * create links for mappings stored in "maps" array.
+ */
+static void
+transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
+{
+ int mapnum;
+ bool vm_must_add_frozenbit = false;
+
+ /*
+ * Do we need to rewrite visibilitymap?
+ */
+ if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
+ new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
+ vm_must_add_frozenbit = true;
+
+ for (mapnum = 0; mapnum < size; mapnum++)
+ {
+ if (old_tablespace == NULL ||
+ strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
+ {
+ /* transfer primary file */
+ transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
+
+ /*
+ * Copy/link any fsm and vm files, if they exist
+ */
+ transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
+ transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
+ }
+ }
+}
+
+
+/*
+ * transfer_relfile()
+ *
+ * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
+ * is true, visibility map forks are converted and rewritten, even in link
+ * mode.
+ */
+static void
+transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
+{
+ char old_file[MAXPGPATH];
+ char new_file[MAXPGPATH];
+ int segno;
+ char extent_suffix[65];
+ struct stat statbuf;
+
+ /*
+ * Now copy/link any related segments as well. Remember, PG breaks large
+ * files into 1GB segments, the first segment has no extension, subsequent
+ * segments are named relfilenumber.1, relfilenumber.2, relfilenumber.3.
+ */
+ for (segno = 0;; segno++)
+ {
+ if (segno == 0)
+ extent_suffix[0] = '\0';
+ else
+ snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
+
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ map->old_tablespace,
+ map->old_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ map->new_tablespace,
+ map->new_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+
+ /* Is it an extent, fsm, or vm file? */
+ if (type_suffix[0] != '\0' || segno != 0)
+ {
+ /* Did file open fail? */
+ if (stat(old_file, &statbuf) != 0)
+ {
+ /* File does not exist? That's OK, just return */
+ if (errno == ENOENT)
+ return;
+ else
+ pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+ map->nspname, map->relname, old_file, new_file,
+ strerror(errno));
+ }
+
+ /* If file is empty, just return */
+ if (statbuf.st_size == 0)
+ return;
+ }
+
+ unlink(new_file);
+
+ /* Copying files might take some time, so give feedback. */
+ pg_log(PG_STATUS, "%s", old_file);
+
+ if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
+ {
+ /* Need to rewrite visibility map format */
+ pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
+ }
+ else
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ cloneFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_COPY:
+ pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ copyFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_LINK:
+ pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ linkFile(old_file, new_file, map->nspname, map->relname);
+ }
+ }
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5dc6010..0fdde9d 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -37,7 +37,7 @@ static const char *progname;
static int WalSegSz;
static volatile sig_atomic_t time_to_stop = false;
-static const RelFileNode emptyRelFileNode = {0, 0, 0};
+static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
typedef struct XLogDumpPrivate
{
@@ -63,7 +63,7 @@ typedef struct XLogDumpConfig
bool filter_by_rmgr_enabled;
TransactionId filter_by_xid;
bool filter_by_xid_enabled;
- RelFileNode filter_by_relation;
+ RelFileLocator filter_by_relation;
bool filter_by_extended;
bool filter_by_relation_enabled;
BlockNumber filter_by_relation_block;
@@ -393,7 +393,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
*/
static bool
XLogRecordMatchesRelationBlock(XLogReaderState *record,
- RelFileNode matchRnode,
+ RelFileLocator matchRlocator,
BlockNumber matchBlock,
ForkNumber matchFork)
{
@@ -401,17 +401,17 @@ XLogRecordMatchesRelationBlock(XLogReaderState *record,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if ((matchFork == InvalidForkNumber || matchFork == forknum) &&
- (RelFileNodeEquals(matchRnode, emptyRelFileNode) ||
- RelFileNodeEquals(matchRnode, rnode)) &&
+ (RelFileLocatorEquals(matchRlocator, emptyRelFileLocator) ||
+ RelFileLocatorEquals(matchRlocator, rlocator)) &&
(matchBlock == InvalidBlockNumber || matchBlock == blk))
return true;
}
@@ -885,11 +885,11 @@ main(int argc, char **argv)
break;
case 'R':
if (sscanf(optarg, "%u/%u/%u",
- &config.filter_by_relation.spcNode,
- &config.filter_by_relation.dbNode,
- &config.filter_by_relation.relNode) != 3 ||
- !OidIsValid(config.filter_by_relation.spcNode) ||
- !OidIsValid(config.filter_by_relation.relNode))
+ &config.filter_by_relation.spcOid,
+ &config.filter_by_relation.dbOid,
+ &config.filter_by_relation.relNumber) != 3 ||
+ !OidIsValid(config.filter_by_relation.spcOid) ||
+ !OidIsValid(config.filter_by_relation.relNumber))
{
pg_log_error("invalid relation specification: \"%s\"", optarg);
pg_log_error_detail("Expecting \"tablespace OID/database OID/relation filenode\".");
@@ -1132,7 +1132,7 @@ main(int argc, char **argv)
!XLogRecordMatchesRelationBlock(xlogreader_state,
config.filter_by_relation_enabled ?
config.filter_by_relation :
- emptyRelFileNode,
+ emptyRelFileLocator,
config.filter_by_relation_block_enabled ?
config.filter_by_relation_block :
InvalidBlockNumber,
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..1b6b620 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -107,24 +107,24 @@ forkname_chars(const char *str, ForkNumber *fork)
* XXX this must agree with GetRelationPath()!
*/
char *
-GetDatabasePath(Oid dbNode, Oid spcNode)
+GetDatabasePath(Oid dbOid, Oid spcOid)
{
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
return pstrdup("global");
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
- return psprintf("base/%u", dbNode);
+ return psprintf("base/%u", dbOid);
}
else
{
/* All other tablespaces are accessed via symlinks */
return psprintf("pg_tblspc/%u/%s/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY, dbNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY, dbOid);
}
}
@@ -138,44 +138,44 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber)
{
char *path;
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
path = psprintf("global/%u_%s",
- relNode, forkNames[forkNumber]);
+ relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/%u", relNumber);
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/%u_%s",
- dbNode, relNode,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/%u",
- dbNode, relNode);
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
- dbNode, backendId, relNode,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/t%d_%u",
- dbNode, backendId, relNode);
+ dbOid, backendId, relNumber);
}
}
else
@@ -185,25 +185,25 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber);
}
}
return path;
diff --git a/src/include/access/brin_xlog.h b/src/include/access/brin_xlog.h
index 95bfc7e..012a9af 100644
--- a/src/include/access/brin_xlog.h
+++ b/src/include/access/brin_xlog.h
@@ -18,7 +18,7 @@
#include "lib/stringinfo.h"
#include "storage/bufpage.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
diff --git a/src/include/access/ginxlog.h b/src/include/access/ginxlog.h
index 21de389..7f98503 100644
--- a/src/include/access/ginxlog.h
+++ b/src/include/access/ginxlog.h
@@ -110,7 +110,7 @@ typedef struct
typedef struct ginxlogSplit
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber rrlink; /* right link, or root's blocknumber if root
* split */
BlockNumber leftChildBlkno; /* valid on a non-leaf split */
@@ -167,7 +167,7 @@ typedef struct ginxlogDeletePage
*/
typedef struct ginxlogUpdateMeta
{
- RelFileNode node;
+ RelFileLocator locator;
GinMetaPageData metadata;
BlockNumber prevTail;
BlockNumber newRightlink;
diff --git a/src/include/access/gistxlog.h b/src/include/access/gistxlog.h
index 4537e67..9bbe4c2 100644
--- a/src/include/access/gistxlog.h
+++ b/src/include/access/gistxlog.h
@@ -97,7 +97,7 @@ typedef struct gistxlogPageDelete
*/
typedef struct gistxlogPageReuse
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} gistxlogPageReuse;
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 2d8a7f6..1705e73 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
@@ -370,9 +370,9 @@ typedef struct xl_heap_new_cid
CommandId combocid; /* just for debugging */
/*
- * Store the relfilenode/ctid pair to facilitate lookups.
+ * Store the relfilelocator/ctid pair to facilitate lookups.
*/
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
} xl_heap_new_cid;
@@ -415,7 +415,7 @@ extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
MultiXactId *relminmxid_out);
extern void heap_execute_freeze_tuple(HeapTupleHeader tuple,
xl_heap_freeze_tuple *xlrec_tp);
-extern XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer,
+extern XLogRecPtr log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer,
Buffer vm_buffer, TransactionId cutoff_xid, uint8 flags);
#endif /* HEAPAM_XLOG_H */
diff --git a/src/include/access/nbtxlog.h b/src/include/access/nbtxlog.h
index de362d3..d79489e 100644
--- a/src/include/access/nbtxlog.h
+++ b/src/include/access/nbtxlog.h
@@ -180,12 +180,12 @@ typedef struct xl_btree_dedup
* This is what we need to know about page reuse within btree. This record
* only exists to generate a conflict point for Hot Standby.
*
- * Note that we must include a RelFileNode in the record because we don't
+ * Note that we must include a RelFileLocator in the record because we don't
* actually register the buffer with the record.
*/
typedef struct xl_btree_reuse_page
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} xl_btree_reuse_page;
diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h
index 3e27790..353cbb2 100644
--- a/src/include/access/rewriteheap.h
+++ b/src/include/access/rewriteheap.h
@@ -15,7 +15,7 @@
#include "access/htup.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* struct definition is private to rewriteheap.c */
@@ -34,8 +34,8 @@ extern bool rewrite_heap_dead_tuple(RewriteState state, HeapTuple oldTuple);
*/
typedef struct LogicalRewriteMappingData
{
- RelFileNode old_node;
- RelFileNode new_node;
+ RelFileLocator old_locator;
+ RelFileLocator new_locator;
ItemPointerData old_tid;
ItemPointerData new_tid;
} LogicalRewriteMappingData;
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index fe869c6..83a8e7e 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -560,32 +560,32 @@ typedef struct TableAmRoutine
*/
/*
- * This callback needs to create a new relation filenode for `rel`, with
+ * This callback needs to create a new relation filelocator for `rel`, with
* appropriate durability behaviour for `persistence`.
*
* Note that only the subset of the relcache filled by
* RelationBuildLocalRelation() can be relied upon and that the relation's
* catalog entries will either not yet exist (new relation), or will still
- * reference the old relfilenode.
+ * reference the old relfilelocator.
*
* As output *freezeXid, *minmulti must be set to the values appropriate
* for pg_class.{relfrozenxid, relminmxid}. For AMs that don't need those
* fields to be filled they can be set to InvalidTransactionId and
* InvalidMultiXactId, respectively.
*
- * See also table_relation_set_new_filenode().
+ * See also table_relation_set_new_filelocator().
*/
- void (*relation_set_new_filenode) (Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti);
+ void (*relation_set_new_filelocator) (Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti);
/*
* This callback needs to remove all contents from `rel`'s current
- * relfilenode. No provisions for transactional behaviour need to be made.
- * Often this can be implemented by truncating the underlying storage to
- * its minimal size.
+ * relfilelocator. No provisions for transactional behaviour need to be
+ * made. Often this can be implemented by truncating the underlying
+ * storage to its minimal size.
*
* See also table_relation_nontransactional_truncate().
*/
@@ -598,7 +598,7 @@ typedef struct TableAmRoutine
* storage, unless it contains references to the tablespace internally.
*/
void (*relation_copy_data) (Relation rel,
- const RelFileNode *newrnode);
+ const RelFileLocator *newrlocator);
/* See table_relation_copy_for_cluster() */
void (*relation_copy_for_cluster) (Relation NewTable,
@@ -1348,7 +1348,7 @@ table_index_delete_tuples(Relation rel, TM_IndexDeleteOp *delstate)
* RelationGetBufferForTuple. See that method for more information.
*
* TABLE_INSERT_FROZEN should only be specified for inserts into
- * relfilenodes created during the current subtransaction and when
+ * relfilenumbers created during the current subtransaction and when
* there are no prior snapshots or pre-existing portals open.
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
@@ -1577,33 +1577,34 @@ table_finish_bulk_insert(Relation rel, int options)
*/
/*
- * Create storage for `rel` in `newrnode`, with persistence set to
+ * Create storage for `rel` in `newrlocator`, with persistence set to
* `persistence`.
*
* This is used both during relation creation and various DDL operations to
- * create a new relfilenode that can be filled from scratch. When creating
- * new storage for an existing relfilenode, this should be called before the
+ * create a new relfilelocator that can be filled from scratch. When creating
+ * new storage for an existing relfilelocator, this should be called before the
* relcache entry has been updated.
*
* *freezeXid, *minmulti are set to the xid / multixact horizon for the table
* that pg_class.{relfrozenxid, relminmxid} have to be set to.
*/
static inline void
-table_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+table_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
- rel->rd_tableam->relation_set_new_filenode(rel, newrnode, persistence,
- freezeXid, minmulti);
+ rel->rd_tableam->relation_set_new_filelocator(rel, newrlocator,
+ persistence, freezeXid,
+ minmulti);
}
/*
* Remove all table contents from `rel`, in a non-transactional manner.
* Non-transactional meaning that there's no need to support rollbacks. This
- * commonly only is used to perform truncations for relfilenodes created in the
- * current transaction.
+ * commonly only is used to perform truncations for relfilelocators created in
+ * the current transaction.
*/
static inline void
table_relation_nontransactional_truncate(Relation rel)
@@ -1612,15 +1613,15 @@ table_relation_nontransactional_truncate(Relation rel)
}
/*
- * Copy data from `rel` into the new relfilenode `newrnode`. The new
- * relfilenode may not have storage associated before this function is
+ * Copy data from `rel` into the new relfilelocator `newrlocator`. The new
+ * relfilelocator may not have storage associated before this function is
* called. This is only supposed to be used for low level operations like
* changing a relation's tablespace.
*/
static inline void
-table_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+table_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
- rel->rd_tableam->relation_copy_data(rel, newrnode);
+ rel->rd_tableam->relation_copy_data(rel, newrlocator);
}
/*
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 4794941..7d2b352 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -19,7 +19,7 @@
#include "datatype/timestamp.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/sinval.h"
/*
@@ -174,7 +174,7 @@ typedef struct SavedTransactionCharacteristics
*/
#define XACT_XINFO_HAS_DBINFO (1U << 0)
#define XACT_XINFO_HAS_SUBXACTS (1U << 1)
-#define XACT_XINFO_HAS_RELFILENODES (1U << 2)
+#define XACT_XINFO_HAS_RELFILELOCATORS (1U << 2)
#define XACT_XINFO_HAS_INVALS (1U << 3)
#define XACT_XINFO_HAS_TWOPHASE (1U << 4)
#define XACT_XINFO_HAS_ORIGIN (1U << 5)
@@ -252,12 +252,12 @@ typedef struct xl_xact_subxacts
} xl_xact_subxacts;
#define MinSizeOfXactSubxacts offsetof(xl_xact_subxacts, subxacts)
-typedef struct xl_xact_relfilenodes
+typedef struct xl_xact_relfilelocators
{
int nrels; /* number of relations */
- RelFileNode xnodes[FLEXIBLE_ARRAY_MEMBER];
-} xl_xact_relfilenodes;
-#define MinSizeOfXactRelfilenodes offsetof(xl_xact_relfilenodes, xnodes)
+ RelFileLocator xlocators[FLEXIBLE_ARRAY_MEMBER];
+} xl_xact_relfilelocators;
+#define MinSizeOfXactRelfileLocators offsetof(xl_xact_relfilelocators, xlocators)
/*
* A transactionally dropped statistics entry.
@@ -305,7 +305,7 @@ typedef struct xl_xact_commit
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* xl_xact_invals follows if XINFO_HAS_INVALS */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -321,7 +321,7 @@ typedef struct xl_xact_abort
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* No invalidation messages needed. */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -367,7 +367,7 @@ typedef struct xl_xact_parsed_commit
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -378,7 +378,7 @@ typedef struct xl_xact_parsed_commit
TransactionId twophase_xid; /* only for 2PC */
char twophase_gid[GIDSIZE]; /* only for 2PC */
int nabortrels; /* only for 2PC */
- RelFileNode *abortnodes; /* only for 2PC */
+ RelFileLocator *abortlocators; /* only for 2PC */
int nabortstats; /* only for 2PC */
xl_xact_stats_item *abortstats; /* only for 2PC */
@@ -400,7 +400,7 @@ typedef struct xl_xact_parsed_abort
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -483,7 +483,7 @@ extern int xactGetCommittedChildren(TransactionId **ptr);
extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int nmsgs, SharedInvalidationMessage *msgs,
@@ -494,7 +494,7 @@ extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
extern XLogRecPtr XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int xactflags, TransactionId twophase_xid,
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index fae0bef..3524c39 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,7 +25,7 @@
#include "lib/stringinfo.h"
#include "pgtime.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index 5fc340c..c04f77b 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "storage/block.h"
#include "storage/buf.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/*
@@ -45,16 +45,16 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int max_block_id, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
+extern void XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator,
ForkNumber forknum, BlockNumber blknum, char *page,
uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
+extern XLogRecPtr log_newpage(RelFileLocator *rlocator, ForkNumber forkNum,
BlockNumber blk, char *page, bool page_std);
-extern void log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+extern void log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, char **pages, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index e73ea4a..5395f15 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -122,7 +122,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
@@ -430,10 +430,10 @@ extern FullTransactionId XLogRecGetFullXid(XLogReaderState *record);
extern bool RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern void XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum);
extern bool XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer);
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 052ac68..7e467ef 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "port/pg_crc32c.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* The overall layout of an XLOG record is:
@@ -97,7 +97,7 @@ typedef struct XLogRecordBlockHeader
* image) */
/* If BKPBLOCK_HAS_IMAGE, an XLogRecordBlockImageHeader struct follows */
- /* If BKPBLOCK_SAME_REL is not set, a RelFileNode follows */
+ /* If BKPBLOCK_SAME_REL is not set, a RelFileLocator follows */
/* BlockNumber follows */
} XLogRecordBlockHeader;
@@ -175,7 +175,7 @@ typedef struct XLogRecordBlockCompressHeader
(SizeOfXLogRecordBlockHeader + \
SizeOfXLogRecordBlockImageHeader + \
SizeOfXLogRecordBlockCompressHeader + \
- sizeof(RelFileNode) + \
+ sizeof(RelFileLocator) + \
sizeof(BlockNumber))
/*
@@ -187,7 +187,7 @@ typedef struct XLogRecordBlockCompressHeader
#define BKPBLOCK_HAS_IMAGE 0x10 /* block data is an XLogRecordBlockImage */
#define BKPBLOCK_HAS_DATA 0x20
#define BKPBLOCK_WILL_INIT 0x40 /* redo will re-init the page */
-#define BKPBLOCK_SAME_REL 0x80 /* RelFileNode omitted, same as previous */
+#define BKPBLOCK_SAME_REL 0x80 /* RelFileLocator omitted, same as previous */
/*
* XLogRecordDataHeaderShort/Long are used for the "main data" portion of
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index c9d0b75..ef18297 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -60,9 +60,9 @@ extern PGDLLIMPORT HotStandbyState standbyState;
extern bool XLogHaveInvalidPages(void);
extern void XLogCheckInvalidPages(void);
-extern void XLogDropRelation(RelFileNode rnode, ForkNumber forknum);
+extern void XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum);
extern void XLogDropDatabase(Oid dbid);
-extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+extern void XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks);
/* Result codes for XLogReadBufferForRedo[Extended] */
@@ -89,11 +89,11 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer);
-extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
+extern Relation CreateFakeRelcacheEntry(RelFileLocator rlocator);
extern void FreeFakeRelcacheEntry(Relation fakerel);
extern int read_local_xlog_page(XLogReaderState *state,
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..fd93442 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..66900f1 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,7 +38,8 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ Relation pg_class,
+ char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 07c5b88..5774c46 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1bdb00a 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/storage.h b/src/include/catalog/storage.h
index 59f3404..9964c31 100644
--- a/src/include/catalog/storage.h
+++ b/src/include/catalog/storage.h
@@ -15,23 +15,23 @@
#define STORAGE_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
/* GUC variables */
extern PGDLLIMPORT int wal_skip_threshold;
-extern SMgrRelation RelationCreateStorage(RelFileNode rnode,
+extern SMgrRelation RelationCreateStorage(RelFileLocator rlocator,
char relpersistence,
bool register_delete);
extern void RelationDropStorage(Relation rel);
-extern void RelationPreserveStorage(RelFileNode rnode, bool atCommit);
+extern void RelationPreserveStorage(RelFileLocator rlocator, bool atCommit);
extern void RelationPreTruncate(Relation rel);
extern void RelationTruncate(Relation rel, BlockNumber nblocks);
extern void RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
-extern bool RelFileNodeSkippingWAL(RelFileNode rnode);
+extern bool RelFileLocatorSkippingWAL(RelFileLocator rlocator);
extern Size EstimatePendingSyncsSpace(void);
extern void SerializePendingSyncs(Size maxSize, char *startAddress);
extern void RestorePendingSyncs(char *startAddress);
@@ -42,7 +42,7 @@ extern void RestorePendingSyncs(char *startAddress);
*/
extern void smgrDoPendingDeletes(bool isCommit);
extern void smgrDoPendingSyncs(bool isCommit, bool isParallelWorker);
-extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr);
+extern int smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr);
extern void AtSubCommit_smgr(void);
extern void AtSubAbort_smgr(void);
extern void PostPrepare_smgr(void);
diff --git a/src/include/catalog/storage_xlog.h b/src/include/catalog/storage_xlog.h
index 622de22..44a5e20 100644
--- a/src/include/catalog/storage_xlog.h
+++ b/src/include/catalog/storage_xlog.h
@@ -17,7 +17,7 @@
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Declarations for smgr-related XLOG records
@@ -32,7 +32,7 @@
typedef struct xl_smgr_create
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
} xl_smgr_create;
@@ -46,11 +46,11 @@ typedef struct xl_smgr_create
typedef struct xl_smgr_truncate
{
BlockNumber blkno;
- RelFileNode rnode;
+ RelFileLocator rlocator;
int flags;
} xl_smgr_truncate;
-extern void log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum);
+extern void log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum);
extern void smgr_redo(XLogReaderState *record);
extern void smgr_desc(StringInfo buf, XLogReaderState *record);
diff --git a/src/include/commands/sequence.h b/src/include/commands/sequence.h
index 9da2300..d38c0e2 100644
--- a/src/include/commands/sequence.h
+++ b/src/include/commands/sequence.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "parser/parse_node.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
typedef struct FormData_pg_sequence_data
@@ -47,7 +47,7 @@ typedef FormData_pg_sequence_data *Form_pg_sequence_data;
typedef struct xl_seq_rec
{
- RelFileNode node;
+ RelFileLocator locator;
/* SEQUENCE TUPLE DATA FOLLOWS AT THE END */
} xl_seq_rec;
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..0c48654 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelFileNumber newRelFileNumber);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index 24b6473..1f80907 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -50,7 +50,7 @@ extern void DropTableSpace(DropTableSpaceStmt *stmt);
extern ObjectAddress RenameTableSpace(const char *oldname, const char *newname);
extern Oid AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
-extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
+extern void TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo);
extern Oid GetDefaultTablespace(char relpersistence, bool partitioned);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 13849a3..3ab7132 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -64,27 +64,27 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
/*
* Stuff for computing filesystem pathnames for relations.
*/
-extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
+extern char *GetDatabasePath(Oid dbOid, Oid spcOid);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber);
/*
* Wrapper macros for GetRelationPath. Beware of multiple
- * evaluation of the RelFileNode or RelFileNodeBackend argument!
+ * evaluation of the RelFileLocator or RelFileLocatorBackend argument!
*/
-/* First argument is a RelFileNode */
-#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
+/* First argument is a RelFileLocator */
+#define relpathbackend(rlocator, backend, forknum) \
+ GetRelationPath((rlocator).dbOid, (rlocator).spcOid, (rlocator).relNumber, \
backend, forknum)
-/* First argument is a RelFileNode */
-#define relpathperm(rnode, forknum) \
- relpathbackend(rnode, InvalidBackendId, forknum)
+/* First argument is a RelFileLocator */
+#define relpathperm(rlocator, forknum) \
+ relpathbackend(rlocator, InvalidBackendId, forknum)
-/* First argument is a RelFileNodeBackend */
-#define relpath(rnode, forknum) \
- relpathbackend((rnode).node, (rnode).backend, forknum)
+/* First argument is a RelFileLocatorBackend */
+#define relpath(rlocator, forknum) \
+ relpathbackend((rlocator).locator, (rlocator).backend, forknum)
#endif /* RELPATH_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 73f635b..562f21c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3247,10 +3247,10 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
- SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
- SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
- * oldNode */
+ RelFileNumber oldNumber; /* relfilenumber of existing storage, if any */
+ SubTransactionId oldCreateSubid; /* rd_createSubid of oldNumber */
+ SubTransactionId oldFirstRelfilenumberSubid; /* rd_firstRelfilelocatorSubid
+ * of oldNumber */
bool unique; /* is index unique? */
bool nulls_not_distinct; /* null treatment for UNIQUE constraints */
bool primary; /* is index a primary key? */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..d8af68b 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,13 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelFileNumber data type identifies the specific relation file name.
+ */
+typedef Oid RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+#define RelFileNumberIsValid(relnumber) \
+ ((bool) ((relnumber) != InvalidRelFileNumber))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 2511ef4..b67fb1e 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -16,7 +16,7 @@
#define _BGWRITER_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 4a01f87..d109d0b 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -99,7 +99,7 @@ typedef struct ReorderBufferChange
struct
{
/* relation that has been changed */
- RelFileNode relnode;
+ RelFileLocator rlocator;
/* no previously reassembled toast chunks are necessary anymore */
bool clear_toast_afterwards;
@@ -145,7 +145,7 @@ typedef struct ReorderBufferChange
*/
struct
{
- RelFileNode node;
+ RelFileLocator locator;
ItemPointerData tid;
CommandId cmin;
CommandId cmax;
@@ -657,7 +657,7 @@ extern void ReorderBufferAddSnapshot(ReorderBuffer *, TransactionId, XLogRecPtr
extern void ReorderBufferAddNewCommandId(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
CommandId cid);
extern void ReorderBufferAddNewTupleCids(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
- RelFileNode node, ItemPointerData pt,
+ RelFileLocator locator, ItemPointerData pt,
CommandId cmin, CommandId cmax, CommandId combocid);
extern void ReorderBufferAddInvalidations(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
Size nmsgs, SharedInvalidationMessage *msgs);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index a17e7b2..d54e1f6 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,30 +90,30 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rlocator.spcOid = InvalidOid, \
+ (a).rlocator.dbOid = InvalidOid, \
+ (a).rlocator.relNumber = InvalidOid, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -291,11 +291,11 @@ extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
- Oid tsId;
- Oid relNode;
- ForkNumber forkNum;
- BlockNumber blockNum;
- int buf_id;
+ Oid tsId;
+ RelFileNumber relNumber;
+ ForkNumber forkNum;
+ BlockNumber blockNum;
+ int buf_id;
} CkptSortItem;
extern PGDLLIMPORT CkptSortItem *CkptBufferIds;
@@ -337,9 +337,9 @@ extern PrefetchBufferResult PrefetchLocalBuffer(SMgrRelation smgr,
extern BufferDesc *LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum,
BlockNumber blockNum, bool *foundPtr);
extern void MarkLocalBufferDirty(Buffer buffer);
-extern void DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
+extern void DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber firstDelBlock);
-extern void DropRelFileNodeAllLocalBuffers(RelFileNode rnode);
+extern void DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator);
extern void AtEOXact_LocalBuffers(bool isCommit);
#endif /* BUFMGR_INTERNALS_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 5839140..96e473e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,7 +17,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -176,13 +176,13 @@ extern PrefetchBufferResult PrefetchSharedBuffer(struct SMgrRelationData *smgr_r
BlockNumber blockNum);
extern PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum,
BlockNumber blockNum);
-extern bool ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum,
+extern bool ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, Buffer recent_buffer);
extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
+extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool permanent);
@@ -204,13 +204,13 @@ extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
extern void FlushOneBuffer(Buffer buffer);
extern void FlushRelationBuffers(Relation rel);
extern void FlushRelationsAllBuffers(struct SMgrRelationData **smgrs, int nrels);
-extern void CreateAndCopyRelationData(RelFileNode src_rnode,
- RelFileNode dst_rnode,
+extern void CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator,
bool permanent);
extern void FlushDatabaseBuffers(Oid dbid);
-extern void DropRelFileNodeBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
+extern void DropRelFileLocatorBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock);
-extern void DropRelFileNodesAllBuffers(struct SMgrRelationData **smgr_reln, int nnodes);
+extern void DropRelFileLocatorsAllBuffers(struct SMgrRelationData **smgr_reln, int nlocators);
extern void DropDatabaseBuffers(Oid dbid);
#define RelationGetNumberOfBlocks(reln) \
@@ -223,7 +223,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileLocator *rlocator,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/freespace.h b/src/include/storage/freespace.h
index dcc40eb..fcb0802 100644
--- a/src/include/storage/freespace.h
+++ b/src/include/storage/freespace.h
@@ -15,7 +15,7 @@
#define FREESPACE_H_
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* prototypes for public functions in freespace.c */
@@ -27,7 +27,7 @@ extern BlockNumber RecordAndGetPageWithFreeSpace(Relation rel,
Size spaceNeeded);
extern void RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk,
Size spaceAvail);
-extern void XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+extern void XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail);
extern BlockNumber FreeSpaceMapPrepareTruncateRel(Relation rel,
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ffffa40..10aa1b0 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -15,7 +15,7 @@
#define MD_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
@@ -25,7 +25,7 @@ extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
-extern void mdunlink(RelFileNodeBackend rnode, ForkNumber forknum, bool isRedo);
+extern void mdunlink(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo);
extern void mdextend(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern bool mdprefetch(SMgrRelation reln, ForkNumber forknum,
@@ -42,7 +42,7 @@ extern void mdtruncate(SMgrRelation reln, ForkNumber forknum,
extern void mdimmedsync(SMgrRelation reln, ForkNumber forknum);
extern void ForgetDatabaseSyncRequests(Oid dbid);
-extern void DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo);
+extern void DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo);
/* md sync callbacks */
extern int mdsyncfiletag(const FileTag *ftag, char *path);
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
new file mode 100644
index 0000000..7211fe7
--- /dev/null
+++ b/src/include/storage/relfilelocator.h
@@ -0,0 +1,99 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilelocator.h
+ * Physical access information for relations.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/relfilelocator.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILELOCATOR_H
+#define RELFILELOCATOR_H
+
+#include "common/relpath.h"
+#include "storage/backendid.h"
+
+/*
+ * RelFileLocator must provide all that we need to know to physically access
+ * a relation, with the exception of the backend ID, which can be provided
+ * separately. Note, however, that a "physical" relation is comprised of
+ * multiple files on the filesystem, as each fork is stored as a separate
+ * file, and each fork can be divided into multiple segments. See md.c.
+ *
+ * spcOid identifies the tablespace of the relation. It corresponds to
+ * pg_tablespace.oid.
+ *
+ * dbOid identifies the database of the relation. It is zero for
+ * "shared" relations (those common to all databases of a cluster).
+ * Nonzero dbOid values correspond to pg_database.oid.
+ *
+ * relNumber identifies the specific relation. relNumber corresponds to
+ * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
+ * to assign new physical files to relations in some situations).
+ * Notice that relNumber is only unique within a database in a particular
+ * tablespace.
+ *
+ * Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
+ * zero. We support shared relations only in the "global" tablespace.
+ *
+ * Note: in pg_class we allow reltablespace == 0 to denote that the
+ * relation is stored in its database's "default" tablespace (as
+ * identified by pg_database.dattablespace). However this shorthand
+ * is NOT allowed in RelFileLocator structs --- the real tablespace ID
+ * must be supplied when setting spcOid.
+ *
+ * Note: in pg_class, relfilenode can be zero to denote that the relation
+ * is a "mapped" relation, whose current true filenode number is available
+ * from relmapper.c. Again, this case is NOT allowed in RelFileLocators.
+ *
+ * Note: various places use RelFileLocator in hashtable keys. Therefore,
+ * there *must not* be any unused padding bytes in this struct. That
+ * should be safe as long as all the fields are of type Oid.
+ */
+typedef struct RelFileLocator
+{
+ Oid spcOid; /* tablespace */
+ Oid dbOid; /* database */
+ RelFileNumber relNumber; /* relation */
+} RelFileLocator;
+
+/*
+ * Augmenting a relfilelocator with the backend ID provides all the information
+ * we need to locate the physical storage. The backend ID is InvalidBackendId
+ * for regular relations (those accessible to more than one backend), or the
+ * owning backend's ID for backend-local relations. Backend-local relations
+ * are always transient and removed in case of a database crash; they are
+ * never WAL-logged or fsync'd.
+ */
+typedef struct RelFileLocatorBackend
+{
+ RelFileLocator locator;
+ BackendId backend;
+} RelFileLocatorBackend;
+
+#define RelFileLocatorBackendIsTemp(rlocator) \
+ ((rlocator).backend != InvalidBackendId)
+
+/*
+ * Note: RelFileLocatorEquals and RelFileLocatorBackendEquals compare relNumber first
+ * since that is most likely to be different in two unequal RelFileLocators. It
+ * is probably redundant to compare spcOid if the other fields are found equal,
+ * but do it anyway to be sure. Likewise for checking the backend ID in
+ * RelFileLocatorBackendEquals.
+ */
+#define RelFileLocatorEquals(locator1, locator2) \
+ ((locator1).relNumber == (locator2).relNumber && \
+ (locator1).dbOid == (locator2).dbOid && \
+ (locator1).spcOid == (locator2).spcOid)
+
+#define RelFileLocatorBackendEquals(locator1, locator2) \
+ ((locator1).locator.relNumber == (locator2).locator.relNumber && \
+ (locator1).locator.dbOid == (locator2).locator.dbOid && \
+ (locator1).backend == (locator2).backend && \
+ (locator1).locator.spcOid == (locator2).locator.spcOid)
+
+#endif /* RELFILELOCATOR_H */
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
deleted file mode 100644
index 4fdc606..0000000
--- a/src/include/storage/relfilenode.h
+++ /dev/null
@@ -1,99 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenode.h
- * Physical access information for relations.
- *
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/storage/relfilenode.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODE_H
-#define RELFILENODE_H
-
-#include "common/relpath.h"
-#include "storage/backendid.h"
-
-/*
- * RelFileNode must provide all that we need to know to physically access
- * a relation, with the exception of the backend ID, which can be provided
- * separately. Note, however, that a "physical" relation is comprised of
- * multiple files on the filesystem, as each fork is stored as a separate
- * file, and each fork can be divided into multiple segments. See md.c.
- *
- * spcNode identifies the tablespace of the relation. It corresponds to
- * pg_tablespace.oid.
- *
- * dbNode identifies the database of the relation. It is zero for
- * "shared" relations (those common to all databases of a cluster).
- * Nonzero dbNode values correspond to pg_database.oid.
- *
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
- *
- * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
- * zero. We support shared relations only in the "global" tablespace.
- *
- * Note: in pg_class we allow reltablespace == 0 to denote that the
- * relation is stored in its database's "default" tablespace (as
- * identified by pg_database.dattablespace). However this shorthand
- * is NOT allowed in RelFileNode structs --- the real tablespace ID
- * must be supplied when setting spcNode.
- *
- * Note: in pg_class, relfilenode can be zero to denote that the relation
- * is a "mapped" relation, whose current true filenode number is available
- * from relmapper.c. Again, this case is NOT allowed in RelFileNodes.
- *
- * Note: various places use RelFileNode in hashtable keys. Therefore,
- * there *must not* be any unused padding bytes in this struct. That
- * should be safe as long as all the fields are of type Oid.
- */
-typedef struct RelFileNode
-{
- Oid spcNode; /* tablespace */
- Oid dbNode; /* database */
- Oid relNode; /* relation */
-} RelFileNode;
-
-/*
- * Augmenting a relfilenode with the backend ID provides all the information
- * we need to locate the physical storage. The backend ID is InvalidBackendId
- * for regular relations (those accessible to more than one backend), or the
- * owning backend's ID for backend-local relations. Backend-local relations
- * are always transient and removed in case of a database crash; they are
- * never WAL-logged or fsync'd.
- */
-typedef struct RelFileNodeBackend
-{
- RelFileNode node;
- BackendId backend;
-} RelFileNodeBackend;
-
-#define RelFileNodeBackendIsTemp(rnode) \
- ((rnode).backend != InvalidBackendId)
-
-/*
- * Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
- * since that is most likely to be different in two unequal RelFileNodes. It
- * is probably redundant to compare spcNode if the other fields are found equal,
- * but do it anyway to be sure. Likewise for checking the backend ID in
- * RelFileNodeBackendEquals.
- */
-#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
- (node1).dbNode == (node2).dbNode && \
- (node1).spcNode == (node2).spcNode)
-
-#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
- (node1).node.dbNode == (node2).node.dbNode && \
- (node1).backend == (node2).backend && \
- (node1).node.spcNode == (node2).node.spcNode)
-
-#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index e7cd456..56c6fc9 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -16,7 +16,7 @@
#include <signal.h>
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* We support several types of shared-invalidation messages:
@@ -90,7 +90,7 @@ typedef struct
int8 id; /* type field --- must be first */
int8 backend_hi; /* high bits of backend ID, if temprel */
uint16 backend_lo; /* low bits of backend ID, if temprel */
- RelFileNode rnode; /* spcNode, dbNode, relNode */
+ RelFileLocator rlocator; /* spcOid, dbOid, relNumber */
} SharedInvalSmgrMsg;
#define SHAREDINVALRELMAP_ID (-4)
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 6b63c60..a077153 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -16,7 +16,7 @@
#include "lib/ilist.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* smgr.c maintains a table of SMgrRelation objects, which are essentially
@@ -38,8 +38,8 @@
*/
typedef struct SMgrRelationData
{
- /* rnode is the hashtable lookup key, so it must be first! */
- RelFileNodeBackend smgr_rnode; /* relation physical identifier */
+ /* rlocator is the hashtable lookup key, so it must be first! */
+ RelFileLocatorBackend smgr_rlocator; /* relation physical identifier */
/* pointer to owning pointer, or NULL if none */
struct SMgrRelationData **smgr_owner;
@@ -75,16 +75,16 @@ typedef struct SMgrRelationData
typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
- RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+ RelFileLocatorBackendIsTemp((smgr)->smgr_rlocator)
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(RelFileLocator rlocator, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclose(SMgrRelation reln);
extern void smgrcloseall(void);
-extern void smgrclosenode(RelFileNodeBackend rnode);
+extern void smgrcloserellocator(RelFileLocatorBackend rlocator);
extern void smgrrelease(SMgrRelation reln);
extern void smgrreleaseall(void);
extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 6a77632..dacef92 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -17,7 +17,7 @@
#include "datatype/timestamp.h"
#include "storage/lock.h"
#include "storage/procsignal.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/standbydefs.h"
/* User-settable GUC parameters */
@@ -30,9 +30,9 @@ extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithTablespace(Oid tsid);
extern void ResolveRecoveryConflictWithDatabase(Oid dbid);
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..049af87 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -13,7 +13,7 @@
#ifndef SYNC_H
#define SYNC_H
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Type of sync request. These are used to manage the set of pending
@@ -51,7 +51,7 @@ typedef struct FileTag
{
int16 handler; /* SyncRequestHandler value, saving space */
int16 forknum; /* ForkNumber, saving space */
- RelFileNode rnode;
+ RelFileLocator rlocator;
uint32 segno;
} FileTag;
diff --git a/src/include/utils/inval.h b/src/include/utils/inval.h
index 0e0323b..23748b7 100644
--- a/src/include/utils/inval.h
+++ b/src/include/utils/inval.h
@@ -15,7 +15,7 @@
#define INVAL_H
#include "access/htup.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
extern PGDLLIMPORT int debug_discard_caches;
@@ -48,7 +48,7 @@ extern void CacheInvalidateRelcacheByTuple(HeapTuple classTuple);
extern void CacheInvalidateRelcacheByRelid(Oid relid);
-extern void CacheInvalidateSmgr(RelFileNodeBackend rnode);
+extern void CacheInvalidateSmgr(RelFileLocatorBackend rlocator);
extern void CacheInvalidateRelmap(Oid databaseId);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1896a9a..e5b6662 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -23,7 +23,7 @@
#include "partitioning/partdefs.h"
#include "rewrite/prs2lock.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/reltrigger.h"
@@ -53,7 +53,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
- RelFileNode rd_node; /* relation physical identifier */
+ RelFileLocator rd_locator; /* relation physical identifier */
SMgrRelation rd_smgr; /* cached file handle, or NULL */
int rd_refcnt; /* reference count */
BackendId rd_backend; /* owning backend id, if temporary relation */
@@ -66,44 +66,44 @@ typedef struct RelationData
/*----------
* rd_createSubid is the ID of the highest subtransaction the rel has
- * survived into or zero if the rel or its rd_node was created before the
- * current top transaction. (IndexStmt.oldNode leads to the case of a new
- * rel with an old rd_node.) rd_firstRelfilenodeSubid is the ID of the
- * highest subtransaction an rd_node change has survived into or zero if
- * rd_node matches the value it had at the start of the current top
+ * survived into or zero if the rel or its rd_locator was created before the
+ * current top transaction. (IndexStmt.oldNumber leads to the case of a new
+ * rel with an old rd_locator.) rd_firstRelfilelocatorSubid is the ID of the
+ * highest subtransaction an rd_locator change has survived into or zero if
+ * rd_locator matches the value it had at the start of the current top
* transaction. (Rolling back the subtransaction that
- * rd_firstRelfilenodeSubid denotes would restore rd_node to the value it
+ * rd_firstRelfilelocatorSubid denotes would restore rd_locator to the value it
* had at the start of the current top transaction. Rolling back any
* lower subtransaction would not.) Their accuracy is critical to
* RelationNeedsWAL().
*
- * rd_newRelfilenodeSubid is the ID of the highest subtransaction the
- * most-recent relfilenode change has survived into or zero if not changed
+ * rd_newRelfilelocatorSubid is the ID of the highest subtransaction the
+ * most-recent relfilenumber change has survived into or zero if not changed
* in the current transaction (or we have forgotten changing it). This
* field is accurate when non-zero, but it can be zero when a relation has
- * multiple new relfilenodes within a single transaction, with one of them
+ * multiple new relfilenumbers within a single transaction, with one of them
* occurring in a subsequently aborted subtransaction, e.g.
* BEGIN;
* TRUNCATE t;
* SAVEPOINT save;
* TRUNCATE t;
* ROLLBACK TO save;
- * -- rd_newRelfilenodeSubid is now forgotten
+ * -- rd_newRelfilelocatorSubid is now forgotten
*
* If every rd_*Subid field is zero, they are read-only outside
- * relcache.c. Files that trigger rd_node changes by updating
+ * relcache.c. Files that trigger rd_locator changes by updating
* pg_class.reltablespace and/or pg_class.relfilenode call
- * RelationAssumeNewRelfilenode() to update rd_*Subid.
+ * RelationAssumeNewRelfilelocator() to update rd_*Subid.
*
* rd_droppedSubid is the ID of the highest subtransaction that a drop of
* the rel has survived into. In entries visible outside relcache.c, this
* is always zero.
*/
SubTransactionId rd_createSubid; /* rel was created in current xact */
- SubTransactionId rd_newRelfilenodeSubid; /* highest subxact changing
- * rd_node to current value */
- SubTransactionId rd_firstRelfilenodeSubid; /* highest subxact changing
- * rd_node to any value */
+ SubTransactionId rd_newRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to current value */
+ SubTransactionId rd_firstRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to any value */
SubTransactionId rd_droppedSubid; /* dropped with another Subid set */
Form_pg_class rd_rel; /* RELATION tuple */
@@ -531,12 +531,12 @@ typedef struct ViewOptions
/*
* RelationIsMapped
- * True if the relation uses the relfilenode map. Note multiple eval
+ * True if the relation uses the relfilenumber map. Note multiple eval
* of argument!
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelFileNumber))
/*
* RelationGetSmgr
@@ -555,7 +555,7 @@ static inline SMgrRelation
RelationGetSmgr(Relation rel)
{
if (unlikely(rel->rd_smgr == NULL))
- smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_node, rel->rd_backend));
+ smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_locator, rel->rd_backend));
return rel->rd_smgr;
}
@@ -607,12 +607,12 @@ RelationGetSmgr(Relation rel)
*
* Returns false if wal_level = minimal and this relation is created or
* truncated in the current transaction. See "Skipping WAL for New
- * RelFileNode" in src/backend/access/transam/README.
+ * RelFileLocator" in src/backend/access/transam/README.
*/
#define RelationNeedsWAL(relation) \
(RelationIsPermanent(relation) && (XLogIsNeeded() || \
(relation->rd_createSubid == InvalidSubTransactionId && \
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)))
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)))
/*
* RelationUsesLocalBuffers
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index c93d865..ba35d6b 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -103,7 +103,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -111,10 +111,10 @@ extern Relation RelationBuildLocalRelation(const char *relname,
char relkind);
/*
- * Routines to manage assignment of new relfilenode to a relation
+ * Routines to manage assignment of new relfilenumber to a relation
*/
-extern void RelationSetNewRelfilenode(Relation relation, char persistence);
-extern void RelationAssumeNewRelfilenode(Relation relation);
+extern void RelationSetNewRelfilenumber(Relation relation, char persistence);
+extern void RelationAssumeNewRelfilelocator(Relation relation);
/*
* Routines for flushing/rebuilding relcache entries in various scenarios
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
deleted file mode 100644
index 77d8046..0000000
--- a/src/include/utils/relfilenodemap.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.h
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/relfilenodemap.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODEMAP_H
-#define RELFILENODEMAP_H
-
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
-
-#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
new file mode 100644
index 0000000..c149a93
--- /dev/null
+++ b/src/include/utils/relfilenumbermap.h
@@ -0,0 +1,19 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.h
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/relfilenumbermap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILENUMBERMAP_H
+#define RELFILENUMBERMAP_H
+
+extern Oid RelidByRelfilenumber(Oid reltablespace,
+ RelFileNumber relfilenumber);
+
+#endif /* RELFILENUMBERMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 557f77e..2bb2e25 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.h
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
*
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
@@ -35,14 +35,15 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumber(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
-extern Oid RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId);
+extern Oid RelationMapFilenumberToOid(RelFileNumber relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumberForDatabase(char *dbpath,
+ Oid relationId);
extern void RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath,
char *dstdbpath);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
- bool immediate);
+extern void RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber,
+ bool shared, bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
index 4700d49..869d9d5 100644
--- a/src/test/recovery/t/018_wal_optimize.pl
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -5,7 +5,7 @@
#
# These tests exercise code that once violated the mandate described in
# src/backend/access/transam/README section "Skipping WAL for New
-# RelFileNode". The tests work by committing some transactions, initiating an
+# RelFileLocator". The tests work by committing some transactions, initiating an
# immediate shutdown, and confirming that the expected data survives recovery.
# For many years, individual commands made the decision to skip WAL, hence the
# frequent appearance of COPY in these tests.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4fb7469..11b68b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2255,8 +2255,8 @@ ReindexObjectType
ReindexParams
ReindexStmt
ReindexType
-RelFileNode
-RelFileNodeBackend
+RelFileLocator
+RelFileLocatorBackend
RelIdCacheEnt
RelInfo
RelInfoArr
@@ -2274,8 +2274,8 @@ RelationPtr
RelationSyncEntry
RelcacheCallbackFunction
ReleaseMatchCB
-RelfilenodeMapEntry
-RelfilenodeMapKey
+RelfilenumberMapEntry
+RelfilenumberMapKey
Relids
RelocationBufferInfo
RelptrFreePageBtree
@@ -3877,7 +3877,7 @@ xl_xact_parsed_abort
xl_xact_parsed_commit
xl_xact_parsed_prepare
xl_xact_prepare
-xl_xact_relfilenodes
+xl_xact_relfilelocators
xl_xact_stats_item
xl_xact_stats_items
xl_xact_subxacts
--
1.8.3.1
On Tue, 28 Jun 2022 at 19:18, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
I will be the first to admit that it is quite unlikely to be common
practise, but this workload increases the number of dbOid+spcOid
combinations to 100s (even while using only a single tablespace),
Which should still fit nicely in 32bits then. Why does that present a
problem to this idea?
The reason to mention this now is that it would give more space than
56bit limit being suggested here. I am not opposed to the current
patch, just finding ways to remove some objections mentioned by
others, if those became blockers.
which in my opinion requires some more thought than just handwaving it
into an smgr array and/or checkpoint records.
The idea is that we would store the mapping as an array, with the
value in the RelFileNode as the offset in the array. The array would
be mostly static, so would cache nicely.
For convenience, I imagine that the mapping could be included in WAL
in or near the checkpoint record, to ensure that the mapping was
available in all backups.
--
Simon Riggs http://www.EnterpriseDB.com/
On Wed, 29 Jun 2022 at 14:41, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Tue, 28 Jun 2022 at 19:18, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:I will be the first to admit that it is quite unlikely to be common
practise, but this workload increases the number of dbOid+spcOid
combinations to 100s (even while using only a single tablespace),Which should still fit nicely in 32bits then. Why does that present a
problem to this idea?
It doesn't, or at least not the bitspace part. I think it is indeed
quite unlikely anyone will try to build as many tablespaces as the 100
million tables project, which utilized 1000 tablespaces to get around
file system limitations [0]https://www.pgcon.org/2013/schedule/attachments/283_Billion_Tables_Project-PgCon2013.pdf.
The potential problem is 'where to store such mapping efficiently'.
Especially considering that this mapping might (and likely: will)
change across restarts and when database churn (create + drop
database) happens in e.g. testing workloads.
The reason to mention this now is that it would give more space than
56bit limit being suggested here. I am not opposed to the current
patch, just finding ways to remove some objections mentioned by
others, if those became blockers.which in my opinion requires some more thought than just handwaving it
into an smgr array and/or checkpoint records.The idea is that we would store the mapping as an array, with the
value in the RelFileNode as the offset in the array. The array would
be mostly static, so would cache nicely.
That part is not quite clear to me. Any cluster may have anywhere
between 3 and hundreds or thousands of entries in that mapping. Do you
suggest to dynamically grow that (presumably shared, considering the
addressing is shared) array, or have a runtime parameter limiting the
amount of those entries (similar to max_connections)?
For convenience, I imagine that the mapping could be included in WAL
in or near the checkpoint record, to ensure that the mapping was
available in all backups.
Why would we need this mapping in backups, considering that it seems
to be transient state that is lost on restart? Won't we still use full
dbOid and spcOid in anything we communicate or store on disk (file
names, WAL, pg_class rows, etc.), or did I misunderstand your
proposal?
Kind regards,
Matthias van de Meent
[0]: https://www.pgcon.org/2013/schedule/attachments/283_Billion_Tables_Project-PgCon2013.pdf
On Thu, Jun 30, 2022 at 12:41 AM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
The reason to mention this now is that it would give more space than
56bit limit being suggested here.
Isn't 2^56 enough, though? Remembering that cluster time runs out
when we've generated 2^64 bytes of WAL, if you want to run out of 56
bit relfile numbers before the end of time you'll need to find a way
to allocate them in less than 2^8 bytes of WAL. That's technically
possible, since SMgr CREATE records are only 42 bytes long, so you
could craft some C code to do nothing but create (and leak)
relfilenodes, but real usage is always accompanied by catalogue
insertions to connect the new relfilenode to a database object,
without which they are utterly useless. So in real life, it takes
many hundreds or typically thousands of bytes, much more than 256.
On Tue, Jun 28, 2022 at 5:15 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
On Sat, 25 Jun 2022 at 02:30, Andres Freund <andres@anarazel.de> wrote:
And then like this in 0003:
typedef struct buftag
{
Oid spcOid;
Oid dbOid;
RelFileNumber fileNumber:56;
ForkNumber forkNum:8;
} BufferTag;Probably worth checking the generated code / the performance effects of using
bitfields (vs manual maskery). I've seen some awful cases, but here it's at a
byte boundary, so it might be ok.Another approach would be to condense spcOid and dbOid into a single
4-byte Oid-like number, since in most cases they are associated with
each other, and not often many of them anyway. So this new number
would indicate both the database and the tablespace. I know that we
want to be able to make file changes without doing catalog lookups,
but since the number of combinations is usually 1, but even then, low,
it can be cached easily in a smgr array and included in the checkpoint
record (or nearby) for ease of use.typedef struct buftag
{
Oid db_spcOid;
ForkNumber uint32;
RelFileNumber uint64;
} BufferTag;That way we could just have a simple 64-bit RelFileNumber, without
restriction, and probably some spare bytes on the ForkNumber, if we
needed them later.
Yeah this is possible but I am not seeing the clear advantage. Of
Course we can widen the RelFileNumber to 64 instead of 56 but with the
added complexity of storing the mapping. I am not sure if it is
really worth it?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, 30 Jun 2022 at 03:43, Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Jun 30, 2022 at 12:41 AM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:The reason to mention this now is that it would give more space than
56bit limit being suggested here.Isn't 2^56 enough, though?
For me, yes.
To the above comment, I followed with:
I am not opposed to the current
patch, just finding ways to remove some objections mentioned by
others, if those became blockers.
So it seems we can continue with the patch.
--
Simon Riggs http://www.EnterpriseDB.com/
On Wed, Jun 29, 2022 at 5:15 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
- It looks to me like you need to give significantly more thought to
the proper way of adjusting the relfilenode-related test cases in
alter_table.out.It seems to me that this test case is just testing whether the
table/child table are rewritten or not after the alter table. And for
that it is comparing the oid with the relfilenode, now that is not
possible so I think it's quite reasonable to just compare the current
relfilenode with the old relfilenode and if they are same the table is
not rewritten. So I am not sure why the original test case had two
cases 'own' and 'orig'. With respect to this test case they both have
the same meaning, in fact comparing old relfilenode with current
relfilenode is better way of testing than comparing the oid with
relfilenode.
I think you're right. However, I don't really like OTHER showing up in
the output, because that looks like a string that was chosen to be
slightly alarming, especially given that it's in ALL CAPS. How about
if we change 'ORIG' to 'new'?
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, Jun 29, 2022 at 5:15 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
PFA, the remaining set of patches. It might need to fix some
indentation but lets first see how is the overall idea then we can
work on it
So just playing around with this patch set, and also looking at the
code a bit, here are a few random observations:
- The patch assigns relfilenumbers starting with 1. I don't see any
specific problem with that, but I wonder if it would be a good idea to
start with a random larger value just in case we ever need some fixed
values for some purpose or other. Maybe we should start with 100000 or
something?
- If I use ALTER TABLE .. SET TABLESPACE to move a table around, then
the relfilenode changes each time, but if I use ALTER DATABASE .. SET
TABLESPACE to move a database around, the relfilenodes don't change.
So, what this guarantees is that if the same filename is used twice,
it will be for the same relation and not some unrelated relation.
That's enough to avoid the hazard described in the comments for
mdunlink(), because that scenario intrinsically involves confusion
caused by two relations using the same filename after an OID
wraparound. And it also means that if we pursue the idea of using an
end-of-recovery record in all cases, we don't need to start creating
tombstones during crash recovery. The forced checkpoint at the end of
crash recovery means we don't currently need to do that, but if we
change that, then the same hazard would exist there as we already have
in normal running, and this fixes it. However, I don't find it
entirely obvious that there are no hazards of any kind stemming from
repeated use of ALTER DATABASE .. SET TABLESPACE resulting in
filenames getting reused. On the other hand avoiding filename reuse
completely would be more work, not closely related to what the rest of
the patch set does, probably somewhat controversial in terms of what
it would have to do, and I'm not sure that we really need it. It does
seem like it would be quite a bit easier to reason about, though,
because the current guarantee is suspiciously similar to "we don't do
X, except when we do." This is not really so much a review comment for
Dilip as a request for input from others ... thoughts?
- Again, not a review comment for this patch specifically, but I'm
wondering if we could use this as infrastructure for a tool to clean
orphaned files out of the data directory. Suppose we create a file for
a new relation and then crash, leaving a potentially large file on
disk that will never be removed. Well, if the relfilenumber as it
exists on disk is not in pg_class and old enough that a transaction
inserting into pg_class can't still be running, then it must be safe
to remove that file. Maybe that's safe even today, but it's a little
hard to reason about it in the face of a possible OID wraparound that
might result in reusing the same numbers over again. It feels like
this makes easier to identify which files are old stuff that can never
again be touched.
- I might be missing something here, but this isn't actually making
the relfilenode 56 bits, is it? The reason to do that is to make the
BufferTag smaller, so I expected to see that BufferTag either used
bitfields like RelFileNumber relNumber:56 and ForkNumber forkNum:8, or
else that it just declared a single field for both as uint64 and used
accessor macros or static inlines to separate them out. But it doesn't
seem to do either of those things, which seems like it can't be right.
On a related note, I think it would be better to declare RelFileNumber
as an unsigned type even though we have no use for the high bit; we
have, equally, no use for negative values. It's easier to reason about
bit-shifting operations with unsigned types.
- I also think that the cross-version compatibility stuff in
pg_buffercache isn't quite right. It does values[1] =
ObjectIdGetDatum(fctx->record[i].relfilenumber). But I think what it
ought to do is dependent on the output type. If the output type is
int8, then it ought to do values[1] = Int64GetDatum((int64)
fctx->record[i].relfilenumber), and if it's OID, then it ought to do
values[1] = ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber)).
The macro that you use needs to be based on the output SQL type, not
the C data type.
- I think it might be a good idea to allocate RelFileNumbers in much
smaller batches than we do OIDs. 8192 feels wasteful to me. It
shouldn't practically matter, because if we have 56 bits of bit space
and so even if we repeatedly allocate 2^13 RelFileNumbers and then
crash, we can still crash 2^41 times before we completely run out of
numbers, and 2 trillion crashes ought to be enough for anyone. But I
see little benefit from being so profligate. You can allocate an OID
as an identifier for a catalog tuple or a TOAST chunk, but a
RelFileNumber requires a filesystem operation, so the amount of work
that is needed to use up 8192 RelFileNumbers is a lot bigger than the
amount of work required to use up 8192 OIDs. If we dropped this down
to 128, or 64, or 256, would anything bad happen?
- Do we really want GetNewRelFileNumber() to call access() just for a
can't-happen scenario? Can't we catch this problem later when we
actually go to create the files on disk?
- The patch updates the comments in XLogPrefetcherNextBlock to talk
about relfilenumbers being reused rather than relfilenodes being
reused, which is fine except that we're sorta kinda not doing that any
more as noted above. I don't really know what these comments ought to
say instead but perhaps more than a mechanical update is in order.
This applies, even more, to the comments above mdunlink(). Apart from
updating the existing comments, I think that the patch needs a good
explanation of the new scheme someplace, and what it does and doesn't
guarantee, which relates to the point above about making sure we know
exactly what we're guaranteeing and why. I don't know where exactly
this text should be positioned yet, or what it should say, but it
needs to go someplace. This is a fairly significant change and needs
to be talked about somewhere.
- I think there's still a bit of a terminology problem here. With the
patch set, we use RelFileNumber to refer to a single, 56-bit integer
and RelFileLocator to refer to that integer combined with the DB and
TS OIDs. But sometimes in the comments we want to talk about the
logical sequence of files that is identified by a RelFileLocator, and
that's not quite the same as either of those things. For example, in
tableam.h we currently say "This callback needs to create a new
relation filenode for `rel`" and how should that be changed in this
new naming? We're not creating a new RelFileNumber - those would need
to be allocated, not created, as all the numbers in the universe exist
already. Neither are we creating a new locator; that sounds like it
means assembling it from pieces. What we're doing is creating the
first of what may end up being a series of similarly-named files on
disk. I'm not exactly sure how we can refer to that in a way that is
clear, but it's a problem that arises here and here throughout the
patch.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Jun 30, 2022 at 10:57 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jun 29, 2022 at 5:15 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
- It looks to me like you need to give significantly more thought to
the proper way of adjusting the relfilenode-related test cases in
alter_table.out.It seems to me that this test case is just testing whether the
table/child table are rewritten or not after the alter table. And for
that it is comparing the oid with the relfilenode, now that is not
possible so I think it's quite reasonable to just compare the current
relfilenode with the old relfilenode and if they are same the table is
not rewritten. So I am not sure why the original test case had two
cases 'own' and 'orig'. With respect to this test case they both have
the same meaning, in fact comparing old relfilenode with current
relfilenode is better way of testing than comparing the oid with
relfilenode.I think you're right. However, I don't really like OTHER showing up in
the output, because that looks like a string that was chosen to be
slightly alarming, especially given that it's in ALL CAPS. How about
if we change 'ORIG' to 'new'?
I think you meant, rename 'OTHER' to 'new', yeah that makes sense.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Jul 1, 2022 at 12:54 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jun 29, 2022 at 5:15 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
PFA, the remaining set of patches. It might need to fix some
indentation but lets first see how is the overall idea then we can
work on itSo just playing around with this patch set, and also looking at the
code a bit, here are a few random observations:- The patch assigns relfilenumbers starting with 1. I don't see any
specific problem with that, but I wonder if it would be a good idea to
start with a random larger value just in case we ever need some fixed
values for some purpose or other. Maybe we should start with 100000 or
something?
Yeah we can do that, I have changed to 100000.
- If I use ALTER TABLE .. SET TABLESPACE to move a table around, then
the relfilenode changes each time, but if I use ALTER DATABASE .. SET
TABLESPACE to move a database around, the relfilenodes don't change.
So, what this guarantees is that if the same filename is used twice,
it will be for the same relation and not some unrelated relation.
That's enough to avoid the hazard described in the comments for
mdunlink(), because that scenario intrinsically involves confusion
caused by two relations using the same filename after an OID
wraparound. And it also means that if we pursue the idea of using an
end-of-recovery record in all cases, we don't need to start creating
tombstones during crash recovery. The forced checkpoint at the end of
crash recovery means we don't currently need to do that, but if we
change that, then the same hazard would exist there as we already have
in normal running, and this fixes it. However, I don't find it
entirely obvious that there are no hazards of any kind stemming from
repeated use of ALTER DATABASE .. SET TABLESPACE resulting in
filenames getting reused. On the other hand avoiding filename reuse
completely would be more work, not closely related to what the rest of
the patch set does, probably somewhat controversial in terms of what
it would have to do, and I'm not sure that we really need it. It does
seem like it would be quite a bit easier to reason about, though,
because the current guarantee is suspiciously similar to "we don't do
X, except when we do." This is not really so much a review comment for
Dilip as a request for input from others ... thoughts?
Yeah that can be done, but maybe as a separate patch. One option is
that when we will support the WAL method for the ALTER TABLE .. SET
TABLESPACE like we did for CREATE DATABASE, as part of that we will
generate the new relfilenumber.
- Again, not a review comment for this patch specifically, but I'm
wondering if we could use this as infrastructure for a tool to clean
orphaned files out of the data directory. Suppose we create a file for
a new relation and then crash, leaving a potentially large file on
disk that will never be removed. Well, if the relfilenumber as it
exists on disk is not in pg_class and old enough that a transaction
inserting into pg_class can't still be running, then it must be safe
to remove that file. Maybe that's safe even today, but it's a little
hard to reason about it in the face of a possible OID wraparound that
might result in reusing the same numbers over again. It feels like
this makes easier to identify which files are old stuff that can never
again be touched.
Correct.
- I might be missing something here, but this isn't actually making
the relfilenode 56 bits, is it? The reason to do that is to make the
BufferTag smaller, so I expected to see that BufferTag either used
bitfields like RelFileNumber relNumber:56 and ForkNumber forkNum:8, or
else that it just declared a single field for both as uint64 and used
accessor macros or static inlines to separate them out. But it doesn't
seem to do either of those things, which seems like it can't be right.
On a related note, I think it would be better to declare RelFileNumber
as an unsigned type even though we have no use for the high bit; we
have, equally, no use for negative values. It's easier to reason about
bit-shifting operations with unsigned types.
Opps, somehow missed to merge that change in the patch. Changed that
like below and adjusted the macros.
typedef struct buftag
{
Oid spcOid; /* tablespace oid. */
Oid dbOid; /* database oid. */
uint32 relNumber_low; /* relfilenumber 32 lower bits */
uint32 relNumber_hi:24; /* relfilenumber 24 high bits */
uint32 forkNum:8; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
I think we need to break like this to keep the BufferTag 4 byte
aligned otherwise the size of the structure will be increased.
- I also think that the cross-version compatibility stuff in
pg_buffercache isn't quite right. It does values[1] =
ObjectIdGetDatum(fctx->record[i].relfilenumber). But I think what it
ought to do is dependent on the output type. If the output type is
int8, then it ought to do values[1] = Int64GetDatum((int64)
fctx->record[i].relfilenumber), and if it's OID, then it ought to do
values[1] = ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber)).
The macro that you use needs to be based on the output SQL type, not
the C data type.
Fixed
- I think it might be a good idea to allocate RelFileNumbers in much
smaller batches than we do OIDs. 8192 feels wasteful to me. It
shouldn't practically matter, because if we have 56 bits of bit space
and so even if we repeatedly allocate 2^13 RelFileNumbers and then
crash, we can still crash 2^41 times before we completely run out of
numbers, and 2 trillion crashes ought to be enough for anyone. But I
see little benefit from being so profligate. You can allocate an OID
as an identifier for a catalog tuple or a TOAST chunk, but a
RelFileNumber requires a filesystem operation, so the amount of work
that is needed to use up 8192 RelFileNumbers is a lot bigger than the
amount of work required to use up 8192 OIDs. If we dropped this down
to 128, or 64, or 256, would anything bad happen?
This makes sense so I have changed to 64.
- Do we really want GetNewRelFileNumber() to call access() just for a
can't-happen scenario? Can't we catch this problem later when we
actually go to create the files on disk?
Yeah we don't need to, actually we can completely get rid of
GetNewRelFileNumber() function and we can directly call
GenerateNewRelFileNumber() and in fact we can rename
GenerateNewRelFileNumber() to GetNewRelFileNumber(). So I have done
these changes.
- The patch updates the comments in XLogPrefetcherNextBlock to talk
about relfilenumbers being reused rather than relfilenodes being
reused, which is fine except that we're sorta kinda not doing that any
more as noted above. I don't really know what these comments ought to
say instead but perhaps more than a mechanical update is in order.
Changed
This applies, even more, to the comments above mdunlink(). Apart from
updating the existing comments, I think that the patch needs a good
explanation of the new scheme someplace, and what it does and doesn't
guarantee, which relates to the point above about making sure we know
exactly what we're guaranteeing and why. I don't know where exactly
this text should be positioned yet, or what it should say, but it
needs to go someplace. This is a fairly significant change and needs
to be talked about somewhere.
For now, in v4_0004**, I have removed the comment which is explaining
why we need to keep the Tombstone file and added some note that why we
do not need to keep those files from PG16 onwards.
- I think there's still a bit of a terminology problem here. With the
patch set, we use RelFileNumber to refer to a single, 56-bit integer
and RelFileLocator to refer to that integer combined with the DB and
TS OIDs. But sometimes in the comments we want to talk about the
logical sequence of files that is identified by a RelFileLocator, and
that's not quite the same as either of those things. For example, in
tableam.h we currently say "This callback needs to create a new
relation filenode for `rel`" and how should that be changed in this
new naming? We're not creating a new RelFileNumber - those would need
to be allocated, not created, as all the numbers in the universe exist
already. Neither are we creating a new locator; that sounds like it
means assembling it from pieces. What we're doing is creating the
first of what may end up being a series of similarly-named files on
disk. I'm not exactly sure how we can refer to that in a way that is
clear, but it's a problem that arises here and here throughout the
patch.
I think the comment can say
"This callback needs to create a new relnumber file for 'rel' " ?
I have not modified this yet, I will check other places where we have
such terminology issues.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v4-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchtext/x-patch; charset=UTF-8; name=v4-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchDownload
From 5d48dccb9b5ebca755fdb3025924ae0e15bf18ca Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Sat, 25 Jun 2022 15:12:27 +0530
Subject: [PATCH v4 3/4] Use 56 bits for relfilenumber to avoid wraparound
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As part of this patch, we will make the relfilenumber 56 bits wide.
But the problem is that if we make it 56 bits wide then the size
of the BufferTag will be increased which will increase the memory
usage and that may also impact the performance. So in order to
avoid that inside the buffer tag, instead of using 64 bits for the
relfilenumber we will use 8 bits for the fork number and 56 bits for
the relfilenumber.
---
contrib/pg_buffercache/Makefile | 3 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 31 ++++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 +++--
src/backend/access/transam/README | 4 +-
src/backend/access/transam/varsup.c | 94 +++++++++++++++++++++-
src/backend/access/transam/xlog.c | 48 +++++++++++
src/backend/access/transam/xlogprefetcher.c | 18 ++---
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 8 +-
src/backend/catalog/catalog.c | 93 ---------------------
src/backend/catalog/heap.c | 8 +-
src/backend/catalog/index.c | 4 +-
src/backend/commands/tablecmds.c | 10 ++-
src/backend/nodes/outfuncs.c | 2 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 4 +-
src/backend/utils/adt/pg_upgrade_support.c | 9 ++-
src/backend/utils/cache/relcache.c | 5 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 ++-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 ++---
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 11 +--
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 ++---
src/fe_utils/option_utils.c | 42 ++++++++++
src/include/access/transam.h | 5 ++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 10 +--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +--
src/include/fe_utils/option_utils.h | 3 +
src/include/postgres_ext.h | 7 +-
src/include/storage/buf_internals.h | 18 +++--
src/include/storage/relfilelocator.h | 12 ++-
src/test/regress/expected/alter_table.out | 24 +++---
src/test/regress/sql/alter_table.sql | 8 +-
57 files changed, 423 insertions(+), 242 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..2fbb62f 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -7,7 +7,8 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+ pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
+ pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index abc8813..4e3884b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +247,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 25b02c4..076bf8f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..e21559d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 565f994..c72f4fb 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,8 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because relfilenumber is 56 bit wide so logically there should not be any
+collisions. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..430e294 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 64
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,94 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GenerateNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* Check for the wraparound for the relfilenumber counter */
+ if (unlikely (ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /* If we run out of logged for use RelFileNumber then we must log more */
+ if (ShmemVariableCache->relnumbercount == 0)
+ {
+ XLogPutNextRelFileNumber(ShmemVariableCache->nextRelFileNumber +
+ VAR_RFN_PREFETCH);
+
+ ShmemVariableCache->relnumbercount = VAR_RFN_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+ (ShmemVariableCache->relnumbercount)--;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ int relnumbercount;
+
+ /* Safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the relfilenode for the objects can be in any
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * Check if we set the new relfilenumber then do we run out of the logged
+ * relnumber, if so then we need to WAL log again. Otherwise, just adjust
+ * the relnumbercount.
+ */
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ XLogPutNextRelFileNumber(relnumber + VAR_RFN_PREFETCH);
+ ShmemVariableCache->relnumbercount = VAR_RFN_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8764084..302da4a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4546,6 +4546,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4559,7 +4560,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5026,7 +5029,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6475,6 +6480,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ checkPoint.nextRelFileNumber += ShmemVariableCache->relnumbercount;
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7353,6 +7364,29 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record.
+ */
+void
+XLogPutNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * Flush xlog record to disk before returning. To protect against file
+ * system changes reaching the disk before the XLOG_NEXT_RELFILENUMBER log.
+ *
+ * This should not impact the performance because we are WAL logging the
+ * RelFileNumber after assigning every 8192 RelFileNumber
+ */
+ XLogFlush(recptr);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7567,6 +7601,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7581,6 +7625,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index d1662f3..a2c57d0 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -572,9 +572,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
/*
* Don't try to prefetch anything in this database until
- * it has been created, or we might confuse the blocks of
- * different generations, if a database OID or
- * relfilenumber is reused. It's also more efficient than
+ * it has been created, because it's more efficient than
* discovering that relations don't exist on disk yet with
* ENOENT errors.
*/
@@ -610,7 +608,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -633,7 +631,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -732,7 +730,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -753,7 +751,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -792,7 +790,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -930,7 +928,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -946,7 +944,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5..1d95fc0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 42a0f51..2f58e77 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilelocator instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2a33273..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,99 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidOid; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index c69c923..02ed007 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -347,7 +347,7 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ relfilenumber = GetNewRelFileNumber();
}
/*
@@ -900,7 +900,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1231,8 +1231,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f245df8..46b914b 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -937,8 +937,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index bf645b8..9270aac 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14371,11 +14371,13 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenumber. Although relfilenumber are unique within a
+ * cluster, we are unable to use the old relfilenumber since unused
+ * relfilenumber are not unlinked until commit. So if within a
+ * transaction, if we set the old tablespace again, we will get conflicting
+ * relfilenumber file.
*/
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber();
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 3724d48..3f2618a 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2928,7 +2928,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNumber);
+ WRITE_UINT64_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilenumberSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index f8fb228..4366ae6 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..b64dbe7 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index b21d8c3..5f6c12a 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index d8ae082..5bbd847 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,7 +898,7 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenumber doesn't misbehave */
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 4408c00..f5b6d41 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -98,10 +98,11 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +121,11 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +144,11 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index b80e2ec3..57d34cb 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3630,7 +3630,7 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelFileNumber;
/* Add it to the active mapping information */
RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
@@ -3708,8 +3708,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
RelFileLocator newrlocator;
/* Allocate a new relfilenumber */
- newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ newrelfilenumber = GetNewRelFileNumber();
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index 3dc45e9..a5ec78c 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " INT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 21dfe1b..65fc623 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 30b2f85..2d70833 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4837,16 +4837,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4864,7 +4864,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4878,7 +4878,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4886,7 +4886,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4899,7 +4899,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5d30b87..ea62e7d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,11 +399,11 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
- char query[QUERY_ALLOC];
- char *last_namespace = NULL,
- *last_tablespace = NULL;
+ RelFileNumber i_relfilenumber;
+ char query[QUERY_ALLOC];
+ char *last_namespace = NULL,
+ *last_tablespace = NULL;
query[0] = '\0'; /* initialize query string to empty */
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 265d829..4c4f03a 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index b3ad820..50e94df 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 0fdde9d..e5b0b50 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..37afdd1 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -213,6 +213,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ uint32 relnumbercount; /* relfilenumbers available before must do
+ XLOG work */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +296,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..4cae54b 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void XLogPutNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 66900f1..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..1cf039c 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -31,6 +31,10 @@
*/
CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,RelationRelation_Rowtype_Id) BKI_SCHEMA_MACRO
{
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* oid */
Oid oid;
@@ -52,10 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..d5e6172 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a77b293..68944fd 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index d8af68b..ecdfc90 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,14 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b1b8061..bd74219 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,16 +92,19 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid. */
Oid dbOid; /* database oid. */
- RelFileNumber relNumber; /* relation file number. */
- ForkNumber forkNum;
+ uint32 relNumber_low; /* relfilenumber 32 lower bits */
+ uint32 relNumber_hi:24; /* relfilenumber 24 high bits */
+ uint32 forkNum:8; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define BufTagGetFileNumber(a) ((a).relNumber)
+#define BufTagGetFileNumber(a) \
+ ((((uint64) (a).relNumber_hi << 32) | ((uint32) (a).relNumber_low)))
#define BufTagSetFileNumber(a, relnumber) \
( \
- (a).relNumber = (relnumber) \
+ (a).relNumber_hi = (relnumber) >> 32, \
+ (a).relNumber_low = (relnumber) & 0xffffffff \
)
#define CLEAR_BUFFERTAG(a) \
@@ -126,7 +129,8 @@ typedef struct buftag
( \
(a).spcOid == (b).spcOid && \
(a).dbOid == (b).dbOid && \
- (a).relNumber == (b).relNumber && \
+ (a).relNumber_low == (b).relNumber_low && \
+ (a).relNumber_hi == (b).relNumber_hi && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -135,14 +139,14 @@ typedef struct buftag
do { \
(locator).spcOid = (a).spcOid; \
(locator).dbOid = (a).dbOid; \
- (locator).relNumber = (a).relNumber; \
+ (locator).relNumber = BufTagGetFileNumber(a); \
} while(0)
#define BuffTagRelFileLocatorEquals(a, locator) \
( \
(a).spcOid == (locator).spcOid && \
(a).dbOid == (locator).dbOid && \
- (a).relNumber == (locator).relNumber \
+ BufTagGetFileNumber(a) == (locator).relNumber \
)
/*
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 7211fe7..6046506 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -34,8 +34,7 @@
* relNumber identifies the specific relation. relNumber corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * Notice that relNumber is unique within a cluster.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +74,15 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
+/*
+ * Max value of the relfilnumber. RelFileNumber will be of 56bits wide for
+ * more details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 5ede56d..6230fcb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 52001e3..4190b12 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
v4-0004-Don-t-delay-removing-Tombstone-file-until-next-ch.patchtext/x-patch; charset=US-ASCII; name=v4-0004-Don-t-delay-removing-Tombstone-file-until-next-ch.patchDownload
From f6e8e0e7412198b02671e67d1859a7448fe83f38 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Wed, 29 Jun 2022 13:24:32 +0530
Subject: [PATCH v4 4/4] Don't delay removing Tombstone file until next
checkpoint
Currently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/storage/smgr/md.c | 92 ++++++++--------------------------
src/backend/storage/sync/sync.c | 101 --------------------------------------
src/include/storage/sync.h | 2 -
4 files changed, 21 insertions(+), 179 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 302da4a..50ac3ea 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6644,11 +6644,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..bb27516 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -126,8 +126,6 @@ static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -240,34 +238,14 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
* to delete all forks.
*
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
- * the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenumber
- * from being reused. The scenario this protects us from is:
- * 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenumber as
- * the just-deleted one (OIDs must've wrapped around for that to happen).
- * 3. We crash before another checkpoint occurs.
- * During replay, we would delete the file and then recreate it, which is fine
- * if the contents of the file were repopulated by subsequent WAL entries.
- * But if we didn't WAL-log insertions, but instead relied on fsyncing the
- * file after populating it (as we do at wal_level=minimal), the contents of
- * the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenumber until it's
- * safe, because relfilenumber assignment skips over any existing file.
- *
- * We do not need to go through this dance for temp relations, though, because
- * we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenumber.
- * The fact that temp rels and regular rels have different file naming
- * patterns provides additional safety.
+ * We do not carefully track whether other forks have been created or not, but
+ * just attempt to unlink them unconditionally; so we should never complain
+ * about ENOENT.
*
- * All the above applies only to the relation's main fork; other forks can
- * just be removed immediately, since they are not needed to prevent the
- * relfilenumber from being recycled. Also, we do not carefully
- * track whether other forks have been created or not, but just attempt to
- * unlink them unconditionally; so we should never complain about ENOENT.
+ * Note that now we can immediately unlink the first segment of the regular
+ * relation as well because the relfilenumber is 56 bits wide since PG 16. So
+ * we don't have to worry about relfilenumber getting reused for some unrelated
+ * relation file.
*
* If isRedo is true, it's unsurprising for the relation to be already gone.
* Also, we should remove the file immediately instead of queuing a request
@@ -325,36 +303,25 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileLocatorBackendIsTemp(rlocator))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rlocator, forkNum, 0 /* first seg */ );
- }
- else
- ret = 0;
+ /* Prevent other backends' fds from holding on to the disk space */
+ ret = do_truncate(path);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
- {
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
+ /* Forget any pending sync requests for the first segment */
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
+ ret = 0;
- /* Register request to unlink first segment later */
- register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
+ /* Next unlink the file, unless it was already found to be missing */
+ if (ret == 0 || errno != ENOENT)
+ {
+ ret = unlink(path);
+ if (ret < 0 && errno != ENOENT)
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
/*
@@ -1002,23 +969,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileLocatorBackendIsTemp(rlocator));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e1fb631..9a4a31c 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -201,92 +201,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -532,21 +446,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 049af87..2c0b812 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -23,7 +23,6 @@
typedef enum SyncRequestType
{
SYNC_REQUEST, /* schedule a call of sync function */
- SYNC_UNLINK_REQUEST, /* schedule a call of unlink function */
SYNC_FORGET_REQUEST, /* forget all calls for a tag */
SYNC_FILTER_REQUEST /* forget all calls satisfying match fn */
} SyncRequestType;
@@ -57,7 +56,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
v4-0002-Preliminary-refactoring-for-supporting-larger-rel.patchtext/x-patch; charset=US-ASCII; name=v4-0002-Preliminary-refactoring-for-supporting-larger-rel.patchDownload
From f07ca9ef19e64922c6ee410707e93773d1a01d7c Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Sat, 25 Jun 2022 10:43:12 +0530
Subject: [PATCH v4 2/4] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
contrib/pg_prewarm/autoprewarm.c | 7 +-
src/backend/storage/buffer/bufmgr.c | 113 ++++++++++++++++++--------
src/backend/storage/buffer/localbuf.c | 22 +++--
src/include/storage/buf_internals.h | 43 ++++++++--
5 files changed, 137 insertions(+), 54 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 713f52a..abc8813 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
+ fctx->record[i].relfilenumber = BufTagGetFileNumber(bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 7f1d55c..ca80d5a 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -631,9 +631,10 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetFileNumber(bufHdr->tag);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7071ff6..d34fff3 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BuffTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BuffTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetFileNumber(bufHdr->tag);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,7 +2790,7 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
+ BuffTagCopyRelFileLocator(bufHdr->tag, *rlocator);
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,7 +2841,12 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -3141,14 +3149,14 @@ DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3301,7 +3309,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, locators[j]))
{
rlocator = &locators[j];
break;
@@ -3310,7 +3318,10 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, locator);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3320,7 +3331,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3380,7 +3391,7 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3419,11 +3430,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3447,13 +3458,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3473,12 +3487,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3535,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3564,13 +3582,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BuffTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3644,7 +3662,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,7 +3671,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3665,7 +3686,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BuffTagRelFileLocatorEquals(bufHdr->tag, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3867,13 +3888,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4033,6 +4054,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4041,8 +4066,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4650,8 +4674,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ BuffTagCopyRelFileLocator(buf->tag, rlocator);
+ path = relpathperm(rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4701,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathperm(rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,8 +4723,11 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathbackend(rlocator, MyBackendId, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4787,9 +4820,14 @@ WaitBufHdrUnlocked(BufferDesc *buf)
static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
- int ret;
+ int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ BuffTagCopyRelFileLocator(*ba, rlocatora);
+ BuffTagCopyRelFileLocator(*bb, rlocatorb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
@@ -4946,10 +4984,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BuffTagCopyRelFileLocator(tag, currlocator);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4957,10 +4997,13 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BuffTagCopyRelFileLocator(next->tag, nextrlocator);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +5022,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
+ reln = smgropen(currlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 3dc9cc7..1d43f22 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,9 +213,12 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -337,16 +340,22 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,12 +392,15 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BuffTagRelFileLocatorEquals(bufHdr->tag, rlocator))
{
+ RelFileLocator rlocator;
+
+ BuffTagCopyRelFileLocator(bufHdr->tag, rlocator);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index d54e1f6..b1b8061 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,61 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
- BlockNumber blockNum; /* blknum relative to begin of reln */
+ Oid spcOid; /* tablespace oid. */
+ Oid dbOid; /* database oid. */
+ RelFileNumber relNumber; /* relation file number. */
+ ForkNumber forkNum;
+ BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+#define BufTagGetFileNumber(a) ((a).relNumber)
+
+#define BufTagSetFileNumber(a, relnumber) \
+( \
+ (a).relNumber = (relnumber) \
+)
+
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidOid, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ BufTagSetFileNumber(a, InvalidRelFileNumber), \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rlocator = (xx_rlocator), \
+ (a).spcOid = (xx_rlocator).spcOid, \
+ (a).dbOid = (xx_rlocator).dbOid, \
+ BufTagSetFileNumber(a, (xx_rlocator).relNumber), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).relNumber == (b).relNumber && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BuffTagCopyRelFileLocator(a, locator) \
+do { \
+ (locator).spcOid = (a).spcOid; \
+ (locator).dbOid = (a).dbOid; \
+ (locator).relNumber = (a).relNumber; \
+} while(0)
+
+#define BuffTagRelFileLocatorEquals(a, locator) \
+( \
+ (a).spcOid == (locator).spcOid && \
+ (a).dbOid == (locator).dbOid && \
+ (a).relNumber == (locator).relNumber \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v4-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchDownload
From ad0673696c10a19f5f8710719f840ed8cf192f36 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 21 Jun 2022 14:04:01 +0530
Subject: [PATCH v4 1/4] Rename RelFileNode to RelFileLocator and relNode to
RelNumber
Currently, the way relfilenode and relnode are used is really confusing.
Although there is some precedent for calling the number that pertains to
the file on disk "relnode" and that value when combined with the database
and tablespace OIDs "relfilenode," but it's definitely not the most obvious
thing, and this terminology is also not used uniformaly.
So as part of this patchset these variables are renamed to something more suited
with their usage. So the RelFileNode is renamed to the RelFileLocator
and all related variable declaration from relfilenode to relfilelocator.
And the relNode in the RelFileLocator is renamed to relNumber and along with that
the dbNode and spcNode are also renamed to dbOid and spcOid. Along with that
all other references to relnode/relfilenode w.r.t to the ondisk file is renamed to
relnumber/relfilenumber.
---
contrib/bloom/blinsert.c | 2 +-
contrib/oid2name/oid2name.c | 28 +--
contrib/pg_buffercache/pg_buffercache_pages.c | 10 +-
contrib/pg_prewarm/autoprewarm.c | 26 +--
contrib/pg_visibility/pg_visibility.c | 2 +-
src/backend/access/common/syncscan.c | 29 +--
src/backend/access/gin/ginbtree.c | 2 +-
src/backend/access/gin/ginfast.c | 2 +-
src/backend/access/gin/ginutil.c | 2 +-
src/backend/access/gin/ginxlog.c | 6 +-
src/backend/access/gist/gistbuild.c | 4 +-
src/backend/access/gist/gistxlog.c | 11 +-
src/backend/access/hash/hash_xlog.c | 6 +-
src/backend/access/hash/hashpage.c | 4 +-
src/backend/access/heap/heapam.c | 78 +++----
src/backend/access/heap/heapam_handler.c | 26 +--
src/backend/access/heap/rewriteheap.c | 10 +-
src/backend/access/heap/visibilitymap.c | 4 +-
src/backend/access/nbtree/nbtpage.c | 2 +-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 2 +-
src/backend/access/nbtree/nbtxlog.c | 8 +-
src/backend/access/rmgrdesc/genericdesc.c | 2 +-
src/backend/access/rmgrdesc/gindesc.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 6 +-
src/backend/access/rmgrdesc/heapdesc.c | 6 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/smgrdesc.c | 4 +-
src/backend/access/rmgrdesc/xactdesc.c | 44 ++--
src/backend/access/rmgrdesc/xlogdesc.c | 10 +-
src/backend/access/spgist/spginsert.c | 6 +-
src/backend/access/spgist/spgxlog.c | 6 +-
src/backend/access/table/tableamapi.c | 2 +-
src/backend/access/transam/README | 14 +-
src/backend/access/transam/README.parallel | 2 +-
src/backend/access/transam/twophase.c | 38 ++--
src/backend/access/transam/varsup.c | 2 +-
src/backend/access/transam/xact.c | 40 ++--
src/backend/access/transam/xloginsert.c | 38 ++--
src/backend/access/transam/xlogprefetcher.c | 96 ++++----
src/backend/access/transam/xlogreader.c | 25 ++-
src/backend/access/transam/xlogrecovery.c | 18 +-
src/backend/access/transam/xlogutils.c | 73 +++---
src/backend/bootstrap/bootparse.y | 8 +-
src/backend/catalog/catalog.c | 30 +--
src/backend/catalog/heap.c | 56 ++---
src/backend/catalog/index.c | 37 +--
src/backend/catalog/storage.c | 119 +++++-----
src/backend/commands/cluster.c | 46 ++--
src/backend/commands/copyfrom.c | 6 +-
src/backend/commands/dbcommands.c | 104 ++++-----
src/backend/commands/indexcmds.c | 14 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/sequence.c | 29 +--
src/backend/commands/tablecmds.c | 87 ++++----
src/backend/commands/tablespace.c | 18 +-
src/backend/nodes/copyfuncs.c | 4 +-
src/backend/nodes/equalfuncs.c | 4 +-
src/backend/nodes/outfuncs.c | 4 +-
src/backend/parser/gram.y | 8 +-
src/backend/parser/parse_utilcmd.c | 8 +-
src/backend/postmaster/checkpointer.c | 2 +-
src/backend/replication/logical/decode.c | 40 ++--
src/backend/replication/logical/reorderbuffer.c | 50 ++---
src/backend/replication/logical/snapbuild.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 284 ++++++++++++------------
src/backend/storage/buffer/localbuf.c | 34 +--
src/backend/storage/freespace/freespace.c | 6 +-
src/backend/storage/freespace/fsmpage.c | 6 +-
src/backend/storage/ipc/standby.c | 8 +-
src/backend/storage/lmgr/predicate.c | 24 +-
src/backend/storage/smgr/README | 2 +-
src/backend/storage/smgr/md.c | 126 +++++------
src/backend/storage/smgr/smgr.c | 44 ++--
src/backend/utils/adt/dbsize.c | 64 +++---
src/backend/utils/adt/pg_upgrade_support.c | 14 +-
src/backend/utils/cache/Makefile | 2 +-
src/backend/utils/cache/inval.c | 16 +-
src/backend/utils/cache/relcache.c | 180 +++++++--------
src/backend/utils/cache/relfilenodemap.c | 244 --------------------
src/backend/utils/cache/relfilenumbermap.c | 244 ++++++++++++++++++++
src/backend/utils/cache/relmapper.c | 85 +++----
src/bin/pg_dump/pg_dump.c | 36 +--
src/bin/pg_rewind/datapagemap.h | 2 +-
src/bin/pg_rewind/filemap.c | 34 +--
src/bin/pg_rewind/filemap.h | 4 +-
src/bin/pg_rewind/parsexlog.c | 10 +-
src/bin/pg_rewind/pg_rewind.h | 2 +-
src/bin/pg_upgrade/Makefile | 2 +-
src/bin/pg_upgrade/info.c | 10 +-
src/bin/pg_upgrade/pg_upgrade.h | 22 +-
src/bin/pg_upgrade/relfilenode.c | 259 ---------------------
src/bin/pg_upgrade/relfilenumber.c | 259 +++++++++++++++++++++
src/bin/pg_waldump/pg_waldump.c | 26 +--
src/common/relpath.c | 48 ++--
src/include/access/brin_xlog.h | 2 +-
src/include/access/ginxlog.h | 4 +-
src/include/access/gistxlog.h | 2 +-
src/include/access/heapam_xlog.h | 8 +-
src/include/access/nbtxlog.h | 4 +-
src/include/access/rewriteheap.h | 6 +-
src/include/access/tableam.h | 59 ++---
src/include/access/xact.h | 26 +--
src/include/access/xlog_internal.h | 2 +-
src/include/access/xloginsert.h | 8 +-
src/include/access/xlogreader.h | 6 +-
src/include/access/xlogrecord.h | 8 +-
src/include/access/xlogutils.h | 8 +-
src/include/catalog/binary_upgrade.h | 6 +-
src/include/catalog/catalog.h | 5 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/storage.h | 10 +-
src/include/catalog/storage_xlog.h | 8 +-
src/include/commands/sequence.h | 4 +-
src/include/commands/tablecmds.h | 2 +-
src/include/commands/tablespace.h | 2 +-
src/include/common/relpath.h | 24 +-
src/include/nodes/parsenodes.h | 8 +-
src/include/postgres_ext.h | 7 +
src/include/postmaster/bgwriter.h | 2 +-
src/include/replication/reorderbuffer.h | 6 +-
src/include/storage/buf_internals.h | 28 +--
src/include/storage/bufmgr.h | 16 +-
src/include/storage/freespace.h | 4 +-
src/include/storage/md.h | 6 +-
src/include/storage/relfilelocator.h | 99 +++++++++
src/include/storage/relfilenode.h | 99 ---------
src/include/storage/sinval.h | 4 +-
src/include/storage/smgr.h | 12 +-
src/include/storage/standby.h | 6 +-
src/include/storage/sync.h | 4 +-
src/include/utils/inval.h | 4 +-
src/include/utils/rel.h | 46 ++--
src/include/utils/relcache.h | 8 +-
src/include/utils/relfilenodemap.h | 18 --
src/include/utils/relfilenumbermap.h | 19 ++
src/include/utils/relmapper.h | 13 +-
src/test/recovery/t/018_wal_optimize.pl | 2 +-
src/tools/pgindent/typedefs.list | 10 +-
141 files changed, 2070 insertions(+), 2042 deletions(-)
delete mode 100644 src/backend/utils/cache/relfilenodemap.c
create mode 100644 src/backend/utils/cache/relfilenumbermap.c
delete mode 100644 src/bin/pg_upgrade/relfilenode.c
create mode 100644 src/bin/pg_upgrade/relfilenumber.c
create mode 100644 src/include/storage/relfilelocator.h
delete mode 100644 src/include/storage/relfilenode.h
delete mode 100644 src/include/utils/relfilenodemap.h
create mode 100644 src/include/utils/relfilenumbermap.h
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 82378db..e64291e 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -179,7 +179,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index a3e358b..cadba3b 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -30,7 +30,7 @@ struct options
{
eary *tables;
eary *oids;
- eary *filenodes;
+ eary *filenumbers;
bool quiet;
bool systables;
@@ -125,9 +125,9 @@ get_opts(int argc, char **argv, struct options *my_opts)
my_opts->dbname = pg_strdup(optarg);
break;
- /* specify one filenode to show */
+ /* specify one filenumber to show */
case 'f':
- add_one_elt(optarg, my_opts->filenodes);
+ add_one_elt(optarg, my_opts->filenumbers);
break;
/* host to connect to */
@@ -494,7 +494,7 @@ sql_exec_dumpalltables(PGconn *conn, struct options *opts)
}
/*
- * Show oid, filenode, name, schema and tablespace for each of the
+ * Show oid, filenumber, name, schema and tablespace for each of the
* given objects in the current database.
*/
void
@@ -504,19 +504,19 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
char *qualifiers,
*ptr;
char *comma_oids,
- *comma_filenodes,
+ *comma_filenumbers,
*comma_tables;
bool written = false;
char *addfields = ",c.oid AS \"Oid\", nspname AS \"Schema\", spcname as \"Tablespace\" ";
- /* get tables qualifiers, whether names, filenodes, or OIDs */
+ /* get tables qualifiers, whether names, filenumbers, or OIDs */
comma_oids = get_comma_elts(opts->oids);
comma_tables = get_comma_elts(opts->tables);
- comma_filenodes = get_comma_elts(opts->filenodes);
+ comma_filenumbers = get_comma_elts(opts->filenumbers);
/* 80 extra chars for SQL expression */
qualifiers = (char *) pg_malloc(strlen(comma_oids) + strlen(comma_tables) +
- strlen(comma_filenodes) + 80);
+ strlen(comma_filenumbers) + 80);
ptr = qualifiers;
if (opts->oids->num > 0)
@@ -524,11 +524,11 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
ptr += sprintf(ptr, "c.oid IN (%s)", comma_oids);
written = true;
}
- if (opts->filenodes->num > 0)
+ if (opts->filenumbers->num > 0)
{
if (written)
ptr += sprintf(ptr, " OR ");
- ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenodes);
+ ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenumbers);
written = true;
}
if (opts->tables->num > 0)
@@ -539,7 +539,7 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
}
free(comma_oids);
free(comma_tables);
- free(comma_filenodes);
+ free(comma_filenumbers);
/* now build the query */
todo = psprintf("SELECT pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\", relname as \"Table Name\" %s\n"
@@ -588,11 +588,11 @@ main(int argc, char **argv)
my_opts->oids = (eary *) pg_malloc(sizeof(eary));
my_opts->tables = (eary *) pg_malloc(sizeof(eary));
- my_opts->filenodes = (eary *) pg_malloc(sizeof(eary));
+ my_opts->filenumbers = (eary *) pg_malloc(sizeof(eary));
my_opts->oids->num = my_opts->oids->alloc = 0;
my_opts->tables->num = my_opts->tables->alloc = 0;
- my_opts->filenodes->num = my_opts->filenodes->alloc = 0;
+ my_opts->filenumbers->num = my_opts->filenumbers->alloc = 0;
/* parse the opts */
get_opts(argc, argv, my_opts);
@@ -618,7 +618,7 @@ main(int argc, char **argv)
/* display the given elements in the database */
if (my_opts->oids->num > 0 ||
my_opts->tables->num > 0 ||
- my_opts->filenodes->num > 0)
+ my_opts->filenumbers->num > 0)
{
if (!my_opts->quiet)
printf("From database \"%s\":\n", my_opts->dbname);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..713f52a 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
+ fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c0c4f5d..7f1d55c 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -52,7 +52,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/resowner.h"
#define AUTOPREWARM_FILE "autoprewarm.blocks"
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ RelFileNumber filenumber;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -347,7 +347,7 @@ apw_load_buffers(void)
unsigned forknum;
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
- &blkinfo[i].tablespace, &blkinfo[i].filenode,
+ &blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
(errmsg("autoprewarm block dump file is corrupted at line %d",
@@ -494,7 +494,7 @@ autoprewarm_database_main(Datum main_arg)
* relation. Note that rel will be NULL if try_relation_open failed
* previously; in that case, there is nothing to close.
*/
- if (old_blk != NULL && old_blk->filenode != blk->filenode &&
+ if (old_blk != NULL && old_blk->filenumber != blk->filenumber &&
rel != NULL)
{
relation_close(rel, AccessShareLock);
@@ -506,13 +506,13 @@ autoprewarm_database_main(Datum main_arg)
* Try to open each new relation, but only once, when we first
* encounter it. If it's been dropped, skip the associated blocks.
*/
- if (old_blk == NULL || old_blk->filenode != blk->filenode)
+ if (old_blk == NULL || old_blk->filenumber != blk->filenumber)
{
Oid reloid;
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenode(blk->tablespace, blk->filenode);
+ reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
@@ -527,7 +527,7 @@ autoprewarm_database_main(Datum main_arg)
/* Once per fork, check for fork existence and size. */
if (old_blk == NULL ||
- old_blk->filenode != blk->filenode ||
+ old_blk->filenumber != blk->filenumber ||
old_blk->forknum != blk->forknum)
{
/*
@@ -631,9 +631,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
+ block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
@@ -671,7 +671,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
- block_info_array[i].filenode,
+ block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
if (ret < 0)
@@ -900,7 +900,7 @@ do { \
* We depend on all records for a particular database being consecutive
* in the dump file; each per-database worker will preload blocks until
* it sees a block for some other database. Sorting by tablespace,
- * filenode, forknum, and blocknum isn't critical for correctness, but
+ * filenumber, forknum, and blocknum isn't critical for correctness, but
* helps us get a sequential I/O pattern.
*/
static int
@@ -911,7 +911,7 @@ apw_compare_blockinfo(const void *p, const void *q)
cmp_member_elem(database);
cmp_member_elem(tablespace);
- cmp_member_elem(filenode);
+ cmp_member_elem(filenumber);
cmp_member_elem(forknum);
cmp_member_elem(blocknum);
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 1853c35..4e2e9ea 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -407,7 +407,7 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
xl_smgr_truncate xlrec;
xlrec.blkno = 0;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_VM;
XLogBeginInsert();
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..ad48cb7 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -90,7 +90,7 @@ bool trace_syncscan = false;
*/
typedef struct ss_scan_location_t
{
- RelFileNode relfilenode; /* identity of a relation */
+ RelFileLocator relfilelocator; /* identity of a relation */
BlockNumber location; /* last-reported location in the relation */
} ss_scan_location_t;
@@ -115,7 +115,7 @@ typedef struct ss_scan_locations_t
static ss_scan_locations_t *scan_locations;
/* prototypes for internal functions */
-static BlockNumber ss_search(RelFileNode relfilenode,
+static BlockNumber ss_search(RelFileLocator relfilelocator,
BlockNumber location, bool set);
@@ -159,9 +159,9 @@ SyncScanShmemInit(void)
* these invalid entries will fall off the LRU list and get
* replaced with real entries.
*/
- item->location.relfilenode.spcNode = InvalidOid;
- item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
@@ -176,10 +176,10 @@ SyncScanShmemInit(void)
/*
* ss_search --- search the scan_locations structure for an entry with the
- * given relfilenode.
+ * given relfilelocator.
*
* If "set" is true, the location is updated to the given location. If no
- * entry for the given relfilenode is found, it will be created at the head
+ * entry for the given relfilelocator is found, it will be created at the head
* of the list with the given location, even if "set" is false.
*
* In any case, the location after possible update is returned.
@@ -188,7 +188,7 @@ SyncScanShmemInit(void)
* data structure.
*/
static BlockNumber
-ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
+ss_search(RelFileLocator relfilelocator, BlockNumber location, bool set)
{
ss_lru_item_t *item;
@@ -197,7 +197,8 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
{
bool match;
- match = RelFileNodeEquals(item->location.relfilenode, relfilenode);
+ match = RelFileLocatorEquals(item->location.relfilelocator,
+ relfilelocator);
if (match || item->next == NULL)
{
@@ -207,7 +208,7 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
*/
if (!match)
{
- item->location.relfilenode = relfilenode;
+ item->location.relfilelocator = relfilelocator;
item->location.location = location;
}
else if (set)
@@ -255,7 +256,7 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
BlockNumber startloc;
LWLockAcquire(SyncScanLock, LW_EXCLUSIVE);
- startloc = ss_search(rel->rd_node, 0, false);
+ startloc = ss_search(rel->rd_locator, 0, false);
LWLockRelease(SyncScanLock);
/*
@@ -281,8 +282,8 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
* ss_report_location --- update the current scan location
*
* Writes an entry into the shared Sync Scan state of the form
- * (relfilenode, blocknumber), overwriting any existing entry for the
- * same relfilenode.
+ * (relfilelocator, blocknumber), overwriting any existing entry for the
+ * same relfilelocator.
*/
void
ss_report_location(Relation rel, BlockNumber location)
@@ -309,7 +310,7 @@ ss_report_location(Relation rel, BlockNumber location)
{
if (LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE))
{
- (void) ss_search(rel->rd_node, location, true);
+ (void) ss_search(rel->rd_locator, location, true);
LWLockRelease(SyncScanLock);
}
#ifdef TRACE_SYNCSCAN
diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index cc6d4e6..c75bfc2 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -470,7 +470,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
savedRightLink = GinPageGetOpaque(page)->rightlink;
/* Begin setting up WAL record */
- data.node = btree->index->rd_node;
+ data.locator = btree->index->rd_locator;
data.flags = xlflags;
if (BufferIsValid(childbuf))
{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 7409fdc..6c67744 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -235,7 +235,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
needWal = RelationNeedsWAL(index);
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 20f4706..6df7f2e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -688,7 +688,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats, bool is_build)
XLogRecPtr recptr;
ginxlogUpdateMeta data;
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
memcpy(&data.metadata, metadata, sizeof(GinMetaPageData));
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..41b9211 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileLocator locator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &locator, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index f5a5caf..374e64e 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -462,7 +462,7 @@ gist_indexsortbuild(GISTBuildState *state)
smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
if (RelationNeedsWAL(state->indexrel))
- log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM, GIST_ROOT_BLKNO,
+ log_newpage(&state->indexrel->rd_locator, MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
pfree(levelstate->pages[0]);
@@ -663,7 +663,7 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
}
if (RelationNeedsWAL(state->indexrel))
- log_newpages(&state->indexrel->rd_node, MAIN_FORKNUM, state->ready_num_pages,
+ log_newpages(&state->indexrel->rd_locator, MAIN_FORKNUM, state->ready_num_pages,
state->ready_blknos, state->ready_pages, true);
for (int i = 0; i < state->ready_num_pages; i++)
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index df70f90..b4f629f 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -191,11 +191,12 @@ gistRedoDeleteRecord(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid,
+ rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -395,7 +396,7 @@ gistRedoPageReuse(XLogReaderState *record)
*/
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
@@ -607,7 +608,7 @@ gistXLogPageReuse(Relation rel, BlockNumber blkno, FullTransactionId latestRemov
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = latestRemovedXid;
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index 62dbfc3..2e68303 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -999,10 +999,10 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rlocator);
}
action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL, true, &buffer);
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 39206d1..d2edcd4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -428,7 +428,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1019,7 +1019,7 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 637de11..aab8d6f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8189,7 +8189,7 @@ log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid,
* heap_buffer, if necessary.
*/
XLogRecPtr
-log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
+log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer, Buffer vm_buffer,
TransactionId cutoff_xid, uint8 vmflags)
{
xl_heap_visible xlrec;
@@ -8454,7 +8454,7 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
Assert(tup->t_tableOid != InvalidOid);
xlrec.top_xid = GetTopTransactionId();
- xlrec.target_node = relation->rd_node;
+ xlrec.target_locator = relation->rd_locator;
xlrec.target_tid = tup->t_self;
/*
@@ -8623,18 +8623,18 @@ heap_xlog_prune(XLogReaderState *record)
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_prune *xlrec = (xl_heap_prune *) XLogRecGetData(record);
Buffer buffer;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
* no queries running for which the removed tuples are still visible.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
/*
* If we have a full-page image, restore it (using a cleanup lock) and
@@ -8694,7 +8694,7 @@ heap_xlog_prune(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8751,9 +8751,9 @@ heap_xlog_vacuum(XLogReaderState *record)
if (BufferIsValid(buffer))
{
Size freespace = PageGetHeapFreeSpace(BufferGetPage(buffer));
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
UnlockReleaseBuffer(buffer);
@@ -8766,7 +8766,7 @@ heap_xlog_vacuum(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8786,11 +8786,11 @@ heap_xlog_visible(XLogReaderState *record)
Buffer vmbuffer = InvalidBuffer;
Buffer buffer;
Page page;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -8802,7 +8802,7 @@ heap_xlog_visible(XLogReaderState *record)
* rather than killing the transaction outright.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rlocator);
/*
* Read the heap page, if it still exists. If the heap file has dropped or
@@ -8865,7 +8865,7 @@ heap_xlog_visible(XLogReaderState *record)
* FSM data is not in the page anyway.
*/
if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
- XLogRecordPageWithFreeSpace(rnode, blkno, space);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, space);
}
/*
@@ -8890,7 +8890,7 @@ heap_xlog_visible(XLogReaderState *record)
*/
LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
- reln = CreateFakeRelcacheEntry(rnode);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, blkno, &vmbuffer);
/*
@@ -8933,13 +8933,13 @@ heap_xlog_freeze_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
TransactionId latestRemovedXid = cutoff_xid;
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -9007,10 +9007,10 @@ heap_xlog_delete(XLogReaderState *record)
ItemId lp = NULL;
HeapTupleHeader htup;
BlockNumber blkno;
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9020,7 +9020,7 @@ heap_xlog_delete(XLogReaderState *record)
*/
if (xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9086,12 +9086,12 @@ heap_xlog_insert(XLogReaderState *record)
xl_heap_header xlhdr;
uint32 newlen;
Size freespace = 0;
- RelFileNode target_node;
+ RelFileLocator target_locator;
BlockNumber blkno;
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9101,7 +9101,7 @@ heap_xlog_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9184,7 +9184,7 @@ heap_xlog_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(target_node, blkno, freespace);
+ XLogRecordPageWithFreeSpace(target_locator, blkno, freespace);
}
/*
@@ -9195,7 +9195,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_multi_insert *xlrec;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
Buffer buffer;
Page page;
@@ -9217,7 +9217,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/* check that the mutually exclusive flags are not both set */
Assert(!((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) &&
@@ -9229,7 +9229,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9331,7 +9331,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
/*
@@ -9342,7 +9342,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_update *xlrec = (xl_heap_update *) XLogRecGetData(record);
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber oldblk;
BlockNumber newblk;
ItemPointerData newtid;
@@ -9371,7 +9371,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &newblk);
if (XLogRecGetBlockTagExtended(record, 1, NULL, NULL, &oldblk, NULL))
{
/* HOT updates are never done across pages */
@@ -9388,7 +9388,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, oldblk, &vmbuffer);
@@ -9472,7 +9472,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, newblk, &vmbuffer);
@@ -9606,7 +9606,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
* totally accurate anyway.
*/
if (newaction == BLK_NEEDS_REDO && !hot_update && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, newblk, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, newblk, freespace);
}
static void
@@ -9662,13 +9662,13 @@ heap_xlog_lock(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
@@ -9735,13 +9735,13 @@ heap_xlog_lock_updated(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 444f027..7f227be 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -566,11 +566,11 @@ tuple_lock_retry:
*/
static void
-heapam_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+heapam_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
SMgrRelation srel;
@@ -591,7 +591,7 @@ heapam_relation_set_new_filenode(Relation rel,
*/
*minmulti = GetOldestMultiXactId();
- srel = RelationCreateStorage(*newrnode, persistence, true);
+ srel = RelationCreateStorage(*newrlocator, persistence, true);
/*
* If required, set up an init fork for an unlogged table so that it can
@@ -608,7 +608,7 @@ heapam_relation_set_new_filenode(Relation rel,
rel->rd_rel->relkind == RELKIND_MATVIEW ||
rel->rd_rel->relkind == RELKIND_TOASTVALUE);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(newrnode, INIT_FORKNUM);
+ log_smgrcreate(newrlocator, INIT_FORKNUM);
smgrimmedsync(srel, INIT_FORKNUM);
}
@@ -622,11 +622,11 @@ heapam_relation_nontransactional_truncate(Relation rel)
}
static void
-heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+heapam_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(*newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -640,10 +640,10 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(*newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -664,7 +664,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(newrnode, forkNum);
+ log_smgrcreate(newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
@@ -2569,7 +2569,7 @@ static const TableAmRoutine heapam_methods = {
.tuple_satisfies_snapshot = heapam_tuple_satisfies_snapshot,
.index_delete_tuples = heap_index_delete_tuples,
- .relation_set_new_filenode = heapam_relation_set_new_filenode,
+ .relation_set_new_filelocator = heapam_relation_set_new_filelocator,
.relation_nontransactional_truncate = heapam_relation_nontransactional_truncate,
.relation_copy_data = heapam_relation_copy_data,
.relation_copy_for_cluster = heapam_relation_copy_for_cluster,
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826..197f06b 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -318,7 +318,7 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -679,7 +679,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
page,
@@ -742,7 +742,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
* When doing logical decoding - which relies on using cmin/cmax of catalog
* tuples, via xl_heap_new_cid records - heap rewrites have to log enough
* information to allow the decoding backend to update its internal mapping
- * of (relfilenode,ctid) => (cmin, cmax) to be correct for the rewritten heap.
+ * of (relfilelocator,ctid) => (cmin, cmax) to be correct for the rewritten heap.
*
* For that, every time we find a tuple that's been modified in a catalog
* relation within the xmin horizon of any decoding slot, we log a mapping
@@ -1080,9 +1080,9 @@ logical_rewrite_heap_tuple(RewriteState state, ItemPointerData old_tid,
return;
/* fill out mapping information */
- map.old_node = state->rs_old_rel->rd_node;
+ map.old_locator = state->rs_old_rel->rd_locator;
map.old_tid = old_tid;
- map.new_node = state->rs_new_rel->rd_node;
+ map.new_locator = state->rs_new_rel->rd_locator;
map.new_tid = new_tid;
/* ---
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index e09f25a..ed72eb7 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -283,7 +283,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
if (XLogRecPtrIsInvalid(recptr))
{
Assert(!InRecovery);
- recptr = log_heap_visible(rel->rd_node, heapBuf, vmBuf,
+ recptr = log_heap_visible(rel->rd_locator, heapBuf, vmBuf,
cutoff_xid, flags);
/*
@@ -668,7 +668,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
* to keep checking for creation or extension of the file, which happens
* infrequently.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
UnlockRelationForExtension(rel, ExclusiveLock);
}
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 20adb60..8b96708 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -836,7 +836,7 @@ _bt_log_reuse_page(Relation rel, BlockNumber blkno, FullTransactionId safexid)
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = safexid;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 9b730f3..b52eca8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -166,7 +166,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&RelationGetSmgr(index)->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&RelationGetSmgr(index)->smgr_rlocator.locator, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9f60fa9..bd1685c 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -647,7 +647,7 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the XLOG_FPI record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(&wstate->index->rd_locator, MAIN_FORKNUM, blkno, page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index f9186ca..ad489e3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -664,11 +664,11 @@ btree_xlog_delete(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
}
/*
@@ -1006,7 +1006,7 @@ btree_xlog_reuse_page(XLogReaderState *record)
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
diff --git a/src/backend/access/rmgrdesc/genericdesc.c b/src/backend/access/rmgrdesc/genericdesc.c
index 877beb5..d8509b8 100644
--- a/src/backend/access/rmgrdesc/genericdesc.c
+++ b/src/backend/access/rmgrdesc/genericdesc.c
@@ -15,7 +15,7 @@
#include "access/generic_xlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Description of generic xlog record: write page regions that this record
diff --git a/src/backend/access/rmgrdesc/gindesc.c b/src/backend/access/rmgrdesc/gindesc.c
index 57f7bce..7d147ce 100644
--- a/src/backend/access/rmgrdesc/gindesc.c
+++ b/src/backend/access/rmgrdesc/gindesc.c
@@ -17,7 +17,7 @@
#include "access/ginxlog.h"
#include "access/xlogutils.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
desc_recompress_leaf(StringInfo buf, ginxlogRecompressDataLeaf *insertData)
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index d0c8e24..7dd3c1d 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -16,7 +16,7 @@
#include "access/gistxlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
@@ -27,8 +27,8 @@ static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..923d3bc 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -170,9 +170,9 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
- xlrec->target_node.spcNode,
- xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ xlrec->target_locator.spcOid,
+ xlrec->target_locator.dbOid,
+ xlrec->target_locator.relNumber,
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..4843cd5 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -101,8 +101,8 @@ btree_desc(StringInfo buf, XLogReaderState *record)
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..b3845f9 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -26,8 +26,8 @@ seq_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SEQ_LOG)
appendStringInfo(buf, "rel %u/%u/%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber);
}
const char *
diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 7547813..e0ee8a0 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,7 +26,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SMGR_CREATE)
{
xl_smgr_create *xlrec = (xl_smgr_create *) rec;
- char *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+ char *path = relpathperm(xlrec->rlocator, xlrec->forkNum);
appendStringInfoString(buf, path);
pfree(path);
@@ -34,7 +34,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
else if (info == XLOG_SMGR_TRUNCATE)
{
xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
- char *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+ char *path = relpathperm(xlrec->rlocator, MAIN_FORKNUM);
appendStringInfo(buf, "%s to %u blocks flags %d", path,
xlrec->blkno, xlrec->flags);
diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 90b6ac2..39752cf 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -73,15 +73,15 @@ ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *pars
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocators = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocators->nrels;
+ parsed->xlocators = xl_rellocators->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocators->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -179,15 +179,15 @@ ParseAbortRecord(uint8 info, xl_xact_abort *xlrec, xl_xact_parsed_abort *parsed)
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocator = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocator->nrels;
+ parsed->xlocators = xl_rellocator->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocator->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -260,11 +260,11 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
parsed->subxacts = (TransactionId *) bufptr;
bufptr += MAXALIGN(xlrec->nsubxacts * sizeof(TransactionId));
- parsed->xnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileNode));
+ parsed->xlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileLocator));
- parsed->abortnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileNode));
+ parsed->abortlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileLocator));
parsed->stats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(xlrec->ncommitstats * sizeof(xl_xact_stats_item));
@@ -278,7 +278,7 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
static void
xact_desc_relations(StringInfo buf, char *label, int nrels,
- RelFileNode *xnodes)
+ RelFileLocator *xlocators)
{
int i;
@@ -287,7 +287,7 @@ xact_desc_relations(StringInfo buf, char *label, int nrels,
appendStringInfo(buf, "; %s:", label);
for (i = 0; i < nrels; i++)
{
- char *path = relpathperm(xnodes[i], MAIN_FORKNUM);
+ char *path = relpathperm(xlocators[i], MAIN_FORKNUM);
appendStringInfo(buf, " %s", path);
pfree(path);
@@ -340,7 +340,7 @@ xact_desc_commit(StringInfo buf, uint8 info, xl_xact_commit *xlrec, RepOriginId
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
xact_desc_stats(buf, "", parsed.nstats, parsed.stats);
@@ -376,7 +376,7 @@ xact_desc_abort(StringInfo buf, uint8 info, xl_xact_abort *xlrec, RepOriginId or
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
if (parsed.xinfo & XACT_XINFO_HAS_ORIGIN)
@@ -400,9 +400,9 @@ xact_desc_prepare(StringInfo buf, uint8 info, xl_xact_prepare *xlrec, RepOriginI
appendStringInfo(buf, "gid %s: ", parsed.twophase_gid);
appendStringInfoString(buf, timestamptz_to_str(parsed.xact_time));
- xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xlocators);
xact_desc_relations(buf, "rels(abort)", parsed.nabortrels,
- parsed.abortnodes);
+ parsed.abortlocators);
xact_desc_stats(buf, "commit ", parsed.nstats, parsed.stats);
xact_desc_stats(buf, "abort ", parsed.nabortstats, parsed.abortstats);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index fefc563..6fec485 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -219,12 +219,12 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (detailed_format)
@@ -239,7 +239,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
"blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
@@ -299,7 +299,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
}
@@ -308,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index bfb7404..c6821b5 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index b500b2c..4c9f402 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -877,11 +877,11 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
if (TransactionIdIsValid(xldata->newestRedirectXid))
{
- RelFileNode node;
+ RelFileLocator locator;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &locator, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
- node);
+ locator);
}
}
diff --git a/src/backend/access/table/tableamapi.c b/src/backend/access/table/tableamapi.c
index 76df798..873d961 100644
--- a/src/backend/access/table/tableamapi.c
+++ b/src/backend/access/table/tableamapi.c
@@ -82,7 +82,7 @@ GetTableAmRoutine(Oid amhandler)
Assert(routine->tuple_update != NULL);
Assert(routine->tuple_lock != NULL);
- Assert(routine->relation_set_new_filenode != NULL);
+ Assert(routine->relation_set_new_filelocator != NULL);
Assert(routine->relation_nontransactional_truncate != NULL);
Assert(routine->relation_copy_data != NULL);
Assert(routine->relation_copy_for_cluster != NULL);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..565f994 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -557,7 +557,7 @@ void XLogRegisterBuffer(uint8 block_id, Buffer buf, uint8 flags);
XLogRegisterBuffer adds information about a data block to the WAL record.
block_id is an arbitrary number used to identify this page reference in
the redo routine. The information needed to re-find the page at redo -
- relfilenode, fork, and block number - are included in the WAL record.
+ relfilenumber, fork, and block number - are included in the WAL record.
XLogInsert will automatically include a full copy of the page contents, if
this is the first modification of the buffer since the last checkpoint.
@@ -692,7 +692,7 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
+because we check for on-disk collisions when allocating new relfilenumber
OIDs. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
@@ -725,10 +725,10 @@ then restart recovery. This is part of the reason for not writing a WAL
entry until we've successfully done the original action.
-Skipping WAL for New RelFileNode
+Skipping WAL for New RelFileLocator
--------------------------------
-Under wal_level=minimal, if a change modifies a relfilenode that ROLLBACK
+Under wal_level=minimal, if a change modifies a relfilenumber that ROLLBACK
would unlink, in-tree access methods write no WAL for that change. Code that
writes WAL without calling RelationNeedsWAL() must check for this case. This
skipping is mandatory. If a WAL-writing change preceded a WAL-skipping change
@@ -748,9 +748,9 @@ unconditionally for permanent relations. Under these approaches, the access
method callbacks must not call functions that react to RelationNeedsWAL().
This applies only to WAL records whose replay would modify bytes stored in the
-new relfilenode. It does not apply to other records about the relfilenode,
+new relfilenumber. It does not apply to other records about the relfilenumber,
such as XLOG_SMGR_CREATE. Because it operates at the level of individual
-relfilenodes, RelationNeedsWAL() can differ for tightly-coupled relations.
+relfilenumbers, RelationNeedsWAL() can differ for tightly-coupled relations.
Consider "CREATE TABLE t (); BEGIN; ALTER TABLE t ADD c text; ..." in which
ALTER TABLE adds a TOAST relation. The TOAST relation will skip WAL, while
the table owning it will not. ALTER TABLE SET TABLESPACE will cause a table
@@ -860,7 +860,7 @@ Changes to a temp table are not WAL-logged, hence could reach disk in
advance of T1's commit, but we don't care since temp table contents don't
survive crashes anyway.
-Database writes that skip WAL for new relfilenodes are also safe. In these
+Database writes that skip WAL for new relfilenumbers are also safe. In these
cases it's entirely possible for the data to reach disk before T1's commit,
because T1 will fsync it down to disk without any sort of interlock. However,
all these paths are designed to write data that no other transaction can see
diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel
index 99c588d..e486bff 100644
--- a/src/backend/access/transam/README.parallel
+++ b/src/backend/access/transam/README.parallel
@@ -126,7 +126,7 @@ worker. This includes:
an index that is currently being rebuilt.
- Active relmapper.c mapping state. This is needed to allow consistent
- answers when fetching the current relfilenode for relation oids of
+ answers when fetching the current relfilenumber for relation oids of
mapped relations.
To prevent unprincipled deadlocks when running in parallel mode, this code
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 75551f6..41b31c5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -204,7 +204,7 @@ static void RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -215,7 +215,7 @@ static void RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid);
@@ -951,8 +951,8 @@ TwoPhaseGetDummyProc(TransactionId xid, bool lock_held)
*
* 1. TwoPhaseFileHeader
* 2. TransactionId[] (subtransactions)
- * 3. RelFileNode[] (files to be deleted at commit)
- * 4. RelFileNode[] (files to be deleted at abort)
+ * 3. RelFileLocator[] (files to be deleted at commit)
+ * 4. RelFileLocator[] (files to be deleted at abort)
* 5. SharedInvalidationMessage[] (inval messages to be sent at commit)
* 6. TwoPhaseRecordOnDisk
* 7. ...
@@ -1047,8 +1047,8 @@ StartPrepare(GlobalTransaction gxact)
TransactionId xid = gxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
xl_xact_stats_item *abortstats = NULL;
xl_xact_stats_item *commitstats = NULL;
SharedInvalidationMessage *invalmsgs;
@@ -1102,12 +1102,12 @@ StartPrepare(GlobalTransaction gxact)
}
if (hdr.ncommitrels > 0)
{
- save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileNode));
+ save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileLocator));
pfree(commitrels);
}
if (hdr.nabortrels > 0)
{
- save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileNode));
+ save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileLocator));
pfree(abortrels);
}
if (hdr.ncommitstats > 0)
@@ -1489,9 +1489,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
TwoPhaseFileHeader *hdr;
TransactionId latestXid;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
- RelFileNode *delrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
+ RelFileLocator *delrels;
int ndelrels;
xl_xact_stats_item *commitstats;
xl_xact_stats_item *abortstats;
@@ -1525,10 +1525,10 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
bufptr += MAXALIGN(hdr->gidlen);
children = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- commitrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- abortrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ commitrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ abortrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
commitstats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
abortstats = (xl_xact_stats_item *) bufptr;
@@ -2100,8 +2100,8 @@ RecoverPreparedTransactions(void)
bufptr += MAXALIGN(hdr->gidlen);
subxids = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->nabortstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->ninvalmsgs * sizeof(SharedInvalidationMessage));
@@ -2285,7 +2285,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -2383,7 +2383,7 @@ RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..849a7ce 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -521,7 +521,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
+ * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
* catalog/catalog.c.
*/
Oid
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index bd60b55..116de11 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1282,7 +1282,7 @@ RecordTransactionCommit(void)
bool markXidCommitted = TransactionIdIsValid(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int nchildren;
TransactionId *children;
int ndroppedstats = 0;
@@ -1705,7 +1705,7 @@ RecordTransactionAbort(bool isSubXact)
TransactionId xid = GetCurrentTransactionIdIfAny();
TransactionId latestXid;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int ndroppedstats = 0;
xl_xact_stats_item *droppedstats = NULL;
int nchildren;
@@ -5586,7 +5586,7 @@ xactGetCommittedChildren(TransactionId **ptr)
XLogRecPtr
XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int nmsgs, SharedInvalidationMessage *msgs,
bool relcacheInval,
@@ -5597,7 +5597,7 @@ XactLogCommitRecord(TimestampTz commit_time,
xl_xact_xinfo xl_xinfo;
xl_xact_dbinfo xl_dbinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_invals xl_invals;
xl_xact_twophase xl_twophase;
@@ -5651,8 +5651,8 @@ XactLogCommitRecord(TimestampTz commit_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5710,12 +5710,12 @@ XactLogCommitRecord(TimestampTz commit_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -5758,7 +5758,7 @@ XactLogCommitRecord(TimestampTz commit_time,
XLogRecPtr
XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int xactflags, TransactionId twophase_xid,
const char *twophase_gid)
@@ -5766,7 +5766,7 @@ XactLogAbortRecord(TimestampTz abort_time,
xl_xact_abort xlrec;
xl_xact_xinfo xl_xinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_twophase xl_twophase;
xl_xact_dbinfo xl_dbinfo;
@@ -5800,8 +5800,8 @@ XactLogAbortRecord(TimestampTz abort_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5864,12 +5864,12 @@ XactLogAbortRecord(TimestampTz abort_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -6010,7 +6010,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
XLogFlush(lsn);
/* Make sure files supposed to be dropped are dropped */
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
@@ -6121,7 +6121,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid,
*/
XLogFlush(lsn);
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 2ce9be2..ec27d36 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -70,7 +70,7 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ RelFileLocator rlocator; /* identifies the relation and block */
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -257,7 +257,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->rlocator, ®buf->forkno, ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -278,7 +278,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -293,7 +293,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
+XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
BlockNumber blknum, Page page, uint8 flags)
{
registered_buffer *regbuf;
@@ -308,7 +308,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
- regbuf->rnode = *rnode;
+ regbuf->rlocator = *rlocator;
regbuf->forkno = forknum;
regbuf->block = blknum;
regbuf->page = page;
@@ -331,7 +331,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -768,7 +768,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && RelFileLocatorEquals(regbuf->rlocator, prev_regbuf->rlocator))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -793,8 +793,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
- memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
- scratch += sizeof(RelFileNode);
+ memcpy(scratch, ®buf->rlocator, sizeof(RelFileLocator));
+ scratch += sizeof(RelFileLocator);
}
memcpy(scratch, ®buf->block, sizeof(BlockNumber));
scratch += sizeof(BlockNumber);
@@ -1031,7 +1031,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags = 0;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkno;
BlockNumber blkno;
@@ -1058,8 +1058,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &rlocator, &forkno, &blkno);
+ XLogRegisterBlock(0, &rlocator, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1080,7 +1080,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
+log_newpage(RelFileLocator *rlocator, ForkNumber forkNum, BlockNumber blkno,
Page page, bool page_std)
{
int flags;
@@ -1091,7 +1091,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, rlocator, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1112,7 +1112,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
* because we can write multiple pages in a single WAL record.
*/
void
-log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, Page *pages, bool page_std)
{
int flags;
@@ -1142,7 +1142,7 @@ log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
nbatch = 0;
while (nbatch < XLR_MAX_BLOCK_ID && i < num_pages)
{
- XLogRegisterBlock(nbatch, rnode, forkNum, blknos[i], pages[i], flags);
+ XLogRegisterBlock(nbatch, rlocator, forkNum, blknos[i], pages[i], flags);
i++;
nbatch++;
}
@@ -1177,16 +1177,16 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
BlockNumber blkno;
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &rlocator, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rlocator, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 959e409..d1662f3 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -138,7 +138,7 @@ struct XLogPrefetcher
dlist_head filter_queue;
/* Book-keeping to avoid repeat prefetches. */
- RelFileNode recent_rnode[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
+ RelFileLocator recent_rlocator[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
BlockNumber recent_block[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
int recent_idx;
@@ -161,7 +161,7 @@ struct XLogPrefetcher
*/
typedef struct XLogPrefetcherFilter
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
XLogRecPtr filter_until_replayed;
BlockNumber filter_from_block;
dlist_node link;
@@ -187,11 +187,11 @@ typedef struct XLogPrefetchStats
} XLogPrefetchStats;
static inline void XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno,
XLogRecPtr lsn);
static inline bool XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno);
static inline void XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher,
XLogRecPtr replaying_lsn);
@@ -365,7 +365,7 @@ XLogPrefetcherAllocate(XLogReaderState *reader)
{
XLogPrefetcher *prefetcher;
static HASHCTL hash_table_ctl = {
- .keysize = sizeof(RelFileNode),
+ .keysize = sizeof(RelFileLocator),
.entrysize = sizeof(XLogPrefetcherFilter)
};
@@ -568,22 +568,22 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
xl_dbase_create_file_copy_rec *xlrec =
(xl_dbase_create_file_copy_rec *) record->main_data;
- RelFileNode rnode = {InvalidOid, xlrec->db_id, InvalidOid};
+ RelFileLocator rlocator = {InvalidOid, xlrec->db_id, InvalidOid};
/*
* Don't try to prefetch anything in this database until
* it has been created, or we might confuse the blocks of
- * different generations, if a database OID or relfilenode
- * is reused. It's also more efficient than discovering
- * that relations don't exist on disk yet with ENOENT
- * errors.
+ * different generations, if a database OID or
+ * relfilenumber is reused. It's also more efficient than
+ * discovering that relations don't exist on disk yet with
+ * ENOENT errors.
*/
- XLogPrefetcherAddFilter(prefetcher, rnode, 0, record->lsn);
+ XLogPrefetcherAddFilter(prefetcher, rlocator, 0, record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in database %u until %X/%X is replayed due to raw file copy",
- rnode.dbNode,
+ rlocator.dbOid,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -601,19 +601,19 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't prefetch anything for this whole relation
* until it has been created. Otherwise we might
* confuse the blocks of different generations, if a
- * relfilenode is reused. This also avoids the need
+ * relfilenumber is reused. This also avoids the need
* to discover the problem via extra syscalls that
* report ENOENT.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -627,16 +627,16 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't consider prefetching anything in the truncated
* range until the truncation has been performed.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator,
xlrec->blkno,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
xlrec->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
@@ -688,7 +688,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
}
/* Should we skip prefetching this block due to a filter? */
- if (XLogPrefetcherIsFiltered(prefetcher, block->rnode, block->blkno))
+ if (XLogPrefetcherIsFiltered(prefetcher, block->rlocator, block->blkno))
{
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -698,7 +698,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
for (int i = 0; i < XLOGPREFETCHER_SEQ_WINDOW_SIZE; ++i)
{
if (block->blkno == prefetcher->recent_block[i] &&
- RelFileNodeEquals(block->rnode, prefetcher->recent_rnode[i]))
+ RelFileLocatorEquals(block->rlocator, prefetcher->recent_rlocator[i]))
{
/*
* XXX If we also remembered where it was, we could set
@@ -709,7 +709,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
return LRQ_NEXT_NO_IO;
}
}
- prefetcher->recent_rnode[prefetcher->recent_idx] = block->rnode;
+ prefetcher->recent_rlocator[prefetcher->recent_idx] = block->rlocator;
prefetcher->recent_block[prefetcher->recent_idx] = block->blkno;
prefetcher->recent_idx =
(prefetcher->recent_idx + 1) % XLOGPREFETCHER_SEQ_WINDOW_SIZE;
@@ -719,7 +719,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* same relation (with some scheme to handle invalidations
* safely), but for now we'll call smgropen() every time.
*/
- reln = smgropen(block->rnode, InvalidBackendId);
+ reln = smgropen(block->rlocator, InvalidBackendId);
/*
* If the relation file doesn't exist on disk, for example because
@@ -733,12 +733,12 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, 0,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -754,13 +754,13 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, block->blkno,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, block->blkno,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -793,9 +793,9 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
*/
elog(ERROR,
"could not prefetch relation %u/%u/%u block %u",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno);
}
}
@@ -852,17 +852,17 @@ pg_stat_get_recovery_prefetch(PG_FUNCTION_ARGS)
}
/*
- * Don't prefetch any blocks >= 'blockno' from a given 'rnode', until 'lsn'
+ * Don't prefetch any blocks >= 'blockno' from a given 'rlocator', until 'lsn'
* has been replayed.
*/
static inline void
-XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno, XLogRecPtr lsn)
{
XLogPrefetcherFilter *filter;
bool found;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_ENTER, &found);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_ENTER, &found);
if (!found)
{
/*
@@ -875,7 +875,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
else
{
/*
- * We were already filtering this rnode. Extend the filter's lifetime
+ * We were already filtering this rlocator. Extend the filter's lifetime
* to cover this WAL record, but leave the lower of the block numbers
* there because we don't want to have to track individual blocks.
*/
@@ -890,7 +890,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
* Have we replayed any records that caused us to begin filtering a block
* range? That means that relations should have been created, extended or
* dropped as required, so we can stop filtering out accesses to a given
- * relfilenode.
+ * relfilenumber.
*/
static inline void
XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_lsn)
@@ -913,7 +913,7 @@ XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_l
* Check if a given block should be skipped due to a filter.
*/
static inline bool
-XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno)
{
/*
@@ -925,13 +925,13 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
XLogPrefetcherFilter *filter;
/* See if the block range is filtered. */
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter && filter->filter_from_block <= blockno)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
#endif
@@ -939,15 +939,15 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
}
/* See if the whole database is filtered. */
- rnode.relNode = InvalidOid;
- rnode.spcNode = InvalidOid;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ rlocator.relNumber = InvalidRelFileNumber;
+ rlocator.spcOid = InvalidOid;
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
return true;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cf5db23..f3dc4b7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1638,7 +1638,7 @@ DecodeXLogRecord(XLogReaderState *state,
char *out;
uint32 remaining;
uint32 datatotal;
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
uint8 block_id;
decoded->header = *record;
@@ -1823,12 +1823,12 @@ DecodeXLogRecord(XLogReaderState *state,
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
- COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
- rnode = &blk->rnode;
+ COPY_HEADER_FIELD(&blk->rlocator, sizeof(RelFileLocator));
+ rlocator = &blk->rlocator;
}
else
{
- if (rnode == NULL)
+ if (rlocator == NULL)
{
report_invalid_record(state,
"BKPBLOCK_SAME_REL set but no previous rel at %X/%X",
@@ -1836,7 +1836,7 @@ DecodeXLogRecord(XLogReaderState *state,
goto err;
}
- blk->rnode = *rnode;
+ blk->rlocator = *rlocator;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
}
@@ -1926,10 +1926,11 @@ err:
*/
void
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
+ RelFileLocator *rlocator, ForkNumber *forknum,
+ BlockNumber *blknum)
{
- if (!XLogRecGetBlockTagExtended(record, block_id, rnode, forknum, blknum,
- NULL))
+ if (!XLogRecGetBlockTagExtended(record, block_id, rlocator, forknum,
+ blknum, NULL))
{
#ifndef FRONTEND
elog(ERROR, "failed to locate backup block with ID %d in WAL record",
@@ -1945,13 +1946,13 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
* Returns information about the block that a block reference refers to,
* optionally including the buffer that the block may already be in.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
+ * If the WAL record contains a block reference with the given ID, *rlocator,
* *forknum, *blknum and *prefetch_buffer are filled in (if not NULL), and
* returns true. Otherwise returns false.
*/
bool
XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer)
{
@@ -1961,8 +1962,8 @@ XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->record->blocks[block_id];
- if (rnode)
- *rnode = bkpb->rnode;
+ if (rlocator)
+ *rlocator = bkpb->rlocator;
if (forknum)
*forknum = bkpb->forknum;
if (blknum)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e23451b..5d6f1b5 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2166,24 +2166,26 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
/* decode block references */
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (forknum != MAIN_FORKNUM)
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
forknum,
blk);
else
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
@@ -2285,7 +2287,7 @@ static void
verifyBackupPageConsistency(XLogReaderState *record)
{
RmgrData rmgr = GetRmgr(XLogRecGetRmid(record));
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
int block_id;
@@ -2302,7 +2304,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
Page page;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -2327,7 +2329,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(rlocator, forknum, blkno,
RBM_NORMAL_NO_LOG,
InvalidBuffer);
if (!BufferIsValid(buf))
@@ -2377,7 +2379,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
{
elog(FATAL,
"inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 4851669..42a0f51 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -67,7 +67,7 @@ HotStandbyState standbyState = STANDBY_DISABLED;
*/
typedef struct xl_invalid_page_key
{
- RelFileNode node; /* the relation */
+ RelFileLocator locator; /* the relation */
ForkNumber forkno; /* the fork number */
BlockNumber blkno; /* the page */
} xl_invalid_page_key;
@@ -86,10 +86,10 @@ static int read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPa
/* Report a reference to an invalid page */
static void
-report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
+report_invalid_page(int elevel, RelFileLocator locator, ForkNumber forkno,
BlockNumber blkno, bool present)
{
- char *path = relpathperm(node, forkno);
+ char *path = relpathperm(locator, forkno);
if (present)
elog(elevel, "page %u of relation %s is uninitialized",
@@ -102,7 +102,7 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
/* Log a reference to an invalid page */
static void
-log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
+log_invalid_page(RelFileLocator locator, ForkNumber forkno, BlockNumber blkno,
bool present)
{
xl_invalid_page_key key;
@@ -119,7 +119,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
*/
if (reachedConsistency)
{
- report_invalid_page(WARNING, node, forkno, blkno, present);
+ report_invalid_page(WARNING, locator, forkno, blkno, present);
elog(ignore_invalid_pages ? WARNING : PANIC,
"WAL contains references to invalid pages");
}
@@ -130,7 +130,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
* something about the XLOG record that generated the reference).
*/
if (message_level_is_interesting(DEBUG1))
- report_invalid_page(DEBUG1, node, forkno, blkno, present);
+ report_invalid_page(DEBUG1, locator, forkno, blkno, present);
if (invalid_page_tab == NULL)
{
@@ -147,7 +147,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
}
/* we currently assume xl_invalid_page_key contains no padding */
- key.node = node;
+ key.locator = locator;
key.forkno = forkno;
key.blkno = blkno;
hentry = (xl_invalid_page *)
@@ -166,7 +166,8 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
/* Forget any invalid pages >= minblkno, because they've been dropped */
static void
-forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
+forget_invalid_pages(RelFileLocator locator, ForkNumber forkno,
+ BlockNumber minblkno)
{
HASH_SEQ_STATUS status;
xl_invalid_page *hentry;
@@ -178,13 +179,13 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (RelFileNodeEquals(hentry->key.node, node) &&
+ if (RelFileLocatorEquals(hentry->key.locator, locator) &&
hentry->key.forkno == forkno &&
hentry->key.blkno >= minblkno)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, forkno);
+ char *path = relpathperm(hentry->key.locator, forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -213,11 +214,11 @@ forget_invalid_pages_db(Oid dbid)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (hentry->key.node.dbNode == dbid)
+ if (hentry->key.locator.dbOid == dbid)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, hentry->key.forkno);
+ char *path = relpathperm(hentry->key.locator, hentry->key.forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -261,7 +262,7 @@ XLogCheckInvalidPages(void)
*/
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- report_invalid_page(WARNING, hentry->key.node, hentry->key.forkno,
+ report_invalid_page(WARNING, hentry->key.locator, hentry->key.forkno,
hentry->key.blkno, hentry->present);
foundone = true;
}
@@ -356,7 +357,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Buffer *buf)
{
XLogRecPtr lsn = record->EndRecPtr;
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
Buffer prefetch_buffer;
@@ -364,7 +365,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
bool zeromode;
bool willinit;
- if (!XLogRecGetBlockTagExtended(record, block_id, &rnode, &forknum, &blkno,
+ if (!XLogRecGetBlockTagExtended(record, block_id, &rlocator, &forknum, &blkno,
&prefetch_buffer))
{
/* Caller specified a bogus block_id */
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK,
prefetch_buffer);
page = BufferGetPage(*buf);
@@ -418,7 +419,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode, prefetch_buffer);
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno, mode, prefetch_buffer);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -468,7 +469,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* they will be invisible to tools that need to know which pages are modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer)
{
@@ -481,14 +482,14 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* Do we have a clue where the buffer might be already? */
if (BufferIsValid(recent_buffer) &&
mode == RBM_NORMAL &&
- ReadRecentBuffer(rnode, forknum, blkno, recent_buffer))
+ ReadRecentBuffer(rlocator, forknum, blkno, recent_buffer))
{
buffer = recent_buffer;
goto recent_buffer_fast_path;
}
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(rlocator, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -505,7 +506,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
else
@@ -513,7 +514,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* hm, page doesn't exist in file */
if (mode == RBM_NORMAL)
{
- log_invalid_page(rnode, forknum, blkno, false);
+ log_invalid_page(rlocator, forknum, blkno, false);
return InvalidBuffer;
}
if (mode == RBM_NORMAL_NO_LOG)
@@ -530,7 +531,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum,
P_NEW, mode, NULL, true);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -540,7 +541,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
}
@@ -559,7 +560,7 @@ recent_buffer_fast_path:
if (PageIsNew(page))
{
ReleaseBuffer(buffer);
- log_invalid_page(rnode, forknum, blkno, true);
+ log_invalid_page(rlocator, forknum, blkno, true);
return InvalidBuffer;
}
}
@@ -594,7 +595,7 @@ typedef FakeRelCacheEntryData *FakeRelCacheEntry;
* Caller must free the returned entry with FreeFakeRelcacheEntry().
*/
Relation
-CreateFakeRelcacheEntry(RelFileNode rnode)
+CreateFakeRelcacheEntry(RelFileLocator rlocator)
{
FakeRelCacheEntry fakeentry;
Relation rel;
@@ -604,7 +605,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel = (Relation) fakeentry;
rel->rd_rel = &fakeentry->pgc;
- rel->rd_node = rnode;
+ rel->rd_locator = rlocator;
/*
* We will never be working with temp rels during recovery or while
@@ -615,18 +616,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
/* It must be a permanent table here */
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
- /* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ /* We don't know the name of the relation; use relfilelocator instead */
+ sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
+ * relation. Note that this is fairly bogus since relNumber may be
* different from the relation's OID. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
- rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
+ rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
rel->rd_smgr = NULL;
@@ -652,9 +653,9 @@ FreeFakeRelcacheEntry(Relation fakerel)
* any open "invalid-page" records for the relation.
*/
void
-XLogDropRelation(RelFileNode rnode, ForkNumber forknum)
+XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum)
{
- forget_invalid_pages(rnode, forknum, 0);
+ forget_invalid_pages(rlocator, forknum, 0);
}
/*
@@ -682,10 +683,10 @@ XLogDropDatabase(Oid dbid)
* We need to clean up any open "invalid-page" records for the dropped pages.
*/
void
-XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks)
{
- forget_invalid_pages(rnode, forkNum, nblocks);
+ forget_invalid_pages(rlocator, forkNum, nblocks);
}
/*
diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index e5cf1b3..a872199 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -287,9 +287,9 @@ Boot_DeclareIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = false;
stmt->primary = false;
stmt->isconstraint = false;
@@ -339,9 +339,9 @@ Boot_DeclareUniqueIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidOid;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
stmt->unique = true;
stmt->primary = false;
stmt->isconstraint = false;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index e784538..2a33273 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,14 +481,14 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
+ * GetNewRelFileNumber
+ * Generate a new relfilenumber that is unique within the
* database of the given tablespace.
*
- * If the relfilenode will also be used as the relation's OID, pass the
+ * If the relfilenumber will also be used as the relation's OID, pass the
* opened pg_class catalog, and this routine will guarantee that the result
* is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
+ * as a relfilenumber for an existing relation, pass NULL for pg_class.
*
* As with GetNewOidWithIndex(), there is some theoretical risk of a race
* condition, but it doesn't seem worth worrying about.
@@ -496,17 +496,17 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
* Note: we don't support using this in bootstrap mode. All relations
* created by bootstrap have preassigned OIDs, so there's no need.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
{
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
char *rpath;
bool collides;
BackendId backend;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenode assignments during a binary-upgrade run should be
+ * relfilenumber assignments during a binary-upgrade run should be
* determined by commands in the dump script.
*/
Assert(!IsBinaryUpgrade);
@@ -526,15 +526,15 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
}
/* This logic should match RelationInitPhysicalAddr */
- rnode.node.spcNode = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rnode.node.dbNode = (rnode.node.spcNode == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
+ rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
/*
* The relpath will vary based on the backend ID, so we must initialize
* that properly here to make sure that any collisions based on filename
* are properly detected.
*/
- rnode.backend = backend;
+ rlocator.backend = backend;
do
{
@@ -542,13 +542,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
/* Generate the OID */
if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
Anum_pg_class_oid);
else
- rnode.node.relNode = GetNewObjectId();
+ rlocator.locator.relNumber = GetNewObjectId();
/* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ rpath = relpath(rlocator, MAIN_FORKNUM);
if (access(rpath, F_OK) == 0)
{
@@ -570,7 +570,7 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
pfree(rpath);
} while (collides);
- return rnode.node.relNode;
+ return rlocator.locator.relNumber;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 1803194..c69c923 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -77,9 +77,11 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber =
+ InvalidRelFileNumber;
+RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber =
+ InvalidRelFileNumber;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -273,7 +275,7 @@ SystemAttributeByName(const char *attname)
* heap_create - Create an uncataloged heap relation
*
* Note API change: the caller must now always provide the OID
- * to use for the relation. The relfilenode may be (and in
+ * to use for the relation. The relfilenumber may be (and in
* the simplest cases is) left unspecified.
*
* create_storage indicates whether or not to create the storage.
@@ -289,7 +291,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -341,11 +343,11 @@ heap_create(const char *relname,
else
{
/*
- * If relfilenode is unspecified by the caller then create storage
+ * If relfilenumber is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelFileNumberIsValid(relfilenumber))
+ relfilenumber = relid;
}
/*
@@ -368,7 +370,7 @@ heap_create(const char *relname,
tupDesc,
relid,
accessmtd,
- relfilenode,
+ relfilenumber,
reltablespace,
shared_relation,
mapped_relation,
@@ -385,11 +387,11 @@ heap_create(const char *relname,
if (create_storage)
{
if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind))
- table_relation_set_new_filenode(rel, &rel->rd_node,
- relpersistence,
- relfrozenxid, relminmxid);
+ table_relation_set_new_filelocator(rel, &rel->rd_locator,
+ relpersistence,
+ relfrozenxid, relminmxid);
else if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
- RelationCreateStorage(rel->rd_node, relpersistence, true);
+ RelationCreateStorage(rel->rd_locator, relpersistence, true);
else
Assert(false);
}
@@ -1069,7 +1071,7 @@ AddNewRelationType(const char *typeName,
* relkind: relkind for new rel
* relpersistence: rel's persistence status (permanent, temp, or unlogged)
* shared_relation: true if it's to be a shared relation
- * mapped_relation: true if the relation will use the relfilenode map
+ * mapped_relation: true if the relation will use the relfilenumber map
* oncommit: ON COMMIT marking (only relevant if it's a temp table)
* reloptions: reloptions in Datum form, or (Datum) 0 if none
* use_user_acl: true if should look for user-defined default permissions;
@@ -1115,7 +1117,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1173,12 +1175,12 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(relid))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
/*
@@ -1196,13 +1198,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_toast_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("toast relfilenode value not set when in binary upgrade mode")));
+ errmsg("toast relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_toast_pg_class_relfilenumber;
+ binary_upgrade_next_toast_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
else
@@ -1217,20 +1219,20 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_heap_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("relfilenode value not set when in binary upgrade mode")));
+ errmsg("relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_heap_pg_class_relfilenumber;
+ binary_upgrade_next_heap_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
+ relpersistence);
}
/*
@@ -1273,7 +1275,7 @@ heap_create_with_catalog(const char *relname,
relnamespace,
reltablespace,
relid,
- relfilenode,
+ relfilenumber,
accessmtd,
tupdesc,
relkind,
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bdd3c34..f245df8 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,8 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber =
+ InvalidRelFileNumber;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,7 +663,7 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
+ * relFileNumber: normally, pass InvalidOid to get new storage. May be
* nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
@@ -703,7 +704,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +736,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelFileNumberIsValid(relFileNumber);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -751,7 +752,7 @@ index_create(Relation heapRelation,
/*
* The index will be in the same namespace as its parent table, and is
* shared across databases if and only if the parent is. Likewise, it
- * will use the relfilenode map if and only if the parent does; and it
+ * will use the relfilenumber map if and only if the parent does; and it
* inherits the parent's relpersistence.
*/
namespaceId = RelationGetNamespace(heapRelation);
@@ -902,12 +903,12 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(indexRelationId))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
if (!OidIsValid(binary_upgrade_next_index_pg_class_oid))
@@ -918,14 +919,14 @@ index_create(Relation heapRelation,
indexRelationId = binary_upgrade_next_index_pg_class_oid;
binary_upgrade_next_index_pg_class_oid = InvalidOid;
- /* Override the index relfilenode */
+ /* Override the index relfilenumber */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!RelFileNumberIsValid(binary_upgrade_next_index_pg_class_relfilenumber)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("index relfilenode value not set when in binary upgrade mode")));
- relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ errmsg("index relfilenumber value not set when in binary upgrade mode")));
+ relFileNumber = binary_upgrade_next_index_pg_class_relfilenumber;
+ binary_upgrade_next_index_pg_class_relfilenumber = InvalidRelFileNumber;
/*
* Note that we want create_storage = true for binary upgrade. The
@@ -937,7 +938,7 @@ index_create(Relation heapRelation,
else
{
indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
}
}
@@ -950,7 +951,7 @@ index_create(Relation heapRelation,
namespaceId,
tableSpaceId,
indexRelationId,
- relFileNode,
+ relFileNumber,
accessMethodObjectId,
indexTupDesc,
relkind,
@@ -1408,7 +1409,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelFileNumber, /* relFileNumber */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
@@ -3024,7 +3025,7 @@ index_build(Relation heapRelation,
* it -- but we must first check whether one already exists. If, for
* example, an unlogged relation is truncated in the transaction that
* created it, or truncated twice in a subsequent transaction, the
- * relfilenode won't change, and nothing needs to be done here.
+ * relfilenumber won't change, and nothing needs to be done here.
*/
if (indexRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
!smgrexists(RelationGetSmgr(indexRelation), INIT_FORKNUM))
@@ -3681,7 +3682,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
* Schedule unlinking of the old index storage at transaction commit.
*/
RelationDropStorage(iRel);
- RelationAssumeNewRelfilenode(iRel);
+ RelationAssumeNewRelfilelocator(iRel);
/* Make sure the reltablespace change is visible */
CommandCounterIncrement();
@@ -3711,7 +3712,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
SetReindexProcessing(heapId, indexId);
/* Create a new physical relation for the index */
- RelationSetNewRelfilenode(iRel, persistence);
+ RelationSetNewRelfilenumber(iRel, persistence);
/* Initialize the index and rebuild */
/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index c06e414..37dd2b9 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -38,7 +38,7 @@
int wal_skip_threshold = 2048; /* in kilobytes */
/*
- * We keep a list of all relations (represented as RelFileNode values)
+ * We keep a list of all relations (represented as RelFileLocator values)
* that have been created or deleted in the current transaction. When
* a relation is created, we create the physical file immediately, but
* remember it so that we can delete the file again if the current
@@ -59,7 +59,7 @@ int wal_skip_threshold = 2048; /* in kilobytes */
typedef struct PendingRelDelete
{
- RelFileNode relnode; /* relation that may need to be deleted */
+ RelFileLocator rlocator; /* relation that may need to be deleted */
BackendId backend; /* InvalidBackendId if not a temp rel */
bool atCommit; /* T=delete at commit; F=delete at abort */
int nestLevel; /* xact nesting level of request */
@@ -68,7 +68,7 @@ typedef struct PendingRelDelete
typedef struct PendingRelSync
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
bool is_truncated; /* Has the file experienced truncation? */
} PendingRelSync;
@@ -81,7 +81,7 @@ static HTAB *pendingSyncHash = NULL;
* Queue an at-commit fsync.
*/
static void
-AddPendingSync(const RelFileNode *rnode)
+AddPendingSync(const RelFileLocator *rlocator)
{
PendingRelSync *pending;
bool found;
@@ -91,14 +91,14 @@ AddPendingSync(const RelFileNode *rnode)
{
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNode);
+ ctl.keysize = sizeof(RelFileLocator);
ctl.entrysize = sizeof(PendingRelSync);
ctl.hcxt = TopTransactionContext;
pendingSyncHash = hash_create("pending sync hash", 16, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
}
- pending = hash_search(pendingSyncHash, rnode, HASH_ENTER, &found);
+ pending = hash_search(pendingSyncHash, rlocator, HASH_ENTER, &found);
Assert(!found);
pending->is_truncated = false;
}
@@ -117,7 +117,7 @@ AddPendingSync(const RelFileNode *rnode)
* pass register_delete = false.
*/
SMgrRelation
-RelationCreateStorage(RelFileNode rnode, char relpersistence,
+RelationCreateStorage(RelFileLocator rlocator, char relpersistence,
bool register_delete)
{
SMgrRelation srel;
@@ -145,11 +145,11 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(rlocator, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
- log_smgrcreate(&srel->smgr_rnode.node, MAIN_FORKNUM);
+ log_smgrcreate(&srel->smgr_rlocator.locator, MAIN_FORKNUM);
/*
* Add the relation to the list of stuff to delete at abort, if we are
@@ -161,7 +161,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rnode;
+ pending->rlocator = rlocator;
pending->backend = backend;
pending->atCommit = false; /* delete if abort */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -172,7 +172,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
if (relpersistence == RELPERSISTENCE_PERMANENT && !XLogIsNeeded())
{
Assert(backend == InvalidBackendId);
- AddPendingSync(&rnode);
+ AddPendingSync(&rlocator);
}
return srel;
@@ -182,14 +182,14 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
* Perform XLogInsert of an XLOG_SMGR_CREATE record to WAL.
*/
void
-log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum)
+log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum)
{
xl_smgr_create xlrec;
/*
* Make an XLOG entry reporting the file creation.
*/
- xlrec.rnode = *rnode;
+ xlrec.rlocator = *rlocator;
xlrec.forkNum = forkNum;
XLogBeginInsert();
@@ -209,7 +209,7 @@ RelationDropStorage(Relation rel)
/* Add the relation to the list of stuff to delete at commit */
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rel->rd_node;
+ pending->rlocator = rel->rd_locator;
pending->backend = rel->rd_backend;
pending->atCommit = true; /* delete if commit */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -247,7 +247,7 @@ RelationDropStorage(Relation rel)
* No-op if the relation is not among those scheduled for deletion.
*/
void
-RelationPreserveStorage(RelFileNode rnode, bool atCommit)
+RelationPreserveStorage(RelFileLocator rlocator, bool atCommit)
{
PendingRelDelete *pending;
PendingRelDelete *prev;
@@ -257,7 +257,7 @@ RelationPreserveStorage(RelFileNode rnode, bool atCommit)
for (pending = pendingDeletes; pending != NULL; pending = next)
{
next = pending->next;
- if (RelFileNodeEquals(rnode, pending->relnode)
+ if (RelFileLocatorEquals(rlocator, pending->rlocator)
&& pending->atCommit == atCommit)
{
/* unlink and delete list entry */
@@ -369,7 +369,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
xl_smgr_truncate xlrec;
xlrec.blkno = nblocks;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_ALL;
XLogBeginInsert();
@@ -428,7 +428,7 @@ RelationPreTruncate(Relation rel)
return;
pending = hash_search(pendingSyncHash,
- &(RelationGetSmgr(rel)->smgr_rnode.node),
+ &(RelationGetSmgr(rel)->smgr_rlocator.locator),
HASH_FIND, NULL);
if (pending)
pending->is_truncated = true;
@@ -472,7 +472,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* We need to log the copied data in WAL iff WAL archiving/streaming is
* enabled AND it's a permanent relation. This gives the same answer as
* "RelationNeedsWAL(rel) || copying_initfork", because we know the
- * current operation created a new relfilenode.
+ * current operation created a new relfilelocator.
*/
use_wal = XLogIsNeeded() &&
(relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork);
@@ -496,8 +496,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* (errcontext callbacks shouldn't be risking any such thing, but
* people have been known to forget that rule.)
*/
- char *relpath = relpathbackend(src->smgr_rnode.node,
- src->smgr_rnode.backend,
+ char *relpath = relpathbackend(src->smgr_rlocator.locator,
+ src->smgr_rlocator.backend,
forkNum);
ereport(ERROR,
@@ -512,7 +512,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
PageSetChecksumInplace(page, blkno);
@@ -538,19 +538,19 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
}
/*
- * RelFileNodeSkippingWAL
- * Check if a BM_PERMANENT relfilenode is using WAL.
+ * RelFileLocatorSkippingWAL
+ * Check if a BM_PERMANENT relfilelocator is using WAL.
*
- * Changes of certain relfilenodes must not write WAL; see "Skipping WAL for
- * New RelFileNode" in src/backend/access/transam/README. Though it is known
- * from Relation efficiently, this function is intended for the code paths not
- * having access to Relation.
+ * Changes of certain relfilelocator must not write WAL; see "Skipping WAL for
+ * New RelFileLocator" in src/backend/access/transam/README. Though it is
+ * known from Relation efficiently, this function is intended for the code
+ * paths not having access to Relation.
*/
bool
-RelFileNodeSkippingWAL(RelFileNode rnode)
+RelFileLocatorSkippingWAL(RelFileLocator rlocator)
{
if (!pendingSyncHash ||
- hash_search(pendingSyncHash, &rnode, HASH_FIND, NULL) == NULL)
+ hash_search(pendingSyncHash, &rlocator, HASH_FIND, NULL) == NULL)
return false;
return true;
@@ -566,7 +566,7 @@ EstimatePendingSyncsSpace(void)
long entries;
entries = pendingSyncHash ? hash_get_num_entries(pendingSyncHash) : 0;
- return mul_size(1 + entries, sizeof(RelFileNode));
+ return mul_size(1 + entries, sizeof(RelFileLocator));
}
/*
@@ -581,57 +581,58 @@ SerializePendingSyncs(Size maxSize, char *startAddress)
HASH_SEQ_STATUS scan;
PendingRelSync *sync;
PendingRelDelete *delete;
- RelFileNode *src;
- RelFileNode *dest = (RelFileNode *) startAddress;
+ RelFileLocator *src;
+ RelFileLocator *dest = (RelFileLocator *) startAddress;
if (!pendingSyncHash)
goto terminate;
- /* Create temporary hash to collect active relfilenodes */
- ctl.keysize = sizeof(RelFileNode);
- ctl.entrysize = sizeof(RelFileNode);
+ /* Create temporary hash to collect active relfilelocators */
+ ctl.keysize = sizeof(RelFileLocator);
+ ctl.entrysize = sizeof(RelFileLocator);
ctl.hcxt = CurrentMemoryContext;
- tmphash = hash_create("tmp relfilenodes",
+ tmphash = hash_create("tmp relfilelocators",
hash_get_num_entries(pendingSyncHash), &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
- /* collect all rnodes from pending syncs */
+ /* collect all rlocator from pending syncs */
hash_seq_init(&scan, pendingSyncHash);
while ((sync = (PendingRelSync *) hash_seq_search(&scan)))
- (void) hash_search(tmphash, &sync->rnode, HASH_ENTER, NULL);
+ (void) hash_search(tmphash, &sync->rlocator, HASH_ENTER, NULL);
/* remove deleted rnodes */
for (delete = pendingDeletes; delete != NULL; delete = delete->next)
if (delete->atCommit)
- (void) hash_search(tmphash, (void *) &delete->relnode,
+ (void) hash_search(tmphash, (void *) &delete->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, tmphash);
- while ((src = (RelFileNode *) hash_seq_search(&scan)))
+ while ((src = (RelFileLocator *) hash_seq_search(&scan)))
*dest++ = *src;
hash_destroy(tmphash);
terminate:
- MemSet(dest, 0, sizeof(RelFileNode));
+ MemSet(dest, 0, sizeof(RelFileLocator));
}
/*
* RestorePendingSyncs
* Restore syncs within a parallel worker.
*
- * RelationNeedsWAL() and RelFileNodeSkippingWAL() must offer the correct
+ * RelationNeedsWAL() and RelFileLocatorSkippingWAL() must offer the correct
* answer to parallel workers. Only smgrDoPendingSyncs() reads the
* is_truncated field, at end of transaction. Hence, don't restore it.
*/
void
RestorePendingSyncs(char *startAddress)
{
- RelFileNode *rnode;
+ RelFileLocator *rlocator;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
- AddPendingSync(rnode);
+ for (rlocator = (RelFileLocator *) startAddress; rlocator->relNumber != 0;
+ rlocator++)
+ AddPendingSync(rlocator);
}
/*
@@ -677,7 +678,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(pending->rlocator, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -747,7 +748,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
/* Skip syncing nodes that smgrDoPendingDeletes() will delete. */
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
if (pending->atCommit)
- (void) hash_search(pendingSyncHash, (void *) &pending->relnode,
+ (void) hash_search(pendingSyncHash, (void *) &pending->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, pendingSyncHash);
@@ -758,7 +759,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
BlockNumber total_blocks = 0;
SMgrRelation srel;
- srel = smgropen(pendingsync->rnode, InvalidBackendId);
+ srel = smgropen(pendingsync->rlocator, InvalidBackendId);
/*
* We emit newpage WAL records for smaller relations.
@@ -832,7 +833,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* page including any unused space. ReadBufferExtended()
* counts some pgstat events; unfortunately, we discard them.
*/
- rel = CreateFakeRelcacheEntry(srel->smgr_rnode.node);
+ rel = CreateFakeRelcacheEntry(srel->smgr_rlocator.locator);
log_newpage_range(rel, fork, 0, n, false);
FreeFakeRelcacheEntry(rel);
}
@@ -852,7 +853,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* smgrGetPendingDeletes() -- Get a list of non-temp relations to be deleted.
*
* The return value is the number of relations scheduled for termination.
- * *ptr is set to point to a freshly-palloc'd array of RelFileNodes.
+ * *ptr is set to point to a freshly-palloc'd array of RelFileLocators.
* If there are no relations to be deleted, *ptr is set to NULL.
*
* Only non-temporary relations are included in the returned list. This is OK
@@ -866,11 +867,11 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* by upper-level transactions.
*/
int
-smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
+smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr)
{
int nestLevel = GetCurrentTransactionNestLevel();
int nrels;
- RelFileNode *rptr;
+ RelFileLocator *rptr;
PendingRelDelete *pending;
nrels = 0;
@@ -885,14 +886,14 @@ smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
*ptr = NULL;
return 0;
}
- rptr = (RelFileNode *) palloc(nrels * sizeof(RelFileNode));
+ rptr = (RelFileLocator *) palloc(nrels * sizeof(RelFileLocator));
*ptr = rptr;
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
{
if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit
&& pending->backend == InvalidBackendId)
{
- *rptr = pending->relnode;
+ *rptr = pending->rlocator;
rptr++;
}
}
@@ -967,7 +968,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -980,7 +981,7 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
@@ -1015,11 +1016,11 @@ smgr_redo(XLogReaderState *record)
nforks++;
/* Also tell xlogutils.c about it */
- XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+ XLogTruncateRelation(xlrec->rlocator, MAIN_FORKNUM, xlrec->blkno);
}
/* Prepare for truncation of FSM and VM too */
- rel = CreateFakeRelcacheEntry(xlrec->rnode);
+ rel = CreateFakeRelcacheEntry(xlrec->rlocator);
if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 &&
smgrexists(reln, FSM_FORKNUM))
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cea2c8b..da137eb 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -293,7 +293,7 @@ cluster_multiple_rels(List *rtcs, ClusterParams *params)
* cluster_rel
*
* This clusters the table by creating a new, clustered table and
- * swapping the relfilenodes of the new table and the old table, so
+ * swapping the relfilenumbers of the new table and the old table, so
* the OID of the original table is preserved. Thus we do not lose
* GRANT, inheritance nor references to this table (this was a bug
* in releases through 7.3).
@@ -1025,8 +1025,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
/*
* Swap the physical files of two given relations.
*
- * We swap the physical identity (reltablespace, relfilenode) while keeping the
- * same logical identities of the two relations. relpersistence is also
+ * We swap the physical identity (reltablespace, relfilenumber) while keeping
+ * the same logical identities of the two relations. relpersistence is also
* swapped, which is critical since it determines where buffers live for each
* relation.
*
@@ -1061,9 +1061,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
- relfilenode2;
- Oid swaptemp;
+ RelFileNumber relfilenumber1,
+ relfilenumber2;
+ RelFileNumber swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
@@ -1079,13 +1079,14 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
elog(ERROR, "cache lookup failed for relation %u", r2);
relform2 = (Form_pg_class) GETSTRUCT(reltup2);
- relfilenode1 = relform1->relfilenode;
- relfilenode2 = relform2->relfilenode;
+ relfilenumber1 = relform1->relfilenode;
+ relfilenumber2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) &&
+ RelFileNumberIsValid(relfilenumber2))
{
/*
- * Normal non-mapped relations: swap relfilenodes, reltablespaces,
+ * Normal non-mapped relations: swap relfilenumbers, reltablespaces,
* relpersistence
*/
Assert(!target_is_pg_class);
@@ -1120,7 +1121,8 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) ||
+ RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1148,12 +1150,12 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
/*
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
- relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ relfilenumber1 = RelationMapOidToFilenumber(r1, relform1->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
- relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ relfilenumber2 = RelationMapOidToFilenumber(r2, relform2->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
@@ -1161,15 +1163,15 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Send replacement mappings to relmapper. Note these won't actually
* take effect until CommandCounterIncrement.
*/
- RelationMapUpdateMap(r1, relfilenode2, relform1->relisshared, false);
- RelationMapUpdateMap(r2, relfilenode1, relform2->relisshared, false);
+ RelationMapUpdateMap(r1, relfilenumber2, relform1->relisshared, false);
+ RelationMapUpdateMap(r2, relfilenumber1, relform2->relisshared, false);
/* Pass OIDs of mapped r2 tables back to caller */
*mapped_tables++ = r2;
}
/*
- * Recognize that rel1's relfilenode (swapped from rel2) is new in this
+ * Recognize that rel1's relfilenumber (swapped from rel2) is new in this
* subtransaction. The rel2 storage (swapped from rel1) may or may not be
* new.
*/
@@ -1180,9 +1182,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
rel1 = relation_open(r1, NoLock);
rel2 = relation_open(r2, NoLock);
rel2->rd_createSubid = rel1->rd_createSubid;
- rel2->rd_newRelfilenodeSubid = rel1->rd_newRelfilenodeSubid;
- rel2->rd_firstRelfilenodeSubid = rel1->rd_firstRelfilenodeSubid;
- RelationAssumeNewRelfilenode(rel1);
+ rel2->rd_newRelfilelocatorSubid = rel1->rd_newRelfilelocatorSubid;
+ rel2->rd_firstRelfilelocatorSubid = rel1->rd_firstRelfilelocatorSubid;
+ RelationAssumeNewRelfilelocator(rel1);
relation_close(rel1, NoLock);
relation_close(rel2, NoLock);
}
@@ -1523,7 +1525,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
table_close(relRelation, RowExclusiveLock);
}
- /* Destroy new heap with old filenode */
+ /* Destroy new heap with old filenumber */
object.classId = RelationRelationId;
object.objectId = OIDNewHeap;
object.objectSubId = 0;
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 35a1d3a..c985fea 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -593,11 +593,11 @@ CopyFrom(CopyFromState cstate)
*/
if (RELKIND_HAS_STORAGE(cstate->rel->rd_rel->relkind) &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
- cstate->rel->rd_firstRelfilenodeSubid != InvalidSubTransactionId))
+ cstate->rel->rd_firstRelfilelocatorSubid != InvalidSubTransactionId))
ti_options |= TABLE_INSERT_SKIP_FSM;
/*
- * Optimize if new relfilenode was created in this subxact or one of its
+ * Optimize if new relfilenumber was created in this subxact or one of its
* committed children and we won't see those rows later as part of an
* earlier scan or command. The subxact test ensures that if this subxact
* aborts then the frozen rows won't be visible after xact cleanup. Note
@@ -640,7 +640,7 @@ CopyFrom(CopyFromState cstate)
errmsg("cannot perform COPY FREEZE because of prior transaction activity")));
if (cstate->rel->rd_createSubid != GetCurrentSubTransactionId() &&
- cstate->rel->rd_newRelfilenodeSubid != GetCurrentSubTransactionId())
+ cstate->rel->rd_newRelfilelocatorSubid != GetCurrentSubTransactionId())
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction")));
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index f269168..ca2f884 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -101,7 +101,7 @@ typedef struct
*/
typedef struct CreateDBRelInfo
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
Oid reloid; /* relation oid */
bool permanent; /* relation is permanent or unlogged */
} CreateDBRelInfo;
@@ -127,7 +127,7 @@ static void CreateDatabaseUsingWalLog(Oid src_dboid, Oid dboid, Oid src_tsid,
static List *ScanSourceDatabasePgClass(Oid srctbid, Oid srcdbid, char *srcpath);
static List *ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid,
Oid dbid, char *srcpath,
- List *rnodelist, Snapshot snapshot);
+ List *rlocatorlist, Snapshot snapshot);
static CreateDBRelInfo *ScanSourceDatabasePgClassTuple(HeapTupleData *tuple,
Oid tbid, Oid dbid,
char *srcpath);
@@ -147,12 +147,12 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
{
char *srcpath;
char *dstpath;
- List *rnodelist = NULL;
+ List *rlocatorlist = NULL;
ListCell *cell;
LockRelId srcrelid;
LockRelId dstrelid;
- RelFileNode srcrnode;
- RelFileNode dstrnode;
+ RelFileLocator srcrlocator;
+ RelFileLocator dstrlocator;
CreateDBRelInfo *relinfo;
/* Get source and destination database paths. */
@@ -165,9 +165,9 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
/* Copy relmap file from source database to the destination database. */
RelationMapCopy(dst_dboid, dst_tsid, srcpath, dstpath);
- /* Get list of relfilenodes to copy from the source database. */
- rnodelist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
- Assert(rnodelist != NIL);
+ /* Get list of relfilelocators to copy from the source database. */
+ rlocatorlist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
+ Assert(rlocatorlist != NIL);
/*
* Database IDs will be the same for all relations so set them before
@@ -176,11 +176,11 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
srcrelid.dbId = src_dboid;
dstrelid.dbId = dst_dboid;
- /* Loop over our list of relfilenodes and copy each one. */
- foreach(cell, rnodelist)
+ /* Loop over our list of relfilelocators and copy each one. */
+ foreach(cell, rlocatorlist)
{
relinfo = lfirst(cell);
- srcrnode = relinfo->rnode;
+ srcrlocator = relinfo->rlocator;
/*
* If the relation is from the source db's default tablespace then we
@@ -188,13 +188,13 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
* Otherwise, we need to create in the same tablespace as it is in the
* source database.
*/
- if (srcrnode.spcNode == src_tsid)
- dstrnode.spcNode = dst_tsid;
+ if (srcrlocator.spcOid == src_tsid)
+ dstrlocator.spcOid = dst_tsid;
else
- dstrnode.spcNode = srcrnode.spcNode;
+ dstrlocator.spcOid = srcrlocator.spcOid;
- dstrnode.dbNode = dst_dboid;
- dstrnode.relNode = srcrnode.relNode;
+ dstrlocator.dbOid = dst_dboid;
+ dstrlocator.relNumber = srcrlocator.relNumber;
/*
* Acquire locks on source and target relations before copying.
@@ -210,7 +210,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
LockRelationId(&dstrelid, AccessShareLock);
/* Copy relation storage from source to the destination. */
- CreateAndCopyRelationData(srcrnode, dstrnode, relinfo->permanent);
+ CreateAndCopyRelationData(srcrlocator, dstrlocator, relinfo->permanent);
/* Release the relation locks. */
UnlockRelationId(&srcrelid, AccessShareLock);
@@ -219,7 +219,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
pfree(srcpath);
pfree(dstpath);
- list_free_deep(rnodelist);
+ list_free_deep(rlocatorlist);
}
/*
@@ -246,31 +246,31 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
static List *
ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenode;
+ Oid relfilenumber;
Page page;
- List *rnodelist = NIL;
+ List *rlocatorlist = NIL;
LockRelId relid;
Relation rel;
Snapshot snapshot;
BufferAccessStrategy bstrategy;
- /* Get pg_class relfilenode. */
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- RelationRelationId);
+ /* Get pg_class relfilenumber. */
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ RelationRelationId);
/* Don't read data into shared_buffers without holding a relation lock. */
relid.dbId = dbid;
relid.relId = RelationRelationId;
LockRelationId(&relid, AccessShareLock);
- /* Prepare a RelFileNode for the pg_class relation. */
- rnode.spcNode = tbid;
- rnode.dbNode = dbid;
- rnode.relNode = relfilenode;
+ /* Prepare a RelFileLocator for the pg_class relation. */
+ rlocator.spcOid = tbid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = relfilenumber;
/*
* We can't use a real relcache entry for a relation in some other
@@ -279,7 +279,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- rel = CreateFakeRelcacheEntry(rnode);
+ rel = CreateFakeRelcacheEntry(rlocator);
nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
FreeFakeRelcacheEntry(rel);
@@ -299,7 +299,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
CHECK_FOR_INTERRUPTS();
- buf = ReadBufferWithoutRelcache(rnode, MAIN_FORKNUM, blkno,
+ buf = ReadBufferWithoutRelcache(rlocator, MAIN_FORKNUM, blkno,
RBM_NORMAL, bstrategy, false);
LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -310,9 +310,9 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
continue;
}
- /* Append relevant pg_class tuples for current page to rnodelist. */
- rnodelist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
- srcpath, rnodelist,
+ /* Append relevant pg_class tuples for current page to rlocatorlist. */
+ rlocatorlist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
+ srcpath, rlocatorlist,
snapshot);
UnlockReleaseBuffer(buf);
@@ -321,16 +321,16 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
/* Release relation lock. */
UnlockRelationId(&relid, AccessShareLock);
- return rnodelist;
+ return rlocatorlist;
}
/*
* Scan one page of the source database's pg_class relation and add relevant
- * entries to rnodelist. The return value is the updated list.
+ * entries to rlocatorlist. The return value is the updated list.
*/
static List *
ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
- char *srcpath, List *rnodelist,
+ char *srcpath, List *rlocatorlist,
Snapshot snapshot)
{
BlockNumber blkno = BufferGetBlockNumber(buf);
@@ -376,11 +376,11 @@ ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
relinfo = ScanSourceDatabasePgClassTuple(&tuple, tbid, dbid,
srcpath);
if (relinfo != NULL)
- rnodelist = lappend(rnodelist, relinfo);
+ rlocatorlist = lappend(rlocatorlist, relinfo);
}
}
- return rnodelist;
+ return rlocatorlist;
}
/*
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenode = InvalidOid;
+ Oid relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
@@ -418,29 +418,29 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
return NULL;
/*
- * If relfilenode is valid then directly use it. Otherwise, consult the
+ * If relfilenumber is valid then directly use it. Otherwise, consult the
* relmap.
*/
if (OidIsValid(classForm->relfilenode))
- relfilenode = classForm->relfilenode;
+ relfilenumber = classForm->relfilenode;
else
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- classForm->oid);
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ classForm->oid);
- /* We must have a valid relfilenode oid. */
- if (!OidIsValid(relfilenode))
- elog(ERROR, "relation with OID %u does not have a valid relfilenode",
+ /* We must have a valid relfilenumber oid. */
+ if (!RelFileNumberIsValid(relfilenumber))
+ elog(ERROR, "relation with OID %u does not have a valid relfilenumber",
classForm->oid);
/* Prepare a rel info element and add it to the list. */
relinfo = (CreateDBRelInfo *) palloc(sizeof(CreateDBRelInfo));
if (OidIsValid(classForm->reltablespace))
- relinfo->rnode.spcNode = classForm->reltablespace;
+ relinfo->rlocator.spcOid = classForm->reltablespace;
else
- relinfo->rnode.spcNode = tbid;
+ relinfo->rlocator.spcOid = tbid;
- relinfo->rnode.dbNode = dbid;
- relinfo->rnode.relNode = relfilenode;
+ relinfo->rlocator.dbOid = dbid;
+ relinfo->rlocator.relNumber = relfilenumber;
relinfo->reloid = classForm->oid;
/* Temporary relations were rejected above. */
@@ -2867,8 +2867,8 @@ remove_dbtablespaces(Oid db_id)
* try to remove that already-existing subdirectory during the cleanup in
* remove_dbtablespaces. Nuking existing files seems like a bad idea, so
* instead we make this extra check before settling on the OID of the new
- * database. This exactly parallels what GetNewRelFileNode() does for table
- * relfilenode values.
+ * database. This exactly parallels what GetNewRelFileNumber() does for table
+ * relfilenumber values.
*/
static bool
check_db_file_conflict(Oid db_id)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 99f5ab8..7a827d4 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1109,10 +1109,10 @@ DefineIndex(Oid relationId,
}
/*
- * A valid stmt->oldNode implies that we already have a built form of the
+ * A valid stmt->oldNumber implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelFileNumberIsValid(stmt->oldNumber) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1154,7 +1154,7 @@ DefineIndex(Oid relationId,
indexRelationId =
index_create(rel, indexRelationName, indexRelationId, parentIndexId,
parentConstraintId,
- stmt->oldNode, indexInfo, indexColNames,
+ stmt->oldNumber, indexInfo, indexColNames,
accessMethodId, tablespaceId,
collationObjectId, classObjectId,
coloptions, reloptions,
@@ -1361,15 +1361,15 @@ DefineIndex(Oid relationId,
* We can't use the same index name for the child index,
* so clear idxname to let the recursive invocation choose
* a new name. Likewise, the existing target relation
- * field is wrong, and if indexOid or oldNode are set,
+ * field is wrong, and if indexOid or oldNumber are set,
* they mustn't be applied to the child either.
*/
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNumber = InvalidRelFileNumber;
childStmt->oldCreateSubid = InvalidSubTransactionId;
- childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ childStmt->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
/*
* Adjust any Vars (both in expressions and in the index's
@@ -3015,7 +3015,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelFileNumberIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106..9ac0383 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -118,7 +118,7 @@ SetMatViewPopulatedState(Relation relation, bool newstate)
* ExecRefreshMatView -- execute a REFRESH MATERIALIZED VIEW command
*
* This refreshes the materialized view by creating a new table and swapping
- * the relfilenodes of the new table and the old materialized view, so the OID
+ * the relfilenumbers of the new table and the old materialized view, so the OID
* of the original materialized view is preserved. Thus we do not lose GRANT
* nor references to this materialized view.
*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ddf219b..48d9d43 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -75,7 +75,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelFileNumber filenumber; /* last seen relfilenumber of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
@@ -255,7 +255,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
*
* The change is made transactionally, so that on failure of the current
* transaction, the sequence will be restored to its previous state.
- * We do that by creating a whole new relfilenode for the sequence; so this
+ * We do that by creating a whole new relfilenumber for the sequence; so this
* works much like the rewriting forms of ALTER TABLE.
*
* Caller is assumed to have acquired AccessExclusiveLock on the sequence,
@@ -310,7 +310,7 @@ ResetSequence(Oid seq_relid)
/*
* Create a new storage file for the sequence.
*/
- RelationSetNewRelfilenode(seq_rel, seq_rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seq_rel, seq_rel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -347,9 +347,9 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
{
SMgrRelation srel;
- srel = smgropen(rel->rd_node, InvalidBackendId);
+ srel = smgropen(rel->rd_locator, InvalidBackendId);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
+ log_smgrcreate(&rel->rd_locator, INIT_FORKNUM);
fill_seq_fork_with_data(rel, tuple, INIT_FORKNUM);
FlushRelationBuffers(rel);
smgrclose(srel);
@@ -418,7 +418,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = rel->rd_node;
+ xlrec.locator = rel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) tuple->t_data, tuple->t_len);
@@ -509,7 +509,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
* Create a new storage file for the sequence, making the state
* changes transactional.
*/
- RelationSetNewRelfilenode(seqrel, seqrel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seqrel, seqrel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -557,7 +557,7 @@ SequenceChangePersistence(Oid relid, char newrelpersistence)
GetTopTransactionId();
(void) read_seq_tuple(seqrel, &buf, &seqdatatuple);
- RelationSetNewRelfilenode(seqrel, newrelpersistence);
+ RelationSetNewRelfilenumber(seqrel, newrelpersistence);
fill_seq_with_data(seqrel, &seqdatatuple);
UnlockReleaseBuffer(buf);
@@ -836,7 +836,7 @@ nextval_internal(Oid relid, bool check_permissions)
seq->is_called = true;
seq->log_cnt = 0;
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1023,7 +1023,7 @@ do_setval(Oid relid, int64 next, bool iscalled)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1147,7 +1147,7 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
if (!found)
{
/* relid already filled in */
- elm->filenode = InvalidOid;
+ elm->filenumber = InvalidRelFileNumber;
elm->lxid = InvalidLocalTransactionId;
elm->last_valid = false;
elm->last = elm->cached = 0;
@@ -1169,9 +1169,9 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
* discard any cached-but-unissued values. We do not touch the currval()
* state, however.
*/
- if (seqrel->rd_rel->relfilenode != elm->filenode)
+ if (seqrel->rd_rel->relfilenode != elm->filenumber)
{
- elm->filenode = seqrel->rd_rel->relfilenode;
+ elm->filenumber = seqrel->rd_rel->relfilenode;
elm->cached = elm->last;
}
@@ -1254,7 +1254,8 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
* changed. This allows ALTER SEQUENCE to behave transactionally. Currently,
* the only option that doesn't cause that is OWNED BY. It's *necessary* for
* ALTER SEQUENCE OWNED BY to not rewrite the sequence, because that would
- * break pg_upgrade by causing unwanted changes in the sequence's relfilenode.
+ * break pg_upgrade by causing unwanted changes in the sequence's
+ * relfilenumber.
*/
static void
init_params(ParseState *pstate, List *options, bool for_identity,
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 2de0eba..bf645b8 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -596,7 +596,7 @@ static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
static ObjectAddress ATExecSetCompression(AlteredTableInfo *tab, Relation rel,
const char *column, Node *newValue, LOCKMODE lockmode);
-static void index_copy_data(Relation rel, RelFileNode newrnode);
+static void index_copy_data(Relation rel, RelFileLocator newrlocator);
static const char *storage_name(char c);
static void RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid,
@@ -1986,12 +1986,12 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* Normally, we need a transaction-safe truncation here. However, if
* the table was either created in the current (sub)transaction or has
- * a new relfilenode in the current (sub)transaction, then we can just
+ * a new relfilenumber in the current (sub)transaction, then we can just
* truncate it in-place, because a rollback would cause the whole
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilelocatorSubid == mySubid)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -2014,10 +2014,10 @@ ExecuteTruncateGuts(List *explicit_rels,
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
- * as the relfilenode value. The old storage file is scheduled for
+ * as the relfilenumber value. The old storage file is scheduled for
* deletion at commit.
*/
- RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(rel, rel->rd_rel->relpersistence);
heap_relid = RelationGetRelid(rel);
@@ -2030,7 +2030,7 @@ ExecuteTruncateGuts(List *explicit_rels,
Relation toastrel = relation_open(toast_relid,
AccessExclusiveLock);
- RelationSetNewRelfilenode(toastrel,
+ RelationSetNewRelfilenumber(toastrel,
toastrel->rd_rel->relpersistence);
table_close(toastrel, NoLock);
}
@@ -3315,10 +3315,10 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
/*
* SetRelationTableSpace
- * Set new reltablespace and relfilenode in pg_class entry.
+ * Set new reltablespace and relfilenumber in pg_class entry.
*
* newTableSpaceId is the new tablespace for the relation, and
- * newRelFileNode its new filenode. If newRelFileNode is InvalidOid,
+ * newRelFilenumber its new filenumber. If newRelFilenumber is InvalidOid,
* this field is not updated.
*
* NOTE: The caller must hold AccessExclusiveLock on the relation.
@@ -3331,7 +3331,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelFileNumber newRelFilenumber)
{
Relation pg_class;
HeapTuple tuple;
@@ -3351,8 +3351,8 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
- rd_rel->relfilenode = newRelFileNode;
+ if (RelFileNumberIsValid(newRelFilenumber))
+ rd_rel->relfilenode = newRelFilenumber;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
/*
@@ -5420,7 +5420,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* persistence: on one hand, we need to ensure that the buffers
* belonging to each of the two relations are marked with or without
* BM_PERMANENT properly. On the other hand, since rewriting creates
- * and assigns a new relfilenode, we automatically create or drop an
+ * and assigns a new relfilenumber, we automatically create or drop an
* init fork for the relation as appropriate.
*/
if (tab->rewrite > 0 && tab->relkind != RELKIND_SEQUENCE)
@@ -5506,12 +5506,13 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* Create transient table that will receive the modified data.
*
* Ensure it is marked correctly as logged or unlogged. We have
- * to do this here so that buffers for the new relfilenode will
+ * to do this here so that buffers for the new relfilenumber will
* have the right persistence set, and at the same time ensure
- * that the original filenode's buffers will get read in with the
- * correct setting (i.e. the original one). Otherwise a rollback
- * after the rewrite would possibly result with buffers for the
- * original filenode having the wrong persistence setting.
+ * that the original filenumbers's buffers will get read in with
+ * the correct setting (i.e. the original one). Otherwise a
+ * rollback after the rewrite would possibly result with buffers
+ * for the original filenumbers having the wrong persistence
+ * setting.
*
* NB: This relies on swap_relation_files() also swapping the
* persistence. That wouldn't work for pg_class, but that can't be
@@ -8597,7 +8598,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelFileNumberIsValid(stmt->oldNumber);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8613,7 +8614,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
quiet);
/*
- * If TryReuseIndex() stashed a relfilenode for us, we used it for the new
+ * If TryReuseIndex() stashed a relfilenumber for us, we used it for the new
* index instead of building from scratch. Restore associated fields.
* This may store InvalidSubTransactionId in both fields, in which case
* relcache.c will assume it can rebuild the relcache entry. Hence, do
@@ -8621,13 +8622,13 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelFileNumberIsValid(stmt->oldNumber))
{
Relation irel = index_open(address.objectId, NoLock);
irel->rd_createSubid = stmt->oldCreateSubid;
- irel->rd_firstRelfilenodeSubid = stmt->oldFirstRelfilenodeSubid;
- RelationPreserveStorage(irel->rd_node, true);
+ irel->rd_firstRelfilelocatorSubid = stmt->oldFirstRelfilenumberSubid;
+ RelationPreserveStorage(irel->rd_locator, true);
index_close(irel, NoLock);
}
@@ -13491,9 +13492,9 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNumber = irel->rd_locator.relNumber;
stmt->oldCreateSubid = irel->rd_createSubid;
- stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
+ stmt->oldFirstRelfilenumberSubid = irel->rd_firstRelfilelocatorSubid;
}
index_close(irel, NoLock);
}
@@ -14340,8 +14341,8 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
- RelFileNode newrnode;
+ RelFileNumber newrelfilenumber;
+ RelFileLocator newrlocator;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14370,26 +14371,28 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
+ * Relfilenumbers are not unique in databases across tablespaces, so we need
* to allocate a new one in the new tablespace.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
- newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
- newrnode.spcNode = newTableSpace;
+ newrlocator = rel->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
+ newrlocator.spcOid = newTableSpace;
- /* hand off to AM to actually create the new filenode and copy the data */
+ /*
+ * hand off to AM to actually create the new filelocator and copy the data
+ */
if (rel->rd_rel->relkind == RELKIND_INDEX)
{
- index_copy_data(rel, newrnode);
+ index_copy_data(rel, newrlocator);
}
else
{
Assert(RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind));
- table_relation_copy_data(rel, &newrnode);
+ table_relation_copy_data(rel, &newrlocator);
}
/*
@@ -14400,11 +14403,11 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* the updated pg_class entry), but that's forbidden with
* CheckRelationTableSpaceMove().
*/
- SetRelationTableSpace(rel, newTableSpace, newrelfilenode);
+ SetRelationTableSpace(rel, newTableSpace, newrelfilenumber);
InvokeObjectPostAlterHook(RelationRelationId, RelationGetRelid(rel), 0);
- RelationAssumeNewRelfilenode(rel);
+ RelationAssumeNewRelfilelocator(rel);
relation_close(rel, NoLock);
@@ -14630,11 +14633,11 @@ AlterTableMoveAll(AlterTableMoveAllStmt *stmt)
}
static void
-index_copy_data(Relation rel, RelFileNode newrnode)
+index_copy_data(Relation rel, RelFileLocator newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -14648,10 +14651,10 @@ index_copy_data(Relation rel, RelFileNode newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilelocator value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -14672,7 +14675,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(&newrnode, forkNum);
+ log_smgrcreate(&newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 00ca397..c8bdd99 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -12,12 +12,12 @@
* remove the possibility of having file name conflicts, we isolate
* files within a tablespace into database-specific subdirectories.
*
- * To support file access via the information given in RelFileNode, we
+ * To support file access via the information given in RelFileLocator, we
* maintain a symbolic-link map in $PGDATA/pg_tblspc. The symlinks are
* named by tablespace OIDs and point to the actual tablespace directories.
* There is also a per-cluster version directory in each tablespace.
* Thus the full path to an arbitrary file is
- * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenode
+ * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenumber
* e.g.
* $PGDATA/pg_tblspc/20981/PG_9.0_201002161/719849/83292814
*
@@ -25,8 +25,8 @@
* tables) and pg_default (for everything else). For backwards compatibility
* and to remain functional on platforms without symlinks, these tablespaces
* are accessed specially: they are respectively
- * $PGDATA/global/relfilenode
- * $PGDATA/base/dboid/relfilenode
+ * $PGDATA/global/relfilenumber
+ * $PGDATA/base/dboid/relfilenumber
*
* To allow CREATE DATABASE to give a new database a default tablespace
* that's different from the template database's default, we make the
@@ -115,7 +115,7 @@ static bool destroy_tablespace_directories(Oid tablespaceoid, bool redo);
* re-create a database subdirectory (of $PGDATA/base) during WAL replay.
*/
void
-TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
+TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
{
struct stat st;
char *dir;
@@ -124,13 +124,13 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
* The global tablespace doesn't have per-database subdirectories, so
* nothing to do for it.
*/
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
return;
- Assert(OidIsValid(spcNode));
- Assert(OidIsValid(dbNode));
+ Assert(OidIsValid(spcOid));
+ Assert(OidIsValid(dbOid));
- dir = GetDatabasePath(dbNode, spcNode);
+ dir = GetDatabasePath(dbOid, spcOid);
if (stat(dir, &st) < 0)
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630f..7d50b50 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4193,9 +4193,9 @@ _copyIndexStmt(const IndexStmt *from)
COPY_NODE_FIELD(excludeOpNames);
COPY_STRING_FIELD(idxcomment);
COPY_SCALAR_FIELD(indexOid);
- COPY_SCALAR_FIELD(oldNode);
+ COPY_SCALAR_FIELD(oldNumber);
COPY_SCALAR_FIELD(oldCreateSubid);
- COPY_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COPY_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COPY_SCALAR_FIELD(unique);
COPY_SCALAR_FIELD(nulls_not_distinct);
COPY_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index e747e16..d63d326 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1752,9 +1752,9 @@ _equalIndexStmt(const IndexStmt *a, const IndexStmt *b)
COMPARE_NODE_FIELD(excludeOpNames);
COMPARE_STRING_FIELD(idxcomment);
COMPARE_SCALAR_FIELD(indexOid);
- COMPARE_SCALAR_FIELD(oldNode);
+ COMPARE_SCALAR_FIELD(oldNumber);
COMPARE_SCALAR_FIELD(oldCreateSubid);
- COMPARE_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COMPARE_SCALAR_FIELD(oldFirstRelfilenumberSubid);
COMPARE_SCALAR_FIELD(unique);
COMPARE_SCALAR_FIELD(nulls_not_distinct);
COMPARE_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ce12915..3724d48 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2928,9 +2928,9 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_OID_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
- WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
+ WRITE_UINT_FIELD(oldFirstRelfilenumberSubid);
WRITE_BOOL_FIELD(unique);
WRITE_BOOL_FIELD(nulls_not_distinct);
WRITE_BOOL_FIELD(primary);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 969c9c1..394404d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,9 +7990,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
@@ -8022,9 +8022,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidOid;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 1a64a52..390b454 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1578,9 +1578,9 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
index->nulls_not_distinct = idxrec->indnullsnotdistinct;
index->primary = idxrec->indisprimary;
@@ -2201,9 +2201,9 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilenumberSubid = InvalidSubTransactionId;
index->transformed = false;
index->concurrent = false;
index->if_not_exists = false;
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index c937c39..5fc076f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -1207,7 +1207,7 @@ CompactCheckpointerRequestQueue(void)
* We use the request struct directly as a hashtable key. This
* assumes that any padding bytes in the structs are consistently the
* same, which should be okay because we zeroed them in
- * CheckpointerShmemInit. Note also that RelFileNode had better
+ * CheckpointerShmemInit. Note also that RelFileLocator had better
* contain no pad bytes.
*/
request = &CheckpointerShmem->requests[n];
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index aa2427b..c5c6a2b 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -845,7 +845,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_insert *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_insert *) XLogRecGetData(r);
@@ -857,8 +857,8 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -872,7 +872,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
tupledata = XLogRecGetBlockData(r, 0, &datalen);
tuplelen = datalen - SizeOfHeapHeader;
@@ -902,13 +902,13 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xl_heap_update *xlrec;
ReorderBufferChange *change;
char *data;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -918,7 +918,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change = ReorderBufferGetChange(ctx->reorder);
change->action = REORDER_BUFFER_CHANGE_UPDATE;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
if (xlrec->flags & XLH_UPDATE_CONTAINS_NEW_TUPLE)
{
@@ -968,13 +968,13 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_delete *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -990,7 +990,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
/* old primary key stored */
if (xlrec->flags & XLH_DELETE_CONTAINS_OLD)
@@ -1063,7 +1063,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
char *data;
char *tupledata;
Size tuplelen;
- RelFileNode rnode;
+ RelFileLocator rlocator;
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
@@ -1075,8 +1075,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
- if (rnode.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &rlocator, NULL, NULL);
+ if (rlocator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1103,7 +1103,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &rnode, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &rlocator, sizeof(RelFileLocator));
xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
@@ -1165,11 +1165,11 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
{
XLogReaderState *r = buf->record;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1180,7 +1180,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
change->data.tp.clear_toast_afterwards = true;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8da5f90..f8fb228 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -106,7 +106,7 @@
#include "utils/memdebug.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
/* entry for a hash table we use to map from xid to our transaction state */
@@ -116,10 +116,10 @@ typedef struct ReorderBufferTXNByIdEnt
ReorderBufferTXN *txn;
} ReorderBufferTXNByIdEnt;
-/* data structures for (relfilenode, ctid) => (cmin, cmax) mapping */
+/* data structures for (relfilelocator, ctid) => (cmin, cmax) mapping */
typedef struct ReorderBufferTupleCidKey
{
- RelFileNode relnode;
+ RelFileLocator rlocator;
ItemPointerData tid;
} ReorderBufferTupleCidKey;
@@ -1643,7 +1643,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Destroy the (relfilenode, ctid) hashtable, so that we don't leak any
+ * Destroy the (relfilelocator, ctid) hashtable, so that we don't leak any
* memory. We could also keep the hash table and update it with new ctid
* values, but this seems simpler and good enough for now.
*/
@@ -1673,7 +1673,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Build a hash with a (relfilenode, ctid) -> (cmin, cmax) mapping for use by
+ * Build a hash with a (relfilelocator, ctid) -> (cmin, cmax) mapping for use by
* HeapTupleSatisfiesHistoricMVCC.
*/
static void
@@ -1711,7 +1711,7 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
/* be careful about padding */
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
- key.relnode = change->data.tuplecid.node;
+ key.rlocator = change->data.tuplecid.locator;
ItemPointerCopy(&change->data.tuplecid.tid,
&key.tid);
@@ -2140,36 +2140,36 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
+ change->data.tp.rlocator.relNumber);
/*
* Mapped catalog tuple without data, emitted while
* catalog table was in the process of being rewritten. We
- * can fail to look up the relfilenode, because the
+ * can fail to look up the relfilenumber, because the
* relmapper has no "historic" view, in contrast to the
* normal catalog during decoding. Thus repeated rewrites
* can cause a lookup failure. That's OK because we do not
* decode catalog changes anyway. Normally such tuples
* would be skipped over below, but we can't identify
* whether the table should be logically logged without
- * mapping the relfilenode to the oid.
+ * mapping the relfilenumber to the oid.
*/
if (reloid == InvalidOid &&
change->data.tp.newtuple == NULL &&
change->data.tp.oldtuple == NULL)
goto change_done;
else if (reloid == InvalidOid)
- elog(ERROR, "could not map filenode \"%s\" to relation OID",
- relpathperm(change->data.tp.relnode,
+ elog(ERROR, "could not map filenumber \"%s\" to relation OID",
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
relation = RelationIdGetRelation(reloid);
if (!RelationIsValid(relation))
- elog(ERROR, "could not open relation with OID %u (for filenode \"%s\")",
+ elog(ERROR, "could not open relation with OID %u (for filenumber \"%s\")",
reloid,
- relpathperm(change->data.tp.relnode,
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
if (!RelationIsLogicallyLogged(relation))
@@ -3157,7 +3157,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
}
/*
- * Add new (relfilenode, tid) -> (cmin, cmax) mappings.
+ * Add new (relfilelocator, tid) -> (cmin, cmax) mappings.
*
* We do not include this change type in memory accounting, because we
* keep CIDs in a separate list and do not evict them when reaching
@@ -3165,7 +3165,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
*/
void
ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
- XLogRecPtr lsn, RelFileNode node,
+ XLogRecPtr lsn, RelFileLocator locator,
ItemPointerData tid, CommandId cmin,
CommandId cmax, CommandId combocid)
{
@@ -3174,7 +3174,7 @@ ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
- change->data.tuplecid.node = node;
+ change->data.tuplecid.locator = locator;
change->data.tuplecid.tid = tid;
change->data.tuplecid.cmin = cmin;
change->data.tuplecid.cmax = cmax;
@@ -4839,7 +4839,7 @@ ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn)
* need anymore.
*
* To resolve those problems we have a per-transaction hash of (cmin,
- * cmax) tuples keyed by (relfilenode, ctid) which contains the actual
+ * cmax) tuples keyed by (relfilelocator, ctid) which contains the actual
* (cmin, cmax) values. That also takes care of combo CIDs by simply
* not caring about them at all. As we have the real cmin/cmax values
* combo CIDs aren't interesting.
@@ -4870,9 +4870,9 @@ DisplayMapping(HTAB *tuplecid_data)
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
- ent->key.relnode.dbNode,
- ent->key.relnode.spcNode,
- ent->key.relnode.relNode,
+ ent->key.rlocator.dbOid,
+ ent->key.rlocator.spcOid,
+ ent->key.rlocator.relNumber,
ItemPointerGetBlockNumber(&ent->key.tid),
ItemPointerGetOffsetNumber(&ent->key.tid),
ent->cmin,
@@ -4932,7 +4932,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
path, readBytes,
(int32) sizeof(LogicalRewriteMappingData))));
- key.relnode = map.old_node;
+ key.rlocator = map.old_locator;
ItemPointerCopy(&map.old_tid,
&key.tid);
@@ -4947,7 +4947,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
if (!ent)
continue;
- key.relnode = map.new_node;
+ key.rlocator = map.new_locator;
ItemPointerCopy(&map.new_tid,
&key.tid);
@@ -5120,10 +5120,10 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
Assert(!BufferIsLocal(buffer));
/*
- * get relfilenode from the buffer, no convenient way to access it other
+ * get relfilelocator from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &key.rlocator, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1119a12..73c0f15 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -781,7 +781,7 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
- xlrec->target_node, xlrec->target_tid,
+ xlrec->target_locator, xlrec->target_tid,
xlrec->cmin, xlrec->cmax,
xlrec->combocid);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index ae13011..7071ff6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -121,12 +121,12 @@ typedef struct CkptTsStatus
* Type for array used to sort SMgrRelations
*
* FlushRelationsAllBuffers shares the same comparator function with
- * DropRelFileNodesAllBuffers. Pointer to this struct and RelFileNode must be
+ * DropRelFileLocatorsAllBuffers. Pointer to this struct and RelFileLocator must be
* compatible.
*/
typedef struct SMgrSortArray
{
- RelFileNode rnode; /* This must be the first member */
+ RelFileLocator rlocator; /* This must be the first member */
SMgrRelation srel;
} SMgrSortArray;
@@ -483,7 +483,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
BufferAccessStrategy strategy,
bool *foundPtr);
static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
-static void FindAndDropRelFileNodeBuffers(RelFileNode rnode,
+static void FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator,
ForkNumber forkNum,
BlockNumber nForkBlock,
BlockNumber firstDelBlock);
@@ -492,7 +492,7 @@ static void RelationCopyStorageUsingBuffer(Relation src, Relation dst,
bool isunlogged);
static void AtProcExit_Buffers(int code, Datum arg);
static void CheckForBufferLeaks(void);
-static int rnode_comparator(const void *p1, const void *p2);
+static int rlocator_comparator(const void *p1, const void *p2);
static inline int buffertag_comparator(const BufferTag *a, const BufferTag *b);
static inline int ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b);
static int ts_ckpt_progress_comparator(Datum a, Datum b, void *arg);
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -620,7 +620,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
* tag. In that case, the buffer is pinned and the usage count is bumped.
*/
bool
-ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
+ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockNum,
Buffer recent_buffer)
{
BufferDesc *bufHdr;
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rnode, forkNum, blockNum);
+ INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -786,13 +786,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* BackendId).
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool permanent)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(rlocator, InvalidBackendId);
return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
@@ -824,10 +824,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
isExtend = (blockNum == P_NEW);
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
@@ -839,7 +839,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend relation %s beyond %u blocks",
- relpath(smgr->smgr_rnode, forkNum),
+ relpath(smgr->smgr_rlocator, forkNum),
P_NEW)));
}
@@ -886,10 +886,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -926,7 +926,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (!PageIsNew((Page) bufBlock))
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
- blockNum, relpath(smgr->smgr_rnode, forkNum)),
+ blockNum, relpath(smgr->smgr_rlocator, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
@@ -1028,7 +1028,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
@@ -1036,7 +1036,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
}
}
}
@@ -1076,10 +1076,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageMiss;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1255,9 +1255,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
/* OK, do the I/O */
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1266,9 +1266,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
&buf->tag);
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
}
else
{
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rlocator.spcOid;
+ item->relNumber = bufHdr->tag.rlocator.relNumber;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2708,7 +2708,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2769,11 +2769,11 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
+ * Returns the relfilelocator, fork number and block number associated with
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2787,7 +2787,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ *rlocator = bufHdr->tag.rlocator;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,13 +2838,13 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rlocator, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
buf_state = LockBufHdr(buf);
@@ -2922,9 +2922,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3026,7 +3026,7 @@ BufferGetLSNAtomic(Buffer buffer)
}
/* ---------------------------------------------------------------------
- * DropRelFileNodeBuffers
+ * DropRelFileLocatorBuffers
*
* This function removes from the buffer pool all the pages of the
* specified relation forks that have block numbers >= firstDelBlock.
@@ -3047,24 +3047,24 @@ BufferGetLSNAtomic(Buffer buffer)
* --------------------------------------------------------------------
*/
void
-DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
+DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock)
{
int i;
int j;
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
BlockNumber nForkBlock[MAX_FORKNUM];
uint64 nBlocksToInvalidate = 0;
- rnode = smgr_reln->smgr_rnode;
+ rlocator = smgr_reln->smgr_rlocator;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileLocatorBackendIsTemp(rlocator))
{
- if (rnode.backend == MyBackendId)
+ if (rlocator.backend == MyBackendId)
{
for (j = 0; j < nforks; j++)
- DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
+ DropRelFileLocatorLocalBuffers(rlocator.locator, forkNum[j],
firstDelBlock[j]);
}
return;
@@ -3115,7 +3115,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)
{
for (j = 0; j < nforks; j++)
- FindAndDropRelFileNodeBuffers(rnode.node, forkNum[j],
+ FindAndDropRelFileLocatorBuffers(rlocator.locator, forkNum[j],
nForkBlock[j], firstDelBlock[j]);
return;
}
@@ -3138,17 +3138,17 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* false positives are safe because we'll recheck after getting the
* buffer lock.
*
- * We could check forkNum and blockNum as well as the rnode, but the
+ * We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3162,16 +3162,16 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
}
/* ---------------------------------------------------------------------
- * DropRelFileNodesAllBuffers
+ * DropRelFileLocatorsAllBuffers
*
* This function removes from the buffer pool all the pages of all
* forks of the specified relations. It's equivalent to calling
- * DropRelFileNodeBuffers once per fork per relation with
+ * DropRelFileLocatorBuffers once per fork per relation with
* firstDelBlock = 0.
* --------------------------------------------------------------------
*/
void
-DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
+DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
{
int i;
int j;
@@ -3179,22 +3179,22 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
SMgrRelation *rels;
BlockNumber (*block)[MAX_FORKNUM + 1];
uint64 nBlocksToInvalidate = 0;
- RelFileNode *nodes;
+ RelFileLocator *locators;
bool cached = true;
bool use_bsearch;
- if (nnodes == 0)
+ if (nlocators == 0)
return;
- rels = palloc(sizeof(SMgrRelation) * nnodes); /* non-local relations */
+ rels = palloc(sizeof(SMgrRelation) * nlocators); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
- for (i = 0; i < nnodes; i++)
+ for (i = 0; i < nlocators; i++)
{
- if (RelFileNodeBackendIsTemp(smgr_reln[i]->smgr_rnode))
+ if (RelFileLocatorBackendIsTemp(smgr_reln[i]->smgr_rlocator))
{
- if (smgr_reln[i]->smgr_rnode.backend == MyBackendId)
- DropRelFileNodeAllLocalBuffers(smgr_reln[i]->smgr_rnode.node);
+ if (smgr_reln[i]->smgr_rlocator.backend == MyBackendId)
+ DropRelFileLocatorAllLocalBuffers(smgr_reln[i]->smgr_rlocator.locator);
}
else
rels[n++] = smgr_reln[i];
@@ -3219,7 +3219,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
/*
* We can avoid scanning the entire buffer pool if we know the exact size
- * of each of the given relation forks. See DropRelFileNodeBuffers.
+ * of each of the given relation forks. See DropRelFileLocatorBuffers.
*/
for (i = 0; i < n && cached; i++)
{
@@ -3257,7 +3257,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
/* drop all the buffers for a particular relation fork */
- FindAndDropRelFileNodeBuffers(rels[i]->smgr_rnode.node,
+ FindAndDropRelFileLocatorBuffers(rels[i]->smgr_rlocator.locator,
j, block[i][j], 0);
}
}
@@ -3268,9 +3268,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
pfree(block);
- nodes = palloc(sizeof(RelFileNode) * n); /* non-local relations */
+ locators = palloc(sizeof(RelFileLocator) * n); /* non-local relations */
for (i = 0; i < n; i++)
- nodes[i] = rels[i]->smgr_rnode.node;
+ locators[i] = rels[i]->smgr_rlocator.locator;
/*
* For low number of relations to drop just use a simple walk through, to
@@ -3280,18 +3280,18 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
*/
use_bsearch = n > RELS_BSEARCH_THRESHOLD;
- /* sort the list of rnodes if necessary */
+ /* sort the list of rlocators if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(locators, n, sizeof(RelFileLocator), rlocator_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3301,37 +3301,37 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
{
- rnode = &nodes[j];
+ rlocator = &locators[j];
break;
}
}
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
- rnode_comparator);
+ rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ locators, n, sizeof(RelFileLocator),
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
- if (rnode == NULL)
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
+ if (rlocator == NULL)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
}
- pfree(nodes);
+ pfree(locators);
pfree(rels);
}
/* ---------------------------------------------------------------------
- * FindAndDropRelFileNodeBuffers
+ * FindAndDropRelFileLocatorBuffers
*
* This function performs look up in BufMapping table and removes from the
* buffer pool all the pages of the specified relation fork that has block
@@ -3340,9 +3340,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
* --------------------------------------------------------------------
*/
static void
-FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber nForkBlock,
- BlockNumber firstDelBlock)
+FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber nForkBlock,
+ BlockNumber firstDelBlock)
{
BlockNumber curBlock;
@@ -3356,7 +3356,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rnode, forkNum, curBlock);
+ INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
@@ -3380,7 +3380,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3397,7 +3397,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
* bothering to write them out first. This is used when we destroy a
* database, to avoid trying to flush data to disk when the directory
* tree no longer exists. Implementation is pretty similar to
- * DropRelFileNodeBuffers() which is for destroying just one relation.
+ * DropRelFileLocatorBuffers() which is for destroying just one relation.
* --------------------------------------------------------------------
*/
void
@@ -3416,14 +3416,14 @@ DropDatabaseBuffers(Oid dbid)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rlocator.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3453,7 +3453,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3478,7 +3478,7 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(buf->tag.rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3517,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3561,16 +3561,16 @@ FlushRelationBuffers(Relation rel)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3608,21 +3608,21 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (i = 0; i < nrels; i++)
{
- Assert(!RelFileNodeBackendIsTemp(smgrs[i]->smgr_rnode));
+ Assert(!RelFileLocatorBackendIsTemp(smgrs[i]->smgr_rlocator));
- srels[i].rnode = smgrs[i]->smgr_rnode.node;
+ srels[i].rlocator = smgrs[i]->smgr_rlocator.locator;
srels[i].srel = smgrs[i];
}
/*
* Save the bsearch overhead for low number of relations to sync. See
- * DropRelFileNodesAllBuffers for details.
+ * DropRelFileLocatorsAllBuffers for details.
*/
use_bsearch = nrels > RELS_BSEARCH_THRESHOLD;
/* sort the list of SMgrRelations if necessary */
if (use_bsearch)
- pg_qsort(srels, nrels, sizeof(SMgrSortArray), rnode_comparator);
+ pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
@@ -3634,7 +3634,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3644,7 +3644,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,19 +3653,19 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
srels, nrels, sizeof(SMgrSortArray),
- rnode_comparator);
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
if (srelent == NULL)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3729,7 +3729,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
CHECK_FOR_INTERRUPTS();
/* Read block from source relation. */
- srcBuf = ReadBufferWithoutRelcache(src->rd_node, forkNum, blkno,
+ srcBuf = ReadBufferWithoutRelcache(src->rd_locator, forkNum, blkno,
RBM_NORMAL, bstrategy_src,
permanent);
srcPage = BufferGetPage(srcBuf);
@@ -3740,7 +3740,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
}
/* Use P_NEW to extend the destination relation. */
- dstBuf = ReadBufferWithoutRelcache(dst->rd_node, forkNum, P_NEW,
+ dstBuf = ReadBufferWithoutRelcache(dst->rd_locator, forkNum, P_NEW,
RBM_NORMAL, bstrategy_dst,
permanent);
LockBuffer(dstBuf, BUFFER_LOCK_EXCLUSIVE);
@@ -3775,8 +3775,8 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
* --------------------------------------------------------------------
*/
void
-CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
- bool permanent)
+CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator, bool permanent)
{
Relation src_rel;
Relation dst_rel;
@@ -3793,8 +3793,8 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- src_rel = CreateFakeRelcacheEntry(src_rnode);
- dst_rel = CreateFakeRelcacheEntry(dst_rnode);
+ src_rel = CreateFakeRelcacheEntry(src_rlocator);
+ dst_rel = CreateFakeRelcacheEntry(dst_rlocator);
/*
* Create and copy all forks of the relation. During create database we
@@ -3802,7 +3802,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* directory. Therefore, each individual relation doesn't need to be
* registered for cleanup.
*/
- RelationCreateStorage(dst_rnode, relpersistence, false);
+ RelationCreateStorage(dst_rlocator, relpersistence, false);
/* copy main fork. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, MAIN_FORKNUM, permanent);
@@ -3820,7 +3820,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* init fork of an unlogged relation.
*/
if (permanent || forkNum == INIT_FORKNUM)
- log_smgrcreate(&dst_rnode, forkNum);
+ log_smgrcreate(&dst_rlocator, forkNum);
/* Copy a fork's data, block by block. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, forkNum,
@@ -3864,16 +3864,16 @@ FlushDatabaseBuffers(Oid dbid)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rlocator.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4034,7 +4034,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
/*
- * If we must not write WAL, due to a relfilenode-specific
+ * If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
* set the hint, just not dirty the page as a result so the hint
* is lost when we evict the page or shutdown.
@@ -4042,7 +4042,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
return;
/*
@@ -4651,7 +4651,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4675,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,7 +4693,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4703,27 +4703,27 @@ local_buffer_write_error_callback(void *arg)
}
/*
- * RelFileNode qsort/bsearch comparator; see RelFileNodeEquals.
+ * RelFileLocator qsort/bsearch comparator; see RelFileLocatorEquals.
*/
static int
-rnode_comparator(const void *p1, const void *p2)
+rlocator_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileLocator n1 = *(const RelFileLocator *) p1;
+ RelFileLocator n2 = *(const RelFileLocator *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.relNumber < n2.relNumber)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.relNumber > n2.relNumber)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.dbOid < n2.dbOid)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.dbOid > n2.dbOid)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.spcOid < n2.spcOid)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.spcOid > n2.spcOid)
return 1;
else
return 0;
@@ -4789,7 +4789,7 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
if (ret != 0)
return ret;
@@ -4822,9 +4822,9 @@ ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b)
else if (a->tsId > b->tsId)
return 1;
/* compare relation */
- if (a->relNode < b->relNode)
+ if (a->relNumber < b->relNumber)
return -1;
- else if (a->relNode > b->relNode)
+ else if (a->relNumber > b->relNumber)
return 1;
/* compare fork */
else if (a->forkNum < b->forkNum)
@@ -4960,7 +4960,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +4979,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..3dc9cc7 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -134,7 +134,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum, -b - 1);
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
#endif
buf_state = pg_atomic_read_u32(&bufHdr->state);
@@ -162,7 +162,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum,
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum,
-nextFreeLocalBuf - 1);
#endif
@@ -215,7 +215,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -312,7 +312,7 @@ MarkLocalBufferDirty(Buffer buffer)
}
/*
- * DropRelFileNodeLocalBuffers
+ * DropRelFileLocatorLocalBuffers
* This function removes from the buffer pool all the pages of the
* specified relation that have block numbers >= firstDelBlock.
* (In particular, with firstDelBlock = 0, all pages are removed.)
@@ -320,11 +320,11 @@ MarkLocalBufferDirty(Buffer buffer)
* out first. Therefore, this is NOT rollback-able, and so should be
* used only with extreme caution!
*
- * See DropRelFileNodeBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber firstDelBlock)
+DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber firstDelBlock)
{
int i;
@@ -337,14 +337,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -363,14 +363,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
}
/*
- * DropRelFileNodeAllLocalBuffers
+ * DropRelFileLocatorAllLocalBuffers
* This function removes from the buffer pool all pages of all forks
* of the specified relation.
*
- * See DropRelFileNodesAllBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorsAllBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
+DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
{
int i;
@@ -383,12 +383,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -589,7 +589,7 @@ AtProcExit_LocalBuffers(void)
{
/*
* We shouldn't be holding any remaining pins; if we are, and assertions
- * aren't enabled, we'll fail later in DropRelFileNodeBuffers while trying
+ * aren't enabled, we'll fail later in DropRelFileLocatorBuffers while trying
* to drop the temp rels.
*/
CheckForLocalBufferLeaks();
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index d41ae37..005def5 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -196,7 +196,7 @@ RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk, Size spaceAvail)
* WAL replay
*/
void
-XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail)
{
int new_cat = fsm_space_avail_to_cat(spaceAvail);
@@ -211,8 +211,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR,
- InvalidBuffer);
+ buf = XLogReadBufferExtended(rlocator, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR, InvalidBuffer);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..af4dab7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &rlocator, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 671b00a..9dab931 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -442,7 +442,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
}
void
-ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
+ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileLocator locator)
{
VirtualTransactionId *backends;
@@ -461,7 +461,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
return;
backends = GetConflictingVirtualXIDs(latestRemovedXid,
- node.dbNode);
+ locator.dbOid);
ResolveRecoveryConflictWithVirtualXIDs(backends,
PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
@@ -475,7 +475,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
*/
void
ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node)
+ RelFileLocator locator)
{
/*
* ResolveRecoveryConflictWithSnapshot operates on 32-bit TransactionIds,
@@ -493,7 +493,7 @@ ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXi
TransactionId latestRemovedXid;
latestRemovedXid = XidFromFullTransactionId(latestRemovedFullXid);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, node);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, locator);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 25e7e4e..5136da6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1997,7 +1997,7 @@ PageIsPredicateLocked(Relation relation, BlockNumber blkno)
PREDICATELOCKTARGET *target;
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
@@ -2576,7 +2576,7 @@ PredicateLockRelation(Relation relation, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
PredicateLockAcquire(&tag);
}
@@ -2599,7 +2599,7 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_PAGE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
PredicateLockAcquire(&tag);
@@ -2638,13 +2638,13 @@ PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
* level lock.
*/
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
if (PredicateLockExists(&tag))
return;
SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -2974,7 +2974,7 @@ DropAllPredicateLocksFromTable(Relation relation, bool transfer)
if (!PredicateLockingNeededForRelation(relation))
return;
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
relId = relation->rd_id;
if (relation->rd_index == NULL)
{
@@ -3194,11 +3194,11 @@ PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
Assert(BlockNumberIsValid(newblkno));
SET_PREDICATELOCKTARGETTAG_PAGE(oldtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
oldblkno);
SET_PREDICATELOCKTARGETTAG_PAGE(newtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
newblkno);
@@ -4478,7 +4478,7 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (tid != NULL)
{
SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -4488,14 +4488,14 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (blkno != InvalidBlockNumber)
{
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
CheckTargetForConflictsIn(&targettag);
}
SET_PREDICATELOCKTARGETTAG_RELATION(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
CheckTargetForConflictsIn(&targettag);
}
@@ -4556,7 +4556,7 @@ CheckTableForSerializableConflictIn(Relation relation)
Assert(relation->rd_index == NULL); /* not an index relation */
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
heapId = relation->rd_id;
LWLockAcquire(SerializablePredicateListLock, LW_EXCLUSIVE);
diff --git a/src/backend/storage/smgr/README b/src/backend/storage/smgr/README
index e1cfc6c..1dfc16f 100644
--- a/src/backend/storage/smgr/README
+++ b/src/backend/storage/smgr/README
@@ -46,7 +46,7 @@ physical relation in system catalogs.
It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
exists. Fork numbers are assigned in src/include/common/relpath.h.
Functions in smgr.c and md.c take an extra fork number argument, in addition
-to relfilenode and block number, to identify which relation fork you want to
+to relfilenumber and block number, to identify which relation fork you want to
access. Since most code wants to access the main fork, a shortcut version of
ReadBuffer that accesses MAIN_FORKNUM is provided in the buffer manager for
convenience.
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 43edaf5..3998296 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -35,7 +35,7 @@
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/md.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
@@ -89,11 +89,11 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* Populate a file tag describing an md.c segment file. */
-#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
+#define INIT_MD_FILETAG(a,xx_rlocator,xx_forknum,xx_segno) \
( \
memset(&(a), 0, sizeof(FileTag)), \
(a).handler = SYNC_HANDLER_MD, \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forknum = (xx_forknum), \
(a).segno = (xx_segno) \
)
@@ -121,14 +121,14 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
-static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
+static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
bool isRedo);
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
-static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
ForkNumber forknum,
@@ -199,11 +199,11 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* should be here and not in commands/tablespace.c? But that would imply
* importing a lot of stuff that smgr.c oughtn't know, either.
*/
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
+ TablespaceCreateDbspace(reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
isRedo);
- path = relpath(reln->smgr_rnode, forkNum);
+ path = relpath(reln->smgr_rlocator, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
@@ -234,7 +234,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
/*
* mdunlink() -- Unlink a relation.
*
- * Note that we're passed a RelFileNodeBackend --- by the time this is called,
+ * Note that we're passed a RelFileLocatorBackend --- by the time this is called,
* there won't be an SMgrRelation hashtable entry anymore.
*
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
@@ -243,10 +243,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* For regular relations, we don't unlink the first segment file of the rel,
* but just truncate it to zero length, and record a request to unlink it after
* the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenode
- * number from being reused. The scenario this protects us from is:
+ * however. Leaving the empty file in place prevents that relfilenumber
+ * from being reused. The scenario this protects us from is:
* 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenode as
+ * 2. We create a new relation, which by chance gets the same relfilenumber as
* the just-deleted one (OIDs must've wrapped around for that to happen).
* 3. We crash before another checkpoint occurs.
* During replay, we would delete the file and then recreate it, which is fine
@@ -254,18 +254,18 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* But if we didn't WAL-log insertions, but instead relied on fsyncing the
* file after populating it (as we do at wal_level=minimal), the contents of
* the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenode number until
- * it's safe, because relfilenode assignment skips over any existing file.
+ * next checkpoint, we prevent reassignment of the relfilenumber until it's
+ * safe, because relfilenumber assignment skips over any existing file.
*
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenode number.
+ * to the health of a regular rel that has taken over its relfilenumber.
* The fact that temp rels and regular rels have different file naming
* patterns provides additional safety.
*
* All the above applies only to the relation's main fork; other forks can
* just be removed immediately, since they are not needed to prevent the
- * relfilenode number from being recycled. Also, we do not carefully
+ * relfilenumber from being recycled. Also, we do not carefully
* track whether other forks have been created or not, but just attempt to
* unlink them unconditionally; so we should never complain about ENOENT.
*
@@ -278,16 +278,16 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* we are usually not in a transaction anymore when this is called.
*/
void
-mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlink(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
/* Now do the per-fork work */
if (forkNum == InvalidForkNumber)
{
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
else
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
/*
@@ -315,25 +315,25 @@ do_truncate(const char *path)
}
static void
-mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
int ret;
- path = relpath(rnode, forkNum);
+ path = relpath(rlocator, forkNum);
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
/* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
ret = 0;
@@ -354,7 +354,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
ret = do_truncate(path);
/* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
}
/*
@@ -373,7 +373,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
sprintf(segpath, "%s.%u", path, segno);
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
* Prevent other backends' fds from holding on to the disk
@@ -386,7 +386,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
* Forget any pending sync requests for this segment before we
* try to unlink.
*/
- register_forget_request(rnode, forkNum, segno);
+ register_forget_request(rlocator, forkNum, segno);
}
if (unlink(segpath) < 0)
@@ -437,7 +437,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend file \"%s\" beyond %u blocks",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
InvalidBlockNumber)));
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync, EXTENSION_CREATE);
@@ -490,7 +490,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (reln->md_num_open_segs[forknum] > 0)
return &reln->md_seg_fds[forknum][0];
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);
@@ -645,10 +645,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
MdfdVec *v;
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -660,10 +660,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_READ);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -715,10 +715,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
#endif
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -730,10 +730,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_WRITE);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -842,7 +842,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
return;
ereport(ERROR,
(errmsg("could not truncate file \"%s\" to %u blocks: it's only %u blocks now",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
nblocks, curnblk)));
}
if (nblocks == curnblk)
@@ -983,7 +983,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
{
FileTag tag;
- INIT_MD_FILETAG(tag, reln->smgr_rnode.node, forknum, seg->mdfd_segno);
+ INIT_MD_FILETAG(tag, reln->smgr_rlocator.locator, forknum, seg->mdfd_segno);
/* Temp relations should never be fsync'd */
Assert(!SmgrIsTemp(reln));
@@ -1005,15 +1005,15 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
* register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
*/
static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
/* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
+ Assert(!RelFileLocatorBackendIsTemp(rlocator));
RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
}
@@ -1022,12 +1022,12 @@ register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
-register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true /* retryOnError */ );
}
@@ -1039,13 +1039,13 @@ void
ForgetDatabaseSyncRequests(Oid dbid)
{
FileTag tag;
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.dbNode = dbid;
- rnode.spcNode = 0;
- rnode.relNode = 0;
+ rlocator.dbOid = dbid;
+ rlocator.spcOid = 0;
+ rlocator.relNumber = 0;
- INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
+ INIT_MD_FILETAG(tag, rlocator, InvalidForkNumber, InvalidBlockNumber);
RegisterSyncRequest(&tag, SYNC_FILTER_REQUEST, true /* retryOnError */ );
}
@@ -1054,7 +1054,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
* DropRelationFiles -- drop files of all given relations
*/
void
-DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
+DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo)
{
SMgrRelation *srels;
int i;
@@ -1129,7 +1129,7 @@ _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
char *path,
*fullpath;
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
if (segno > 0)
{
@@ -1345,7 +1345,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(ftag->rlocator, InvalidBackendId);
File file;
bool need_to_close;
int result,
@@ -1395,7 +1395,7 @@ mdunlinkfiletag(const FileTag *ftag, char *path)
char *p;
/* Compute the path. */
- p = relpathperm(ftag->rnode, MAIN_FORKNUM);
+ p = relpathperm(ftag->rlocator, MAIN_FORKNUM);
strlcpy(path, p, MAXPGPATH);
pfree(p);
@@ -1417,5 +1417,5 @@ mdfiletagmatches(const FileTag *ftag, const FileTag *candidate)
* We'll return true for all candidates that have the same database OID as
* the ftag from the SYNC_FILTER_REQUEST request, so they're forgotten.
*/
- return ftag->rnode.dbNode == candidate->rnode.dbNode;
+ return ftag->rlocator.dbOid == candidate->rlocator.dbOid;
}
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index a477f70..b21d8c3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -46,7 +46,7 @@ typedef struct f_smgr
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
bool (*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
- void (*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
+ void (*smgr_unlink) (RelFileLocatorBackend rlocator, ForkNumber forknum,
bool isRedo);
void (*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
@@ -143,9 +143,9 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(RelFileLocator rlocator, BackendId backend)
{
- RelFileNodeBackend brnode;
+ RelFileLocatorBackend brlocator;
SMgrRelation reln;
bool found;
@@ -154,7 +154,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = sizeof(RelFileLocatorBackend);
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
@@ -162,10 +162,10 @@ smgropen(RelFileNode rnode, BackendId backend)
}
/* Look up or create an entry */
- brnode.node = rnode;
- brnode.backend = backend;
+ brlocator.locator = rlocator;
+ brlocator.backend = backend;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &brnode,
+ (void *) &brlocator,
HASH_ENTER, &found);
/* Initialize it if not present before */
@@ -267,7 +267,7 @@ smgrclose(SMgrRelation reln)
dlist_delete(&reln->node);
if (hash_search(SMgrRelationHash,
- (void *) &(reln->smgr_rnode),
+ (void *) &(reln->smgr_rlocator),
HASH_REMOVE, NULL) == NULL)
elog(ERROR, "SMgrRelation hashtable corrupted");
@@ -335,15 +335,15 @@ smgrcloseall(void)
}
/*
- * smgrclosenode() -- Close SMgrRelation object for given RelFileNode,
+ * smgrcloserellocator() -- Close SMgrRelation object for given RelFileLocator,
* if one exists.
*
- * This has the same effects as smgrclose(smgropen(rnode)), but it avoids
+ * This has the same effects as smgrclose(smgropen(rlocator)), but it avoids
* uselessly creating a hashtable entry only to drop it again when no
* such entry exists already.
*/
void
-smgrclosenode(RelFileNodeBackend rnode)
+smgrcloserellocator(RelFileLocatorBackend rlocator)
{
SMgrRelation reln;
@@ -352,7 +352,7 @@ smgrclosenode(RelFileNodeBackend rnode)
return;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &rnode,
+ (void *) &rlocator,
HASH_FIND, NULL);
if (reln != NULL)
smgrclose(reln);
@@ -420,7 +420,7 @@ void
smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
{
int i = 0;
- RelFileNodeBackend *rnodes;
+ RelFileLocatorBackend *rlocators;
ForkNumber forknum;
if (nrels == 0)
@@ -430,19 +430,19 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* Get rid of any remaining buffers for the relations. bufmgr will just
* drop them without bothering to write the contents.
*/
- DropRelFileNodesAllBuffers(rels, nrels);
+ DropRelFileLocatorsAllBuffers(rels, nrels);
/*
* create an array which contains all relations to be dropped, and close
* each relation's forks at the smgr level while at it
*/
- rnodes = palloc(sizeof(RelFileNodeBackend) * nrels);
+ rlocators = palloc(sizeof(RelFileLocatorBackend) * nrels);
for (i = 0; i < nrels; i++)
{
- RelFileNodeBackend rnode = rels[i]->smgr_rnode;
+ RelFileLocatorBackend rlocator = rels[i]->smgr_rlocator;
int which = rels[i]->smgr_which;
- rnodes[i] = rnode;
+ rlocators[i] = rlocator;
/* Close the forks at smgr level */
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
@@ -458,7 +458,7 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* closed our own smgr rel.
*/
for (i = 0; i < nrels; i++)
- CacheInvalidateSmgr(rnodes[i]);
+ CacheInvalidateSmgr(rlocators[i]);
/*
* Delete the physical file(s).
@@ -473,10 +473,10 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
int which = rels[i]->smgr_which;
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- smgrsw[which].smgr_unlink(rnodes[i], forknum, isRedo);
+ smgrsw[which].smgr_unlink(rlocators[i], forknum, isRedo);
}
- pfree(rnodes);
+ pfree(rlocators);
}
@@ -631,7 +631,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* Get rid of any buffers for the about-to-be-deleted blocks. bufmgr will
* just drop them without bothering to write the contents.
*/
- DropRelFileNodeBuffers(reln, forknum, nforks, nblocks);
+ DropRelFileLocatorBuffers(reln, forknum, nforks, nblocks);
/*
* Send a shared-inval message to force other backends to close any smgr
@@ -643,7 +643,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* is a performance-critical path.) As in the unlink code, we want to be
* sure the message is sent before we start changing things on-disk.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
/* Do the truncation */
for (i = 0; i < nforks; i++)
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index b4a2c8d..d8ae082 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -27,7 +27,7 @@
#include "utils/builtins.h"
#include "utils/numeric.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/relmapper.h"
#include "utils/syscache.h"
@@ -292,7 +292,7 @@ pg_tablespace_size_name(PG_FUNCTION_ARGS)
* is no check here or at the call sites for that.
*/
static int64
-calculate_relation_size(RelFileNode *rfn, BackendId backend, ForkNumber forknum)
+calculate_relation_size(RelFileLocator *rfn, BackendId backend, ForkNumber forknum)
{
int64 totalsize = 0;
char *relationpath;
@@ -349,7 +349,7 @@ pg_relation_size(PG_FUNCTION_ARGS)
if (rel == NULL)
PG_RETURN_NULL();
- size = calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size = calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkname_to_number(text_to_cstring(forkName)));
relation_close(rel, AccessShareLock);
@@ -374,7 +374,7 @@ calculate_toast_table_size(Oid toastrelid)
/* toast heap size, including FSM and VM size */
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastRel->rd_node),
+ size += calculate_relation_size(&(toastRel->rd_locator),
toastRel->rd_backend, forkNum);
/* toast index size, including FSM and VM size */
@@ -388,7 +388,7 @@ calculate_toast_table_size(Oid toastrelid)
toastIdxRel = relation_open(lfirst_oid(lc),
AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastIdxRel->rd_node),
+ size += calculate_relation_size(&(toastIdxRel->rd_locator),
toastIdxRel->rd_backend, forkNum);
relation_close(toastIdxRel, AccessShareLock);
@@ -417,7 +417,7 @@ calculate_table_size(Relation rel)
* heap size, including FSM and VM
*/
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size += calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkNum);
/*
@@ -456,7 +456,7 @@ calculate_indexes_size(Relation rel)
idxRel = relation_open(idxOid, AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(idxRel->rd_node),
+ size += calculate_relation_size(&(idxRel->rd_locator),
idxRel->rd_backend,
forkNum);
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelFileNumber result;
HeapTuple tuple;
Form_pg_class relform;
@@ -864,29 +864,29 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (relform->relfilenode)
result = relform->relfilenode;
else /* Consult the relation mapper */
- result = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ result = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelFileNumber;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
PG_RETURN_OID(result);
}
/*
- * Get the relation via (reltablespace, relfilenode)
+ * Get the relation via (reltablespace, relfilenumber)
*
* This is expected to be used when somebody wants to match an individual file
* on the filesystem back to its table. That's not trivially possible via
- * pg_class, because that doesn't contain the relfilenodes of shared and nailed
+ * pg_class, because that doesn't contain the relfilenumbers of shared and nailed
* tables.
*
* We don't fail but return NULL if we cannot find a mapping.
@@ -898,14 +898,14 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_OID(1);
Oid heaprel;
- /* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ /* test needed so RelidByRelfilenumber doesn't misbehave */
+ if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenode(reltablespace, relfilenode);
+ heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
@@ -924,7 +924,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
Oid relid = PG_GETARG_OID(0);
HeapTuple tuple;
Form_pg_class relform;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BackendId backend;
char *path;
@@ -937,29 +937,29 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
{
/* This logic should match RelationInitPhysicalAddr */
if (relform->reltablespace)
- rnode.spcNode = relform->reltablespace;
+ rlocator.spcOid = relform->reltablespace;
else
- rnode.spcNode = MyDatabaseTableSpace;
- if (rnode.spcNode == GLOBALTABLESPACE_OID)
- rnode.dbNode = InvalidOid;
+ rlocator.spcOid = MyDatabaseTableSpace;
+ if (rlocator.spcOid == GLOBALTABLESPACE_OID)
+ rlocator.dbOid = InvalidOid;
else
- rnode.dbNode = MyDatabaseId;
+ rlocator.dbOid = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ rlocator.relNumber = relform->relfilenode;
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ rlocator.relNumber = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rlocator.relNumber = InvalidOid;
/* some compilers generate warnings without these next two lines */
- rnode.dbNode = InvalidOid;
- rnode.spcNode = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.spcOid = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!OidIsValid(rlocator.relNumber))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
@@ -990,7 +990,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
ReleaseSysCache(tuple);
- path = relpathbackend(rnode, backend, MAIN_FORKNUM);
+ path = relpathbackend(rlocator, backend, MAIN_FORKNUM);
PG_RETURN_TEXT_P(cstring_to_text(path));
}
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 67b9675e..4408c00 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -2,7 +2,7 @@
* pg_upgrade_support.c
*
* server-side functions to set backend global variables
- * to control oid and relfilenode assignment, and do other special
+ * to control oid and relfilenumber assignment, and do other special
* hacks needed for pg_upgrade.
*
* Copyright (c) 2010-2022, PostgreSQL Global Development Group
@@ -98,10 +98,10 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -120,10 +120,10 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile
index 38e46d2..5105018 100644
--- a/src/backend/utils/cache/Makefile
+++ b/src/backend/utils/cache/Makefile
@@ -21,7 +21,7 @@ OBJS = \
partcache.o \
plancache.o \
relcache.o \
- relfilenodemap.o \
+ relfilenumbermap.o \
relmapper.o \
spccache.o \
syscache.o \
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index af000d4..eb5782f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -661,11 +661,11 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
* We could have smgr entries for relations of other databases, so no
* short-circuit test is possible here.
*/
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
- rnode.node = msg->sm.rnode;
- rnode.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
- smgrclosenode(rnode);
+ rlocator.locator = msg->sm.rlocator;
+ rlocator.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
+ smgrcloserellocator(rlocator);
}
else if (msg->id == SHAREDINVALRELMAP_ID)
{
@@ -1459,14 +1459,14 @@ CacheInvalidateRelcacheByRelid(Oid relid)
* Thus, the maximum possible backend ID is 2^23-1.
*/
void
-CacheInvalidateSmgr(RelFileNodeBackend rnode)
+CacheInvalidateSmgr(RelFileLocatorBackend rlocator)
{
SharedInvalidationMessage msg;
msg.sm.id = SHAREDINVALSMGR_ID;
- msg.sm.backend_hi = rnode.backend >> 16;
- msg.sm.backend_lo = rnode.backend & 0xffff;
- msg.sm.rnode = rnode.node;
+ msg.sm.backend_hi = rlocator.backend >> 16;
+ msg.sm.backend_lo = rlocator.backend & 0xffff;
+ msg.sm.rlocator = rlocator.locator;
/* check AddCatcacheInvalidationMessage() for an explanation */
VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index f502df9..b80e2ec3 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -369,7 +369,7 @@ ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic)
/*
* The caller might need a tuple that's newer than the one the historic
* snapshot; currently the only case requiring to do so is looking up the
- * relfilenode of non mapped system relations during decoding. That
+ * relfilenumber of non mapped system relations during decoding. That
* snapshot can't change in the midst of a relcache build, so there's no
* need to register the snapshot.
*/
@@ -1133,8 +1133,8 @@ retry:
relation->rd_refcnt = 0;
relation->rd_isnailed = false;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
@@ -1300,7 +1300,7 @@ retry:
}
/*
- * Initialize the physical addressing info (RelFileNode) for a relcache entry
+ * Initialize the physical addressing info (RelFileLocator) for a relcache entry
*
* Note: at the physical level, relations in the pg_global tablespace must
* be treated as shared, even if relisshared isn't set. Hence we do not
@@ -1309,20 +1309,20 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ RelFileNumber oldnumber = relation->rd_locator.relNumber;
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
if (relation->rd_rel->reltablespace)
- relation->rd_node.spcNode = relation->rd_rel->reltablespace;
+ relation->rd_locator.spcOid = relation->rd_rel->reltablespace;
else
- relation->rd_node.spcNode = MyDatabaseTableSpace;
- if (relation->rd_node.spcNode == GLOBALTABLESPACE_OID)
- relation->rd_node.dbNode = InvalidOid;
+ relation->rd_locator.spcOid = MyDatabaseTableSpace;
+ if (relation->rd_locator.spcOid == GLOBALTABLESPACE_OID)
+ relation->rd_locator.dbOid = InvalidOid;
else
- relation->rd_node.dbNode = MyDatabaseId;
+ relation->rd_locator.dbOid = MyDatabaseId;
if (relation->rd_rel->relfilenode)
{
@@ -1356,30 +1356,30 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ relation->rd_locator.relNumber = relation->rd_rel->relfilenode;
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ relation->rd_locator.relNumber =
+ RelationMapOidToFilenumber(relation->rd_id,
+ relation->rd_rel->relisshared);
+ if (!RelFileNumberIsValid(relation->rd_locator.relNumber))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
/*
* For RelationNeedsWAL() to answer correctly on parallel workers, restore
- * rd_firstRelfilenodeSubid. No subtransactions start or end while in
+ * rd_firstRelfilelocatorSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnumber != relation->rd_locator.relNumber)
{
- if (RelFileNodeSkippingWAL(relation->rd_node))
- relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ if (RelFileLocatorSkippingWAL(relation->rd_locator))
+ relation->rd_firstRelfilelocatorSubid = TopSubTransactionId;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
}
@@ -1889,8 +1889,8 @@ formrdesc(const char *relationName, Oid relationReltype,
*/
relation->rd_isnailed = true;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
relation->rd_backend = InvalidBackendId;
relation->rd_islocaltemp = false;
@@ -1978,9 +1978,9 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
- * because there is no other way to know what filenode they currently
+ * because there is no other way to know what filenumber they currently
* have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * specifying that the initial filenumber is the same as the OID.
*/
relation->rd_rel->relfilenode = InvalidOid;
if (IsBootstrapProcessingMode())
@@ -2180,7 +2180,7 @@ RelationClose(Relation relation)
#ifdef RELCACHE_FORCE_RELEASE
if (RelationHasReferenceCountZero(relation) &&
relation->rd_createSubid == InvalidSubTransactionId &&
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
RelationClearRelation(relation, false);
#endif
}
@@ -2352,7 +2352,7 @@ RelationReloadNailed(Relation relation)
{
/*
* If it's a nailed-but-not-mapped index, then we need to re-read the
- * pg_class row to see if its relfilenode changed.
+ * pg_class row to see if its relfilenumber changed.
*/
RelationReloadIndexInfo(relation);
}
@@ -2700,8 +2700,8 @@ RelationClearRelation(Relation relation, bool rebuild)
Assert(newrel->rd_isnailed == relation->rd_isnailed);
/* creation sub-XIDs must be preserved */
SWAPFIELD(SubTransactionId, rd_createSubid);
- SWAPFIELD(SubTransactionId, rd_newRelfilenodeSubid);
- SWAPFIELD(SubTransactionId, rd_firstRelfilenodeSubid);
+ SWAPFIELD(SubTransactionId, rd_newRelfilelocatorSubid);
+ SWAPFIELD(SubTransactionId, rd_firstRelfilelocatorSubid);
SWAPFIELD(SubTransactionId, rd_droppedSubid);
/* un-swap rd_rel pointers, swap contents instead */
SWAPFIELD(Form_pg_class, rd_rel);
@@ -2791,12 +2791,12 @@ static void
RelationFlushRelation(Relation relation)
{
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* New relcache entries are always rebuilt, not flushed; else we'd
* forget the "new" status of the relation. Ditto for the
- * new-relfilenode status.
+ * new-relfilenumber status.
*
* The rel could have zero refcnt here, so temporarily increment the
* refcnt to ensure it's safe to rebuild it. We can assume that the
@@ -2835,7 +2835,7 @@ RelationForgetRelation(Oid rid)
Assert(relation->rd_droppedSubid == InvalidSubTransactionId);
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* In the event of subtransaction rollback, we must not forget
@@ -2894,7 +2894,7 @@ RelationCacheInvalidateEntry(Oid relationId)
*
* Apart from debug_discard_caches, this is currently used only to recover
* from SI message buffer overflow, so we do not touch relations having
- * new-in-transaction relfilenodes; they cannot be targets of cross-backend
+ * new-in-transaction relfilenumbers; they cannot be targets of cross-backend
* SI updates (and our own updates now go through a separate linked list
* that isn't limited by the SI message buffer size).
*
@@ -2909,7 +2909,7 @@ RelationCacheInvalidateEntry(Oid relationId)
* so hash_seq_search will complete safely; (b) during the second pass we
* only hold onto pointers to nondeletable entries.
*
- * The two-phase approach also makes it easy to update relfilenodes for
+ * The two-phase approach also makes it easy to update relfilenumbers for
* mapped relations before we do anything else, and to ensure that the
* second pass processes nailed-in-cache items before other nondeletable
* items. This should ensure that system catalogs are up to date before
@@ -2948,12 +2948,12 @@ RelationCacheInvalidate(bool debug_discard)
/*
* Ignore new relations; no other backend will manipulate them before
- * we commit. Likewise, before replacing a relation's relfilenode, we
- * shall have acquired AccessExclusiveLock and drained any applicable
- * pending invalidations.
+ * we commit. Likewise, before replacing a relation's relfilenumber,
+ * we shall have acquired AccessExclusiveLock and drained any
+ * applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;
relcacheInvalsReceived++;
@@ -2967,8 +2967,8 @@ RelationCacheInvalidate(bool debug_discard)
else
{
/*
- * If it's a mapped relation, immediately update its rd_node in
- * case its relfilenode changed. We must do this during phase 1
+ * If it's a mapped relation, immediately update its rd_locator in
+ * case its relfilenumber changed. We must do this during phase 1
* in case the relation is consulted during rebuild of other
* relcache entries in phase 2. It's safe since consulting the
* map doesn't involve any access to relcache entries.
@@ -3078,14 +3078,14 @@ AssertPendingSyncConsistency(Relation relation)
RelationIsPermanent(relation) &&
((relation->rd_createSubid != InvalidSubTransactionId &&
RELKIND_HAS_STORAGE(relation->rd_rel->relkind)) ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId);
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId);
- Assert(relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node));
+ Assert(relcache_verdict == RelFileLocatorSkippingWAL(relation->rd_locator));
if (relation->rd_droppedSubid != InvalidSubTransactionId)
Assert(!relation->rd_isvalid &&
(relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId));
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId));
}
/*
@@ -3282,8 +3282,8 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
* also lets RelationClearRelation() drop the relcache entry.
*/
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
if (clear_relcache)
@@ -3397,8 +3397,8 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
{
/* allow the entry to be removed */
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
RelationClearRelation(relation, false);
return;
@@ -3419,23 +3419,23 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
}
/*
- * Likewise, update or drop any new-relfilenode-in-subtransaction record
+ * Likewise, update or drop any new-relfilenumber-in-subtransaction record
* or drop record.
*/
- if (relation->rd_newRelfilenodeSubid == mySubid)
+ if (relation->rd_newRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_newRelfilenodeSubid = parentSubid;
+ relation->rd_newRelfilelocatorSubid = parentSubid;
else
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
}
- if (relation->rd_firstRelfilenodeSubid == mySubid)
+ if (relation->rd_firstRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_firstRelfilenodeSubid = parentSubid;
+ relation->rd_firstRelfilelocatorSubid = parentSubid;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
if (relation->rd_droppedSubid == mySubid)
@@ -3459,7 +3459,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3533,8 +3533,8 @@ RelationBuildLocalRelation(const char *relname,
/* it's being created in this transaction */
rel->rd_createSubid = GetCurrentSubTransactionId();
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
/*
@@ -3616,7 +3616,7 @@ RelationBuildLocalRelation(const char *relname,
/*
* Insert relation physical and logical identifiers (OIDs) into the right
- * places. For a mapped relation, we set relfilenode to zero and rely on
+ * places. For a mapped relation, we set relfilenumber to zero and rely on
* RelationInitPhysicalAddr to consult the map.
*/
rel->rd_rel->relisshared = shared_relation;
@@ -3632,10 +3632,10 @@ RelationBuildLocalRelation(const char *relname,
{
rel->rd_rel->relfilenode = InvalidOid;
/* Add it to the active mapping information */
- RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
+ RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
else
- rel->rd_rel->relfilenode = relfilenode;
+ rel->rd_rel->relfilenode = relfilenumber;
RelationInitLockInfo(rel); /* see lmgr.c */
@@ -3683,13 +3683,13 @@ RelationBuildLocalRelation(const char *relname,
/*
- * RelationSetNewRelfilenode
+ * RelationSetNewRelfilenumber
*
- * Assign a new relfilenode (physical file name), and possibly a new
+ * Assign a new relfilenumber (physical file name), and possibly a new
* persistence setting, to the relation.
*
* This allows a full rewrite of the relation to be done with transactional
- * safety (since the filenode assignment can be rolled back). Note however
+ * safety (since the filenumber assignment can be rolled back). Note however
* that there is no simple way to access the relation's old data for the
* remainder of the current transaction. This limits the usefulness to cases
* such as TRUNCATE or rebuilding an index from scratch.
@@ -3697,19 +3697,19 @@ RelationBuildLocalRelation(const char *relname,
* Caller must already hold exclusive lock on the relation.
*/
void
-RelationSetNewRelfilenode(Relation relation, char persistence)
+RelationSetNewRelfilenumber(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelFileNumber newrelfilenumber;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
MultiXactId minmulti = InvalidMultiXactId;
TransactionId freezeXid = InvalidTransactionId;
- RelFileNode newrnode;
+ RelFileLocator newrlocator;
- /* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
- persistence);
+ /* Allocate a new relfilenumber */
+ newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
+ NULL, persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
@@ -3729,28 +3729,28 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelationDropStorage(relation);
/*
- * Create storage for the main fork of the new relfilenode. If it's a
+ * Create storage for the main fork of the new relfilenumber. If it's a
* table-like object, call into the table AM to do so, which'll also
* create the table's init fork if needed.
*
- * NOTE: If relevant for the AM, any conflict in relfilenode value will be
- * caught here, if GetNewRelFileNode messes up for any reason.
+ * NOTE: If relevant for the AM, any conflict in relfilenumber value will be
+ * caught here, if GetNewRelFileNumber messes up for any reason.
*/
- newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ newrlocator = relation->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
- table_relation_set_new_filenode(relation, &newrnode,
- persistence,
- &freezeXid, &minmulti);
+ table_relation_set_new_filelocator(relation, &newrlocator,
+ persistence,
+ &freezeXid, &minmulti);
}
else if (RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
{
/* handle these directly, at least for now */
SMgrRelation srel;
- srel = RelationCreateStorage(newrnode, persistence, true);
+ srel = RelationCreateStorage(newrlocator, persistence, true);
smgrclose(srel);
}
else
@@ -3789,7 +3789,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
/* Do the deed */
RelationMapUpdateMap(RelationGetRelid(relation),
- newrelfilenode,
+ newrelfilenumber,
relation->rd_rel->relisshared,
false);
@@ -3799,7 +3799,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
else
{
/* Normal case, update the pg_class entry */
- classform->relfilenode = newrelfilenode;
+ classform->relfilenode = newrelfilenumber;
/* relpages etc. never change for sequences */
if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
@@ -3825,27 +3825,27 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
*/
CommandCounterIncrement();
- RelationAssumeNewRelfilenode(relation);
+ RelationAssumeNewRelfilelocator(relation);
}
/*
- * RelationAssumeNewRelfilenode
+ * RelationAssumeNewRelfilelocator
*
* Code that modifies pg_class.reltablespace or pg_class.relfilenode must call
* this. The call shall precede any code that might insert WAL records whose
- * replay would modify bytes in the new RelFileNode, and the call shall follow
- * any WAL modifying bytes in the prior RelFileNode. See struct RelationData.
+ * replay would modify bytes in the new RelFileLocator, and the call shall follow
+ * any WAL modifying bytes in the prior RelFileLocator. See struct RelationData.
* Ideally, call this as near as possible to the CommandCounterIncrement()
* that makes the pg_class change visible (before it or after it); that
* minimizes the chance of future development adding a forbidden WAL insertion
- * between RelationAssumeNewRelfilenode() and CommandCounterIncrement().
+ * between RelationAssumeNewRelfilelocator() and CommandCounterIncrement().
*/
void
-RelationAssumeNewRelfilenode(Relation relation)
+RelationAssumeNewRelfilelocator(Relation relation)
{
- relation->rd_newRelfilenodeSubid = GetCurrentSubTransactionId();
- if (relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
- relation->rd_firstRelfilenodeSubid = relation->rd_newRelfilenodeSubid;
+ relation->rd_newRelfilelocatorSubid = GetCurrentSubTransactionId();
+ if (relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid = relation->rd_newRelfilelocatorSubid;
/* Flag relation as needing eoxact cleanup (to clear these fields) */
EOXactListAdd(relation);
@@ -6254,8 +6254,8 @@ load_relcache_init_file(bool shared)
rel->rd_fkeyvalid = false;
rel->rd_fkeylist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
rel->pgstat_info = NULL;
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
deleted file mode 100644
index 70c323c..0000000
--- a/src/backend/utils/cache/relfilenodemap.c
+++ /dev/null
@@ -1,244 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.c
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- * src/backend/utils/cache/relfilenodemap.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "access/genam.h"
-#include "access/htup_details.h"
-#include "access/table.h"
-#include "catalog/pg_class.h"
-#include "catalog/pg_tablespace.h"
-#include "miscadmin.h"
-#include "utils/builtins.h"
-#include "utils/catcache.h"
-#include "utils/fmgroids.h"
-#include "utils/hsearch.h"
-#include "utils/inval.h"
-#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
-#include "utils/relmapper.h"
-
-/* Hash table for information about each relfilenode <-> oid pair */
-static HTAB *RelfilenodeMapHash = NULL;
-
-/* built first time through in InitializeRelfilenodeMap */
-static ScanKeyData relfilenode_skey[2];
-
-typedef struct
-{
- Oid reltablespace;
- Oid relfilenode;
-} RelfilenodeMapKey;
-
-typedef struct
-{
- RelfilenodeMapKey key; /* lookup key - must be first */
- Oid relid; /* pg_class.oid */
-} RelfilenodeMapEntry;
-
-/*
- * RelfilenodeMapInvalidateCallback
- * Flush mapping entries when pg_class is updated in a relevant fashion.
- */
-static void
-RelfilenodeMapInvalidateCallback(Datum arg, Oid relid)
-{
- HASH_SEQ_STATUS status;
- RelfilenodeMapEntry *entry;
-
- /* callback only gets registered after creating the hash */
- Assert(RelfilenodeMapHash != NULL);
-
- hash_seq_init(&status, RelfilenodeMapHash);
- while ((entry = (RelfilenodeMapEntry *) hash_seq_search(&status)) != NULL)
- {
- /*
- * If relid is InvalidOid, signaling a complete reset, we must remove
- * all entries, otherwise just remove the specific relation's entry.
- * Always remove negative cache entries.
- */
- if (relid == InvalidOid || /* complete reset */
- entry->relid == InvalidOid || /* negative cache entry */
- entry->relid == relid) /* individual flushed relation */
- {
- if (hash_search(RelfilenodeMapHash,
- (void *) &entry->key,
- HASH_REMOVE,
- NULL) == NULL)
- elog(ERROR, "hash table corrupted");
- }
- }
-}
-
-/*
- * InitializeRelfilenodeMap
- * Initialize cache, either on first use or after a reset.
- */
-static void
-InitializeRelfilenodeMap(void)
-{
- HASHCTL ctl;
- int i;
-
- /* Make sure we've initialized CacheMemoryContext. */
- if (CacheMemoryContext == NULL)
- CreateCacheMemoryContext();
-
- /* build skey */
- MemSet(&relfilenode_skey, 0, sizeof(relfilenode_skey));
-
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenode_skey[i].sk_func,
- CacheMemoryContext);
- relfilenode_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenode_skey[i].sk_subtype = InvalidOid;
- relfilenode_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenode_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenode_skey[1].sk_attno = Anum_pg_class_relfilenode;
-
- /*
- * Only create the RelfilenodeMapHash now, so we don't end up partially
- * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
- * error.
- */
- ctl.keysize = sizeof(RelfilenodeMapKey);
- ctl.entrysize = sizeof(RelfilenodeMapEntry);
- ctl.hcxt = CacheMemoryContext;
-
- RelfilenodeMapHash =
- hash_create("RelfilenodeMap cache", 64, &ctl,
- HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
-
- /* Watch for invalidation events. */
- CacheRegisterRelcacheCallback(RelfilenodeMapInvalidateCallback,
- (Datum) 0);
-}
-
-/*
- * Map a relation's (tablespace, filenode) to a relation's oid and cache the
- * result.
- *
- * Returns InvalidOid if no relation matching the criteria could be found.
- */
-Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
-{
- RelfilenodeMapKey key;
- RelfilenodeMapEntry *entry;
- bool found;
- SysScanDesc scandesc;
- Relation relation;
- HeapTuple ntp;
- ScanKeyData skey[2];
- Oid relid;
-
- if (RelfilenodeMapHash == NULL)
- InitializeRelfilenodeMap();
-
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
-
- /*
- * Check cache and return entry if one is found. Even if no target
- * relation can be found later on we store the negative match and return a
- * InvalidOid from cache. That's not really necessary for performance
- * since querying invalid values isn't supposed to be a frequent thing,
- * but it's basically free.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_FIND, &found);
-
- if (found)
- return entry->relid;
-
- /* ok, no previous cache entry, do it the hard way */
-
- /* initialize empty/negative cache entry before doing the actual lookups */
- relid = InvalidOid;
-
- if (reltablespace == GLOBALTABLESPACE_OID)
- {
- /*
- * Ok, shared table, check relmapper.
- */
- relid = RelationMapFilenodeToOid(relfilenode, true);
- }
- else
- {
- /*
- * Not a shared table, could either be a plain relation or a
- * non-shared, nailed one, like e.g. pg_class.
- */
-
- /* check for plain relations by looking in pg_class */
- relation = table_open(RelationRelationId, AccessShareLock);
-
- /* copy scankey to local copy, it will be modified during the scan */
- memcpy(skey, relfilenode_skey, sizeof(skey));
-
- /* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
-
- scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
- true,
- NULL,
- 2,
- skey);
-
- found = false;
-
- while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
- {
- Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
-
- if (found)
- elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
- reltablespace, relfilenode);
- found = true;
-
- Assert(classform->reltablespace == reltablespace);
- Assert(classform->relfilenode == relfilenode);
- relid = classform->oid;
- }
-
- systable_endscan(scandesc);
- table_close(relation, AccessShareLock);
-
- /* check for tables that are mapped but not shared */
- if (!found)
- relid = RelationMapFilenodeToOid(relfilenode, false);
- }
-
- /*
- * Only enter entry into cache now, our opening of pg_class could have
- * caused cache invalidations to be executed which would have deleted a
- * new entry if we had entered it above.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_ENTER, &found);
- if (found)
- elog(ERROR, "corrupted hashtable");
- entry->relid = relid;
-
- return relid;
-}
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
new file mode 100644
index 0000000..3dc45e9
--- /dev/null
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -0,0 +1,244 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.c
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/utils/cache/relfilenumbermap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/htup_details.h"
+#include "access/table.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/catcache.h"
+#include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/inval.h"
+#include "utils/rel.h"
+#include "utils/relfilenumbermap.h"
+#include "utils/relmapper.h"
+
+/* Hash table for information about each relfilenumber <-> oid pair */
+static HTAB *RelfilenumberMapHash = NULL;
+
+/* built first time through in InitializeRelfilenumberMap */
+static ScanKeyData relfilenumber_skey[2];
+
+typedef struct
+{
+ Oid reltablespace;
+ RelFileNumber relfilenumber;
+} RelfilenumberMapKey;
+
+typedef struct
+{
+ RelfilenumberMapKey key; /* lookup key - must be first */
+ Oid relid; /* pg_class.oid */
+} RelfilenumberMapEntry;
+
+/*
+ * RelfilenumberMapInvalidateCallback
+ * Flush mapping entries when pg_class is updated in a relevant fashion.
+ */
+static void
+RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
+{
+ HASH_SEQ_STATUS status;
+ RelfilenumberMapEntry *entry;
+
+ /* callback only gets registered after creating the hash */
+ Assert(RelfilenumberMapHash != NULL);
+
+ hash_seq_init(&status, RelfilenumberMapHash);
+ while ((entry = (RelfilenumberMapEntry *) hash_seq_search(&status)) != NULL)
+ {
+ /*
+ * If relid is InvalidOid, signaling a complete reset, we must remove
+ * all entries, otherwise just remove the specific relation's entry.
+ * Always remove negative cache entries.
+ */
+ if (relid == InvalidOid || /* complete reset */
+ entry->relid == InvalidOid || /* negative cache entry */
+ entry->relid == relid) /* individual flushed relation */
+ {
+ if (hash_search(RelfilenumberMapHash,
+ (void *) &entry->key,
+ HASH_REMOVE,
+ NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+ }
+ }
+}
+
+/*
+ * InitializeRelfilenumberMap
+ * Initialize cache, either on first use or after a reset.
+ */
+static void
+InitializeRelfilenumberMap(void)
+{
+ HASHCTL ctl;
+ int i;
+
+ /* Make sure we've initialized CacheMemoryContext. */
+ if (CacheMemoryContext == NULL)
+ CreateCacheMemoryContext();
+
+ /* build skey */
+ MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
+
+ for (i = 0; i < 2; i++)
+ {
+ fmgr_info_cxt(F_OIDEQ,
+ &relfilenumber_skey[i].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[i].sk_subtype = InvalidOid;
+ relfilenumber_skey[i].sk_collation = InvalidOid;
+ }
+
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
+ relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+
+ /*
+ * Only create the RelfilenumberMapHash now, so we don't end up partially
+ * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
+ * error.
+ */
+ ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.entrysize = sizeof(RelfilenumberMapEntry);
+ ctl.hcxt = CacheMemoryContext;
+
+ RelfilenumberMapHash =
+ hash_create("RelfilenumberMap cache", 64, &ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+ /* Watch for invalidation events. */
+ CacheRegisterRelcacheCallback(RelfilenumberMapInvalidateCallback,
+ (Datum) 0);
+}
+
+/*
+ * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * the result.
+ *
+ * Returns InvalidOid if no relation matching the criteria could be found.
+ */
+Oid
+RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
+{
+ RelfilenumberMapKey key;
+ RelfilenumberMapEntry *entry;
+ bool found;
+ SysScanDesc scandesc;
+ Relation relation;
+ HeapTuple ntp;
+ ScanKeyData skey[2];
+ Oid relid;
+
+ if (RelfilenumberMapHash == NULL)
+ InitializeRelfilenumberMap();
+
+ /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
+ if (reltablespace == MyDatabaseTableSpace)
+ reltablespace = 0;
+
+ MemSet(&key, 0, sizeof(key));
+ key.reltablespace = reltablespace;
+ key.relfilenumber = relfilenumber;
+
+ /*
+ * Check cache and return entry if one is found. Even if no target
+ * relation can be found later on we store the negative match and return a
+ * InvalidOid from cache. That's not really necessary for performance
+ * since querying invalid values isn't supposed to be a frequent thing,
+ * but it's basically free.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+
+ if (found)
+ return entry->relid;
+
+ /* ok, no previous cache entry, do it the hard way */
+
+ /* initialize empty/negative cache entry before doing the actual lookups */
+ relid = InvalidOid;
+
+ if (reltablespace == GLOBALTABLESPACE_OID)
+ {
+ /*
+ * Ok, shared table, check relmapper.
+ */
+ relid = RelationMapFilenumberToOid(relfilenumber, true);
+ }
+ else
+ {
+ /*
+ * Not a shared table, could either be a plain relation or a
+ * non-shared, nailed one, like e.g. pg_class.
+ */
+
+ /* check for plain relations by looking in pg_class */
+ relation = table_open(RelationRelationId, AccessShareLock);
+
+ /* copy scankey to local copy, it will be modified during the scan */
+ memcpy(skey, relfilenumber_skey, sizeof(skey));
+
+ /* set scan arguments */
+ skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
+ skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+
+ scandesc = systable_beginscan(relation,
+ ClassTblspcRelfilenodeIndexId,
+ true,
+ NULL,
+ 2,
+ skey);
+
+ found = false;
+
+ while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
+ {
+ Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
+
+ if (found)
+ elog(ERROR,
+ "unexpected duplicate for tablespace %u, relfilenumber %u",
+ reltablespace, relfilenumber);
+ found = true;
+
+ Assert(classform->reltablespace == reltablespace);
+ Assert(classform->relfilenode == relfilenumber);
+ relid = classform->oid;
+ }
+
+ systable_endscan(scandesc);
+ table_close(relation, AccessShareLock);
+
+ /* check for tables that are mapped but not shared */
+ if (!found)
+ relid = RelationMapFilenumberToOid(relfilenumber, false);
+ }
+
+ /*
+ * Only enter entry into cache now, our opening of pg_class could have
+ * caused cache invalidations to be executed which would have deleted a
+ * new entry if we had entered it above.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ if (found)
+ elog(ERROR, "corrupted hashtable");
+ entry->relid = relid;
+
+ return relid;
+}
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 2a330cf..2dd236f 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.c
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
* For most tables, the physical file underlying the table is specified by
* pg_class.relfilenode. However, that obviously won't work for pg_class
@@ -11,7 +11,7 @@
* update other databases' pg_class entries when relocating a shared catalog.
* Therefore, for these special catalogs (henceforth referred to as "mapped
* catalogs") we rely on a separately maintained file that shows the mapping
- * from catalog OIDs to filenode numbers. Each database has a map file for
+ * from catalog OIDs to filenumbers. Each database has a map file for
* its local mapped catalogs, and there is a separate map file for shared
* catalogs. Mapped catalogs have zero in their pg_class.relfilenode entries.
*
@@ -78,8 +78,8 @@
typedef struct RelMapping
{
- Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ Oid mapoid; /* OID of a catalog */
+ RelFileNumber mapfilenumber; /* its rel file number */
} RelMapping;
typedef struct RelMapFile
@@ -116,7 +116,7 @@ static RelMapFile local_map;
* subtransactions, so one set of transaction-level changes is sufficient.
*
* The active_xxx variables contain updates that are valid in our transaction
- * and should be honored by RelationMapOidToFilenode. The pending_xxx
+ * and should be honored by RelationMapOidToFilenumber. The pending_xxx
* variables contain updates we have been told about that aren't active yet;
* they will become active at the next CommandCounterIncrement. This setup
* lets map updates act similarly to updates of pg_class rows, ie, they
@@ -132,8 +132,8 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
- bool add_okay);
+static void apply_map_update(RelMapFile *map, Oid relationId,
+ RelFileNumber filenumber, bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
static void load_relmap_file(bool shared, bool lock_held);
@@ -146,9 +146,9 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
/*
- * RelationMapOidToFilenode
+ * RelationMapOidToFilenumber
*
- * The raison d' etre ... given a relation OID, look up its filenode.
+ * The raison d' etre ... given a relation OID, look up its filenumber.
*
* Although shared and local relation OIDs should never overlap, the caller
* always knows which we need --- so pass that information to avoid useless
@@ -157,8 +157,8 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
* Returns InvalidOid if the OID is not known (which should never happen,
* but the caller is in a better position to report a meaningful error).
*/
-Oid
-RelationMapOidToFilenode(Oid relationId, bool shared)
+RelFileNumber
+RelationMapOidToFilenumber(Oid relationId, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -170,13 +170,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
else
@@ -185,33 +185,33 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
- * RelationMapFilenodeToOid
+ * RelationMapFilenumberToOid
*
* Do the reverse of the normal direction of mapping done in
- * RelationMapOidToFilenode.
+ * RelationMapOidToFilenumber.
*
* This is not supposed to be used during normal running but rather for
* information purposes when looking at the filesystem or xlog.
*
* Returns InvalidOid if the OID is not known; this can easily happen if the
- * relfilenode doesn't pertain to a mapped relation.
+ * relfilenumber doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenumberToOid(RelFileNumber filenumber, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -222,13 +222,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -237,13 +237,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -252,13 +252,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
}
/*
- * RelationMapOidToFilenodeForDatabase
+ * RelationMapOidToFilenumberForDatabase
*
- * Like RelationMapOidToFilenode, but reads the mapping from the indicated
+ * Like RelationMapOidToFilenumber, but reads the mapping from the indicated
* path instead of using the one for the current database.
*/
-Oid
-RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
+RelFileNumber
+RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId)
{
RelMapFile map;
int i;
@@ -270,10 +270,10 @@ RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
for (i = 0; i < map.num_mappings; i++)
{
if (relationId == map.mappings[i].mapoid)
- return map.mappings[i].mapfilenode;
+ return map.mappings[i].mapfilenumber;
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
@@ -311,13 +311,13 @@ RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath, char *dstdbpath)
/*
* RelationMapUpdateMap
*
- * Install a new relfilenode mapping for the specified relation.
+ * Install a new relfilenumber mapping for the specified relation.
*
* If immediate is true (or we're bootstrapping), the mapping is activated
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -362,7 +362,7 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
map = &pending_local_updates;
}
}
- apply_map_update(map, relationId, fileNode, true);
+ apply_map_update(map, relationId, fileNumber, true);
}
/*
@@ -375,7 +375,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelFileNumber fileNumber,
+ bool add_okay)
{
int32 i;
@@ -384,7 +385,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ map->mappings[i].mapfilenumber = fileNumber;
return;
}
}
@@ -396,7 +397,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ map->mappings[map->num_mappings].mapfilenumber = fileNumber;
map->num_mappings++;
}
@@ -415,7 +416,7 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ updates->mappings[i].mapfilenumber,
add_okay);
}
}
@@ -983,12 +984,12 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
for (i = 0; i < newmap->num_mappings; i++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.spcNode = tsid;
- rnode.dbNode = dbid;
- rnode.relNode = newmap->mappings[i].mapfilenode;
- RelationPreserveStorage(rnode, false);
+ rlocator.spcOid = tsid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = newmap->mappings[i].mapfilenumber;
+ RelationPreserveStorage(rlocator, false);
}
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7cc9c72..30b2f85 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4805,16 +4805,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
bool is_index)
{
PQExpBuffer upgrade_query = createPQExpBuffer();
- PGresult *upgrade_res;
- Oid relfilenode;
- Oid toast_oid;
- Oid toast_relfilenode;
- char relkind;
- Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ PGresult *upgrade_res;
+ RelFileNumber relfilenumber;
+ Oid toast_oid;
+ RelFileNumber toast_relfilenumber;
+ char relkind;
+ Oid toast_index_oid;
+ RelFileNumber toast_index_relfilenumber;
/*
- * Preserve the OID and relfilenode of the table, table's index, table's
+ * Preserve the OID and relfilenumber of the table, table's index, table's
* toast table and toast table's index if any.
*
* One complexity is that the current table definition might not require
@@ -4837,15 +4837,15 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
@@ -4859,13 +4859,13 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
/*
* Not every relation has storage. Also, in a pre-v12 database,
- * partitioned tables have a relfilenode, which should not be
+ * partitioned tables have a relfilenumber, which should not be
* preserved when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
/*
* In a pre-v12 database, partitioned tables might be marked as having
@@ -4879,7 +4879,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
- toast_relfilenode);
+ toast_relfilenumber);
/* every toast table has an index */
appendPQExpBuffer(upgrade_buffer,
@@ -4887,20 +4887,20 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- toast_index_relfilenode);
+ toast_index_relfilenumber);
}
PQclear(upgrade_res);
}
else
{
- /* Preserve the OID and relfilenode of the index */
+ /* Preserve the OID and relfilenumber of the index */
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
}
appendPQExpBufferChar(upgrade_buffer, '\n');
diff --git a/src/bin/pg_rewind/datapagemap.h b/src/bin/pg_rewind/datapagemap.h
index ae4965f..235b676 100644
--- a/src/bin/pg_rewind/datapagemap.h
+++ b/src/bin/pg_rewind/datapagemap.h
@@ -10,7 +10,7 @@
#define DATAPAGEMAP_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
struct datapagemap
{
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 6252931..269ed64 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -56,7 +56,7 @@ static uint32 hash_string_pointer(const char *s);
static filehash_hash *filehash;
static bool isRelDataFile(const char *path);
-static char *datasegpath(RelFileNode rnode, ForkNumber forknum,
+static char *datasegpath(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber segno);
static file_entry_t *insert_filehash_entry(const char *path);
@@ -288,7 +288,7 @@ process_target_file(const char *path, file_type_t type, size_t size,
* hash table!
*/
void
-process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
+process_target_wal_block_change(ForkNumber forknum, RelFileLocator rlocator,
BlockNumber blkno)
{
char *path;
@@ -299,7 +299,7 @@ process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
segno = blkno / RELSEG_SIZE;
blkno_inseg = blkno % RELSEG_SIZE;
- path = datasegpath(rnode, forknum, segno);
+ path = datasegpath(rlocator, forknum, segno);
entry = lookup_filehash_entry(path);
pfree(path);
@@ -508,7 +508,7 @@ print_filemap(filemap_t *filemap)
static bool
isRelDataFile(const char *path)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
unsigned int segNo;
int nmatch;
bool matched;
@@ -532,32 +532,32 @@ isRelDataFile(const char *path)
*
*----
*/
- rnode.spcNode = InvalidOid;
- rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rlocator.spcOid = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.relNumber = InvalidRelFileNumber;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
- rnode.spcNode = GLOBALTABLESPACE_OID;
- rnode.dbNode = 0;
+ rlocator.spcOid = GLOBALTABLESPACE_OID;
+ rlocator.dbOid = 0;
matched = true;
}
else
{
nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ &rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
- rnode.spcNode = DEFAULTTABLESPACE_OID;
+ rlocator.spcOid = DEFAULTTABLESPACE_OID;
matched = true;
}
else
{
nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ &rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
matched = true;
@@ -567,12 +567,12 @@ isRelDataFile(const char *path)
/*
* The sscanf tests above can match files that have extra characters at
* the end. To eliminate such cases, cross-check that GetRelationPath
- * creates the exact same filename, when passed the RelFileNode
+ * creates the exact same filename, when passed the RelFileLocator
* information we extracted from the filename.
*/
if (matched)
{
- char *check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
+ char *check_path = datasegpath(rlocator, MAIN_FORKNUM, segNo);
if (strcmp(check_path, path) != 0)
matched = false;
@@ -589,12 +589,12 @@ isRelDataFile(const char *path)
* The returned path is palloc'd
*/
static char *
-datasegpath(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
+datasegpath(RelFileLocator rlocator, ForkNumber forknum, BlockNumber segno)
{
char *path;
char *segpath;
- path = relpathperm(rnode, forknum);
+ path = relpathperm(rlocator, forknum);
if (segno > 0)
{
segpath = psprintf("%s.%u", path, segno);
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 096f57a..0e011fb 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -10,7 +10,7 @@
#include "datapagemap.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* these enum values are sorted in the order we want actions to be processed */
typedef enum
@@ -103,7 +103,7 @@ extern void process_source_file(const char *path, file_type_t type,
extern void process_target_file(const char *path, file_type_t type,
size_t size, const char *link_target);
extern void process_target_wal_block_change(ForkNumber forknum,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blkno);
extern filemap_t *decide_file_actions(void);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index c6792da..d97240e 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -445,18 +445,18 @@ extractPageInfo(XLogReaderState *record)
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
- ForkNumber forknum;
- BlockNumber blkno;
+ RelFileLocator rlocator;
+ ForkNumber forknum;
+ BlockNumber blkno;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
continue;
/* We only care about the main fork; others are copied in toto */
if (forknum != MAIN_FORKNUM)
continue;
- process_target_wal_block_change(forknum, rnode, blkno);
+ process_target_wal_block_change(forknum, rlocator, blkno);
}
}
diff --git a/src/bin/pg_rewind/pg_rewind.h b/src/bin/pg_rewind/pg_rewind.h
index 393182f..8b4b50a 100644
--- a/src/bin/pg_rewind/pg_rewind.h
+++ b/src/bin/pg_rewind/pg_rewind.h
@@ -16,7 +16,7 @@
#include "datapagemap.h"
#include "libpq-fe.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* Configuration options */
extern char *datadir_target;
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 587793e..7f8042f 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -19,7 +19,7 @@ OBJS = \
option.o \
parallel.o \
pg_upgrade.o \
- relfilenode.o \
+ relfilenumber.o \
server.o \
tablespace.o \
util.o \
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 36b0670..5d30b87 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -190,9 +190,9 @@ create_rel_filename_map(const char *old_data, const char *new_data,
map->new_tablespace_suffix = new_cluster.tablespace_suffix;
}
- /* DB oid and relfilenodes are preserved between old and new cluster */
+ /* DB oid and relfilenumbers are preserved between old and new cluster */
map->db_oid = old_db->db_oid;
- map->relfilenode = old_rel->relfilenode;
+ map->relfilenumber = old_rel->relfilenumber;
/* used only for logging and error reporting, old/new are identical */
map->nspname = old_rel->nspname;
@@ -399,7 +399,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
+ i_relfilenumber,
i_reltablespace;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
@@ -495,7 +495,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_toastheap = PQfnumber(res, "toastheap");
i_nspname = PQfnumber(res, "nspname");
i_relname = PQfnumber(res, "relname");
- i_relfilenode = PQfnumber(res, "relfilenode");
+ i_relfilenumber = PQfnumber(res, "relfilenode");
i_reltablespace = PQfnumber(res, "reltablespace");
i_spclocation = PQfnumber(res, "spclocation");
@@ -527,7 +527,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 55de244..30c3ee6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -132,15 +132,15 @@ extern char *output_files[];
typedef struct
{
/* Can't use NAMEDATALEN; not guaranteed to be same on client */
- char *nspname; /* namespace name */
- char *relname; /* relation name */
- Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
- Oid indtable; /* if index, OID of its table, else 0 */
- Oid toastheap; /* if toast table, OID of base table, else 0 */
- char *tablespace; /* tablespace path; "" for cluster default */
- bool nsp_alloc; /* should nspname be freed? */
- bool tblsp_alloc; /* should tablespace be freed? */
+ char *nspname; /* namespace name */
+ char *relname; /* relation name */
+ Oid reloid; /* relation OID */
+ RelFileNumber relfilenumber; /* relation file number */
+ Oid indtable; /* if index, OID of its table, else 0 */
+ Oid toastheap; /* if toast table, OID of base table, else 0 */
+ char *tablespace; /* tablespace path; "" for cluster default */
+ bool nsp_alloc; /* should nspname be freed? */
+ bool tblsp_alloc; /* should tablespace be freed? */
} RelInfo;
typedef struct
@@ -159,7 +159,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
@@ -400,7 +400,7 @@ void parseCommandLine(int argc, char *argv[]);
void adjust_data_dir(ClusterInfo *cluster);
void get_sock_dir(ClusterInfo *cluster, bool live_check);
-/* relfilenode.c */
+/* relfilenumber.c */
void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
deleted file mode 100644
index d23ac88..0000000
--- a/src/bin/pg_upgrade/relfilenode.c
+++ /dev/null
@@ -1,259 +0,0 @@
-/*
- * relfilenode.c
- *
- * relfilenode functions
- *
- * Copyright (c) 2010-2022, PostgreSQL Global Development Group
- * src/bin/pg_upgrade/relfilenode.c
- */
-
-#include "postgres_fe.h"
-
-#include <sys/stat.h>
-
-#include "access/transam.h"
-#include "catalog/pg_class_d.h"
-#include "pg_upgrade.h"
-
-static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
-static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
-
-
-/*
- * transfer_all_new_tablespaces()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata)
-{
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- prep_status_progress("Cloning user relation files");
- break;
- case TRANSFER_MODE_COPY:
- prep_status_progress("Copying user relation files");
- break;
- case TRANSFER_MODE_LINK:
- prep_status_progress("Linking user relation files");
- break;
- }
-
- /*
- * Transferring files by tablespace is tricky because a single database
- * can use multiple tablespaces. For non-parallel mode, we just pass a
- * NULL tablespace path, which matches all tablespaces. In parallel mode,
- * we pass the default tablespace and all user-created tablespaces and let
- * those operations happen in parallel.
- */
- if (user_opts.jobs <= 1)
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, NULL);
- else
- {
- int tblnum;
-
- /* transfer default tablespace */
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, old_pgdata);
-
- for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
- parallel_transfer_all_new_dbs(old_db_arr,
- new_db_arr,
- old_pgdata,
- new_pgdata,
- os_info.old_tablespaces[tblnum]);
- /* reap all children */
- while (reap_child(true) == true)
- ;
- }
-
- end_progress_output();
- check_ok();
-}
-
-
-/*
- * transfer_all_new_dbs()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata, char *old_tablespace)
-{
- int old_dbnum,
- new_dbnum;
-
- /* Scan the old cluster databases and transfer their files */
- for (old_dbnum = new_dbnum = 0;
- old_dbnum < old_db_arr->ndbs;
- old_dbnum++, new_dbnum++)
- {
- DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
- *new_db = NULL;
- FileNameMap *mappings;
- int n_maps;
-
- /*
- * Advance past any databases that exist in the new cluster but not in
- * the old, e.g. "postgres". (The user might have removed the
- * 'postgres' database from the old cluster.)
- */
- for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
- {
- new_db = &new_db_arr->dbs[new_dbnum];
- if (strcmp(old_db->db_name, new_db->db_name) == 0)
- break;
- }
-
- if (new_dbnum >= new_db_arr->ndbs)
- pg_fatal("old database \"%s\" not found in the new cluster\n",
- old_db->db_name);
-
- mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
- new_pgdata);
- if (n_maps)
- {
- transfer_single_new_db(mappings, n_maps, old_tablespace);
- }
- /* We allocate something even for n_maps == 0 */
- pg_free(mappings);
- }
-}
-
-/*
- * transfer_single_new_db()
- *
- * create links for mappings stored in "maps" array.
- */
-static void
-transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
-{
- int mapnum;
- bool vm_must_add_frozenbit = false;
-
- /*
- * Do we need to rewrite visibilitymap?
- */
- if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
- new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
- vm_must_add_frozenbit = true;
-
- for (mapnum = 0; mapnum < size; mapnum++)
- {
- if (old_tablespace == NULL ||
- strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
- {
- /* transfer primary file */
- transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
-
- /*
- * Copy/link any fsm and vm files, if they exist
- */
- transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
- transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
- }
- }
-}
-
-
-/*
- * transfer_relfile()
- *
- * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
- * is true, visibility map forks are converted and rewritten, even in link
- * mode.
- */
-static void
-transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
-{
- char old_file[MAXPGPATH];
- char new_file[MAXPGPATH];
- int segno;
- char extent_suffix[65];
- struct stat statbuf;
-
- /*
- * Now copy/link any related segments as well. Remember, PG breaks large
- * files into 1GB segments, the first segment has no extension, subsequent
- * segments are named relfilenode.1, relfilenode.2, relfilenode.3.
- */
- for (segno = 0;; segno++)
- {
- if (segno == 0)
- extent_suffix[0] = '\0';
- else
- snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
-
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
- map->old_tablespace,
- map->old_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
- map->new_tablespace,
- map->new_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
-
- /* Is it an extent, fsm, or vm file? */
- if (type_suffix[0] != '\0' || segno != 0)
- {
- /* Did file open fail? */
- if (stat(old_file, &statbuf) != 0)
- {
- /* File does not exist? That's OK, just return */
- if (errno == ENOENT)
- return;
- else
- pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
- map->nspname, map->relname, old_file, new_file,
- strerror(errno));
- }
-
- /* If file is empty, just return */
- if (statbuf.st_size == 0)
- return;
- }
-
- unlink(new_file);
-
- /* Copying files might take some time, so give feedback. */
- pg_log(PG_STATUS, "%s", old_file);
-
- if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
- {
- /* Need to rewrite visibility map format */
- pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
- old_file, new_file);
- rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
- }
- else
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
- old_file, new_file);
- cloneFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_COPY:
- pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
- old_file, new_file);
- copyFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_LINK:
- pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
- old_file, new_file);
- linkFile(old_file, new_file, map->nspname, map->relname);
- }
- }
-}
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
new file mode 100644
index 0000000..b3ad820
--- /dev/null
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -0,0 +1,259 @@
+/*
+ * relfilenumber.c
+ *
+ * relfilenumber functions
+ *
+ * Copyright (c) 2010-2022, PostgreSQL Global Development Group
+ * src/bin/pg_upgrade/relfilenumber.c
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+
+#include "access/transam.h"
+#include "catalog/pg_class_d.h"
+#include "pg_upgrade.h"
+
+static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
+static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
+
+
+/*
+ * transfer_all_new_tablespaces()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata)
+{
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ prep_status_progress("Cloning user relation files");
+ break;
+ case TRANSFER_MODE_COPY:
+ prep_status_progress("Copying user relation files");
+ break;
+ case TRANSFER_MODE_LINK:
+ prep_status_progress("Linking user relation files");
+ break;
+ }
+
+ /*
+ * Transferring files by tablespace is tricky because a single database
+ * can use multiple tablespaces. For non-parallel mode, we just pass a
+ * NULL tablespace path, which matches all tablespaces. In parallel mode,
+ * we pass the default tablespace and all user-created tablespaces and let
+ * those operations happen in parallel.
+ */
+ if (user_opts.jobs <= 1)
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, NULL);
+ else
+ {
+ int tblnum;
+
+ /* transfer default tablespace */
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, old_pgdata);
+
+ for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+ parallel_transfer_all_new_dbs(old_db_arr,
+ new_db_arr,
+ old_pgdata,
+ new_pgdata,
+ os_info.old_tablespaces[tblnum]);
+ /* reap all children */
+ while (reap_child(true) == true)
+ ;
+ }
+
+ end_progress_output();
+ check_ok();
+}
+
+
+/*
+ * transfer_all_new_dbs()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata, char *old_tablespace)
+{
+ int old_dbnum,
+ new_dbnum;
+
+ /* Scan the old cluster databases and transfer their files */
+ for (old_dbnum = new_dbnum = 0;
+ old_dbnum < old_db_arr->ndbs;
+ old_dbnum++, new_dbnum++)
+ {
+ DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
+ *new_db = NULL;
+ FileNameMap *mappings;
+ int n_maps;
+
+ /*
+ * Advance past any databases that exist in the new cluster but not in
+ * the old, e.g. "postgres". (The user might have removed the
+ * 'postgres' database from the old cluster.)
+ */
+ for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
+ {
+ new_db = &new_db_arr->dbs[new_dbnum];
+ if (strcmp(old_db->db_name, new_db->db_name) == 0)
+ break;
+ }
+
+ if (new_dbnum >= new_db_arr->ndbs)
+ pg_fatal("old database \"%s\" not found in the new cluster\n",
+ old_db->db_name);
+
+ mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
+ new_pgdata);
+ if (n_maps)
+ {
+ transfer_single_new_db(mappings, n_maps, old_tablespace);
+ }
+ /* We allocate something even for n_maps == 0 */
+ pg_free(mappings);
+ }
+}
+
+/*
+ * transfer_single_new_db()
+ *
+ * create links for mappings stored in "maps" array.
+ */
+static void
+transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
+{
+ int mapnum;
+ bool vm_must_add_frozenbit = false;
+
+ /*
+ * Do we need to rewrite visibilitymap?
+ */
+ if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
+ new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
+ vm_must_add_frozenbit = true;
+
+ for (mapnum = 0; mapnum < size; mapnum++)
+ {
+ if (old_tablespace == NULL ||
+ strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
+ {
+ /* transfer primary file */
+ transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
+
+ /*
+ * Copy/link any fsm and vm files, if they exist
+ */
+ transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
+ transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
+ }
+ }
+}
+
+
+/*
+ * transfer_relfile()
+ *
+ * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
+ * is true, visibility map forks are converted and rewritten, even in link
+ * mode.
+ */
+static void
+transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
+{
+ char old_file[MAXPGPATH];
+ char new_file[MAXPGPATH];
+ int segno;
+ char extent_suffix[65];
+ struct stat statbuf;
+
+ /*
+ * Now copy/link any related segments as well. Remember, PG breaks large
+ * files into 1GB segments, the first segment has no extension, subsequent
+ * segments are named relfilenumber.1, relfilenumber.2, relfilenumber.3.
+ */
+ for (segno = 0;; segno++)
+ {
+ if (segno == 0)
+ extent_suffix[0] = '\0';
+ else
+ snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
+
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ map->old_tablespace,
+ map->old_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ map->new_tablespace,
+ map->new_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+
+ /* Is it an extent, fsm, or vm file? */
+ if (type_suffix[0] != '\0' || segno != 0)
+ {
+ /* Did file open fail? */
+ if (stat(old_file, &statbuf) != 0)
+ {
+ /* File does not exist? That's OK, just return */
+ if (errno == ENOENT)
+ return;
+ else
+ pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+ map->nspname, map->relname, old_file, new_file,
+ strerror(errno));
+ }
+
+ /* If file is empty, just return */
+ if (statbuf.st_size == 0)
+ return;
+ }
+
+ unlink(new_file);
+
+ /* Copying files might take some time, so give feedback. */
+ pg_log(PG_STATUS, "%s", old_file);
+
+ if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
+ {
+ /* Need to rewrite visibility map format */
+ pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
+ }
+ else
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ cloneFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_COPY:
+ pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ copyFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_LINK:
+ pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ linkFile(old_file, new_file, map->nspname, map->relname);
+ }
+ }
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5dc6010..0fdde9d 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -37,7 +37,7 @@ static const char *progname;
static int WalSegSz;
static volatile sig_atomic_t time_to_stop = false;
-static const RelFileNode emptyRelFileNode = {0, 0, 0};
+static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
typedef struct XLogDumpPrivate
{
@@ -63,7 +63,7 @@ typedef struct XLogDumpConfig
bool filter_by_rmgr_enabled;
TransactionId filter_by_xid;
bool filter_by_xid_enabled;
- RelFileNode filter_by_relation;
+ RelFileLocator filter_by_relation;
bool filter_by_extended;
bool filter_by_relation_enabled;
BlockNumber filter_by_relation_block;
@@ -393,7 +393,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
*/
static bool
XLogRecordMatchesRelationBlock(XLogReaderState *record,
- RelFileNode matchRnode,
+ RelFileLocator matchRlocator,
BlockNumber matchBlock,
ForkNumber matchFork)
{
@@ -401,17 +401,17 @@ XLogRecordMatchesRelationBlock(XLogReaderState *record,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if ((matchFork == InvalidForkNumber || matchFork == forknum) &&
- (RelFileNodeEquals(matchRnode, emptyRelFileNode) ||
- RelFileNodeEquals(matchRnode, rnode)) &&
+ (RelFileLocatorEquals(matchRlocator, emptyRelFileLocator) ||
+ RelFileLocatorEquals(matchRlocator, rlocator)) &&
(matchBlock == InvalidBlockNumber || matchBlock == blk))
return true;
}
@@ -885,11 +885,11 @@ main(int argc, char **argv)
break;
case 'R':
if (sscanf(optarg, "%u/%u/%u",
- &config.filter_by_relation.spcNode,
- &config.filter_by_relation.dbNode,
- &config.filter_by_relation.relNode) != 3 ||
- !OidIsValid(config.filter_by_relation.spcNode) ||
- !OidIsValid(config.filter_by_relation.relNode))
+ &config.filter_by_relation.spcOid,
+ &config.filter_by_relation.dbOid,
+ &config.filter_by_relation.relNumber) != 3 ||
+ !OidIsValid(config.filter_by_relation.spcOid) ||
+ !OidIsValid(config.filter_by_relation.relNumber))
{
pg_log_error("invalid relation specification: \"%s\"", optarg);
pg_log_error_detail("Expecting \"tablespace OID/database OID/relation filenode\".");
@@ -1132,7 +1132,7 @@ main(int argc, char **argv)
!XLogRecordMatchesRelationBlock(xlogreader_state,
config.filter_by_relation_enabled ?
config.filter_by_relation :
- emptyRelFileNode,
+ emptyRelFileLocator,
config.filter_by_relation_block_enabled ?
config.filter_by_relation_block :
InvalidBlockNumber,
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..1b6b620 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -107,24 +107,24 @@ forkname_chars(const char *str, ForkNumber *fork)
* XXX this must agree with GetRelationPath()!
*/
char *
-GetDatabasePath(Oid dbNode, Oid spcNode)
+GetDatabasePath(Oid dbOid, Oid spcOid)
{
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
return pstrdup("global");
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
- return psprintf("base/%u", dbNode);
+ return psprintf("base/%u", dbOid);
}
else
{
/* All other tablespaces are accessed via symlinks */
return psprintf("pg_tblspc/%u/%s/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY, dbNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY, dbOid);
}
}
@@ -138,44 +138,44 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber)
{
char *path;
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
path = psprintf("global/%u_%s",
- relNode, forkNames[forkNumber]);
+ relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/%u", relNumber);
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/%u_%s",
- dbNode, relNode,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/%u",
- dbNode, relNode);
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
- dbNode, backendId, relNode,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/t%d_%u",
- dbNode, backendId, relNode);
+ dbOid, backendId, relNumber);
}
}
else
@@ -185,25 +185,25 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber);
}
}
return path;
diff --git a/src/include/access/brin_xlog.h b/src/include/access/brin_xlog.h
index 95bfc7e..012a9af 100644
--- a/src/include/access/brin_xlog.h
+++ b/src/include/access/brin_xlog.h
@@ -18,7 +18,7 @@
#include "lib/stringinfo.h"
#include "storage/bufpage.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
diff --git a/src/include/access/ginxlog.h b/src/include/access/ginxlog.h
index 21de389..7f98503 100644
--- a/src/include/access/ginxlog.h
+++ b/src/include/access/ginxlog.h
@@ -110,7 +110,7 @@ typedef struct
typedef struct ginxlogSplit
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber rrlink; /* right link, or root's blocknumber if root
* split */
BlockNumber leftChildBlkno; /* valid on a non-leaf split */
@@ -167,7 +167,7 @@ typedef struct ginxlogDeletePage
*/
typedef struct ginxlogUpdateMeta
{
- RelFileNode node;
+ RelFileLocator locator;
GinMetaPageData metadata;
BlockNumber prevTail;
BlockNumber newRightlink;
diff --git a/src/include/access/gistxlog.h b/src/include/access/gistxlog.h
index 4537e67..9bbe4c2 100644
--- a/src/include/access/gistxlog.h
+++ b/src/include/access/gistxlog.h
@@ -97,7 +97,7 @@ typedef struct gistxlogPageDelete
*/
typedef struct gistxlogPageReuse
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} gistxlogPageReuse;
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 2d8a7f6..1705e73 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
@@ -370,9 +370,9 @@ typedef struct xl_heap_new_cid
CommandId combocid; /* just for debugging */
/*
- * Store the relfilenode/ctid pair to facilitate lookups.
+ * Store the relfilelocator/ctid pair to facilitate lookups.
*/
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
} xl_heap_new_cid;
@@ -415,7 +415,7 @@ extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
MultiXactId *relminmxid_out);
extern void heap_execute_freeze_tuple(HeapTupleHeader tuple,
xl_heap_freeze_tuple *xlrec_tp);
-extern XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer,
+extern XLogRecPtr log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer,
Buffer vm_buffer, TransactionId cutoff_xid, uint8 flags);
#endif /* HEAPAM_XLOG_H */
diff --git a/src/include/access/nbtxlog.h b/src/include/access/nbtxlog.h
index de362d3..d79489e 100644
--- a/src/include/access/nbtxlog.h
+++ b/src/include/access/nbtxlog.h
@@ -180,12 +180,12 @@ typedef struct xl_btree_dedup
* This is what we need to know about page reuse within btree. This record
* only exists to generate a conflict point for Hot Standby.
*
- * Note that we must include a RelFileNode in the record because we don't
+ * Note that we must include a RelFileLocator in the record because we don't
* actually register the buffer with the record.
*/
typedef struct xl_btree_reuse_page
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} xl_btree_reuse_page;
diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h
index 3e27790..353cbb2 100644
--- a/src/include/access/rewriteheap.h
+++ b/src/include/access/rewriteheap.h
@@ -15,7 +15,7 @@
#include "access/htup.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* struct definition is private to rewriteheap.c */
@@ -34,8 +34,8 @@ extern bool rewrite_heap_dead_tuple(RewriteState state, HeapTuple oldTuple);
*/
typedef struct LogicalRewriteMappingData
{
- RelFileNode old_node;
- RelFileNode new_node;
+ RelFileLocator old_locator;
+ RelFileLocator new_locator;
ItemPointerData old_tid;
ItemPointerData new_tid;
} LogicalRewriteMappingData;
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index fe869c6..83a8e7e 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -560,32 +560,32 @@ typedef struct TableAmRoutine
*/
/*
- * This callback needs to create a new relation filenode for `rel`, with
+ * This callback needs to create a new relation filelocator for `rel`, with
* appropriate durability behaviour for `persistence`.
*
* Note that only the subset of the relcache filled by
* RelationBuildLocalRelation() can be relied upon and that the relation's
* catalog entries will either not yet exist (new relation), or will still
- * reference the old relfilenode.
+ * reference the old relfilelocator.
*
* As output *freezeXid, *minmulti must be set to the values appropriate
* for pg_class.{relfrozenxid, relminmxid}. For AMs that don't need those
* fields to be filled they can be set to InvalidTransactionId and
* InvalidMultiXactId, respectively.
*
- * See also table_relation_set_new_filenode().
+ * See also table_relation_set_new_filelocator().
*/
- void (*relation_set_new_filenode) (Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti);
+ void (*relation_set_new_filelocator) (Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti);
/*
* This callback needs to remove all contents from `rel`'s current
- * relfilenode. No provisions for transactional behaviour need to be made.
- * Often this can be implemented by truncating the underlying storage to
- * its minimal size.
+ * relfilelocator. No provisions for transactional behaviour need to be
+ * made. Often this can be implemented by truncating the underlying
+ * storage to its minimal size.
*
* See also table_relation_nontransactional_truncate().
*/
@@ -598,7 +598,7 @@ typedef struct TableAmRoutine
* storage, unless it contains references to the tablespace internally.
*/
void (*relation_copy_data) (Relation rel,
- const RelFileNode *newrnode);
+ const RelFileLocator *newrlocator);
/* See table_relation_copy_for_cluster() */
void (*relation_copy_for_cluster) (Relation NewTable,
@@ -1348,7 +1348,7 @@ table_index_delete_tuples(Relation rel, TM_IndexDeleteOp *delstate)
* RelationGetBufferForTuple. See that method for more information.
*
* TABLE_INSERT_FROZEN should only be specified for inserts into
- * relfilenodes created during the current subtransaction and when
+ * relfilenumbers created during the current subtransaction and when
* there are no prior snapshots or pre-existing portals open.
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
@@ -1577,33 +1577,34 @@ table_finish_bulk_insert(Relation rel, int options)
*/
/*
- * Create storage for `rel` in `newrnode`, with persistence set to
+ * Create storage for `rel` in `newrlocator`, with persistence set to
* `persistence`.
*
* This is used both during relation creation and various DDL operations to
- * create a new relfilenode that can be filled from scratch. When creating
- * new storage for an existing relfilenode, this should be called before the
+ * create a new relfilelocator that can be filled from scratch. When creating
+ * new storage for an existing relfilelocator, this should be called before the
* relcache entry has been updated.
*
* *freezeXid, *minmulti are set to the xid / multixact horizon for the table
* that pg_class.{relfrozenxid, relminmxid} have to be set to.
*/
static inline void
-table_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+table_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
- rel->rd_tableam->relation_set_new_filenode(rel, newrnode, persistence,
- freezeXid, minmulti);
+ rel->rd_tableam->relation_set_new_filelocator(rel, newrlocator,
+ persistence, freezeXid,
+ minmulti);
}
/*
* Remove all table contents from `rel`, in a non-transactional manner.
* Non-transactional meaning that there's no need to support rollbacks. This
- * commonly only is used to perform truncations for relfilenodes created in the
- * current transaction.
+ * commonly only is used to perform truncations for relfilelocators created in
+ * the current transaction.
*/
static inline void
table_relation_nontransactional_truncate(Relation rel)
@@ -1612,15 +1613,15 @@ table_relation_nontransactional_truncate(Relation rel)
}
/*
- * Copy data from `rel` into the new relfilenode `newrnode`. The new
- * relfilenode may not have storage associated before this function is
+ * Copy data from `rel` into the new relfilelocator `newrlocator`. The new
+ * relfilelocator may not have storage associated before this function is
* called. This is only supposed to be used for low level operations like
* changing a relation's tablespace.
*/
static inline void
-table_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+table_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
- rel->rd_tableam->relation_copy_data(rel, newrnode);
+ rel->rd_tableam->relation_copy_data(rel, newrlocator);
}
/*
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 4794941..7d2b352 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -19,7 +19,7 @@
#include "datatype/timestamp.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/sinval.h"
/*
@@ -174,7 +174,7 @@ typedef struct SavedTransactionCharacteristics
*/
#define XACT_XINFO_HAS_DBINFO (1U << 0)
#define XACT_XINFO_HAS_SUBXACTS (1U << 1)
-#define XACT_XINFO_HAS_RELFILENODES (1U << 2)
+#define XACT_XINFO_HAS_RELFILELOCATORS (1U << 2)
#define XACT_XINFO_HAS_INVALS (1U << 3)
#define XACT_XINFO_HAS_TWOPHASE (1U << 4)
#define XACT_XINFO_HAS_ORIGIN (1U << 5)
@@ -252,12 +252,12 @@ typedef struct xl_xact_subxacts
} xl_xact_subxacts;
#define MinSizeOfXactSubxacts offsetof(xl_xact_subxacts, subxacts)
-typedef struct xl_xact_relfilenodes
+typedef struct xl_xact_relfilelocators
{
int nrels; /* number of relations */
- RelFileNode xnodes[FLEXIBLE_ARRAY_MEMBER];
-} xl_xact_relfilenodes;
-#define MinSizeOfXactRelfilenodes offsetof(xl_xact_relfilenodes, xnodes)
+ RelFileLocator xlocators[FLEXIBLE_ARRAY_MEMBER];
+} xl_xact_relfilelocators;
+#define MinSizeOfXactRelfileLocators offsetof(xl_xact_relfilelocators, xlocators)
/*
* A transactionally dropped statistics entry.
@@ -305,7 +305,7 @@ typedef struct xl_xact_commit
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* xl_xact_invals follows if XINFO_HAS_INVALS */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -321,7 +321,7 @@ typedef struct xl_xact_abort
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* No invalidation messages needed. */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -367,7 +367,7 @@ typedef struct xl_xact_parsed_commit
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -378,7 +378,7 @@ typedef struct xl_xact_parsed_commit
TransactionId twophase_xid; /* only for 2PC */
char twophase_gid[GIDSIZE]; /* only for 2PC */
int nabortrels; /* only for 2PC */
- RelFileNode *abortnodes; /* only for 2PC */
+ RelFileLocator *abortlocators; /* only for 2PC */
int nabortstats; /* only for 2PC */
xl_xact_stats_item *abortstats; /* only for 2PC */
@@ -400,7 +400,7 @@ typedef struct xl_xact_parsed_abort
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -483,7 +483,7 @@ extern int xactGetCommittedChildren(TransactionId **ptr);
extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int nmsgs, SharedInvalidationMessage *msgs,
@@ -494,7 +494,7 @@ extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
extern XLogRecPtr XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int xactflags, TransactionId twophase_xid,
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index fae0bef..3524c39 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,7 +25,7 @@
#include "lib/stringinfo.h"
#include "pgtime.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index 5fc340c..c04f77b 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "storage/block.h"
#include "storage/buf.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/*
@@ -45,16 +45,16 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int max_block_id, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
+extern void XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator,
ForkNumber forknum, BlockNumber blknum, char *page,
uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
+extern XLogRecPtr log_newpage(RelFileLocator *rlocator, ForkNumber forkNum,
BlockNumber blk, char *page, bool page_std);
-extern void log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+extern void log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, char **pages, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index e73ea4a..5395f15 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -122,7 +122,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
@@ -430,10 +430,10 @@ extern FullTransactionId XLogRecGetFullXid(XLogReaderState *record);
extern bool RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern void XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum);
extern bool XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer);
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 052ac68..7e467ef 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "port/pg_crc32c.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* The overall layout of an XLOG record is:
@@ -97,7 +97,7 @@ typedef struct XLogRecordBlockHeader
* image) */
/* If BKPBLOCK_HAS_IMAGE, an XLogRecordBlockImageHeader struct follows */
- /* If BKPBLOCK_SAME_REL is not set, a RelFileNode follows */
+ /* If BKPBLOCK_SAME_REL is not set, a RelFileLocator follows */
/* BlockNumber follows */
} XLogRecordBlockHeader;
@@ -175,7 +175,7 @@ typedef struct XLogRecordBlockCompressHeader
(SizeOfXLogRecordBlockHeader + \
SizeOfXLogRecordBlockImageHeader + \
SizeOfXLogRecordBlockCompressHeader + \
- sizeof(RelFileNode) + \
+ sizeof(RelFileLocator) + \
sizeof(BlockNumber))
/*
@@ -187,7 +187,7 @@ typedef struct XLogRecordBlockCompressHeader
#define BKPBLOCK_HAS_IMAGE 0x10 /* block data is an XLogRecordBlockImage */
#define BKPBLOCK_HAS_DATA 0x20
#define BKPBLOCK_WILL_INIT 0x40 /* redo will re-init the page */
-#define BKPBLOCK_SAME_REL 0x80 /* RelFileNode omitted, same as previous */
+#define BKPBLOCK_SAME_REL 0x80 /* RelFileLocator omitted, same as previous */
/*
* XLogRecordDataHeaderShort/Long are used for the "main data" portion of
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index c9d0b75..ef18297 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -60,9 +60,9 @@ extern PGDLLIMPORT HotStandbyState standbyState;
extern bool XLogHaveInvalidPages(void);
extern void XLogCheckInvalidPages(void);
-extern void XLogDropRelation(RelFileNode rnode, ForkNumber forknum);
+extern void XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum);
extern void XLogDropDatabase(Oid dbid);
-extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+extern void XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks);
/* Result codes for XLogReadBufferForRedo[Extended] */
@@ -89,11 +89,11 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer);
-extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
+extern Relation CreateFakeRelcacheEntry(RelFileLocator rlocator);
extern void FreeFakeRelcacheEntry(Relation fakerel);
extern int read_local_xlog_page(XLogReaderState *state,
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..fd93442 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..66900f1 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,7 +38,8 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ Relation pg_class,
+ char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 07c5b88..5774c46 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1bdb00a 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/storage.h b/src/include/catalog/storage.h
index 59f3404..9964c31 100644
--- a/src/include/catalog/storage.h
+++ b/src/include/catalog/storage.h
@@ -15,23 +15,23 @@
#define STORAGE_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
/* GUC variables */
extern PGDLLIMPORT int wal_skip_threshold;
-extern SMgrRelation RelationCreateStorage(RelFileNode rnode,
+extern SMgrRelation RelationCreateStorage(RelFileLocator rlocator,
char relpersistence,
bool register_delete);
extern void RelationDropStorage(Relation rel);
-extern void RelationPreserveStorage(RelFileNode rnode, bool atCommit);
+extern void RelationPreserveStorage(RelFileLocator rlocator, bool atCommit);
extern void RelationPreTruncate(Relation rel);
extern void RelationTruncate(Relation rel, BlockNumber nblocks);
extern void RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
-extern bool RelFileNodeSkippingWAL(RelFileNode rnode);
+extern bool RelFileLocatorSkippingWAL(RelFileLocator rlocator);
extern Size EstimatePendingSyncsSpace(void);
extern void SerializePendingSyncs(Size maxSize, char *startAddress);
extern void RestorePendingSyncs(char *startAddress);
@@ -42,7 +42,7 @@ extern void RestorePendingSyncs(char *startAddress);
*/
extern void smgrDoPendingDeletes(bool isCommit);
extern void smgrDoPendingSyncs(bool isCommit, bool isParallelWorker);
-extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr);
+extern int smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr);
extern void AtSubCommit_smgr(void);
extern void AtSubAbort_smgr(void);
extern void PostPrepare_smgr(void);
diff --git a/src/include/catalog/storage_xlog.h b/src/include/catalog/storage_xlog.h
index 622de22..44a5e20 100644
--- a/src/include/catalog/storage_xlog.h
+++ b/src/include/catalog/storage_xlog.h
@@ -17,7 +17,7 @@
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Declarations for smgr-related XLOG records
@@ -32,7 +32,7 @@
typedef struct xl_smgr_create
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
} xl_smgr_create;
@@ -46,11 +46,11 @@ typedef struct xl_smgr_create
typedef struct xl_smgr_truncate
{
BlockNumber blkno;
- RelFileNode rnode;
+ RelFileLocator rlocator;
int flags;
} xl_smgr_truncate;
-extern void log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum);
+extern void log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum);
extern void smgr_redo(XLogReaderState *record);
extern void smgr_desc(StringInfo buf, XLogReaderState *record);
diff --git a/src/include/commands/sequence.h b/src/include/commands/sequence.h
index 9da2300..d38c0e2 100644
--- a/src/include/commands/sequence.h
+++ b/src/include/commands/sequence.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "parser/parse_node.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
typedef struct FormData_pg_sequence_data
@@ -47,7 +47,7 @@ typedef FormData_pg_sequence_data *Form_pg_sequence_data;
typedef struct xl_seq_rec
{
- RelFileNode node;
+ RelFileLocator locator;
/* SEQUENCE TUPLE DATA FOLLOWS AT THE END */
} xl_seq_rec;
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..0c48654 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelFileNumber newRelFileNumber);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index 24b6473..1f80907 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -50,7 +50,7 @@ extern void DropTableSpace(DropTableSpaceStmt *stmt);
extern ObjectAddress RenameTableSpace(const char *oldname, const char *newname);
extern Oid AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
-extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
+extern void TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo);
extern Oid GetDefaultTablespace(char relpersistence, bool partitioned);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 13849a3..3ab7132 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -64,27 +64,27 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
/*
* Stuff for computing filesystem pathnames for relations.
*/
-extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
+extern char *GetDatabasePath(Oid dbOid, Oid spcOid);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber);
/*
* Wrapper macros for GetRelationPath. Beware of multiple
- * evaluation of the RelFileNode or RelFileNodeBackend argument!
+ * evaluation of the RelFileLocator or RelFileLocatorBackend argument!
*/
-/* First argument is a RelFileNode */
-#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
+/* First argument is a RelFileLocator */
+#define relpathbackend(rlocator, backend, forknum) \
+ GetRelationPath((rlocator).dbOid, (rlocator).spcOid, (rlocator).relNumber, \
backend, forknum)
-/* First argument is a RelFileNode */
-#define relpathperm(rnode, forknum) \
- relpathbackend(rnode, InvalidBackendId, forknum)
+/* First argument is a RelFileLocator */
+#define relpathperm(rlocator, forknum) \
+ relpathbackend(rlocator, InvalidBackendId, forknum)
-/* First argument is a RelFileNodeBackend */
-#define relpath(rnode, forknum) \
- relpathbackend((rnode).node, (rnode).backend, forknum)
+/* First argument is a RelFileLocatorBackend */
+#define relpath(rlocator, forknum) \
+ relpathbackend((rlocator).locator, (rlocator).backend, forknum)
#endif /* RELPATH_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 73f635b..562f21c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3247,10 +3247,10 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
- SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
- SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
- * oldNode */
+ RelFileNumber oldNumber; /* relfilenumber of existing storage, if any */
+ SubTransactionId oldCreateSubid; /* rd_createSubid of oldNumber */
+ SubTransactionId oldFirstRelfilenumberSubid; /* rd_firstRelfilelocatorSubid
+ * of oldNumber */
bool unique; /* is index unique? */
bool nulls_not_distinct; /* null treatment for UNIQUE constraints */
bool primary; /* is index a primary key? */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..d8af68b 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,13 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelFileNumber data type identifies the specific relation file name.
+ */
+typedef Oid RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+#define RelFileNumberIsValid(relnumber) \
+ ((bool) ((relnumber) != InvalidRelFileNumber))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 2511ef4..b67fb1e 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -16,7 +16,7 @@
#define _BGWRITER_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 4a01f87..d109d0b 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -99,7 +99,7 @@ typedef struct ReorderBufferChange
struct
{
/* relation that has been changed */
- RelFileNode relnode;
+ RelFileLocator rlocator;
/* no previously reassembled toast chunks are necessary anymore */
bool clear_toast_afterwards;
@@ -145,7 +145,7 @@ typedef struct ReorderBufferChange
*/
struct
{
- RelFileNode node;
+ RelFileLocator locator;
ItemPointerData tid;
CommandId cmin;
CommandId cmax;
@@ -657,7 +657,7 @@ extern void ReorderBufferAddSnapshot(ReorderBuffer *, TransactionId, XLogRecPtr
extern void ReorderBufferAddNewCommandId(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
CommandId cid);
extern void ReorderBufferAddNewTupleCids(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
- RelFileNode node, ItemPointerData pt,
+ RelFileLocator locator, ItemPointerData pt,
CommandId cmin, CommandId cmax, CommandId combocid);
extern void ReorderBufferAddInvalidations(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
Size nmsgs, SharedInvalidationMessage *msgs);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index a17e7b2..d54e1f6 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,30 +90,30 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rlocator.spcOid = InvalidOid, \
+ (a).rlocator.dbOid = InvalidOid, \
+ (a).rlocator.relNumber = InvalidOid, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -291,11 +291,11 @@ extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
- Oid tsId;
- Oid relNode;
- ForkNumber forkNum;
- BlockNumber blockNum;
- int buf_id;
+ Oid tsId;
+ RelFileNumber relNumber;
+ ForkNumber forkNum;
+ BlockNumber blockNum;
+ int buf_id;
} CkptSortItem;
extern PGDLLIMPORT CkptSortItem *CkptBufferIds;
@@ -337,9 +337,9 @@ extern PrefetchBufferResult PrefetchLocalBuffer(SMgrRelation smgr,
extern BufferDesc *LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum,
BlockNumber blockNum, bool *foundPtr);
extern void MarkLocalBufferDirty(Buffer buffer);
-extern void DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
+extern void DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber firstDelBlock);
-extern void DropRelFileNodeAllLocalBuffers(RelFileNode rnode);
+extern void DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator);
extern void AtEOXact_LocalBuffers(bool isCommit);
#endif /* BUFMGR_INTERNALS_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 5839140..96e473e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,7 +17,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -176,13 +176,13 @@ extern PrefetchBufferResult PrefetchSharedBuffer(struct SMgrRelationData *smgr_r
BlockNumber blockNum);
extern PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum,
BlockNumber blockNum);
-extern bool ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum,
+extern bool ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, Buffer recent_buffer);
extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
+extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool permanent);
@@ -204,13 +204,13 @@ extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
extern void FlushOneBuffer(Buffer buffer);
extern void FlushRelationBuffers(Relation rel);
extern void FlushRelationsAllBuffers(struct SMgrRelationData **smgrs, int nrels);
-extern void CreateAndCopyRelationData(RelFileNode src_rnode,
- RelFileNode dst_rnode,
+extern void CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator,
bool permanent);
extern void FlushDatabaseBuffers(Oid dbid);
-extern void DropRelFileNodeBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
+extern void DropRelFileLocatorBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock);
-extern void DropRelFileNodesAllBuffers(struct SMgrRelationData **smgr_reln, int nnodes);
+extern void DropRelFileLocatorsAllBuffers(struct SMgrRelationData **smgr_reln, int nlocators);
extern void DropDatabaseBuffers(Oid dbid);
#define RelationGetNumberOfBlocks(reln) \
@@ -223,7 +223,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileLocator *rlocator,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/freespace.h b/src/include/storage/freespace.h
index dcc40eb..fcb0802 100644
--- a/src/include/storage/freespace.h
+++ b/src/include/storage/freespace.h
@@ -15,7 +15,7 @@
#define FREESPACE_H_
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* prototypes for public functions in freespace.c */
@@ -27,7 +27,7 @@ extern BlockNumber RecordAndGetPageWithFreeSpace(Relation rel,
Size spaceNeeded);
extern void RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk,
Size spaceAvail);
-extern void XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+extern void XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail);
extern BlockNumber FreeSpaceMapPrepareTruncateRel(Relation rel,
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ffffa40..10aa1b0 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -15,7 +15,7 @@
#define MD_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
@@ -25,7 +25,7 @@ extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
-extern void mdunlink(RelFileNodeBackend rnode, ForkNumber forknum, bool isRedo);
+extern void mdunlink(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo);
extern void mdextend(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern bool mdprefetch(SMgrRelation reln, ForkNumber forknum,
@@ -42,7 +42,7 @@ extern void mdtruncate(SMgrRelation reln, ForkNumber forknum,
extern void mdimmedsync(SMgrRelation reln, ForkNumber forknum);
extern void ForgetDatabaseSyncRequests(Oid dbid);
-extern void DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo);
+extern void DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo);
/* md sync callbacks */
extern int mdsyncfiletag(const FileTag *ftag, char *path);
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
new file mode 100644
index 0000000..7211fe7
--- /dev/null
+++ b/src/include/storage/relfilelocator.h
@@ -0,0 +1,99 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilelocator.h
+ * Physical access information for relations.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/relfilelocator.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILELOCATOR_H
+#define RELFILELOCATOR_H
+
+#include "common/relpath.h"
+#include "storage/backendid.h"
+
+/*
+ * RelFileLocator must provide all that we need to know to physically access
+ * a relation, with the exception of the backend ID, which can be provided
+ * separately. Note, however, that a "physical" relation is comprised of
+ * multiple files on the filesystem, as each fork is stored as a separate
+ * file, and each fork can be divided into multiple segments. See md.c.
+ *
+ * spcOid identifies the tablespace of the relation. It corresponds to
+ * pg_tablespace.oid.
+ *
+ * dbOid identifies the database of the relation. It is zero for
+ * "shared" relations (those common to all databases of a cluster).
+ * Nonzero dbOid values correspond to pg_database.oid.
+ *
+ * relNumber identifies the specific relation. relNumber corresponds to
+ * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
+ * to assign new physical files to relations in some situations).
+ * Notice that relNumber is only unique within a database in a particular
+ * tablespace.
+ *
+ * Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
+ * zero. We support shared relations only in the "global" tablespace.
+ *
+ * Note: in pg_class we allow reltablespace == 0 to denote that the
+ * relation is stored in its database's "default" tablespace (as
+ * identified by pg_database.dattablespace). However this shorthand
+ * is NOT allowed in RelFileLocator structs --- the real tablespace ID
+ * must be supplied when setting spcOid.
+ *
+ * Note: in pg_class, relfilenode can be zero to denote that the relation
+ * is a "mapped" relation, whose current true filenode number is available
+ * from relmapper.c. Again, this case is NOT allowed in RelFileLocators.
+ *
+ * Note: various places use RelFileLocator in hashtable keys. Therefore,
+ * there *must not* be any unused padding bytes in this struct. That
+ * should be safe as long as all the fields are of type Oid.
+ */
+typedef struct RelFileLocator
+{
+ Oid spcOid; /* tablespace */
+ Oid dbOid; /* database */
+ RelFileNumber relNumber; /* relation */
+} RelFileLocator;
+
+/*
+ * Augmenting a relfilelocator with the backend ID provides all the information
+ * we need to locate the physical storage. The backend ID is InvalidBackendId
+ * for regular relations (those accessible to more than one backend), or the
+ * owning backend's ID for backend-local relations. Backend-local relations
+ * are always transient and removed in case of a database crash; they are
+ * never WAL-logged or fsync'd.
+ */
+typedef struct RelFileLocatorBackend
+{
+ RelFileLocator locator;
+ BackendId backend;
+} RelFileLocatorBackend;
+
+#define RelFileLocatorBackendIsTemp(rlocator) \
+ ((rlocator).backend != InvalidBackendId)
+
+/*
+ * Note: RelFileLocatorEquals and RelFileLocatorBackendEquals compare relNumber first
+ * since that is most likely to be different in two unequal RelFileLocators. It
+ * is probably redundant to compare spcOid if the other fields are found equal,
+ * but do it anyway to be sure. Likewise for checking the backend ID in
+ * RelFileLocatorBackendEquals.
+ */
+#define RelFileLocatorEquals(locator1, locator2) \
+ ((locator1).relNumber == (locator2).relNumber && \
+ (locator1).dbOid == (locator2).dbOid && \
+ (locator1).spcOid == (locator2).spcOid)
+
+#define RelFileLocatorBackendEquals(locator1, locator2) \
+ ((locator1).locator.relNumber == (locator2).locator.relNumber && \
+ (locator1).locator.dbOid == (locator2).locator.dbOid && \
+ (locator1).backend == (locator2).backend && \
+ (locator1).locator.spcOid == (locator2).locator.spcOid)
+
+#endif /* RELFILELOCATOR_H */
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
deleted file mode 100644
index 4fdc606..0000000
--- a/src/include/storage/relfilenode.h
+++ /dev/null
@@ -1,99 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenode.h
- * Physical access information for relations.
- *
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/storage/relfilenode.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODE_H
-#define RELFILENODE_H
-
-#include "common/relpath.h"
-#include "storage/backendid.h"
-
-/*
- * RelFileNode must provide all that we need to know to physically access
- * a relation, with the exception of the backend ID, which can be provided
- * separately. Note, however, that a "physical" relation is comprised of
- * multiple files on the filesystem, as each fork is stored as a separate
- * file, and each fork can be divided into multiple segments. See md.c.
- *
- * spcNode identifies the tablespace of the relation. It corresponds to
- * pg_tablespace.oid.
- *
- * dbNode identifies the database of the relation. It is zero for
- * "shared" relations (those common to all databases of a cluster).
- * Nonzero dbNode values correspond to pg_database.oid.
- *
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
- *
- * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
- * zero. We support shared relations only in the "global" tablespace.
- *
- * Note: in pg_class we allow reltablespace == 0 to denote that the
- * relation is stored in its database's "default" tablespace (as
- * identified by pg_database.dattablespace). However this shorthand
- * is NOT allowed in RelFileNode structs --- the real tablespace ID
- * must be supplied when setting spcNode.
- *
- * Note: in pg_class, relfilenode can be zero to denote that the relation
- * is a "mapped" relation, whose current true filenode number is available
- * from relmapper.c. Again, this case is NOT allowed in RelFileNodes.
- *
- * Note: various places use RelFileNode in hashtable keys. Therefore,
- * there *must not* be any unused padding bytes in this struct. That
- * should be safe as long as all the fields are of type Oid.
- */
-typedef struct RelFileNode
-{
- Oid spcNode; /* tablespace */
- Oid dbNode; /* database */
- Oid relNode; /* relation */
-} RelFileNode;
-
-/*
- * Augmenting a relfilenode with the backend ID provides all the information
- * we need to locate the physical storage. The backend ID is InvalidBackendId
- * for regular relations (those accessible to more than one backend), or the
- * owning backend's ID for backend-local relations. Backend-local relations
- * are always transient and removed in case of a database crash; they are
- * never WAL-logged or fsync'd.
- */
-typedef struct RelFileNodeBackend
-{
- RelFileNode node;
- BackendId backend;
-} RelFileNodeBackend;
-
-#define RelFileNodeBackendIsTemp(rnode) \
- ((rnode).backend != InvalidBackendId)
-
-/*
- * Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
- * since that is most likely to be different in two unequal RelFileNodes. It
- * is probably redundant to compare spcNode if the other fields are found equal,
- * but do it anyway to be sure. Likewise for checking the backend ID in
- * RelFileNodeBackendEquals.
- */
-#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
- (node1).dbNode == (node2).dbNode && \
- (node1).spcNode == (node2).spcNode)
-
-#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
- (node1).node.dbNode == (node2).node.dbNode && \
- (node1).backend == (node2).backend && \
- (node1).node.spcNode == (node2).node.spcNode)
-
-#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index e7cd456..56c6fc9 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -16,7 +16,7 @@
#include <signal.h>
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* We support several types of shared-invalidation messages:
@@ -90,7 +90,7 @@ typedef struct
int8 id; /* type field --- must be first */
int8 backend_hi; /* high bits of backend ID, if temprel */
uint16 backend_lo; /* low bits of backend ID, if temprel */
- RelFileNode rnode; /* spcNode, dbNode, relNode */
+ RelFileLocator rlocator; /* spcOid, dbOid, relNumber */
} SharedInvalSmgrMsg;
#define SHAREDINVALRELMAP_ID (-4)
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 6b63c60..a077153 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -16,7 +16,7 @@
#include "lib/ilist.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* smgr.c maintains a table of SMgrRelation objects, which are essentially
@@ -38,8 +38,8 @@
*/
typedef struct SMgrRelationData
{
- /* rnode is the hashtable lookup key, so it must be first! */
- RelFileNodeBackend smgr_rnode; /* relation physical identifier */
+ /* rlocator is the hashtable lookup key, so it must be first! */
+ RelFileLocatorBackend smgr_rlocator; /* relation physical identifier */
/* pointer to owning pointer, or NULL if none */
struct SMgrRelationData **smgr_owner;
@@ -75,16 +75,16 @@ typedef struct SMgrRelationData
typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
- RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+ RelFileLocatorBackendIsTemp((smgr)->smgr_rlocator)
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(RelFileLocator rlocator, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclose(SMgrRelation reln);
extern void smgrcloseall(void);
-extern void smgrclosenode(RelFileNodeBackend rnode);
+extern void smgrcloserellocator(RelFileLocatorBackend rlocator);
extern void smgrrelease(SMgrRelation reln);
extern void smgrreleaseall(void);
extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 6a77632..dacef92 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -17,7 +17,7 @@
#include "datatype/timestamp.h"
#include "storage/lock.h"
#include "storage/procsignal.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/standbydefs.h"
/* User-settable GUC parameters */
@@ -30,9 +30,9 @@ extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithTablespace(Oid tsid);
extern void ResolveRecoveryConflictWithDatabase(Oid dbid);
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..049af87 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -13,7 +13,7 @@
#ifndef SYNC_H
#define SYNC_H
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Type of sync request. These are used to manage the set of pending
@@ -51,7 +51,7 @@ typedef struct FileTag
{
int16 handler; /* SyncRequestHandler value, saving space */
int16 forknum; /* ForkNumber, saving space */
- RelFileNode rnode;
+ RelFileLocator rlocator;
uint32 segno;
} FileTag;
diff --git a/src/include/utils/inval.h b/src/include/utils/inval.h
index 0e0323b..23748b7 100644
--- a/src/include/utils/inval.h
+++ b/src/include/utils/inval.h
@@ -15,7 +15,7 @@
#define INVAL_H
#include "access/htup.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
extern PGDLLIMPORT int debug_discard_caches;
@@ -48,7 +48,7 @@ extern void CacheInvalidateRelcacheByTuple(HeapTuple classTuple);
extern void CacheInvalidateRelcacheByRelid(Oid relid);
-extern void CacheInvalidateSmgr(RelFileNodeBackend rnode);
+extern void CacheInvalidateSmgr(RelFileLocatorBackend rlocator);
extern void CacheInvalidateRelmap(Oid databaseId);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1896a9a..e5b6662 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -23,7 +23,7 @@
#include "partitioning/partdefs.h"
#include "rewrite/prs2lock.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/reltrigger.h"
@@ -53,7 +53,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
- RelFileNode rd_node; /* relation physical identifier */
+ RelFileLocator rd_locator; /* relation physical identifier */
SMgrRelation rd_smgr; /* cached file handle, or NULL */
int rd_refcnt; /* reference count */
BackendId rd_backend; /* owning backend id, if temporary relation */
@@ -66,44 +66,44 @@ typedef struct RelationData
/*----------
* rd_createSubid is the ID of the highest subtransaction the rel has
- * survived into or zero if the rel or its rd_node was created before the
- * current top transaction. (IndexStmt.oldNode leads to the case of a new
- * rel with an old rd_node.) rd_firstRelfilenodeSubid is the ID of the
- * highest subtransaction an rd_node change has survived into or zero if
- * rd_node matches the value it had at the start of the current top
+ * survived into or zero if the rel or its rd_locator was created before the
+ * current top transaction. (IndexStmt.oldNumber leads to the case of a new
+ * rel with an old rd_locator.) rd_firstRelfilelocatorSubid is the ID of the
+ * highest subtransaction an rd_locator change has survived into or zero if
+ * rd_locator matches the value it had at the start of the current top
* transaction. (Rolling back the subtransaction that
- * rd_firstRelfilenodeSubid denotes would restore rd_node to the value it
+ * rd_firstRelfilelocatorSubid denotes would restore rd_locator to the value it
* had at the start of the current top transaction. Rolling back any
* lower subtransaction would not.) Their accuracy is critical to
* RelationNeedsWAL().
*
- * rd_newRelfilenodeSubid is the ID of the highest subtransaction the
- * most-recent relfilenode change has survived into or zero if not changed
+ * rd_newRelfilelocatorSubid is the ID of the highest subtransaction the
+ * most-recent relfilenumber change has survived into or zero if not changed
* in the current transaction (or we have forgotten changing it). This
* field is accurate when non-zero, but it can be zero when a relation has
- * multiple new relfilenodes within a single transaction, with one of them
+ * multiple new relfilenumbers within a single transaction, with one of them
* occurring in a subsequently aborted subtransaction, e.g.
* BEGIN;
* TRUNCATE t;
* SAVEPOINT save;
* TRUNCATE t;
* ROLLBACK TO save;
- * -- rd_newRelfilenodeSubid is now forgotten
+ * -- rd_newRelfilelocatorSubid is now forgotten
*
* If every rd_*Subid field is zero, they are read-only outside
- * relcache.c. Files that trigger rd_node changes by updating
+ * relcache.c. Files that trigger rd_locator changes by updating
* pg_class.reltablespace and/or pg_class.relfilenode call
- * RelationAssumeNewRelfilenode() to update rd_*Subid.
+ * RelationAssumeNewRelfilelocator() to update rd_*Subid.
*
* rd_droppedSubid is the ID of the highest subtransaction that a drop of
* the rel has survived into. In entries visible outside relcache.c, this
* is always zero.
*/
SubTransactionId rd_createSubid; /* rel was created in current xact */
- SubTransactionId rd_newRelfilenodeSubid; /* highest subxact changing
- * rd_node to current value */
- SubTransactionId rd_firstRelfilenodeSubid; /* highest subxact changing
- * rd_node to any value */
+ SubTransactionId rd_newRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to current value */
+ SubTransactionId rd_firstRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to any value */
SubTransactionId rd_droppedSubid; /* dropped with another Subid set */
Form_pg_class rd_rel; /* RELATION tuple */
@@ -531,12 +531,12 @@ typedef struct ViewOptions
/*
* RelationIsMapped
- * True if the relation uses the relfilenode map. Note multiple eval
+ * True if the relation uses the relfilenumber map. Note multiple eval
* of argument!
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelFileNumber))
/*
* RelationGetSmgr
@@ -555,7 +555,7 @@ static inline SMgrRelation
RelationGetSmgr(Relation rel)
{
if (unlikely(rel->rd_smgr == NULL))
- smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_node, rel->rd_backend));
+ smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_locator, rel->rd_backend));
return rel->rd_smgr;
}
@@ -607,12 +607,12 @@ RelationGetSmgr(Relation rel)
*
* Returns false if wal_level = minimal and this relation is created or
* truncated in the current transaction. See "Skipping WAL for New
- * RelFileNode" in src/backend/access/transam/README.
+ * RelFileLocator" in src/backend/access/transam/README.
*/
#define RelationNeedsWAL(relation) \
(RelationIsPermanent(relation) && (XLogIsNeeded() || \
(relation->rd_createSubid == InvalidSubTransactionId && \
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)))
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)))
/*
* RelationUsesLocalBuffers
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index c93d865..ba35d6b 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -103,7 +103,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -111,10 +111,10 @@ extern Relation RelationBuildLocalRelation(const char *relname,
char relkind);
/*
- * Routines to manage assignment of new relfilenode to a relation
+ * Routines to manage assignment of new relfilenumber to a relation
*/
-extern void RelationSetNewRelfilenode(Relation relation, char persistence);
-extern void RelationAssumeNewRelfilenode(Relation relation);
+extern void RelationSetNewRelfilenumber(Relation relation, char persistence);
+extern void RelationAssumeNewRelfilelocator(Relation relation);
/*
* Routines for flushing/rebuilding relcache entries in various scenarios
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
deleted file mode 100644
index 77d8046..0000000
--- a/src/include/utils/relfilenodemap.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.h
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/relfilenodemap.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODEMAP_H
-#define RELFILENODEMAP_H
-
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
-
-#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
new file mode 100644
index 0000000..c149a93
--- /dev/null
+++ b/src/include/utils/relfilenumbermap.h
@@ -0,0 +1,19 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.h
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/relfilenumbermap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILENUMBERMAP_H
+#define RELFILENUMBERMAP_H
+
+extern Oid RelidByRelfilenumber(Oid reltablespace,
+ RelFileNumber relfilenumber);
+
+#endif /* RELFILENUMBERMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 557f77e..2bb2e25 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.h
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
*
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
@@ -35,14 +35,15 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumber(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
-extern Oid RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId);
+extern Oid RelationMapFilenumberToOid(RelFileNumber relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumberForDatabase(char *dbpath,
+ Oid relationId);
extern void RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath,
char *dstdbpath);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
- bool immediate);
+extern void RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber,
+ bool shared, bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
index 4700d49..869d9d5 100644
--- a/src/test/recovery/t/018_wal_optimize.pl
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -5,7 +5,7 @@
#
# These tests exercise code that once violated the mandate described in
# src/backend/access/transam/README section "Skipping WAL for New
-# RelFileNode". The tests work by committing some transactions, initiating an
+# RelFileLocator". The tests work by committing some transactions, initiating an
# immediate shutdown, and confirming that the expected data survives recovery.
# For many years, individual commands made the decision to skip WAL, hence the
# frequent appearance of COPY in these tests.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4fb7469..11b68b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2255,8 +2255,8 @@ ReindexObjectType
ReindexParams
ReindexStmt
ReindexType
-RelFileNode
-RelFileNodeBackend
+RelFileLocator
+RelFileLocatorBackend
RelIdCacheEnt
RelInfo
RelInfoArr
@@ -2274,8 +2274,8 @@ RelationPtr
RelationSyncEntry
RelcacheCallbackFunction
ReleaseMatchCB
-RelfilenodeMapEntry
-RelfilenodeMapKey
+RelfilenumberMapEntry
+RelfilenumberMapKey
Relids
RelocationBufferInfo
RelptrFreePageBtree
@@ -3877,7 +3877,7 @@ xl_xact_parsed_abort
xl_xact_parsed_commit
xl_xact_parsed_prepare
xl_xact_prepare
-xl_xact_relfilenodes
+xl_xact_relfilelocators
xl_xact_stats_item
xl_xact_stats_items
xl_xact_subxacts
--
1.8.3.1
Hi,
I'm not feeling inspired by "locator", tbh. But I don't really have a great
alternative, so ...
On 2022-07-01 16:12:01 +0530, Dilip Kumar wrote:
From f07ca9ef19e64922c6ee410707e93773d1a01d7c Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Sat, 25 Jun 2022 10:43:12 +0530
Subject: [PATCH v4 2/4] Preliminary refactoring for supporting larger
relfilenumber
I don't think we have abbreviated buffer as 'buff' in many places? I think we
should either spell buffer out or use 'buf'. This is in regard to BuffTag etc.
Subject: [PATCH v4 3/4] Use 56 bits for relfilenumber to avoid wraparound
/* + * GenerateNewRelFileNumber + * + * Similar to GetNewObjectId but instead of new Oid it generates new + * relfilenumber. + */ +RelFileNumber +GetNewRelFileNumber(void) +{ + RelFileNumber result; + + /* Safety check, we should never get this far in a HS standby */
Normally we don't capitalize the first character of a comment that's not a
full sentence (i.e. ending with a punctuation mark).
+ if (RecoveryInProgress()) + elog(ERROR, "cannot assign RelFileNumber during recovery"); + + LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE); + + /* Check for the wraparound for the relfilenumber counter */ + if (unlikely (ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER)) + elog(ERROR, "relfilenumber is out of bound"); + + /* If we run out of logged for use RelFileNumber then we must log more */
"logged for use" - looks like you reformulated the sentence incompletely.
+ if (ShmemVariableCache->relnumbercount == 0) + { + XLogPutNextRelFileNumber(ShmemVariableCache->nextRelFileNumber + + VAR_RFN_PREFETCH);
I know this is just copied, but I find "XLogPut" as a prefix pretty unhelpful.
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h index e1f4eef..1cf039c 100644 --- a/src/include/catalog/pg_class.h +++ b/src/include/catalog/pg_class.h @@ -31,6 +31,10 @@ */ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,RelationRelation_Rowtype_Id) BKI_SCHEMA_MACRO { + /* identifier of physical storage file */ + /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */ + int64 relfilenode BKI_DEFAULT(0); + /* oid */ Oid oid;@@ -52,10 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* access method; 0 if not a table / index */
Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
What's the story behind moving relfilenode to the front? Alignment
consideration? Seems odd to move the relfilenode before the oid. If there's an
alignment issue, can't you just swap it with reltablespace or such to resolve
it?
From f6e8e0e7412198b02671e67d1859a7448fe83f38 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Wed, 29 Jun 2022 13:24:32 +0530
Subject: [PATCH v4 4/4] Don't delay removing Tombstone file until next
checkpointCurrently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.
Well, not quite "currently", because at this point we've fixed that in a prior
commit ;)
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
Hm. Wasn't there also some issue around crash-restarts benefiting from having
those files around until later? I think what I'm remembering is what is
referenced in this comment:
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
- * the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenumber
- * from being reused. The scenario this protects us from is:
- * 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenumber as
- * the just-deleted one (OIDs must've wrapped around for that to happen).
- * 3. We crash before another checkpoint occurs.
- * During replay, we would delete the file and then recreate it, which is fine
- * if the contents of the file were repopulated by subsequent WAL entries.
- * But if we didn't WAL-log insertions, but instead relied on fsyncing the
- * file after populating it (as we do at wal_level=minimal), the contents of
- * the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenumber until it's
- * safe, because relfilenumber assignment skips over any existing file.
This isn't related to oid wraparound, just crashes. It's possible that the
XLogFlush() in XLogPutNextRelFileNumber() prevents such a scenario, but if so
it still ought to be explained here, I think.
+ * Note that now we can immediately unlink the first segment of the regular + * relation as well because the relfilenumber is 56 bits wide since PG 16. So + * we don't have to worry about relfilenumber getting reused for some unrelated + * relation file.
I'm doubtful it's a good idea to start dropping at the first segment. I'm
fairly certain that there's smgrexists() checks in some places, and they'll
now stop working, even if there are later segments that remained, e.g. because
of an error in the middle of removing later segments.
Greetings,
Andres Freund
On Sat, Jul 2, 2022 at 9:38 AM Andres Freund <andres@anarazel.de> wrote:
Thanks for the review,
I'm not feeling inspired by "locator", tbh. But I don't really have a great
alternative, so ...On 2022-07-01 16:12:01 +0530, Dilip Kumar wrote:
From f07ca9ef19e64922c6ee410707e93773d1a01d7c Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Sat, 25 Jun 2022 10:43:12 +0530
Subject: [PATCH v4 2/4] Preliminary refactoring for supporting larger
relfilenumberI don't think we have abbreviated buffer as 'buff' in many places? I think we
should either spell buffer out or use 'buf'. This is in regard to BuffTag etc.
Okay, I will change it to 'buf'
Subject: [PATCH v4 3/4] Use 56 bits for relfilenumber to avoid wraparound
Normally we don't capitalize the first character of a comment that's not a
full sentence (i.e. ending with a punctuation mark).
Okay.
"logged for use" - looks like you reformulated the sentence incompletely.
Right, I will fix it.
+ if (ShmemVariableCache->relnumbercount == 0) + { + XLogPutNextRelFileNumber(ShmemVariableCache->nextRelFileNumber + + VAR_RFN_PREFETCH);I know this is just copied, but I find "XLogPut" as a prefix pretty unhelpful.
Maybe we can change to LogNextRelFileNumber()?
What's the story behind moving relfilenode to the front? Alignment
consideration? Seems odd to move the relfilenode before the oid. If there's an
alignment issue, can't you just swap it with reltablespace or such to resolve
it?
Because of a test case added in this commit
(79b716cfb7a1be2a61ebb4418099db1258f35e30). Even I did not like to
move relfilenode before oid, but under this commit it is expected this
to aligned as well as before any NameData check this comments
===
+--
+-- Keep such columns before the first NameData column of the
+-- catalog, since packagers can override NAMEDATALEN to an odd number.
+--
===
From f6e8e0e7412198b02671e67d1859a7448fe83f38 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Wed, 29 Jun 2022 13:24:32 +0530
Subject: [PATCH v4 4/4] Don't delay removing Tombstone file until next
checkpointCurrently, we can not remove the unused relfilenode until the
next checkpoint because if we remove them immediately then
there is a risk of reusing the same relfilenode for two
different relations during single checkpoint due to Oid
wraparound.Well, not quite "currently", because at this point we've fixed that in a prior
commit ;)
Right, I will change, but I'm not sure whether we want to commit 0003
and 0004 as an independent patch or as a simple patch.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.Hm. Wasn't there also some issue around crash-restarts benefiting from having
those files around until later? I think what I'm remembering is what is
referenced in this comment:
I think due to wraparound if relfilenode gets reused by another
relation in the same checkpoint then there was an issue in crash
recovery with wal level minimal. But the root of the issue is a
wraparound, right?
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
- * the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenumber
- * from being reused. The scenario this protects us from is:
- * 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenumber as
- * the just-deleted one (OIDs must've wrapped around for that to happen).
- * 3. We crash before another checkpoint occurs.
- * During replay, we would delete the file and then recreate it, which is fine
- * if the contents of the file were repopulated by subsequent WAL entries.
- * But if we didn't WAL-log insertions, but instead relied on fsyncing the
- * file after populating it (as we do at wal_level=minimal), the contents of
- * the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenumber until it's
- * safe, because relfilenumber assignment skips over any existing file.This isn't related to oid wraparound, just crashes. It's possible that the
XLogFlush() in XLogPutNextRelFileNumber() prevents such a scenario, but if so
it still ought to be explained here, I think.
I think the root cause of the problem is oid reuse which is due to
relfilenode wraparound, and the problem is created if there is a crash
after that. Now, we have removed the wraparound so there won't be any
more reuse of the relfilenode so there is no problem during crash
recovery.
In XLogPutNextRelFileNumber() we need XLogFlush() to just ensure that
after the crash recovery we do not reuse the relfilenode because now
we are not checking for the existing disk file as we have completely
removed the wraparound.
So in short the problem this comment was explaining is related to if
relfilenode get reused in same checkpoint due to wraparound then crash
recovery will loose the content of the new related which has reused
the relfilenode at wal level minimal. But, by adding XLogFlush() in
XLogPutNextRelFileNumber() we are ensuring that after crash recovery
we do not reuse the same relfilenode so we ensure this wal go to disk
first before we create the relation file on the disk.
+ * Note that now we can immediately unlink the first segment of the regular + * relation as well because the relfilenumber is 56 bits wide since PG 16. So + * we don't have to worry about relfilenumber getting reused for some unrelated + * relation file.I'm doubtful it's a good idea to start dropping at the first segment. I'm
fairly certain that there's smgrexists() checks in some places, and they'll
now stop working, even if there are later segments that remained, e.g. because
of an error in the middle of removing later segments.
Okay, so you mean to say that we can first drop the remaining segment
and at last we drop the segment 0 right?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Sat, Jul 2, 2022 at 4:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I'm doubtful it's a good idea to start dropping at the first segment. I'm
fairly certain that there's smgrexists() checks in some places, and they'll
now stop working, even if there are later segments that remained, e.g. because
of an error in the middle of removing later segments.Okay, so you mean to say that we can first drop the remaining segment
and at last we drop the segment 0 right?
I think we need to do it in descending order, starting with the
highest-numbered segment and working down. md.c isn't smart about gaps
in the sequence of files, so it's better if we don't create any gaps.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 2022-07-02 14:23:08 +0530, Dilip Kumar wrote:
+ if (ShmemVariableCache->relnumbercount == 0) + { + XLogPutNextRelFileNumber(ShmemVariableCache->nextRelFileNumber + + VAR_RFN_PREFETCH);I know this is just copied, but I find "XLogPut" as a prefix pretty unhelpful.
Maybe we can change to LogNextRelFileNumber()?
Much better.
Hm. Now that I think about it, isn't the XlogFlush() in
XLogPutNextRelFileNumber() problematic performance wise? Yes, we'll spread the
cost across a number of GetNewRelFileNumber() calls, but still, an additional
f[data]sync for every 64 relfilenodes assigned isn't cheap - today there's
zero fsyncs when creating a sequence or table inside a transaction (there are
some for indexes, but there's patches to fix that).
Not that I really see an obvious alternative.
I guess we could try to invent a flush-log-before-write type logic for
relfilenodes somehow? So that the first block actually written to a file needs
to ensure the WAL record that created the relation is flushed. But getting
that to work reliably seems nontrivial.
One thing that would be good is to add an assertion to a few places ensuring
that relfilenodes aren't above ->nextRelFileNumber, most importantly somewhere
in the recovery path.
Why did you choose a quite small value for VAR_RFN_PREFETCH? VAR_OID_PREFETCH
is 8192, but you chose 64 for VAR_RFN_PREFETCH?
I'd spell out RFN in VAR_RFN_PREFETCH btw, it took me a bit to expand RFN to
relfilenode.
What's the story behind moving relfilenode to the front? Alignment
consideration? Seems odd to move the relfilenode before the oid. If there's an
alignment issue, can't you just swap it with reltablespace or such to resolve
it?Because of a test case added in this commit
(79b716cfb7a1be2a61ebb4418099db1258f35e30). Even I did not like to
move relfilenode before oid, but under this commit it is expected this
to aligned as well as before any NameData check this comments=== +-- +-- Keep such columns before the first NameData column of the +-- catalog, since packagers can override NAMEDATALEN to an odd number. +-- ===
This is embarassing. Trying to keep alignment match between C and catalog
alignment on AIX, without actually making the system understand the alignment
rules, is a remarkably shortsighted approach.
I started a separate thread about it, since it's not really relevant to this thread:
/messages/by-id/20220702183354.a6uhja35wta7agew@alap3.anarazel.de
Maybe we could at least make the field order to be something like
oid, relam, relfilenode, relname
that should be ok alignment wise, keep oid first, and seems to make sense from
an "importance" POV? Can't really interpret later fields without knowing relam
etc.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.Hm. Wasn't there also some issue around crash-restarts benefiting from having
those files around until later? I think what I'm remembering is what is
referenced in this comment:I think due to wraparound if relfilenode gets reused by another
relation in the same checkpoint then there was an issue in crash
recovery with wal level minimal. But the root of the issue is a
wraparound, right?
I'm not convinced the tombstones were required solely in the oid wraparound
case before, despite the comment saying so, with wal_level=minimal. I gotta do
some non-work stuff for a bit, so I need to stop pondering this now :)
I think it might be a good idea to have a few weeks in which we do *not*
remove the tombstones, but have assertion checks against such files existing
when we don't expect them to. I.e. commit 1-3, add the asserts, then commit 4
a bit later.
I'm doubtful it's a good idea to start dropping at the first segment. I'm
fairly certain that there's smgrexists() checks in some places, and they'll
now stop working, even if there are later segments that remained, e.g. because
of an error in the middle of removing later segments.Okay, so you mean to say that we can first drop the remaining segment
and at last we drop the segment 0 right?
I'd use the approach Robert suggested and delete from the end, going down.
Greetings,
Andres Freund
On Sun, Jul 3, 2022 at 12:59 AM Andres Freund <andres@anarazel.de> wrote:
Hm. Now that I think about it, isn't the XlogFlush() in
XLogPutNextRelFileNumber() problematic performance wise? Yes, we'll spread the
cost across a number of GetNewRelFileNumber() calls, but still, an additional
f[data]sync for every 64 relfilenodes assigned isn't cheap - today there's
zero fsyncs when creating a sequence or table inside a transaction (there are
some for indexes, but there's patches to fix that).Not that I really see an obvious alternative.
I think to see the impact we need a workload which frequently creates
the relfilenode, maybe we can test where parallel to pgbench we are
frequently creating the relation/indexes and check how much
performance hit we see. And if we see the impact then increasing
VAR_RFN_PREFETCH value can help in resolving that.
I guess we could try to invent a flush-log-before-write type logic for
relfilenodes somehow? So that the first block actually written to a file needs
to ensure the WAL record that created the relation is flushed. But getting
that to work reliably seems nontrivial.
One thing that would be good is to add an assertion to a few places ensuring
that relfilenodes aren't above ->nextRelFileNumber, most importantly somewhere
in the recovery path.
Yes, it makes sense.
Why did you choose a quite small value for VAR_RFN_PREFETCH? VAR_OID_PREFETCH
is 8192, but you chose 64 for VAR_RFN_PREFETCH?
Earlier it was 8192, then there was a comment from Robert that we can
use Oid for many other things like an identifier for a catalog tuple
or a TOAST chunk, but a RelFileNumber requires a filesystem operation,
so the amount of work that is needed to use up 8192 RelFileNumbers is
a lot bigger than the amount of work required to use up 8192 OIDs.
I think that make sense so I reduced it to 64, but now I tends to
think that we also need to consider the point that after consuming
VAR_RFN_PREFETCH we are going to do XlogFlush(), so it's better that
we keep it high. And as Robert told upthread, even with keeping it
8192 we can still crash 2^41 (2 trillian) times before we completely
run out of the number. So I think we can easily keep it up to 8192
and I don't think that we really need to worry much about the
performance impact by XlogFlush()?
I'd spell out RFN in VAR_RFN_PREFETCH btw, it took me a bit to expand RFN to
relfilenode.
Okay.
This is embarassing. Trying to keep alignment match between C and catalog
alignment on AIX, without actually making the system understand the alignment
rules, is a remarkably shortsighted approach.I started a separate thread about it, since it's not really relevant to this thread:
/messages/by-id/20220702183354.a6uhja35wta7agew@alap3.anarazel.deMaybe we could at least make the field order to be something like
oid, relam, relfilenode, relname
Yeah that we can do.
that should be ok alignment wise, keep oid first, and seems to make sense from
an "importance" POV? Can't really interpret later fields without knowing relam
etc.
Right.
I think due to wraparound if relfilenode gets reused by another
relation in the same checkpoint then there was an issue in crash
recovery with wal level minimal. But the root of the issue is a
wraparound, right?I'm not convinced the tombstones were required solely in the oid wraparound
case before, despite the comment saying so, with wal_level=minimal. I gotta do
some non-work stuff for a bit, so I need to stop pondering this now :)I think it might be a good idea to have a few weeks in which we do *not*
remove the tombstones, but have assertion checks against such files existing
when we don't expect them to. I.e. commit 1-3, add the asserts, then commit 4
a bit later.
I think this is a good idea.
Okay, so you mean to say that we can first drop the remaining segment
and at last we drop the segment 0 right?I'd use the approach Robert suggested and delete from the end, going down.
Yeah, I got it, thanks.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Sat, Jul 2, 2022 at 3:29 PM Andres Freund <andres@anarazel.de> wrote:
Why did you choose a quite small value for VAR_RFN_PREFETCH? VAR_OID_PREFETCH
is 8192, but you chose 64 for VAR_RFN_PREFETCH?
As Dilip mentioned, I suggested a lower value. If that's too low, we
can go higher, but I think there is value in not making this
excessively large. Somebody somewhere is going to have a database
that's crash-restarting like mad, and I don't want that person to run
through an insane number of relfilenodes for no reason. I don't think
there are going to be a lot of people creating thousands upon
thousands of relations in a short period of time, and I'm not sure
that it's a big deal if those who do end up having to wait for a few
extra xlog flushes.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Sun, Jul 3, 2022 at 8:02 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Jul 2, 2022 at 3:29 PM Andres Freund <andres@anarazel.de> wrote:
Why did you choose a quite small value for VAR_RFN_PREFETCH? VAR_OID_PREFETCH
is 8192, but you chose 64 for VAR_RFN_PREFETCH?As Dilip mentioned, I suggested a lower value. If that's too low, we
can go higher, but I think there is value in not making this
excessively large. Somebody somewhere is going to have a database
that's crash-restarting like mad, and I don't want that person to run
through an insane number of relfilenodes for no reason. I don't think
there are going to be a lot of people creating thousands upon
thousands of relations in a short period of time, and I'm not sure
that it's a big deal if those who do end up having to wait for a few
extra xlog flushes.
Here is the updated version of the patch.
Patch 0001-0003 are the same with review comments fixes given by
Andres, 0004 as an extra assert patch suggested by Andres, this can be
merged with 0003. Basically, during recovery we add asserts checking
"relfilenumbers aren't above ->nextRelFileNumber," and also the assert
for checking that after we allocate a new relfile number the file
should not already exist on the disk so that once we are sure that
this assertion is not hitting then maybe we are safe for removing the
TombStone files immediately what we were doing in 0005.
In 0005 I fixed the file delete order so now we are deleting in
descending order, for that first we need to count the number of
segments by doing stat() on each file and after that we need to go
ahead and unlink in the descending order.
The VAR_RELFILENUMBER_PREFETCH is still 64 as we have not yet
concluded on that, and as discussed I will test some performance to
see whether we have some obvious impact with different values of this.
Maybe I will start with some very small numbers so that we have some
impact.
I thought about this comment from Robert
that's not quite the same as either of those things. For example, in
tableam.h we currently say "This callback needs to create a new
relation filenode for `rel`" and how should that be changed in this
new naming? We're not creating a new RelFileNumber - those would need
to be allocated, not created, as all the numbers in the universe exist
already. Neither are we creating a new locator; that sounds like it
means assembling it from pieces.
I think that "This callback needs to create a new relation storage
for `rel`" looks better.
I have again reviewed 0001 and 0003 and found some discrepancies in
usage of relfilenumber vs relfilelocator and fixed those, also some
places InvalidOid were use instead of InvalidRelFileNumber.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v5-0005-Don-t-delay-removing-Tombstone-file-until-next.patchapplication/x-patch; name=v5-0005-Don-t-delay-removing-Tombstone-file-until-next.patchDownload
From e6b7873fa160423b4eb6f2d5bfd86fe2815992a1 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Tue, 5 Jul 2022 13:25:39 +0530
Subject: [PATCH v5 5/5] Don't delay removing Tombstone file until next
checkpoint
Prior to making relfilenode to 56bit wider, we can not
remove the unused relfilenode until the next checkpoint
because if we remove them immediately then there is a risk
of reusing the same relfilenode for two different relations
during single checkpoint due to Oid wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/storage/smgr/md.c | 151 +++++++++++---------------------------
src/backend/storage/sync/sync.c | 101 -------------------------
src/include/storage/sync.h | 2 -
4 files changed, 43 insertions(+), 216 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1fbb5af..e298318 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6641,11 +6641,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index aaf8881..7d15920 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -24,6 +24,7 @@
#include <unistd.h>
#include <fcntl.h>
#include <sys/file.h>
+#include <sys/stat.h>
#include "access/xlog.h"
#include "access/xlogutils.h"
@@ -126,8 +127,6 @@ static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -240,38 +239,14 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
* to delete all forks.
*
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
- * the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenumber
- * from being reused. The scenario this protects us from is:
- * 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenumber as
- * the just-deleted one (OIDs must've wrapped around for that to happen).
- * 3. We crash before another checkpoint occurs.
- * During replay, we would delete the file and then recreate it, which is fine
- * if the contents of the file were repopulated by subsequent WAL entries.
- * But if we didn't WAL-log insertions, but instead relied on fsyncing the
- * file after populating it (as we do at wal_level=minimal), the contents of
- * the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenumber until it's
- * safe, because relfilenumber assignment skips over any existing file.
- *
- * XXX although this all was true when we had 32bits relfilenumber but now we
- * have 56bits relfilenumber so we don't have risk of relfilenumber being
- * reused so in future we can immediately unlink the first segment as well.
- *
- * We do not need to go through this dance for temp relations, though, because
- * we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenumber.
- * The fact that temp rels and regular rels have different file naming
- * patterns provides additional safety.
+ * We do not carefully track whether other forks have been created or not, but
+ * just attempt to unlink them unconditionally; so we should never complain
+ * about ENOENT.
*
- * All the above applies only to the relation's main fork; other forks can
- * just be removed immediately, since they are not needed to prevent the
- * relfilenumber from being recycled. Also, we do not carefully
- * track whether other forks have been created or not, but just attempt to
- * unlink them unconditionally; so we should never complain about ENOENT.
+ * Note that now we can immediately unlink the first segment of the regular
+ * relation as well because the relfilenumber is 56 bits wide since PG 16. So
+ * we don't have to worry about relfilenumber getting reused for some unrelated
+ * relation file.
*
* If isRedo is true, it's unsurprising for the relation to be already gone.
* Also, we should remove the file immediately instead of queuing a request
@@ -322,90 +297,67 @@ static void
mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
- int ret;
+ char *segpath;
+ int segno;
+ int lastsegment = -1;
+ struct stat statbuf;
path = relpath(rlocator, forkNum);
+ segpath = (char *) palloc(strlen(path) + 12);
- /*
- * Delete or truncate the first segment.
- */
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
+ /* compute number of segments. */
+ for (segno = 0;; segno++)
{
- if (!RelFileLocatorBackendIsTemp(rlocator))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rlocator, forkNum, 0 /* first seg */ );
- }
+ if (segno == 0)
+ sprintf(segpath, "%s", path);
else
- ret = 0;
+ sprintf(segpath, "%s.%u", path, segno);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
+ if (stat(segpath, &statbuf) != 0)
{
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
+ /* ENOENT is expected after the last segment... */
+ if (errno == ENOENT)
+ break;
}
- }
- else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Register request to unlink first segment later */
- register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
+ lastsegment = segno;
}
/*
- * Delete any additional segments.
+ * Unlink segment files in descending order so that if there is any failure
+ * while deleting any of the segment files, we do not create any gaps in
+ * segment files sequence.
*/
- if (ret >= 0)
+ for (segno = lastsegment; segno >= 0; segno--)
{
- char *segpath = (char *) palloc(strlen(path) + 12);
- BlockNumber segno;
-
- /*
- * Note that because we loop until getting ENOENT, we will correctly
- * remove all inactive segments as well as active ones.
- */
- for (segno = 1;; segno++)
- {
+ if (segno == 0)
+ sprintf(segpath, "%s", path);
+ else
sprintf(segpath, "%s.%u", path, segno);
if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
- * Prevent other backends' fds from holding on to the disk
+ * prevent other backends' fds from holding on to the disk
* space.
*/
- if (do_truncate(segpath) < 0 && errno == ENOENT)
- break;
+ do_truncate(path);
- /*
- * Forget any pending sync requests for this segment before we
- * try to unlink.
- */
+ /* forget any pending sync requests for the first segment. */
register_forget_request(rlocator, forkNum, segno);
}
- if (unlink(segpath) < 0)
- {
- /* ENOENT is expected after the last segment... */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", segpath)));
- break;
- }
- }
- pfree(segpath);
+ /*
+ * Unlink the file, we have already checked for file existence in
+ * the above loop while computing the segments so we do not need to
+ * check for ENOENT.
+ */
+ if (unlink(path))
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
+ pfree(segpath);
pfree(path);
}
@@ -1006,23 +958,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileLocatorBackendIsTemp(rlocator));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e1fb631..9a4a31c 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -201,92 +201,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -532,21 +446,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 049af87..2c0b812 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -23,7 +23,6 @@
typedef enum SyncRequestType
{
SYNC_REQUEST, /* schedule a call of sync function */
- SYNC_UNLINK_REQUEST, /* schedule a call of unlink function */
SYNC_FORGET_REQUEST, /* forget all calls for a tag */
SYNC_FILTER_REQUEST /* forget all calls satisfying match fn */
} SyncRequestType;
@@ -57,7 +56,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
v5-0004-Assert-checking-to-be-merged-with-0003.patchapplication/x-patch; name=v5-0004-Assert-checking-to-be-merged-with-0003.patchDownload
From c1588b883a9cc213feed010d0e8907d8807d3e01 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Mon, 4 Jul 2022 14:51:21 +0530
Subject: [PATCH v5 4/5] Assert checking (to be merged with 0003)
---
src/backend/catalog/catalog.c | 54 ++++++++++++++++++++++++++++++++++++++++
src/backend/catalog/heap.c | 5 ++++
src/backend/catalog/storage.c | 6 +++++
src/backend/commands/tablecmds.c | 3 +++
src/include/catalog/catalog.h | 9 +++++++
5 files changed, 77 insertions(+)
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 155400c..9a22203 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -583,3 +583,57 @@ pg_stop_making_pinned_objects(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+
+#ifdef USE_ASSERT_CHECKING
+
+/*
+ * Assert that there is no existing diskfile for input relnumber.
+ */
+void
+AssertRelfileNumberFileNotExists(Oid spcoid, RelFileNumber relnumber,
+ char relpersistence)
+{
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ /*
+ * If we ever get here during pg_upgrade, there's something wrong; all
+ * relfilenode assignments during a binary-upgrade run should be
+ * determined by commands in the dump script.
+ */
+ Assert(!IsBinaryUpgrade);
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid = spcoid ? spcoid : MyDatabaseTableSpace;
+ rlocator.locator.dbOid =
+ (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid :
+ MyDatabaseId;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must initialize
+ * that properly here to make sure that any collisions based on filename
+ * are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+
+ Assert(access(rpath, F_OK) != 0);
+}
+#endif
\ No newline at end of file
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 4b813e9..d8c25a6 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -347,7 +347,12 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
+ {
relfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(reltablespace,
+ relfilenumber,
+ relpersistence);
+ }
}
/*
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index c25adbb..974dbc0 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,9 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +984,9 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c0a2ab2..d3fcf9e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14378,6 +14378,9 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* relfilenumber file.
*/
newrelfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(newTableSpace,
+ newrelfilenumber,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index b452530..be6ba13 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -39,4 +39,13 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
+#ifdef USE_ASSERT_CHECKING
+extern void AssertRelfileNumberFileNotExists(Oid spcoid,
+ RelFileNumber relnumber,
+ char relpersistence);
+#else
+#define AssertRelfileNumberFileNotExists(spcoid, relnumber, relpersistence) \
+ ((void)true)
+#endif
+
#endif /* CATALOG_H */
--
1.8.3.1
v5-0002-Preliminary-refactoring-for-supporting-larger.patchapplication/x-patch; name=v5-0002-Preliminary-refactoring-for-supporting-larger.patchDownload
From da3b44626b2167f8db8e9fb9e47c6eab73c9f862 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Tue, 5 Jul 2022 12:51:25 +0530
Subject: [PATCH v5 2/5] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
contrib/pg_prewarm/autoprewarm.c | 7 +-
src/backend/storage/buffer/bufmgr.c | 113 ++++++++++++++++++--------
src/backend/storage/buffer/localbuf.c | 22 +++--
src/include/storage/buf_internals.h | 43 ++++++++--
5 files changed, 137 insertions(+), 54 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 713f52a..abc8813 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
+ fctx->record[i].relfilenumber = BufTagGetFileNumber(bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 7f1d55c..ca80d5a 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -631,9 +631,10 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetFileNumber(bufHdr->tag);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7071ff6..a2c1e81 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BufTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BufTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetFileNumber(bufHdr->tag);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,7 +2790,7 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
+ BufTagCopyRelFileLocator(bufHdr->tag, *rlocator);
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,7 +2841,12 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -3141,14 +3149,14 @@ DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3301,7 +3309,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, locators[j]))
{
rlocator = &locators[j];
break;
@@ -3310,7 +3318,10 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, locator);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3320,7 +3331,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3380,7 +3391,7 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3419,11 +3430,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3447,13 +3458,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3473,12 +3487,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3535,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3564,13 +3582,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3644,7 +3662,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,7 +3671,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3665,7 +3686,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3867,13 +3888,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4033,6 +4054,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4041,8 +4066,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4650,8 +4674,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+ path = relpathperm(rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4701,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathperm(rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,8 +4723,11 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathbackend(rlocator, MyBackendId, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4787,9 +4820,14 @@ WaitBufHdrUnlocked(BufferDesc *buf)
static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
- int ret;
+ int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ BufTagCopyRelFileLocator(*ba, rlocatora);
+ BufTagCopyRelFileLocator(*bb, rlocatorb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
@@ -4946,10 +4984,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BufTagCopyRelFileLocator(tag, currlocator);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4957,10 +4997,13 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BufTagCopyRelFileLocator(next->tag, nextrlocator);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +5022,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
+ reln = smgropen(currlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 3dc9cc7..ce73172 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,9 +213,12 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -337,16 +340,22 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ BufTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,12 +392,15 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagRelFileLocatorEquals(bufHdr->tag, rlocator))
{
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b85b94f..78484a9 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,61 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
- BlockNumber blockNum; /* blknum relative to begin of reln */
+ Oid spcOid; /* tablespace oid. */
+ Oid dbOid; /* database oid. */
+ RelFileNumber relNumber; /* relation file number. */
+ ForkNumber forkNum;
+ BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+#define BufTagGetFileNumber(a) ((a).relNumber)
+
+#define BufTagSetFileNumber(a, relnumber) \
+( \
+ (a).relNumber = (relnumber) \
+)
+
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ BufTagSetFileNumber(a, InvalidRelFileNumber), \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rlocator = (xx_rlocator), \
+ (a).spcOid = (xx_rlocator).spcOid, \
+ (a).dbOid = (xx_rlocator).dbOid, \
+ BufTagSetFileNumber(a, (xx_rlocator).relNumber), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).relNumber == (b).relNumber && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BufTagCopyRelFileLocator(a, locator) \
+do { \
+ (locator).spcOid = (a).spcOid; \
+ (locator).dbOid = (a).dbOid; \
+ (locator).relNumber = (a).relNumber; \
+} while(0)
+
+#define BufTagRelFileLocatorEquals(a, locator) \
+( \
+ (a).spcOid == (locator).spcOid && \
+ (a).dbOid == (locator).dbOid && \
+ (a).relNumber == (locator).relNumber \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v5-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchapplication/x-patch; name=v5-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchDownload
From a0068367d82b6d089c40c39374d1fa86778beb15 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Tue, 5 Jul 2022 12:56:31 +0530
Subject: [PATCH v5 3/5] Use 56 bits for relfilenumber to avoid wraparound
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As part of this patch, we will make the relfilenumber 56 bits wide.
But the problem is that if we make it 56 bits wide then the size
of the BufferTag will be increased which will increase the memory
usage and that may also impact the performance. So in order to
avoid that inside the buffer tag, instead of using 64 bits for the
relfilenumber we will use 8 bits for the fork number and 56 bits for
the relfilenumber.
---
contrib/pg_buffercache/Makefile | 3 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 31 ++++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 +++--
src/backend/access/transam/README | 4 +-
src/backend/access/transam/varsup.c | 94 +++++++++++++++++++++-
src/backend/access/transam/xlog.c | 48 +++++++++++
src/backend/access/transam/xlogprefetcher.c | 24 +++---
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 8 +-
src/backend/catalog/catalog.c | 93 ---------------------
src/backend/catalog/heap.c | 15 ++--
src/backend/catalog/index.c | 11 +--
src/backend/commands/tablecmds.c | 10 ++-
src/backend/nodes/outfuncs.c | 2 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 4 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 4 +-
src/backend/utils/adt/pg_upgrade_support.c | 9 ++-
src/backend/utils/cache/relcache.c | 3 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 ++-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 ++---
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 11 +--
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 ++---
src/fe_utils/option_utils.c | 42 ++++++++++
src/include/access/transam.h | 5 ++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 ++--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +--
src/include/fe_utils/option_utils.h | 3 +
src/include/postgres_ext.h | 7 +-
src/include/storage/buf_internals.h | 18 +++--
src/include/storage/relfilelocator.h | 12 ++-
src/test/regress/expected/alter_table.out | 24 +++---
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
59 files changed, 434 insertions(+), 261 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..2fbb62f 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -7,7 +7,8 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+ pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
+ pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index abc8813..4e3884b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +247,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ef7ae7e..118edd1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..e21559d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..18ee70d 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,8 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because relfilenumber is 56 bit wide so logically there should not be any
+collisions. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..1421198 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to prefetch (preallocate) per XLOG write */
+#define VAR_RELNUMBER_PREFETCH 64
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,94 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GenerateNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely (ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /* if we run out of logged RelFileNumber then we must log more */
+ if (ShmemVariableCache->relnumbercount == 0)
+ {
+ LogNextRelFileNumber(ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PREFETCH);
+
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+ (ShmemVariableCache->relnumbercount)--;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ int relnumbercount;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the relfilenode for the objects can be in any
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * Check if we set the new relfilenumber then do we run out of the logged
+ * relnumber, if so then we need to WAL log again. Otherwise, just adjust
+ * the relnumbercount.
+ */
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH);
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1b2f240..1fbb5af 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5026,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6472,6 +6477,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ checkPoint.nextRelFileNumber += ShmemVariableCache->relnumbercount;
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7350,6 +7361,29 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * Flush xlog record to disk before returning. To protect against file
+ * system changes reaching the disk before the XLOG_NEXT_RELFILENUMBER log.
+ *
+ * This should not impact the performance because we are WAL logging the
+ * RelFileNumber after assigning every 64 RelFileNumber
+ */
+ XLogFlush(recptr);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7564,6 +7598,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7578,6 +7622,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c469610..6ce6d29 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -573,9 +573,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
/*
* Don't try to prefetch anything in this database until
- * it has been created, or we might confuse the blocks of
- * different generations, if a database OID or
- * relfilenumber is reused. It's also more efficient than
+ * it has been created, because it's more efficient than
* discovering that relations don't exist on disk yet with
* ENOENT errors.
*/
@@ -600,10 +598,8 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
/*
* Don't prefetch anything for this whole relation
- * until it has been created. Otherwise we might
- * confuse the blocks of different generations, if a
- * relfilenumber is reused. This also avoids the need
- * to discover the problem via extra syscalls that
+ * until it has been created. This avoids the need to
+ * discover the problem via extra syscalls that
* report ENOENT.
*/
XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
@@ -611,7 +607,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +630,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +729,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +750,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +789,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -931,7 +927,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -947,7 +943,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5..1d95fc0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..8a56e8d 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2a33273..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,99 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidOid; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index c69c923..4b813e9 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -347,7 +347,7 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ relfilenumber = GetNewRelFileNumber();
}
/*
@@ -900,7 +900,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1172,12 +1172,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1231,8 +1226,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 3dc535e..68dabeb 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -900,12 +900,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -937,8 +932,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1249c89..c0a2ab2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14371,11 +14371,13 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenumber. Although relfilenumber are unique within a
+ * cluster, we are unable to use the old relfilenumber since unused
+ * relfilenumber are not unlinked until commit. So if within a
+ * transaction, if we set the old tablespace again, we will get conflicting
+ * relfilenumber file.
*/
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber();
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 05f27f0..c30fca2 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2932,7 +2932,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNumber);
+ WRITE_UINT64_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilelocatorSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index f8fb228..4366ae6 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..b64dbe7 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..aaf8881 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX although this all was true when we had 32bits relfilenumber but now we
+ * have 56bits relfilenumber so we don't have risk of relfilenumber being
+ * reused so in future we can immediately unlink the first segment as well.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index b21d8c3..5f6c12a 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 36ec845..65f76ce 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,7 +898,7 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenumber doesn't misbehave */
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c260c97..291dff0 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -98,10 +98,11 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +121,11 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +144,11 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 0639875..a1c159b 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3708,8 +3708,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
RelFileLocator newrlocator;
/* Allocate a new relfilenumber */
- newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ newrelfilenumber = GetNewRelFileNumber();
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index 3dc45e9..a5ec78c 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " INT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 21dfe1b..65fc623 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 6b90e7c..54861d5 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4835,16 +4835,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4862,7 +4862,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4876,7 +4876,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4884,7 +4884,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4897,7 +4897,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5d30b87..ea62e7d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,11 +399,11 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
- char query[QUERY_ALLOC];
- char *last_namespace = NULL,
- *last_tablespace = NULL;
+ RelFileNumber i_relfilenumber;
+ char query[QUERY_ALLOC];
+ char *last_namespace = NULL,
+ *last_tablespace = NULL;
query[0] = '\0'; /* initialize query string to empty */
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 265d829..4c4f03a 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index b3ad820..50e94df 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..37afdd1 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -213,6 +213,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ uint32 relnumbercount; /* relfilenumbers available before must do
+ XLOG work */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +296,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..bd683cc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 66900f1..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..4768e5e 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..d5e6172 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index d8af68b..ecdfc90 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,14 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 78484a9..91f64d9 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,16 +92,19 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid. */
Oid dbOid; /* database oid. */
- RelFileNumber relNumber; /* relation file number. */
- ForkNumber forkNum;
+ uint32 relNumber_low; /* relfilenumber 32 lower bits */
+ uint32 relNumber_hi:24; /* relfilenumber 24 high bits */
+ uint32 forkNum:8; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define BufTagGetFileNumber(a) ((a).relNumber)
+#define BufTagGetFileNumber(a) \
+ ((((uint64) (a).relNumber_hi << 32) | ((uint32) (a).relNumber_low)))
#define BufTagSetFileNumber(a, relnumber) \
( \
- (a).relNumber = (relnumber) \
+ (a).relNumber_hi = (relnumber) >> 32, \
+ (a).relNumber_low = (relnumber) & 0xffffffff \
)
#define CLEAR_BUFFERTAG(a) \
@@ -126,7 +129,8 @@ typedef struct buftag
( \
(a).spcOid == (b).spcOid && \
(a).dbOid == (b).dbOid && \
- (a).relNumber == (b).relNumber && \
+ (a).relNumber_low == (b).relNumber_low && \
+ (a).relNumber_hi == (b).relNumber_hi && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -135,14 +139,14 @@ typedef struct buftag
do { \
(locator).spcOid = (a).spcOid; \
(locator).dbOid = (a).dbOid; \
- (locator).relNumber = (a).relNumber; \
+ (locator).relNumber = BufTagGetFileNumber(a); \
} while(0)
#define BufTagRelFileLocatorEquals(a, locator) \
( \
(a).spcOid == (locator).spcOid && \
(a).dbOid == (locator).dbOid && \
- (a).relNumber == (locator).relNumber \
+ BufTagGetFileNumber(a) == (locator).relNumber \
)
/*
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 7211fe7..6046506 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -34,8 +34,7 @@
* relNumber identifies the specific relation. relNumber corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * Notice that relNumber is unique within a cluster.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +74,15 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
+/*
+ * Max value of the relfilnumber. RelFileNumber will be of 56bits wide for
+ * more details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 5ede56d..6230fcb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 52001e3..4190b12 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
v5-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchapplication/x-patch; name=v5-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchDownload
From deaa3b28321baf4c85aa19a5be64fd69301b7154 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 21 Jun 2022 14:04:01 +0530
Subject: [PATCH v5 1/5] Rename RelFileNode to RelFileLocator and relNode to
RelNumber
Currently, the way relfilenode and relnode are used is really confusing.
Although there is some precedent for calling the number that pertains to
the file on disk "relnode" and that value when combined with the database
and tablespace OIDs "relfilenode," but it's definitely not the most obvious
thing, and this terminology is also not used uniformaly.
So as part of this patchset these variables are renamed to something more suited
with their usage. So the RelFileNode is renamed to the RelFileLocator
and all related variable declaration from relfilenode to relfilelocator.
And the relNode in the RelFileLocator is renamed to relNumber and along with that
the dbNode and spcNode are also renamed to dbOid and spcOid. Along with that
all other references to relnode/relfilenode w.r.t to the ondisk file is renamed to
relnumber/relfilenumber.
---
contrib/bloom/blinsert.c | 2 +-
contrib/oid2name/oid2name.c | 28 +--
contrib/pg_buffercache/pg_buffercache_pages.c | 10 +-
contrib/pg_prewarm/autoprewarm.c | 26 +--
contrib/pg_visibility/pg_visibility.c | 2 +-
src/backend/access/common/syncscan.c | 29 +--
src/backend/access/gin/ginbtree.c | 2 +-
src/backend/access/gin/ginfast.c | 2 +-
src/backend/access/gin/ginutil.c | 2 +-
src/backend/access/gin/ginxlog.c | 6 +-
src/backend/access/gist/gistbuild.c | 4 +-
src/backend/access/gist/gistxlog.c | 11 +-
src/backend/access/hash/hash_xlog.c | 6 +-
src/backend/access/hash/hashpage.c | 4 +-
src/backend/access/heap/heapam.c | 78 +++----
src/backend/access/heap/heapam_handler.c | 26 +--
src/backend/access/heap/rewriteheap.c | 10 +-
src/backend/access/heap/visibilitymap.c | 4 +-
src/backend/access/nbtree/nbtpage.c | 2 +-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 2 +-
src/backend/access/nbtree/nbtxlog.c | 8 +-
src/backend/access/rmgrdesc/genericdesc.c | 2 +-
src/backend/access/rmgrdesc/gindesc.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 6 +-
src/backend/access/rmgrdesc/heapdesc.c | 6 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/smgrdesc.c | 4 +-
src/backend/access/rmgrdesc/xactdesc.c | 44 ++--
src/backend/access/rmgrdesc/xlogdesc.c | 10 +-
src/backend/access/spgist/spginsert.c | 6 +-
src/backend/access/spgist/spgxlog.c | 6 +-
src/backend/access/table/tableamapi.c | 2 +-
src/backend/access/transam/README | 14 +-
src/backend/access/transam/README.parallel | 2 +-
src/backend/access/transam/twophase.c | 38 ++--
src/backend/access/transam/varsup.c | 2 +-
src/backend/access/transam/xact.c | 40 ++--
src/backend/access/transam/xloginsert.c | 38 ++--
src/backend/access/transam/xlogprefetcher.c | 97 ++++----
src/backend/access/transam/xlogreader.c | 25 ++-
src/backend/access/transam/xlogrecovery.c | 18 +-
src/backend/access/transam/xlogutils.c | 73 +++---
src/backend/bootstrap/bootparse.y | 8 +-
src/backend/catalog/catalog.c | 30 +--
src/backend/catalog/heap.c | 56 ++---
src/backend/catalog/index.c | 39 ++--
src/backend/catalog/storage.c | 119 +++++-----
src/backend/commands/cluster.c | 46 ++--
src/backend/commands/copyfrom.c | 6 +-
src/backend/commands/dbcommands.c | 106 ++++-----
src/backend/commands/indexcmds.c | 14 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/sequence.c | 29 +--
src/backend/commands/tablecmds.c | 89 ++++----
src/backend/commands/tablespace.c | 18 +-
src/backend/nodes/copyfuncs.c | 4 +-
src/backend/nodes/equalfuncs.c | 4 +-
src/backend/nodes/outfuncs.c | 4 +-
src/backend/parser/gram.y | 8 +-
src/backend/parser/parse_utilcmd.c | 8 +-
src/backend/postmaster/checkpointer.c | 2 +-
src/backend/replication/logical/decode.c | 40 ++--
src/backend/replication/logical/reorderbuffer.c | 50 ++---
src/backend/replication/logical/snapbuild.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 284 ++++++++++++------------
src/backend/storage/buffer/localbuf.c | 34 +--
src/backend/storage/freespace/freespace.c | 6 +-
src/backend/storage/freespace/fsmpage.c | 6 +-
src/backend/storage/ipc/standby.c | 8 +-
src/backend/storage/lmgr/predicate.c | 24 +-
src/backend/storage/smgr/README | 2 +-
src/backend/storage/smgr/md.c | 126 +++++------
src/backend/storage/smgr/smgr.c | 44 ++--
src/backend/utils/adt/dbsize.c | 64 +++---
src/backend/utils/adt/pg_upgrade_support.c | 14 +-
src/backend/utils/cache/Makefile | 2 +-
src/backend/utils/cache/inval.c | 16 +-
src/backend/utils/cache/relcache.c | 184 +++++++--------
src/backend/utils/cache/relfilenodemap.c | 244 --------------------
src/backend/utils/cache/relfilenumbermap.c | 244 ++++++++++++++++++++
src/backend/utils/cache/relmapper.c | 90 ++++----
src/bin/pg_dump/pg_dump.c | 36 +--
src/bin/pg_rewind/datapagemap.h | 2 +-
src/bin/pg_rewind/filemap.c | 34 +--
src/bin/pg_rewind/filemap.h | 4 +-
src/bin/pg_rewind/parsexlog.c | 10 +-
src/bin/pg_rewind/pg_rewind.h | 2 +-
src/bin/pg_upgrade/Makefile | 2 +-
src/bin/pg_upgrade/info.c | 10 +-
src/bin/pg_upgrade/pg_upgrade.h | 22 +-
src/bin/pg_upgrade/relfilenode.c | 259 ---------------------
src/bin/pg_upgrade/relfilenumber.c | 259 +++++++++++++++++++++
src/bin/pg_waldump/pg_waldump.c | 26 +--
src/common/relpath.c | 48 ++--
src/include/access/brin_xlog.h | 2 +-
src/include/access/ginxlog.h | 4 +-
src/include/access/gistxlog.h | 2 +-
src/include/access/heapam_xlog.h | 8 +-
src/include/access/nbtxlog.h | 4 +-
src/include/access/rewriteheap.h | 6 +-
src/include/access/tableam.h | 59 ++---
src/include/access/xact.h | 26 +--
src/include/access/xlog_internal.h | 2 +-
src/include/access/xloginsert.h | 8 +-
src/include/access/xlogreader.h | 6 +-
src/include/access/xlogrecord.h | 8 +-
src/include/access/xlogutils.h | 8 +-
src/include/catalog/binary_upgrade.h | 6 +-
src/include/catalog/catalog.h | 5 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/storage.h | 10 +-
src/include/catalog/storage_xlog.h | 8 +-
src/include/commands/sequence.h | 4 +-
src/include/commands/tablecmds.h | 2 +-
src/include/commands/tablespace.h | 2 +-
src/include/common/relpath.h | 24 +-
src/include/nodes/parsenodes.h | 8 +-
src/include/postgres_ext.h | 7 +
src/include/postmaster/bgwriter.h | 2 +-
src/include/replication/reorderbuffer.h | 6 +-
src/include/storage/buf_internals.h | 28 +--
src/include/storage/bufmgr.h | 16 +-
src/include/storage/freespace.h | 4 +-
src/include/storage/md.h | 6 +-
src/include/storage/relfilelocator.h | 99 +++++++++
src/include/storage/relfilenode.h | 99 ---------
src/include/storage/sinval.h | 4 +-
src/include/storage/smgr.h | 12 +-
src/include/storage/standby.h | 6 +-
src/include/storage/sync.h | 4 +-
src/include/utils/inval.h | 4 +-
src/include/utils/rel.h | 46 ++--
src/include/utils/relcache.h | 8 +-
src/include/utils/relfilenodemap.h | 18 --
src/include/utils/relfilenumbermap.h | 19 ++
src/include/utils/relmapper.h | 13 +-
src/test/recovery/t/018_wal_optimize.pl | 2 +-
src/tools/pgindent/typedefs.list | 10 +-
141 files changed, 2079 insertions(+), 2049 deletions(-)
delete mode 100644 src/backend/utils/cache/relfilenodemap.c
create mode 100644 src/backend/utils/cache/relfilenumbermap.c
delete mode 100644 src/bin/pg_upgrade/relfilenode.c
create mode 100644 src/bin/pg_upgrade/relfilenumber.c
create mode 100644 src/include/storage/relfilelocator.h
delete mode 100644 src/include/storage/relfilenode.h
delete mode 100644 src/include/utils/relfilenodemap.h
create mode 100644 src/include/utils/relfilenumbermap.h
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 82378db..e64291e 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -179,7 +179,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index a3e358b..cadba3b 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -30,7 +30,7 @@ struct options
{
eary *tables;
eary *oids;
- eary *filenodes;
+ eary *filenumbers;
bool quiet;
bool systables;
@@ -125,9 +125,9 @@ get_opts(int argc, char **argv, struct options *my_opts)
my_opts->dbname = pg_strdup(optarg);
break;
- /* specify one filenode to show */
+ /* specify one filenumber to show */
case 'f':
- add_one_elt(optarg, my_opts->filenodes);
+ add_one_elt(optarg, my_opts->filenumbers);
break;
/* host to connect to */
@@ -494,7 +494,7 @@ sql_exec_dumpalltables(PGconn *conn, struct options *opts)
}
/*
- * Show oid, filenode, name, schema and tablespace for each of the
+ * Show oid, filenumber, name, schema and tablespace for each of the
* given objects in the current database.
*/
void
@@ -504,19 +504,19 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
char *qualifiers,
*ptr;
char *comma_oids,
- *comma_filenodes,
+ *comma_filenumbers,
*comma_tables;
bool written = false;
char *addfields = ",c.oid AS \"Oid\", nspname AS \"Schema\", spcname as \"Tablespace\" ";
- /* get tables qualifiers, whether names, filenodes, or OIDs */
+ /* get tables qualifiers, whether names, filenumbers, or OIDs */
comma_oids = get_comma_elts(opts->oids);
comma_tables = get_comma_elts(opts->tables);
- comma_filenodes = get_comma_elts(opts->filenodes);
+ comma_filenumbers = get_comma_elts(opts->filenumbers);
/* 80 extra chars for SQL expression */
qualifiers = (char *) pg_malloc(strlen(comma_oids) + strlen(comma_tables) +
- strlen(comma_filenodes) + 80);
+ strlen(comma_filenumbers) + 80);
ptr = qualifiers;
if (opts->oids->num > 0)
@@ -524,11 +524,11 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
ptr += sprintf(ptr, "c.oid IN (%s)", comma_oids);
written = true;
}
- if (opts->filenodes->num > 0)
+ if (opts->filenumbers->num > 0)
{
if (written)
ptr += sprintf(ptr, " OR ");
- ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenodes);
+ ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenumbers);
written = true;
}
if (opts->tables->num > 0)
@@ -539,7 +539,7 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
}
free(comma_oids);
free(comma_tables);
- free(comma_filenodes);
+ free(comma_filenumbers);
/* now build the query */
todo = psprintf("SELECT pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\", relname as \"Table Name\" %s\n"
@@ -588,11 +588,11 @@ main(int argc, char **argv)
my_opts->oids = (eary *) pg_malloc(sizeof(eary));
my_opts->tables = (eary *) pg_malloc(sizeof(eary));
- my_opts->filenodes = (eary *) pg_malloc(sizeof(eary));
+ my_opts->filenumbers = (eary *) pg_malloc(sizeof(eary));
my_opts->oids->num = my_opts->oids->alloc = 0;
my_opts->tables->num = my_opts->tables->alloc = 0;
- my_opts->filenodes->num = my_opts->filenodes->alloc = 0;
+ my_opts->filenumbers->num = my_opts->filenumbers->alloc = 0;
/* parse the opts */
get_opts(argc, argv, my_opts);
@@ -618,7 +618,7 @@ main(int argc, char **argv)
/* display the given elements in the database */
if (my_opts->oids->num > 0 ||
my_opts->tables->num > 0 ||
- my_opts->filenodes->num > 0)
+ my_opts->filenumbers->num > 0)
{
if (!my_opts->quiet)
printf("From database \"%s\":\n", my_opts->dbname);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..713f52a 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
+ fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c0c4f5d..7f1d55c 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -52,7 +52,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/resowner.h"
#define AUTOPREWARM_FILE "autoprewarm.blocks"
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ RelFileNumber filenumber;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -347,7 +347,7 @@ apw_load_buffers(void)
unsigned forknum;
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
- &blkinfo[i].tablespace, &blkinfo[i].filenode,
+ &blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
(errmsg("autoprewarm block dump file is corrupted at line %d",
@@ -494,7 +494,7 @@ autoprewarm_database_main(Datum main_arg)
* relation. Note that rel will be NULL if try_relation_open failed
* previously; in that case, there is nothing to close.
*/
- if (old_blk != NULL && old_blk->filenode != blk->filenode &&
+ if (old_blk != NULL && old_blk->filenumber != blk->filenumber &&
rel != NULL)
{
relation_close(rel, AccessShareLock);
@@ -506,13 +506,13 @@ autoprewarm_database_main(Datum main_arg)
* Try to open each new relation, but only once, when we first
* encounter it. If it's been dropped, skip the associated blocks.
*/
- if (old_blk == NULL || old_blk->filenode != blk->filenode)
+ if (old_blk == NULL || old_blk->filenumber != blk->filenumber)
{
Oid reloid;
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenode(blk->tablespace, blk->filenode);
+ reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
@@ -527,7 +527,7 @@ autoprewarm_database_main(Datum main_arg)
/* Once per fork, check for fork existence and size. */
if (old_blk == NULL ||
- old_blk->filenode != blk->filenode ||
+ old_blk->filenumber != blk->filenumber ||
old_blk->forknum != blk->forknum)
{
/*
@@ -631,9 +631,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
+ block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
@@ -671,7 +671,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
- block_info_array[i].filenode,
+ block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
if (ret < 0)
@@ -900,7 +900,7 @@ do { \
* We depend on all records for a particular database being consecutive
* in the dump file; each per-database worker will preload blocks until
* it sees a block for some other database. Sorting by tablespace,
- * filenode, forknum, and blocknum isn't critical for correctness, but
+ * filenumber, forknum, and blocknum isn't critical for correctness, but
* helps us get a sequential I/O pattern.
*/
static int
@@ -911,7 +911,7 @@ apw_compare_blockinfo(const void *p, const void *q)
cmp_member_elem(database);
cmp_member_elem(tablespace);
- cmp_member_elem(filenode);
+ cmp_member_elem(filenumber);
cmp_member_elem(forknum);
cmp_member_elem(blocknum);
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 1853c35..4e2e9ea 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -407,7 +407,7 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
xl_smgr_truncate xlrec;
xlrec.blkno = 0;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_VM;
XLogBeginInsert();
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..ad48cb7 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -90,7 +90,7 @@ bool trace_syncscan = false;
*/
typedef struct ss_scan_location_t
{
- RelFileNode relfilenode; /* identity of a relation */
+ RelFileLocator relfilelocator; /* identity of a relation */
BlockNumber location; /* last-reported location in the relation */
} ss_scan_location_t;
@@ -115,7 +115,7 @@ typedef struct ss_scan_locations_t
static ss_scan_locations_t *scan_locations;
/* prototypes for internal functions */
-static BlockNumber ss_search(RelFileNode relfilenode,
+static BlockNumber ss_search(RelFileLocator relfilelocator,
BlockNumber location, bool set);
@@ -159,9 +159,9 @@ SyncScanShmemInit(void)
* these invalid entries will fall off the LRU list and get
* replaced with real entries.
*/
- item->location.relfilenode.spcNode = InvalidOid;
- item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
@@ -176,10 +176,10 @@ SyncScanShmemInit(void)
/*
* ss_search --- search the scan_locations structure for an entry with the
- * given relfilenode.
+ * given relfilelocator.
*
* If "set" is true, the location is updated to the given location. If no
- * entry for the given relfilenode is found, it will be created at the head
+ * entry for the given relfilelocator is found, it will be created at the head
* of the list with the given location, even if "set" is false.
*
* In any case, the location after possible update is returned.
@@ -188,7 +188,7 @@ SyncScanShmemInit(void)
* data structure.
*/
static BlockNumber
-ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
+ss_search(RelFileLocator relfilelocator, BlockNumber location, bool set)
{
ss_lru_item_t *item;
@@ -197,7 +197,8 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
{
bool match;
- match = RelFileNodeEquals(item->location.relfilenode, relfilenode);
+ match = RelFileLocatorEquals(item->location.relfilelocator,
+ relfilelocator);
if (match || item->next == NULL)
{
@@ -207,7 +208,7 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
*/
if (!match)
{
- item->location.relfilenode = relfilenode;
+ item->location.relfilelocator = relfilelocator;
item->location.location = location;
}
else if (set)
@@ -255,7 +256,7 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
BlockNumber startloc;
LWLockAcquire(SyncScanLock, LW_EXCLUSIVE);
- startloc = ss_search(rel->rd_node, 0, false);
+ startloc = ss_search(rel->rd_locator, 0, false);
LWLockRelease(SyncScanLock);
/*
@@ -281,8 +282,8 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
* ss_report_location --- update the current scan location
*
* Writes an entry into the shared Sync Scan state of the form
- * (relfilenode, blocknumber), overwriting any existing entry for the
- * same relfilenode.
+ * (relfilelocator, blocknumber), overwriting any existing entry for the
+ * same relfilelocator.
*/
void
ss_report_location(Relation rel, BlockNumber location)
@@ -309,7 +310,7 @@ ss_report_location(Relation rel, BlockNumber location)
{
if (LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE))
{
- (void) ss_search(rel->rd_node, location, true);
+ (void) ss_search(rel->rd_locator, location, true);
LWLockRelease(SyncScanLock);
}
#ifdef TRACE_SYNCSCAN
diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index cc6d4e6..c75bfc2 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -470,7 +470,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
savedRightLink = GinPageGetOpaque(page)->rightlink;
/* Begin setting up WAL record */
- data.node = btree->index->rd_node;
+ data.locator = btree->index->rd_locator;
data.flags = xlflags;
if (BufferIsValid(childbuf))
{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 7409fdc..6c67744 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -235,7 +235,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
needWal = RelationNeedsWAL(index);
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 20f4706..6df7f2e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -688,7 +688,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats, bool is_build)
XLogRecPtr recptr;
ginxlogUpdateMeta data;
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
memcpy(&data.metadata, metadata, sizeof(GinMetaPageData));
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..41b9211 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileLocator locator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &locator, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index f5a5caf..374e64e 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -462,7 +462,7 @@ gist_indexsortbuild(GISTBuildState *state)
smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
if (RelationNeedsWAL(state->indexrel))
- log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM, GIST_ROOT_BLKNO,
+ log_newpage(&state->indexrel->rd_locator, MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
pfree(levelstate->pages[0]);
@@ -663,7 +663,7 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
}
if (RelationNeedsWAL(state->indexrel))
- log_newpages(&state->indexrel->rd_node, MAIN_FORKNUM, state->ready_num_pages,
+ log_newpages(&state->indexrel->rd_locator, MAIN_FORKNUM, state->ready_num_pages,
state->ready_blknos, state->ready_pages, true);
for (int i = 0; i < state->ready_num_pages; i++)
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index df70f90..b4f629f 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -191,11 +191,12 @@ gistRedoDeleteRecord(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid,
+ rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -395,7 +396,7 @@ gistRedoPageReuse(XLogReaderState *record)
*/
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
@@ -607,7 +608,7 @@ gistXLogPageReuse(Relation rel, BlockNumber blkno, FullTransactionId latestRemov
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = latestRemovedXid;
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index 62dbfc3..2e68303 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -999,10 +999,10 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rlocator);
}
action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL, true, &buffer);
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 39206d1..d2edcd4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -428,7 +428,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1019,7 +1019,7 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 637de11..aab8d6f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8189,7 +8189,7 @@ log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid,
* heap_buffer, if necessary.
*/
XLogRecPtr
-log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
+log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer, Buffer vm_buffer,
TransactionId cutoff_xid, uint8 vmflags)
{
xl_heap_visible xlrec;
@@ -8454,7 +8454,7 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
Assert(tup->t_tableOid != InvalidOid);
xlrec.top_xid = GetTopTransactionId();
- xlrec.target_node = relation->rd_node;
+ xlrec.target_locator = relation->rd_locator;
xlrec.target_tid = tup->t_self;
/*
@@ -8623,18 +8623,18 @@ heap_xlog_prune(XLogReaderState *record)
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_prune *xlrec = (xl_heap_prune *) XLogRecGetData(record);
Buffer buffer;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
* no queries running for which the removed tuples are still visible.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
/*
* If we have a full-page image, restore it (using a cleanup lock) and
@@ -8694,7 +8694,7 @@ heap_xlog_prune(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8751,9 +8751,9 @@ heap_xlog_vacuum(XLogReaderState *record)
if (BufferIsValid(buffer))
{
Size freespace = PageGetHeapFreeSpace(BufferGetPage(buffer));
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
UnlockReleaseBuffer(buffer);
@@ -8766,7 +8766,7 @@ heap_xlog_vacuum(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8786,11 +8786,11 @@ heap_xlog_visible(XLogReaderState *record)
Buffer vmbuffer = InvalidBuffer;
Buffer buffer;
Page page;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -8802,7 +8802,7 @@ heap_xlog_visible(XLogReaderState *record)
* rather than killing the transaction outright.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rlocator);
/*
* Read the heap page, if it still exists. If the heap file has dropped or
@@ -8865,7 +8865,7 @@ heap_xlog_visible(XLogReaderState *record)
* FSM data is not in the page anyway.
*/
if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
- XLogRecordPageWithFreeSpace(rnode, blkno, space);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, space);
}
/*
@@ -8890,7 +8890,7 @@ heap_xlog_visible(XLogReaderState *record)
*/
LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
- reln = CreateFakeRelcacheEntry(rnode);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, blkno, &vmbuffer);
/*
@@ -8933,13 +8933,13 @@ heap_xlog_freeze_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
TransactionId latestRemovedXid = cutoff_xid;
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -9007,10 +9007,10 @@ heap_xlog_delete(XLogReaderState *record)
ItemId lp = NULL;
HeapTupleHeader htup;
BlockNumber blkno;
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9020,7 +9020,7 @@ heap_xlog_delete(XLogReaderState *record)
*/
if (xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9086,12 +9086,12 @@ heap_xlog_insert(XLogReaderState *record)
xl_heap_header xlhdr;
uint32 newlen;
Size freespace = 0;
- RelFileNode target_node;
+ RelFileLocator target_locator;
BlockNumber blkno;
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9101,7 +9101,7 @@ heap_xlog_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9184,7 +9184,7 @@ heap_xlog_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(target_node, blkno, freespace);
+ XLogRecordPageWithFreeSpace(target_locator, blkno, freespace);
}
/*
@@ -9195,7 +9195,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_multi_insert *xlrec;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
Buffer buffer;
Page page;
@@ -9217,7 +9217,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/* check that the mutually exclusive flags are not both set */
Assert(!((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) &&
@@ -9229,7 +9229,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9331,7 +9331,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
/*
@@ -9342,7 +9342,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_update *xlrec = (xl_heap_update *) XLogRecGetData(record);
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber oldblk;
BlockNumber newblk;
ItemPointerData newtid;
@@ -9371,7 +9371,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &newblk);
if (XLogRecGetBlockTagExtended(record, 1, NULL, NULL, &oldblk, NULL))
{
/* HOT updates are never done across pages */
@@ -9388,7 +9388,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, oldblk, &vmbuffer);
@@ -9472,7 +9472,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, newblk, &vmbuffer);
@@ -9606,7 +9606,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
* totally accurate anyway.
*/
if (newaction == BLK_NEEDS_REDO && !hot_update && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, newblk, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, newblk, freespace);
}
static void
@@ -9662,13 +9662,13 @@ heap_xlog_lock(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
@@ -9735,13 +9735,13 @@ heap_xlog_lock_updated(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 444f027..a3414a7 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -566,11 +566,11 @@ tuple_lock_retry:
*/
static void
-heapam_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+heapam_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
SMgrRelation srel;
@@ -591,7 +591,7 @@ heapam_relation_set_new_filenode(Relation rel,
*/
*minmulti = GetOldestMultiXactId();
- srel = RelationCreateStorage(*newrnode, persistence, true);
+ srel = RelationCreateStorage(*newrlocator, persistence, true);
/*
* If required, set up an init fork for an unlogged table so that it can
@@ -608,7 +608,7 @@ heapam_relation_set_new_filenode(Relation rel,
rel->rd_rel->relkind == RELKIND_MATVIEW ||
rel->rd_rel->relkind == RELKIND_TOASTVALUE);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(newrnode, INIT_FORKNUM);
+ log_smgrcreate(newrlocator, INIT_FORKNUM);
smgrimmedsync(srel, INIT_FORKNUM);
}
@@ -622,11 +622,11 @@ heapam_relation_nontransactional_truncate(Relation rel)
}
static void
-heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+heapam_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(*newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -640,10 +640,10 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilenumber value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(*newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -664,7 +664,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(newrnode, forkNum);
+ log_smgrcreate(newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
@@ -2569,7 +2569,7 @@ static const TableAmRoutine heapam_methods = {
.tuple_satisfies_snapshot = heapam_tuple_satisfies_snapshot,
.index_delete_tuples = heap_index_delete_tuples,
- .relation_set_new_filenode = heapam_relation_set_new_filenode,
+ .relation_set_new_filelocator = heapam_relation_set_new_filelocator,
.relation_nontransactional_truncate = heapam_relation_nontransactional_truncate,
.relation_copy_data = heapam_relation_copy_data,
.relation_copy_for_cluster = heapam_relation_copy_for_cluster,
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826..197f06b 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -318,7 +318,7 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -679,7 +679,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
page,
@@ -742,7 +742,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
* When doing logical decoding - which relies on using cmin/cmax of catalog
* tuples, via xl_heap_new_cid records - heap rewrites have to log enough
* information to allow the decoding backend to update its internal mapping
- * of (relfilenode,ctid) => (cmin, cmax) to be correct for the rewritten heap.
+ * of (relfilelocator,ctid) => (cmin, cmax) to be correct for the rewritten heap.
*
* For that, every time we find a tuple that's been modified in a catalog
* relation within the xmin horizon of any decoding slot, we log a mapping
@@ -1080,9 +1080,9 @@ logical_rewrite_heap_tuple(RewriteState state, ItemPointerData old_tid,
return;
/* fill out mapping information */
- map.old_node = state->rs_old_rel->rd_node;
+ map.old_locator = state->rs_old_rel->rd_locator;
map.old_tid = old_tid;
- map.new_node = state->rs_new_rel->rd_node;
+ map.new_locator = state->rs_new_rel->rd_locator;
map.new_tid = new_tid;
/* ---
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index e09f25a..ed72eb7 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -283,7 +283,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
if (XLogRecPtrIsInvalid(recptr))
{
Assert(!InRecovery);
- recptr = log_heap_visible(rel->rd_node, heapBuf, vmBuf,
+ recptr = log_heap_visible(rel->rd_locator, heapBuf, vmBuf,
cutoff_xid, flags);
/*
@@ -668,7 +668,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
* to keep checking for creation or extension of the file, which happens
* infrequently.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
UnlockRelationForExtension(rel, ExclusiveLock);
}
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 20adb60..8b96708 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -836,7 +836,7 @@ _bt_log_reuse_page(Relation rel, BlockNumber blkno, FullTransactionId safexid)
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = safexid;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 9b730f3..b52eca8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -166,7 +166,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&RelationGetSmgr(index)->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&RelationGetSmgr(index)->smgr_rlocator.locator, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9f60fa9..bd1685c 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -647,7 +647,7 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the XLOG_FPI record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(&wstate->index->rd_locator, MAIN_FORKNUM, blkno, page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index f9186ca..ad489e3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -664,11 +664,11 @@ btree_xlog_delete(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
}
/*
@@ -1006,7 +1006,7 @@ btree_xlog_reuse_page(XLogReaderState *record)
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
diff --git a/src/backend/access/rmgrdesc/genericdesc.c b/src/backend/access/rmgrdesc/genericdesc.c
index 877beb5..d8509b8 100644
--- a/src/backend/access/rmgrdesc/genericdesc.c
+++ b/src/backend/access/rmgrdesc/genericdesc.c
@@ -15,7 +15,7 @@
#include "access/generic_xlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Description of generic xlog record: write page regions that this record
diff --git a/src/backend/access/rmgrdesc/gindesc.c b/src/backend/access/rmgrdesc/gindesc.c
index 57f7bce..7d147ce 100644
--- a/src/backend/access/rmgrdesc/gindesc.c
+++ b/src/backend/access/rmgrdesc/gindesc.c
@@ -17,7 +17,7 @@
#include "access/ginxlog.h"
#include "access/xlogutils.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
desc_recompress_leaf(StringInfo buf, ginxlogRecompressDataLeaf *insertData)
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index d0c8e24..7dd3c1d 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -16,7 +16,7 @@
#include "access/gistxlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
@@ -27,8 +27,8 @@ static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..923d3bc 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -170,9 +170,9 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
- xlrec->target_node.spcNode,
- xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ xlrec->target_locator.spcOid,
+ xlrec->target_locator.dbOid,
+ xlrec->target_locator.relNumber,
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..4843cd5 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -101,8 +101,8 @@ btree_desc(StringInfo buf, XLogReaderState *record)
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..b3845f9 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -26,8 +26,8 @@ seq_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SEQ_LOG)
appendStringInfo(buf, "rel %u/%u/%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber);
}
const char *
diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 7547813..e0ee8a0 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,7 +26,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SMGR_CREATE)
{
xl_smgr_create *xlrec = (xl_smgr_create *) rec;
- char *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+ char *path = relpathperm(xlrec->rlocator, xlrec->forkNum);
appendStringInfoString(buf, path);
pfree(path);
@@ -34,7 +34,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
else if (info == XLOG_SMGR_TRUNCATE)
{
xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
- char *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+ char *path = relpathperm(xlrec->rlocator, MAIN_FORKNUM);
appendStringInfo(buf, "%s to %u blocks flags %d", path,
xlrec->blkno, xlrec->flags);
diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 90b6ac2..39752cf 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -73,15 +73,15 @@ ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *pars
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocators = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocators->nrels;
+ parsed->xlocators = xl_rellocators->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocators->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -179,15 +179,15 @@ ParseAbortRecord(uint8 info, xl_xact_abort *xlrec, xl_xact_parsed_abort *parsed)
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocator = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocator->nrels;
+ parsed->xlocators = xl_rellocator->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocator->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -260,11 +260,11 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
parsed->subxacts = (TransactionId *) bufptr;
bufptr += MAXALIGN(xlrec->nsubxacts * sizeof(TransactionId));
- parsed->xnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileNode));
+ parsed->xlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileLocator));
- parsed->abortnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileNode));
+ parsed->abortlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileLocator));
parsed->stats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(xlrec->ncommitstats * sizeof(xl_xact_stats_item));
@@ -278,7 +278,7 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
static void
xact_desc_relations(StringInfo buf, char *label, int nrels,
- RelFileNode *xnodes)
+ RelFileLocator *xlocators)
{
int i;
@@ -287,7 +287,7 @@ xact_desc_relations(StringInfo buf, char *label, int nrels,
appendStringInfo(buf, "; %s:", label);
for (i = 0; i < nrels; i++)
{
- char *path = relpathperm(xnodes[i], MAIN_FORKNUM);
+ char *path = relpathperm(xlocators[i], MAIN_FORKNUM);
appendStringInfo(buf, " %s", path);
pfree(path);
@@ -340,7 +340,7 @@ xact_desc_commit(StringInfo buf, uint8 info, xl_xact_commit *xlrec, RepOriginId
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
xact_desc_stats(buf, "", parsed.nstats, parsed.stats);
@@ -376,7 +376,7 @@ xact_desc_abort(StringInfo buf, uint8 info, xl_xact_abort *xlrec, RepOriginId or
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
if (parsed.xinfo & XACT_XINFO_HAS_ORIGIN)
@@ -400,9 +400,9 @@ xact_desc_prepare(StringInfo buf, uint8 info, xl_xact_prepare *xlrec, RepOriginI
appendStringInfo(buf, "gid %s: ", parsed.twophase_gid);
appendStringInfoString(buf, timestamptz_to_str(parsed.xact_time));
- xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xlocators);
xact_desc_relations(buf, "rels(abort)", parsed.nabortrels,
- parsed.abortnodes);
+ parsed.abortlocators);
xact_desc_stats(buf, "commit ", parsed.nstats, parsed.stats);
xact_desc_stats(buf, "abort ", parsed.nabortstats, parsed.abortstats);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index fefc563..6fec485 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -219,12 +219,12 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (detailed_format)
@@ -239,7 +239,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
"blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
@@ -299,7 +299,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
}
@@ -308,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index bfb7404..c6821b5 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index b500b2c..4c9f402 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -877,11 +877,11 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
if (TransactionIdIsValid(xldata->newestRedirectXid))
{
- RelFileNode node;
+ RelFileLocator locator;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &locator, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
- node);
+ locator);
}
}
diff --git a/src/backend/access/table/tableamapi.c b/src/backend/access/table/tableamapi.c
index 76df798..873d961 100644
--- a/src/backend/access/table/tableamapi.c
+++ b/src/backend/access/table/tableamapi.c
@@ -82,7 +82,7 @@ GetTableAmRoutine(Oid amhandler)
Assert(routine->tuple_update != NULL);
Assert(routine->tuple_lock != NULL);
- Assert(routine->relation_set_new_filenode != NULL);
+ Assert(routine->relation_set_new_filelocator != NULL);
Assert(routine->relation_nontransactional_truncate != NULL);
Assert(routine->relation_copy_data != NULL);
Assert(routine->relation_copy_for_cluster != NULL);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..734c39a 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -557,7 +557,7 @@ void XLogRegisterBuffer(uint8 block_id, Buffer buf, uint8 flags);
XLogRegisterBuffer adds information about a data block to the WAL record.
block_id is an arbitrary number used to identify this page reference in
the redo routine. The information needed to re-find the page at redo -
- relfilenode, fork, and block number - are included in the WAL record.
+ relfilelocator, fork, and block number - are included in the WAL record.
XLogInsert will automatically include a full copy of the page contents, if
this is the first modification of the buffer since the last checkpoint.
@@ -692,7 +692,7 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
+because we check for on-disk collisions when allocating new relfilenumber
OIDs. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
@@ -725,10 +725,10 @@ then restart recovery. This is part of the reason for not writing a WAL
entry until we've successfully done the original action.
-Skipping WAL for New RelFileNode
+Skipping WAL for New RelFileLocator
--------------------------------
-Under wal_level=minimal, if a change modifies a relfilenode that ROLLBACK
+Under wal_level=minimal, if a change modifies a relfilenumber that ROLLBACK
would unlink, in-tree access methods write no WAL for that change. Code that
writes WAL without calling RelationNeedsWAL() must check for this case. This
skipping is mandatory. If a WAL-writing change preceded a WAL-skipping change
@@ -748,9 +748,9 @@ unconditionally for permanent relations. Under these approaches, the access
method callbacks must not call functions that react to RelationNeedsWAL().
This applies only to WAL records whose replay would modify bytes stored in the
-new relfilenode. It does not apply to other records about the relfilenode,
+new relfilenumber. It does not apply to other records about the relfilenumber,
such as XLOG_SMGR_CREATE. Because it operates at the level of individual
-relfilenodes, RelationNeedsWAL() can differ for tightly-coupled relations.
+relfilenumbers, RelationNeedsWAL() can differ for tightly-coupled relations.
Consider "CREATE TABLE t (); BEGIN; ALTER TABLE t ADD c text; ..." in which
ALTER TABLE adds a TOAST relation. The TOAST relation will skip WAL, while
the table owning it will not. ALTER TABLE SET TABLESPACE will cause a table
@@ -860,7 +860,7 @@ Changes to a temp table are not WAL-logged, hence could reach disk in
advance of T1's commit, but we don't care since temp table contents don't
survive crashes anyway.
-Database writes that skip WAL for new relfilenodes are also safe. In these
+Database writes that skip WAL for new relfilenumbers are also safe. In these
cases it's entirely possible for the data to reach disk before T1's commit,
because T1 will fsync it down to disk without any sort of interlock. However,
all these paths are designed to write data that no other transaction can see
diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel
index 99c588d..e486bff 100644
--- a/src/backend/access/transam/README.parallel
+++ b/src/backend/access/transam/README.parallel
@@ -126,7 +126,7 @@ worker. This includes:
an index that is currently being rebuilt.
- Active relmapper.c mapping state. This is needed to allow consistent
- answers when fetching the current relfilenode for relation oids of
+ answers when fetching the current relfilenumber for relation oids of
mapped relations.
To prevent unprincipled deadlocks when running in parallel mode, this code
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 75551f6..41b31c5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -204,7 +204,7 @@ static void RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -215,7 +215,7 @@ static void RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid);
@@ -951,8 +951,8 @@ TwoPhaseGetDummyProc(TransactionId xid, bool lock_held)
*
* 1. TwoPhaseFileHeader
* 2. TransactionId[] (subtransactions)
- * 3. RelFileNode[] (files to be deleted at commit)
- * 4. RelFileNode[] (files to be deleted at abort)
+ * 3. RelFileLocator[] (files to be deleted at commit)
+ * 4. RelFileLocator[] (files to be deleted at abort)
* 5. SharedInvalidationMessage[] (inval messages to be sent at commit)
* 6. TwoPhaseRecordOnDisk
* 7. ...
@@ -1047,8 +1047,8 @@ StartPrepare(GlobalTransaction gxact)
TransactionId xid = gxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
xl_xact_stats_item *abortstats = NULL;
xl_xact_stats_item *commitstats = NULL;
SharedInvalidationMessage *invalmsgs;
@@ -1102,12 +1102,12 @@ StartPrepare(GlobalTransaction gxact)
}
if (hdr.ncommitrels > 0)
{
- save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileNode));
+ save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileLocator));
pfree(commitrels);
}
if (hdr.nabortrels > 0)
{
- save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileNode));
+ save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileLocator));
pfree(abortrels);
}
if (hdr.ncommitstats > 0)
@@ -1489,9 +1489,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
TwoPhaseFileHeader *hdr;
TransactionId latestXid;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
- RelFileNode *delrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
+ RelFileLocator *delrels;
int ndelrels;
xl_xact_stats_item *commitstats;
xl_xact_stats_item *abortstats;
@@ -1525,10 +1525,10 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
bufptr += MAXALIGN(hdr->gidlen);
children = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- commitrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- abortrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ commitrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ abortrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
commitstats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
abortstats = (xl_xact_stats_item *) bufptr;
@@ -2100,8 +2100,8 @@ RecoverPreparedTransactions(void)
bufptr += MAXALIGN(hdr->gidlen);
subxids = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->nabortstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->ninvalmsgs * sizeof(SharedInvalidationMessage));
@@ -2285,7 +2285,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -2383,7 +2383,7 @@ RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..849a7ce 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -521,7 +521,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
+ * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
* catalog/catalog.c.
*/
Oid
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index bd60b55..116de11 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1282,7 +1282,7 @@ RecordTransactionCommit(void)
bool markXidCommitted = TransactionIdIsValid(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int nchildren;
TransactionId *children;
int ndroppedstats = 0;
@@ -1705,7 +1705,7 @@ RecordTransactionAbort(bool isSubXact)
TransactionId xid = GetCurrentTransactionIdIfAny();
TransactionId latestXid;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int ndroppedstats = 0;
xl_xact_stats_item *droppedstats = NULL;
int nchildren;
@@ -5586,7 +5586,7 @@ xactGetCommittedChildren(TransactionId **ptr)
XLogRecPtr
XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int nmsgs, SharedInvalidationMessage *msgs,
bool relcacheInval,
@@ -5597,7 +5597,7 @@ XactLogCommitRecord(TimestampTz commit_time,
xl_xact_xinfo xl_xinfo;
xl_xact_dbinfo xl_dbinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_invals xl_invals;
xl_xact_twophase xl_twophase;
@@ -5651,8 +5651,8 @@ XactLogCommitRecord(TimestampTz commit_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5710,12 +5710,12 @@ XactLogCommitRecord(TimestampTz commit_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -5758,7 +5758,7 @@ XactLogCommitRecord(TimestampTz commit_time,
XLogRecPtr
XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int xactflags, TransactionId twophase_xid,
const char *twophase_gid)
@@ -5766,7 +5766,7 @@ XactLogAbortRecord(TimestampTz abort_time,
xl_xact_abort xlrec;
xl_xact_xinfo xl_xinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_twophase xl_twophase;
xl_xact_dbinfo xl_dbinfo;
@@ -5800,8 +5800,8 @@ XactLogAbortRecord(TimestampTz abort_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5864,12 +5864,12 @@ XactLogAbortRecord(TimestampTz abort_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -6010,7 +6010,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
XLogFlush(lsn);
/* Make sure files supposed to be dropped are dropped */
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
@@ -6121,7 +6121,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid,
*/
XLogFlush(lsn);
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 2ce9be2..ec27d36 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -70,7 +70,7 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ RelFileLocator rlocator; /* identifies the relation and block */
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -257,7 +257,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->rlocator, ®buf->forkno, ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -278,7 +278,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -293,7 +293,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
+XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
BlockNumber blknum, Page page, uint8 flags)
{
registered_buffer *regbuf;
@@ -308,7 +308,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
- regbuf->rnode = *rnode;
+ regbuf->rlocator = *rlocator;
regbuf->forkno = forknum;
regbuf->block = blknum;
regbuf->page = page;
@@ -331,7 +331,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -768,7 +768,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && RelFileLocatorEquals(regbuf->rlocator, prev_regbuf->rlocator))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -793,8 +793,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
- memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
- scratch += sizeof(RelFileNode);
+ memcpy(scratch, ®buf->rlocator, sizeof(RelFileLocator));
+ scratch += sizeof(RelFileLocator);
}
memcpy(scratch, ®buf->block, sizeof(BlockNumber));
scratch += sizeof(BlockNumber);
@@ -1031,7 +1031,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags = 0;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkno;
BlockNumber blkno;
@@ -1058,8 +1058,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &rlocator, &forkno, &blkno);
+ XLogRegisterBlock(0, &rlocator, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1080,7 +1080,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
+log_newpage(RelFileLocator *rlocator, ForkNumber forkNum, BlockNumber blkno,
Page page, bool page_std)
{
int flags;
@@ -1091,7 +1091,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, rlocator, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1112,7 +1112,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
* because we can write multiple pages in a single WAL record.
*/
void
-log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, Page *pages, bool page_std)
{
int flags;
@@ -1142,7 +1142,7 @@ log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
nbatch = 0;
while (nbatch < XLR_MAX_BLOCK_ID && i < num_pages)
{
- XLogRegisterBlock(nbatch, rnode, forkNum, blknos[i], pages[i], flags);
+ XLogRegisterBlock(nbatch, rlocator, forkNum, blknos[i], pages[i], flags);
i++;
nbatch++;
}
@@ -1177,16 +1177,16 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
BlockNumber blkno;
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &rlocator, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rlocator, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 959e409..c469610 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -138,7 +138,7 @@ struct XLogPrefetcher
dlist_head filter_queue;
/* Book-keeping to avoid repeat prefetches. */
- RelFileNode recent_rnode[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
+ RelFileLocator recent_rlocator[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
BlockNumber recent_block[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
int recent_idx;
@@ -161,7 +161,7 @@ struct XLogPrefetcher
*/
typedef struct XLogPrefetcherFilter
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
XLogRecPtr filter_until_replayed;
BlockNumber filter_from_block;
dlist_node link;
@@ -187,11 +187,11 @@ typedef struct XLogPrefetchStats
} XLogPrefetchStats;
static inline void XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno,
XLogRecPtr lsn);
static inline bool XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno);
static inline void XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher,
XLogRecPtr replaying_lsn);
@@ -365,7 +365,7 @@ XLogPrefetcherAllocate(XLogReaderState *reader)
{
XLogPrefetcher *prefetcher;
static HASHCTL hash_table_ctl = {
- .keysize = sizeof(RelFileNode),
+ .keysize = sizeof(RelFileLocator),
.entrysize = sizeof(XLogPrefetcherFilter)
};
@@ -568,22 +568,23 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
xl_dbase_create_file_copy_rec *xlrec =
(xl_dbase_create_file_copy_rec *) record->main_data;
- RelFileNode rnode = {InvalidOid, xlrec->db_id, InvalidOid};
+ RelFileLocator rlocator =
+ {InvalidOid, xlrec->db_id, InvalidRelFileNumber};
/*
* Don't try to prefetch anything in this database until
* it has been created, or we might confuse the blocks of
- * different generations, if a database OID or relfilenode
- * is reused. It's also more efficient than discovering
- * that relations don't exist on disk yet with ENOENT
- * errors.
+ * different generations, if a database OID or
+ * relfilenumber is reused. It's also more efficient than
+ * discovering that relations don't exist on disk yet with
+ * ENOENT errors.
*/
- XLogPrefetcherAddFilter(prefetcher, rnode, 0, record->lsn);
+ XLogPrefetcherAddFilter(prefetcher, rlocator, 0, record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in database %u until %X/%X is replayed due to raw file copy",
- rnode.dbNode,
+ rlocator.dbOid,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -601,19 +602,19 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't prefetch anything for this whole relation
* until it has been created. Otherwise we might
* confuse the blocks of different generations, if a
- * relfilenode is reused. This also avoids the need
+ * relfilenumber is reused. This also avoids the need
* to discover the problem via extra syscalls that
* report ENOENT.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -627,16 +628,16 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't consider prefetching anything in the truncated
* range until the truncation has been performed.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator,
xlrec->blkno,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
xlrec->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
@@ -688,7 +689,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
}
/* Should we skip prefetching this block due to a filter? */
- if (XLogPrefetcherIsFiltered(prefetcher, block->rnode, block->blkno))
+ if (XLogPrefetcherIsFiltered(prefetcher, block->rlocator, block->blkno))
{
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -698,7 +699,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
for (int i = 0; i < XLOGPREFETCHER_SEQ_WINDOW_SIZE; ++i)
{
if (block->blkno == prefetcher->recent_block[i] &&
- RelFileNodeEquals(block->rnode, prefetcher->recent_rnode[i]))
+ RelFileLocatorEquals(block->rlocator, prefetcher->recent_rlocator[i]))
{
/*
* XXX If we also remembered where it was, we could set
@@ -709,7 +710,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
return LRQ_NEXT_NO_IO;
}
}
- prefetcher->recent_rnode[prefetcher->recent_idx] = block->rnode;
+ prefetcher->recent_rlocator[prefetcher->recent_idx] = block->rlocator;
prefetcher->recent_block[prefetcher->recent_idx] = block->blkno;
prefetcher->recent_idx =
(prefetcher->recent_idx + 1) % XLOGPREFETCHER_SEQ_WINDOW_SIZE;
@@ -719,7 +720,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* same relation (with some scheme to handle invalidations
* safely), but for now we'll call smgropen() every time.
*/
- reln = smgropen(block->rnode, InvalidBackendId);
+ reln = smgropen(block->rlocator, InvalidBackendId);
/*
* If the relation file doesn't exist on disk, for example because
@@ -733,12 +734,12 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, 0,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -754,13 +755,13 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, block->blkno,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, block->blkno,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -793,9 +794,9 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
*/
elog(ERROR,
"could not prefetch relation %u/%u/%u block %u",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno);
}
}
@@ -852,17 +853,17 @@ pg_stat_get_recovery_prefetch(PG_FUNCTION_ARGS)
}
/*
- * Don't prefetch any blocks >= 'blockno' from a given 'rnode', until 'lsn'
+ * Don't prefetch any blocks >= 'blockno' from a given 'rlocator', until 'lsn'
* has been replayed.
*/
static inline void
-XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno, XLogRecPtr lsn)
{
XLogPrefetcherFilter *filter;
bool found;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_ENTER, &found);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_ENTER, &found);
if (!found)
{
/*
@@ -875,7 +876,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
else
{
/*
- * We were already filtering this rnode. Extend the filter's lifetime
+ * We were already filtering this rlocator. Extend the filter's lifetime
* to cover this WAL record, but leave the lower of the block numbers
* there because we don't want to have to track individual blocks.
*/
@@ -890,7 +891,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
* Have we replayed any records that caused us to begin filtering a block
* range? That means that relations should have been created, extended or
* dropped as required, so we can stop filtering out accesses to a given
- * relfilenode.
+ * relfilenumber.
*/
static inline void
XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_lsn)
@@ -913,7 +914,7 @@ XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_l
* Check if a given block should be skipped due to a filter.
*/
static inline bool
-XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno)
{
/*
@@ -925,13 +926,13 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
XLogPrefetcherFilter *filter;
/* See if the block range is filtered. */
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter && filter->filter_from_block <= blockno)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
#endif
@@ -939,15 +940,15 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
}
/* See if the whole database is filtered. */
- rnode.relNode = InvalidOid;
- rnode.spcNode = InvalidOid;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ rlocator.relNumber = InvalidRelFileNumber;
+ rlocator.spcOid = InvalidOid;
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
return true;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cf5db23..f3dc4b7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1638,7 +1638,7 @@ DecodeXLogRecord(XLogReaderState *state,
char *out;
uint32 remaining;
uint32 datatotal;
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
uint8 block_id;
decoded->header = *record;
@@ -1823,12 +1823,12 @@ DecodeXLogRecord(XLogReaderState *state,
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
- COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
- rnode = &blk->rnode;
+ COPY_HEADER_FIELD(&blk->rlocator, sizeof(RelFileLocator));
+ rlocator = &blk->rlocator;
}
else
{
- if (rnode == NULL)
+ if (rlocator == NULL)
{
report_invalid_record(state,
"BKPBLOCK_SAME_REL set but no previous rel at %X/%X",
@@ -1836,7 +1836,7 @@ DecodeXLogRecord(XLogReaderState *state,
goto err;
}
- blk->rnode = *rnode;
+ blk->rlocator = *rlocator;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
}
@@ -1926,10 +1926,11 @@ err:
*/
void
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
+ RelFileLocator *rlocator, ForkNumber *forknum,
+ BlockNumber *blknum)
{
- if (!XLogRecGetBlockTagExtended(record, block_id, rnode, forknum, blknum,
- NULL))
+ if (!XLogRecGetBlockTagExtended(record, block_id, rlocator, forknum,
+ blknum, NULL))
{
#ifndef FRONTEND
elog(ERROR, "failed to locate backup block with ID %d in WAL record",
@@ -1945,13 +1946,13 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
* Returns information about the block that a block reference refers to,
* optionally including the buffer that the block may already be in.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
+ * If the WAL record contains a block reference with the given ID, *rlocator,
* *forknum, *blknum and *prefetch_buffer are filled in (if not NULL), and
* returns true. Otherwise returns false.
*/
bool
XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer)
{
@@ -1961,8 +1962,8 @@ XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->record->blocks[block_id];
- if (rnode)
- *rnode = bkpb->rnode;
+ if (rlocator)
+ *rlocator = bkpb->rlocator;
if (forknum)
*forknum = bkpb->forknum;
if (blknum)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e23451b..5d6f1b5 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2166,24 +2166,26 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
/* decode block references */
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (forknum != MAIN_FORKNUM)
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
forknum,
blk);
else
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
@@ -2285,7 +2287,7 @@ static void
verifyBackupPageConsistency(XLogReaderState *record)
{
RmgrData rmgr = GetRmgr(XLogRecGetRmid(record));
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
int block_id;
@@ -2302,7 +2304,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
Page page;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -2327,7 +2329,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(rlocator, forknum, blkno,
RBM_NORMAL_NO_LOG,
InvalidBuffer);
if (!BufferIsValid(buf))
@@ -2377,7 +2379,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
{
elog(FATAL,
"inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 4851669..0cda225 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -67,7 +67,7 @@ HotStandbyState standbyState = STANDBY_DISABLED;
*/
typedef struct xl_invalid_page_key
{
- RelFileNode node; /* the relation */
+ RelFileLocator locator; /* the relation */
ForkNumber forkno; /* the fork number */
BlockNumber blkno; /* the page */
} xl_invalid_page_key;
@@ -86,10 +86,10 @@ static int read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPa
/* Report a reference to an invalid page */
static void
-report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
+report_invalid_page(int elevel, RelFileLocator locator, ForkNumber forkno,
BlockNumber blkno, bool present)
{
- char *path = relpathperm(node, forkno);
+ char *path = relpathperm(locator, forkno);
if (present)
elog(elevel, "page %u of relation %s is uninitialized",
@@ -102,7 +102,7 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
/* Log a reference to an invalid page */
static void
-log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
+log_invalid_page(RelFileLocator locator, ForkNumber forkno, BlockNumber blkno,
bool present)
{
xl_invalid_page_key key;
@@ -119,7 +119,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
*/
if (reachedConsistency)
{
- report_invalid_page(WARNING, node, forkno, blkno, present);
+ report_invalid_page(WARNING, locator, forkno, blkno, present);
elog(ignore_invalid_pages ? WARNING : PANIC,
"WAL contains references to invalid pages");
}
@@ -130,7 +130,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
* something about the XLOG record that generated the reference).
*/
if (message_level_is_interesting(DEBUG1))
- report_invalid_page(DEBUG1, node, forkno, blkno, present);
+ report_invalid_page(DEBUG1, locator, forkno, blkno, present);
if (invalid_page_tab == NULL)
{
@@ -147,7 +147,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
}
/* we currently assume xl_invalid_page_key contains no padding */
- key.node = node;
+ key.locator = locator;
key.forkno = forkno;
key.blkno = blkno;
hentry = (xl_invalid_page *)
@@ -166,7 +166,8 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
/* Forget any invalid pages >= minblkno, because they've been dropped */
static void
-forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
+forget_invalid_pages(RelFileLocator locator, ForkNumber forkno,
+ BlockNumber minblkno)
{
HASH_SEQ_STATUS status;
xl_invalid_page *hentry;
@@ -178,13 +179,13 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (RelFileNodeEquals(hentry->key.node, node) &&
+ if (RelFileLocatorEquals(hentry->key.locator, locator) &&
hentry->key.forkno == forkno &&
hentry->key.blkno >= minblkno)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, forkno);
+ char *path = relpathperm(hentry->key.locator, forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -213,11 +214,11 @@ forget_invalid_pages_db(Oid dbid)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (hentry->key.node.dbNode == dbid)
+ if (hentry->key.locator.dbOid == dbid)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, hentry->key.forkno);
+ char *path = relpathperm(hentry->key.locator, hentry->key.forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -261,7 +262,7 @@ XLogCheckInvalidPages(void)
*/
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- report_invalid_page(WARNING, hentry->key.node, hentry->key.forkno,
+ report_invalid_page(WARNING, hentry->key.locator, hentry->key.forkno,
hentry->key.blkno, hentry->present);
foundone = true;
}
@@ -356,7 +357,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Buffer *buf)
{
XLogRecPtr lsn = record->EndRecPtr;
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
Buffer prefetch_buffer;
@@ -364,7 +365,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
bool zeromode;
bool willinit;
- if (!XLogRecGetBlockTagExtended(record, block_id, &rnode, &forknum, &blkno,
+ if (!XLogRecGetBlockTagExtended(record, block_id, &rlocator, &forknum, &blkno,
&prefetch_buffer))
{
/* Caller specified a bogus block_id */
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK,
prefetch_buffer);
page = BufferGetPage(*buf);
@@ -418,7 +419,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode, prefetch_buffer);
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno, mode, prefetch_buffer);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -468,7 +469,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* they will be invisible to tools that need to know which pages are modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer)
{
@@ -481,14 +482,14 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* Do we have a clue where the buffer might be already? */
if (BufferIsValid(recent_buffer) &&
mode == RBM_NORMAL &&
- ReadRecentBuffer(rnode, forknum, blkno, recent_buffer))
+ ReadRecentBuffer(rlocator, forknum, blkno, recent_buffer))
{
buffer = recent_buffer;
goto recent_buffer_fast_path;
}
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(rlocator, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -505,7 +506,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
else
@@ -513,7 +514,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* hm, page doesn't exist in file */
if (mode == RBM_NORMAL)
{
- log_invalid_page(rnode, forknum, blkno, false);
+ log_invalid_page(rlocator, forknum, blkno, false);
return InvalidBuffer;
}
if (mode == RBM_NORMAL_NO_LOG)
@@ -530,7 +531,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum,
P_NEW, mode, NULL, true);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -540,7 +541,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
}
@@ -559,7 +560,7 @@ recent_buffer_fast_path:
if (PageIsNew(page))
{
ReleaseBuffer(buffer);
- log_invalid_page(rnode, forknum, blkno, true);
+ log_invalid_page(rlocator, forknum, blkno, true);
return InvalidBuffer;
}
}
@@ -594,7 +595,7 @@ typedef FakeRelCacheEntryData *FakeRelCacheEntry;
* Caller must free the returned entry with FreeFakeRelcacheEntry().
*/
Relation
-CreateFakeRelcacheEntry(RelFileNode rnode)
+CreateFakeRelcacheEntry(RelFileLocator rlocator)
{
FakeRelCacheEntry fakeentry;
Relation rel;
@@ -604,7 +605,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel = (Relation) fakeentry;
rel->rd_rel = &fakeentry->pgc;
- rel->rd_node = rnode;
+ rel->rd_locator = rlocator;
/*
* We will never be working with temp rels during recovery or while
@@ -615,18 +616,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
/* It must be a permanent table here */
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
- /* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ /* We don't know the name of the relation; use relfilenumber instead */
+ sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
+ * relation. Note that this is fairly bogus since relNumber may be
* different from the relation's OID. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
- rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
+ rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
rel->rd_smgr = NULL;
@@ -652,9 +653,9 @@ FreeFakeRelcacheEntry(Relation fakerel)
* any open "invalid-page" records for the relation.
*/
void
-XLogDropRelation(RelFileNode rnode, ForkNumber forknum)
+XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum)
{
- forget_invalid_pages(rnode, forknum, 0);
+ forget_invalid_pages(rlocator, forknum, 0);
}
/*
@@ -682,10 +683,10 @@ XLogDropDatabase(Oid dbid)
* We need to clean up any open "invalid-page" records for the dropped pages.
*/
void
-XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks)
{
- forget_invalid_pages(rnode, forkNum, nblocks);
+ forget_invalid_pages(rlocator, forkNum, nblocks);
}
/*
diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index e5cf1b3..7d7655d 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -287,9 +287,9 @@ Boot_DeclareIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidRelFileNumber;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
stmt->unique = false;
stmt->primary = false;
stmt->isconstraint = false;
@@ -339,9 +339,9 @@ Boot_DeclareUniqueIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidRelFileNumber;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
stmt->unique = true;
stmt->primary = false;
stmt->isconstraint = false;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index e784538..2a33273 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,14 +481,14 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
+ * GetNewRelFileNumber
+ * Generate a new relfilenumber that is unique within the
* database of the given tablespace.
*
- * If the relfilenode will also be used as the relation's OID, pass the
+ * If the relfilenumber will also be used as the relation's OID, pass the
* opened pg_class catalog, and this routine will guarantee that the result
* is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
+ * as a relfilenumber for an existing relation, pass NULL for pg_class.
*
* As with GetNewOidWithIndex(), there is some theoretical risk of a race
* condition, but it doesn't seem worth worrying about.
@@ -496,17 +496,17 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
* Note: we don't support using this in bootstrap mode. All relations
* created by bootstrap have preassigned OIDs, so there's no need.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
{
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
char *rpath;
bool collides;
BackendId backend;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenode assignments during a binary-upgrade run should be
+ * relfilenumber assignments during a binary-upgrade run should be
* determined by commands in the dump script.
*/
Assert(!IsBinaryUpgrade);
@@ -526,15 +526,15 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
}
/* This logic should match RelationInitPhysicalAddr */
- rnode.node.spcNode = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rnode.node.dbNode = (rnode.node.spcNode == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
+ rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
/*
* The relpath will vary based on the backend ID, so we must initialize
* that properly here to make sure that any collisions based on filename
* are properly detected.
*/
- rnode.backend = backend;
+ rlocator.backend = backend;
do
{
@@ -542,13 +542,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
/* Generate the OID */
if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
Anum_pg_class_oid);
else
- rnode.node.relNode = GetNewObjectId();
+ rlocator.locator.relNumber = GetNewObjectId();
/* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ rpath = relpath(rlocator, MAIN_FORKNUM);
if (access(rpath, F_OK) == 0)
{
@@ -570,7 +570,7 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
pfree(rpath);
} while (collides);
- return rnode.node.relNode;
+ return rlocator.locator.relNumber;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 1803194..c69c923 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -77,9 +77,11 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber =
+ InvalidRelFileNumber;
+RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber =
+ InvalidRelFileNumber;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -273,7 +275,7 @@ SystemAttributeByName(const char *attname)
* heap_create - Create an uncataloged heap relation
*
* Note API change: the caller must now always provide the OID
- * to use for the relation. The relfilenode may be (and in
+ * to use for the relation. The relfilenumber may be (and in
* the simplest cases is) left unspecified.
*
* create_storage indicates whether or not to create the storage.
@@ -289,7 +291,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -341,11 +343,11 @@ heap_create(const char *relname,
else
{
/*
- * If relfilenode is unspecified by the caller then create storage
+ * If relfilenumber is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelFileNumberIsValid(relfilenumber))
+ relfilenumber = relid;
}
/*
@@ -368,7 +370,7 @@ heap_create(const char *relname,
tupDesc,
relid,
accessmtd,
- relfilenode,
+ relfilenumber,
reltablespace,
shared_relation,
mapped_relation,
@@ -385,11 +387,11 @@ heap_create(const char *relname,
if (create_storage)
{
if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind))
- table_relation_set_new_filenode(rel, &rel->rd_node,
- relpersistence,
- relfrozenxid, relminmxid);
+ table_relation_set_new_filelocator(rel, &rel->rd_locator,
+ relpersistence,
+ relfrozenxid, relminmxid);
else if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
- RelationCreateStorage(rel->rd_node, relpersistence, true);
+ RelationCreateStorage(rel->rd_locator, relpersistence, true);
else
Assert(false);
}
@@ -1069,7 +1071,7 @@ AddNewRelationType(const char *typeName,
* relkind: relkind for new rel
* relpersistence: rel's persistence status (permanent, temp, or unlogged)
* shared_relation: true if it's to be a shared relation
- * mapped_relation: true if the relation will use the relfilenode map
+ * mapped_relation: true if the relation will use the relfilenumber map
* oncommit: ON COMMIT marking (only relevant if it's a temp table)
* reloptions: reloptions in Datum form, or (Datum) 0 if none
* use_user_acl: true if should look for user-defined default permissions;
@@ -1115,7 +1117,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1173,12 +1175,12 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(relid))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
/*
@@ -1196,13 +1198,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_toast_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("toast relfilenode value not set when in binary upgrade mode")));
+ errmsg("toast relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_toast_pg_class_relfilenumber;
+ binary_upgrade_next_toast_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
else
@@ -1217,20 +1219,20 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_heap_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("relfilenode value not set when in binary upgrade mode")));
+ errmsg("relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_heap_pg_class_relfilenumber;
+ binary_upgrade_next_heap_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
+ relpersistence);
}
/*
@@ -1273,7 +1275,7 @@ heap_create_with_catalog(const char *relname,
relnamespace,
reltablespace,
relid,
- relfilenode,
+ relfilenumber,
accessmtd,
tupdesc,
relkind,
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bdd3c34..3dc535e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,8 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber =
+ InvalidRelFileNumber;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,8 +663,8 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
- * nonzero to attach an existing valid build.
+ * relFileNumber: normally, pass InvalidRelFileNumber to get new storage.
+ * May be nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
* accessMethodObjectId: OID of index AM to use
@@ -703,7 +704,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +736,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelFileNumberIsValid(relFileNumber);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -751,7 +752,7 @@ index_create(Relation heapRelation,
/*
* The index will be in the same namespace as its parent table, and is
* shared across databases if and only if the parent is. Likewise, it
- * will use the relfilenode map if and only if the parent does; and it
+ * will use the relfilenumber map if and only if the parent does; and it
* inherits the parent's relpersistence.
*/
namespaceId = RelationGetNamespace(heapRelation);
@@ -902,12 +903,12 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(indexRelationId))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
if (!OidIsValid(binary_upgrade_next_index_pg_class_oid))
@@ -918,14 +919,14 @@ index_create(Relation heapRelation,
indexRelationId = binary_upgrade_next_index_pg_class_oid;
binary_upgrade_next_index_pg_class_oid = InvalidOid;
- /* Override the index relfilenode */
+ /* Override the index relfilenumber */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!RelFileNumberIsValid(binary_upgrade_next_index_pg_class_relfilenumber)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("index relfilenode value not set when in binary upgrade mode")));
- relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ errmsg("index relfilenumber value not set when in binary upgrade mode")));
+ relFileNumber = binary_upgrade_next_index_pg_class_relfilenumber;
+ binary_upgrade_next_index_pg_class_relfilenumber = InvalidRelFileNumber;
/*
* Note that we want create_storage = true for binary upgrade. The
@@ -937,7 +938,7 @@ index_create(Relation heapRelation,
else
{
indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
}
}
@@ -950,7 +951,7 @@ index_create(Relation heapRelation,
namespaceId,
tableSpaceId,
indexRelationId,
- relFileNode,
+ relFileNumber,
accessMethodObjectId,
indexTupDesc,
relkind,
@@ -1408,7 +1409,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelFileNumber, /* relFileNumber */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
@@ -3024,7 +3025,7 @@ index_build(Relation heapRelation,
* it -- but we must first check whether one already exists. If, for
* example, an unlogged relation is truncated in the transaction that
* created it, or truncated twice in a subsequent transaction, the
- * relfilenode won't change, and nothing needs to be done here.
+ * relfilenumber won't change, and nothing needs to be done here.
*/
if (indexRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
!smgrexists(RelationGetSmgr(indexRelation), INIT_FORKNUM))
@@ -3681,7 +3682,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
* Schedule unlinking of the old index storage at transaction commit.
*/
RelationDropStorage(iRel);
- RelationAssumeNewRelfilenode(iRel);
+ RelationAssumeNewRelfilelocator(iRel);
/* Make sure the reltablespace change is visible */
CommandCounterIncrement();
@@ -3711,7 +3712,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
SetReindexProcessing(heapId, indexId);
/* Create a new physical relation for the index */
- RelationSetNewRelfilenode(iRel, persistence);
+ RelationSetNewRelfilenumber(iRel, persistence);
/* Initialize the index and rebuild */
/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index c06e414..c25adbb 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -38,7 +38,7 @@
int wal_skip_threshold = 2048; /* in kilobytes */
/*
- * We keep a list of all relations (represented as RelFileNode values)
+ * We keep a list of all relations (represented as RelFileLocator values)
* that have been created or deleted in the current transaction. When
* a relation is created, we create the physical file immediately, but
* remember it so that we can delete the file again if the current
@@ -59,7 +59,7 @@ int wal_skip_threshold = 2048; /* in kilobytes */
typedef struct PendingRelDelete
{
- RelFileNode relnode; /* relation that may need to be deleted */
+ RelFileLocator rlocator; /* relation that may need to be deleted */
BackendId backend; /* InvalidBackendId if not a temp rel */
bool atCommit; /* T=delete at commit; F=delete at abort */
int nestLevel; /* xact nesting level of request */
@@ -68,7 +68,7 @@ typedef struct PendingRelDelete
typedef struct PendingRelSync
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
bool is_truncated; /* Has the file experienced truncation? */
} PendingRelSync;
@@ -81,7 +81,7 @@ static HTAB *pendingSyncHash = NULL;
* Queue an at-commit fsync.
*/
static void
-AddPendingSync(const RelFileNode *rnode)
+AddPendingSync(const RelFileLocator *rlocator)
{
PendingRelSync *pending;
bool found;
@@ -91,14 +91,14 @@ AddPendingSync(const RelFileNode *rnode)
{
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNode);
+ ctl.keysize = sizeof(RelFileLocator);
ctl.entrysize = sizeof(PendingRelSync);
ctl.hcxt = TopTransactionContext;
pendingSyncHash = hash_create("pending sync hash", 16, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
}
- pending = hash_search(pendingSyncHash, rnode, HASH_ENTER, &found);
+ pending = hash_search(pendingSyncHash, rlocator, HASH_ENTER, &found);
Assert(!found);
pending->is_truncated = false;
}
@@ -117,7 +117,7 @@ AddPendingSync(const RelFileNode *rnode)
* pass register_delete = false.
*/
SMgrRelation
-RelationCreateStorage(RelFileNode rnode, char relpersistence,
+RelationCreateStorage(RelFileLocator rlocator, char relpersistence,
bool register_delete)
{
SMgrRelation srel;
@@ -145,11 +145,11 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(rlocator, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
- log_smgrcreate(&srel->smgr_rnode.node, MAIN_FORKNUM);
+ log_smgrcreate(&srel->smgr_rlocator.locator, MAIN_FORKNUM);
/*
* Add the relation to the list of stuff to delete at abort, if we are
@@ -161,7 +161,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rnode;
+ pending->rlocator = rlocator;
pending->backend = backend;
pending->atCommit = false; /* delete if abort */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -172,7 +172,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
if (relpersistence == RELPERSISTENCE_PERMANENT && !XLogIsNeeded())
{
Assert(backend == InvalidBackendId);
- AddPendingSync(&rnode);
+ AddPendingSync(&rlocator);
}
return srel;
@@ -182,14 +182,14 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
* Perform XLogInsert of an XLOG_SMGR_CREATE record to WAL.
*/
void
-log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum)
+log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum)
{
xl_smgr_create xlrec;
/*
* Make an XLOG entry reporting the file creation.
*/
- xlrec.rnode = *rnode;
+ xlrec.rlocator = *rlocator;
xlrec.forkNum = forkNum;
XLogBeginInsert();
@@ -209,7 +209,7 @@ RelationDropStorage(Relation rel)
/* Add the relation to the list of stuff to delete at commit */
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rel->rd_node;
+ pending->rlocator = rel->rd_locator;
pending->backend = rel->rd_backend;
pending->atCommit = true; /* delete if commit */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -247,7 +247,7 @@ RelationDropStorage(Relation rel)
* No-op if the relation is not among those scheduled for deletion.
*/
void
-RelationPreserveStorage(RelFileNode rnode, bool atCommit)
+RelationPreserveStorage(RelFileLocator rlocator, bool atCommit)
{
PendingRelDelete *pending;
PendingRelDelete *prev;
@@ -257,7 +257,7 @@ RelationPreserveStorage(RelFileNode rnode, bool atCommit)
for (pending = pendingDeletes; pending != NULL; pending = next)
{
next = pending->next;
- if (RelFileNodeEquals(rnode, pending->relnode)
+ if (RelFileLocatorEquals(rlocator, pending->rlocator)
&& pending->atCommit == atCommit)
{
/* unlink and delete list entry */
@@ -369,7 +369,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
xl_smgr_truncate xlrec;
xlrec.blkno = nblocks;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_ALL;
XLogBeginInsert();
@@ -428,7 +428,7 @@ RelationPreTruncate(Relation rel)
return;
pending = hash_search(pendingSyncHash,
- &(RelationGetSmgr(rel)->smgr_rnode.node),
+ &(RelationGetSmgr(rel)->smgr_rlocator.locator),
HASH_FIND, NULL);
if (pending)
pending->is_truncated = true;
@@ -472,7 +472,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* We need to log the copied data in WAL iff WAL archiving/streaming is
* enabled AND it's a permanent relation. This gives the same answer as
* "RelationNeedsWAL(rel) || copying_initfork", because we know the
- * current operation created a new relfilenode.
+ * current operation created a new relation storage.
*/
use_wal = XLogIsNeeded() &&
(relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork);
@@ -496,8 +496,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* (errcontext callbacks shouldn't be risking any such thing, but
* people have been known to forget that rule.)
*/
- char *relpath = relpathbackend(src->smgr_rnode.node,
- src->smgr_rnode.backend,
+ char *relpath = relpathbackend(src->smgr_rlocator.locator,
+ src->smgr_rlocator.backend,
forkNum);
ereport(ERROR,
@@ -512,7 +512,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
PageSetChecksumInplace(page, blkno);
@@ -538,19 +538,19 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
}
/*
- * RelFileNodeSkippingWAL
- * Check if a BM_PERMANENT relfilenode is using WAL.
+ * RelFileLocatorSkippingWAL
+ * Check if a BM_PERMANENT relfilelocator is using WAL.
*
- * Changes of certain relfilenodes must not write WAL; see "Skipping WAL for
- * New RelFileNode" in src/backend/access/transam/README. Though it is known
- * from Relation efficiently, this function is intended for the code paths not
- * having access to Relation.
+ * Changes of certain relfilelocator must not write WAL; see "Skipping WAL for
+ * New RelFileLocator" in src/backend/access/transam/README. Though it is
+ * known from Relation efficiently, this function is intended for the code
+ * paths not having access to Relation.
*/
bool
-RelFileNodeSkippingWAL(RelFileNode rnode)
+RelFileLocatorSkippingWAL(RelFileLocator rlocator)
{
if (!pendingSyncHash ||
- hash_search(pendingSyncHash, &rnode, HASH_FIND, NULL) == NULL)
+ hash_search(pendingSyncHash, &rlocator, HASH_FIND, NULL) == NULL)
return false;
return true;
@@ -566,7 +566,7 @@ EstimatePendingSyncsSpace(void)
long entries;
entries = pendingSyncHash ? hash_get_num_entries(pendingSyncHash) : 0;
- return mul_size(1 + entries, sizeof(RelFileNode));
+ return mul_size(1 + entries, sizeof(RelFileLocator));
}
/*
@@ -581,57 +581,58 @@ SerializePendingSyncs(Size maxSize, char *startAddress)
HASH_SEQ_STATUS scan;
PendingRelSync *sync;
PendingRelDelete *delete;
- RelFileNode *src;
- RelFileNode *dest = (RelFileNode *) startAddress;
+ RelFileLocator *src;
+ RelFileLocator *dest = (RelFileLocator *) startAddress;
if (!pendingSyncHash)
goto terminate;
- /* Create temporary hash to collect active relfilenodes */
- ctl.keysize = sizeof(RelFileNode);
- ctl.entrysize = sizeof(RelFileNode);
+ /* Create temporary hash to collect active relfilelocators */
+ ctl.keysize = sizeof(RelFileLocator);
+ ctl.entrysize = sizeof(RelFileLocator);
ctl.hcxt = CurrentMemoryContext;
- tmphash = hash_create("tmp relfilenodes",
+ tmphash = hash_create("tmp relfilelocators",
hash_get_num_entries(pendingSyncHash), &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
- /* collect all rnodes from pending syncs */
+ /* collect all rlocator from pending syncs */
hash_seq_init(&scan, pendingSyncHash);
while ((sync = (PendingRelSync *) hash_seq_search(&scan)))
- (void) hash_search(tmphash, &sync->rnode, HASH_ENTER, NULL);
+ (void) hash_search(tmphash, &sync->rlocator, HASH_ENTER, NULL);
/* remove deleted rnodes */
for (delete = pendingDeletes; delete != NULL; delete = delete->next)
if (delete->atCommit)
- (void) hash_search(tmphash, (void *) &delete->relnode,
+ (void) hash_search(tmphash, (void *) &delete->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, tmphash);
- while ((src = (RelFileNode *) hash_seq_search(&scan)))
+ while ((src = (RelFileLocator *) hash_seq_search(&scan)))
*dest++ = *src;
hash_destroy(tmphash);
terminate:
- MemSet(dest, 0, sizeof(RelFileNode));
+ MemSet(dest, 0, sizeof(RelFileLocator));
}
/*
* RestorePendingSyncs
* Restore syncs within a parallel worker.
*
- * RelationNeedsWAL() and RelFileNodeSkippingWAL() must offer the correct
+ * RelationNeedsWAL() and RelFileLocatorSkippingWAL() must offer the correct
* answer to parallel workers. Only smgrDoPendingSyncs() reads the
* is_truncated field, at end of transaction. Hence, don't restore it.
*/
void
RestorePendingSyncs(char *startAddress)
{
- RelFileNode *rnode;
+ RelFileLocator *rlocator;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
- AddPendingSync(rnode);
+ for (rlocator = (RelFileLocator *) startAddress; rlocator->relNumber != 0;
+ rlocator++)
+ AddPendingSync(rlocator);
}
/*
@@ -677,7 +678,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(pending->rlocator, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -747,7 +748,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
/* Skip syncing nodes that smgrDoPendingDeletes() will delete. */
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
if (pending->atCommit)
- (void) hash_search(pendingSyncHash, (void *) &pending->relnode,
+ (void) hash_search(pendingSyncHash, (void *) &pending->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, pendingSyncHash);
@@ -758,7 +759,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
BlockNumber total_blocks = 0;
SMgrRelation srel;
- srel = smgropen(pendingsync->rnode, InvalidBackendId);
+ srel = smgropen(pendingsync->rlocator, InvalidBackendId);
/*
* We emit newpage WAL records for smaller relations.
@@ -832,7 +833,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* page including any unused space. ReadBufferExtended()
* counts some pgstat events; unfortunately, we discard them.
*/
- rel = CreateFakeRelcacheEntry(srel->smgr_rnode.node);
+ rel = CreateFakeRelcacheEntry(srel->smgr_rlocator.locator);
log_newpage_range(rel, fork, 0, n, false);
FreeFakeRelcacheEntry(rel);
}
@@ -852,7 +853,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* smgrGetPendingDeletes() -- Get a list of non-temp relations to be deleted.
*
* The return value is the number of relations scheduled for termination.
- * *ptr is set to point to a freshly-palloc'd array of RelFileNodes.
+ * *ptr is set to point to a freshly-palloc'd array of RelFileLocators.
* If there are no relations to be deleted, *ptr is set to NULL.
*
* Only non-temporary relations are included in the returned list. This is OK
@@ -866,11 +867,11 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* by upper-level transactions.
*/
int
-smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
+smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr)
{
int nestLevel = GetCurrentTransactionNestLevel();
int nrels;
- RelFileNode *rptr;
+ RelFileLocator *rptr;
PendingRelDelete *pending;
nrels = 0;
@@ -885,14 +886,14 @@ smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
*ptr = NULL;
return 0;
}
- rptr = (RelFileNode *) palloc(nrels * sizeof(RelFileNode));
+ rptr = (RelFileLocator *) palloc(nrels * sizeof(RelFileLocator));
*ptr = rptr;
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
{
if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit
&& pending->backend == InvalidBackendId)
{
- *rptr = pending->relnode;
+ *rptr = pending->rlocator;
rptr++;
}
}
@@ -967,7 +968,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -980,7 +981,7 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
@@ -1015,11 +1016,11 @@ smgr_redo(XLogReaderState *record)
nforks++;
/* Also tell xlogutils.c about it */
- XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+ XLogTruncateRelation(xlrec->rlocator, MAIN_FORKNUM, xlrec->blkno);
}
/* Prepare for truncation of FSM and VM too */
- rel = CreateFakeRelcacheEntry(xlrec->rnode);
+ rel = CreateFakeRelcacheEntry(xlrec->rlocator);
if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 &&
smgrexists(reln, FSM_FORKNUM))
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cea2c8b..da137eb 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -293,7 +293,7 @@ cluster_multiple_rels(List *rtcs, ClusterParams *params)
* cluster_rel
*
* This clusters the table by creating a new, clustered table and
- * swapping the relfilenodes of the new table and the old table, so
+ * swapping the relfilenumbers of the new table and the old table, so
* the OID of the original table is preserved. Thus we do not lose
* GRANT, inheritance nor references to this table (this was a bug
* in releases through 7.3).
@@ -1025,8 +1025,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
/*
* Swap the physical files of two given relations.
*
- * We swap the physical identity (reltablespace, relfilenode) while keeping the
- * same logical identities of the two relations. relpersistence is also
+ * We swap the physical identity (reltablespace, relfilenumber) while keeping
+ * the same logical identities of the two relations. relpersistence is also
* swapped, which is critical since it determines where buffers live for each
* relation.
*
@@ -1061,9 +1061,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
- relfilenode2;
- Oid swaptemp;
+ RelFileNumber relfilenumber1,
+ relfilenumber2;
+ RelFileNumber swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
@@ -1079,13 +1079,14 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
elog(ERROR, "cache lookup failed for relation %u", r2);
relform2 = (Form_pg_class) GETSTRUCT(reltup2);
- relfilenode1 = relform1->relfilenode;
- relfilenode2 = relform2->relfilenode;
+ relfilenumber1 = relform1->relfilenode;
+ relfilenumber2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) &&
+ RelFileNumberIsValid(relfilenumber2))
{
/*
- * Normal non-mapped relations: swap relfilenodes, reltablespaces,
+ * Normal non-mapped relations: swap relfilenumbers, reltablespaces,
* relpersistence
*/
Assert(!target_is_pg_class);
@@ -1120,7 +1121,8 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) ||
+ RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1148,12 +1150,12 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
/*
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
- relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ relfilenumber1 = RelationMapOidToFilenumber(r1, relform1->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
- relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ relfilenumber2 = RelationMapOidToFilenumber(r2, relform2->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
@@ -1161,15 +1163,15 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Send replacement mappings to relmapper. Note these won't actually
* take effect until CommandCounterIncrement.
*/
- RelationMapUpdateMap(r1, relfilenode2, relform1->relisshared, false);
- RelationMapUpdateMap(r2, relfilenode1, relform2->relisshared, false);
+ RelationMapUpdateMap(r1, relfilenumber2, relform1->relisshared, false);
+ RelationMapUpdateMap(r2, relfilenumber1, relform2->relisshared, false);
/* Pass OIDs of mapped r2 tables back to caller */
*mapped_tables++ = r2;
}
/*
- * Recognize that rel1's relfilenode (swapped from rel2) is new in this
+ * Recognize that rel1's relfilenumber (swapped from rel2) is new in this
* subtransaction. The rel2 storage (swapped from rel1) may or may not be
* new.
*/
@@ -1180,9 +1182,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
rel1 = relation_open(r1, NoLock);
rel2 = relation_open(r2, NoLock);
rel2->rd_createSubid = rel1->rd_createSubid;
- rel2->rd_newRelfilenodeSubid = rel1->rd_newRelfilenodeSubid;
- rel2->rd_firstRelfilenodeSubid = rel1->rd_firstRelfilenodeSubid;
- RelationAssumeNewRelfilenode(rel1);
+ rel2->rd_newRelfilelocatorSubid = rel1->rd_newRelfilelocatorSubid;
+ rel2->rd_firstRelfilelocatorSubid = rel1->rd_firstRelfilelocatorSubid;
+ RelationAssumeNewRelfilelocator(rel1);
relation_close(rel1, NoLock);
relation_close(rel2, NoLock);
}
@@ -1523,7 +1525,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
table_close(relRelation, RowExclusiveLock);
}
- /* Destroy new heap with old filenode */
+ /* Destroy new heap with old filenumber */
object.classId = RelationRelationId;
object.objectId = OIDNewHeap;
object.objectSubId = 0;
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 35a1d3a..c985fea 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -593,11 +593,11 @@ CopyFrom(CopyFromState cstate)
*/
if (RELKIND_HAS_STORAGE(cstate->rel->rd_rel->relkind) &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
- cstate->rel->rd_firstRelfilenodeSubid != InvalidSubTransactionId))
+ cstate->rel->rd_firstRelfilelocatorSubid != InvalidSubTransactionId))
ti_options |= TABLE_INSERT_SKIP_FSM;
/*
- * Optimize if new relfilenode was created in this subxact or one of its
+ * Optimize if new relfilenumber was created in this subxact or one of its
* committed children and we won't see those rows later as part of an
* earlier scan or command. The subxact test ensures that if this subxact
* aborts then the frozen rows won't be visible after xact cleanup. Note
@@ -640,7 +640,7 @@ CopyFrom(CopyFromState cstate)
errmsg("cannot perform COPY FREEZE because of prior transaction activity")));
if (cstate->rel->rd_createSubid != GetCurrentSubTransactionId() &&
- cstate->rel->rd_newRelfilenodeSubid != GetCurrentSubTransactionId())
+ cstate->rel->rd_newRelfilelocatorSubid != GetCurrentSubTransactionId())
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction")));
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index f269168..c78bab5 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -101,7 +101,7 @@ typedef struct
*/
typedef struct CreateDBRelInfo
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
Oid reloid; /* relation oid */
bool permanent; /* relation is permanent or unlogged */
} CreateDBRelInfo;
@@ -127,7 +127,7 @@ static void CreateDatabaseUsingWalLog(Oid src_dboid, Oid dboid, Oid src_tsid,
static List *ScanSourceDatabasePgClass(Oid srctbid, Oid srcdbid, char *srcpath);
static List *ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid,
Oid dbid, char *srcpath,
- List *rnodelist, Snapshot snapshot);
+ List *rlocatorlist, Snapshot snapshot);
static CreateDBRelInfo *ScanSourceDatabasePgClassTuple(HeapTupleData *tuple,
Oid tbid, Oid dbid,
char *srcpath);
@@ -147,12 +147,12 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
{
char *srcpath;
char *dstpath;
- List *rnodelist = NULL;
+ List *rlocatorlist = NULL;
ListCell *cell;
LockRelId srcrelid;
LockRelId dstrelid;
- RelFileNode srcrnode;
- RelFileNode dstrnode;
+ RelFileLocator srcrlocator;
+ RelFileLocator dstrlocator;
CreateDBRelInfo *relinfo;
/* Get source and destination database paths. */
@@ -165,9 +165,9 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
/* Copy relmap file from source database to the destination database. */
RelationMapCopy(dst_dboid, dst_tsid, srcpath, dstpath);
- /* Get list of relfilenodes to copy from the source database. */
- rnodelist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
- Assert(rnodelist != NIL);
+ /* Get list of relfilelocators to copy from the source database. */
+ rlocatorlist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
+ Assert(rlocatorlist != NIL);
/*
* Database IDs will be the same for all relations so set them before
@@ -176,11 +176,11 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
srcrelid.dbId = src_dboid;
dstrelid.dbId = dst_dboid;
- /* Loop over our list of relfilenodes and copy each one. */
- foreach(cell, rnodelist)
+ /* Loop over our list of relfilelocators and copy each one. */
+ foreach(cell, rlocatorlist)
{
relinfo = lfirst(cell);
- srcrnode = relinfo->rnode;
+ srcrlocator = relinfo->rlocator;
/*
* If the relation is from the source db's default tablespace then we
@@ -188,13 +188,13 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
* Otherwise, we need to create in the same tablespace as it is in the
* source database.
*/
- if (srcrnode.spcNode == src_tsid)
- dstrnode.spcNode = dst_tsid;
+ if (srcrlocator.spcOid == src_tsid)
+ dstrlocator.spcOid = dst_tsid;
else
- dstrnode.spcNode = srcrnode.spcNode;
+ dstrlocator.spcOid = srcrlocator.spcOid;
- dstrnode.dbNode = dst_dboid;
- dstrnode.relNode = srcrnode.relNode;
+ dstrlocator.dbOid = dst_dboid;
+ dstrlocator.relNumber = srcrlocator.relNumber;
/*
* Acquire locks on source and target relations before copying.
@@ -210,7 +210,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
LockRelationId(&dstrelid, AccessShareLock);
/* Copy relation storage from source to the destination. */
- CreateAndCopyRelationData(srcrnode, dstrnode, relinfo->permanent);
+ CreateAndCopyRelationData(srcrlocator, dstrlocator, relinfo->permanent);
/* Release the relation locks. */
UnlockRelationId(&srcrelid, AccessShareLock);
@@ -219,7 +219,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
pfree(srcpath);
pfree(dstpath);
- list_free_deep(rnodelist);
+ list_free_deep(rlocatorlist);
}
/*
@@ -246,31 +246,31 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
static List *
ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenode;
+ Oid relfilenumber;
Page page;
- List *rnodelist = NIL;
+ List *rlocatorlist = NIL;
LockRelId relid;
Relation rel;
Snapshot snapshot;
BufferAccessStrategy bstrategy;
- /* Get pg_class relfilenode. */
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- RelationRelationId);
+ /* Get pg_class relfilenumber. */
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ RelationRelationId);
/* Don't read data into shared_buffers without holding a relation lock. */
relid.dbId = dbid;
relid.relId = RelationRelationId;
LockRelationId(&relid, AccessShareLock);
- /* Prepare a RelFileNode for the pg_class relation. */
- rnode.spcNode = tbid;
- rnode.dbNode = dbid;
- rnode.relNode = relfilenode;
+ /* Prepare a RelFileLocator for the pg_class relation. */
+ rlocator.spcOid = tbid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = relfilenumber;
/*
* We can't use a real relcache entry for a relation in some other
@@ -279,7 +279,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- rel = CreateFakeRelcacheEntry(rnode);
+ rel = CreateFakeRelcacheEntry(rlocator);
nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
FreeFakeRelcacheEntry(rel);
@@ -299,7 +299,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
CHECK_FOR_INTERRUPTS();
- buf = ReadBufferWithoutRelcache(rnode, MAIN_FORKNUM, blkno,
+ buf = ReadBufferWithoutRelcache(rlocator, MAIN_FORKNUM, blkno,
RBM_NORMAL, bstrategy, false);
LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -310,9 +310,9 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
continue;
}
- /* Append relevant pg_class tuples for current page to rnodelist. */
- rnodelist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
- srcpath, rnodelist,
+ /* Append relevant pg_class tuples for current page to rlocatorlist. */
+ rlocatorlist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
+ srcpath, rlocatorlist,
snapshot);
UnlockReleaseBuffer(buf);
@@ -321,16 +321,16 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
/* Release relation lock. */
UnlockRelationId(&relid, AccessShareLock);
- return rnodelist;
+ return rlocatorlist;
}
/*
* Scan one page of the source database's pg_class relation and add relevant
- * entries to rnodelist. The return value is the updated list.
+ * entries to rlocatorlist. The return value is the updated list.
*/
static List *
ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
- char *srcpath, List *rnodelist,
+ char *srcpath, List *rlocatorlist,
Snapshot snapshot)
{
BlockNumber blkno = BufferGetBlockNumber(buf);
@@ -376,11 +376,11 @@ ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
relinfo = ScanSourceDatabasePgClassTuple(&tuple, tbid, dbid,
srcpath);
if (relinfo != NULL)
- rnodelist = lappend(rnodelist, relinfo);
+ rlocatorlist = lappend(rlocatorlist, relinfo);
}
}
- return rnodelist;
+ return rlocatorlist;
}
/*
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenode = InvalidOid;
+ Oid relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
@@ -418,29 +418,29 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
return NULL;
/*
- * If relfilenode is valid then directly use it. Otherwise, consult the
+ * If relfilenumber is valid then directly use it. Otherwise, consult the
* relmap.
*/
- if (OidIsValid(classForm->relfilenode))
- relfilenode = classForm->relfilenode;
+ if (RelFileNumberIsValid(classForm->relfilenode))
+ relfilenumber = classForm->relfilenode;
else
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- classForm->oid);
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ classForm->oid);
- /* We must have a valid relfilenode oid. */
- if (!OidIsValid(relfilenode))
- elog(ERROR, "relation with OID %u does not have a valid relfilenode",
+ /* We must have a valid relfilenumber. */
+ if (!RelFileNumberIsValid(relfilenumber))
+ elog(ERROR, "relation with OID %u does not have a valid relfilenumber",
classForm->oid);
/* Prepare a rel info element and add it to the list. */
relinfo = (CreateDBRelInfo *) palloc(sizeof(CreateDBRelInfo));
if (OidIsValid(classForm->reltablespace))
- relinfo->rnode.spcNode = classForm->reltablespace;
+ relinfo->rlocator.spcOid = classForm->reltablespace;
else
- relinfo->rnode.spcNode = tbid;
+ relinfo->rlocator.spcOid = tbid;
- relinfo->rnode.dbNode = dbid;
- relinfo->rnode.relNode = relfilenode;
+ relinfo->rlocator.dbOid = dbid;
+ relinfo->rlocator.relNumber = relfilenumber;
relinfo->reloid = classForm->oid;
/* Temporary relations were rejected above. */
@@ -2867,8 +2867,8 @@ remove_dbtablespaces(Oid db_id)
* try to remove that already-existing subdirectory during the cleanup in
* remove_dbtablespaces. Nuking existing files seems like a bad idea, so
* instead we make this extra check before settling on the OID of the new
- * database. This exactly parallels what GetNewRelFileNode() does for table
- * relfilenode values.
+ * database. This exactly parallels what GetNewRelFileNumber() does for table
+ * relfilenumber values.
*/
static bool
check_db_file_conflict(Oid db_id)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 99f5ab8..1868608 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1109,10 +1109,10 @@ DefineIndex(Oid relationId,
}
/*
- * A valid stmt->oldNode implies that we already have a built form of the
+ * A valid stmt->oldNumber implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelFileNumberIsValid(stmt->oldNumber) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1154,7 +1154,7 @@ DefineIndex(Oid relationId,
indexRelationId =
index_create(rel, indexRelationName, indexRelationId, parentIndexId,
parentConstraintId,
- stmt->oldNode, indexInfo, indexColNames,
+ stmt->oldNumber, indexInfo, indexColNames,
accessMethodId, tablespaceId,
collationObjectId, classObjectId,
coloptions, reloptions,
@@ -1361,15 +1361,15 @@ DefineIndex(Oid relationId,
* We can't use the same index name for the child index,
* so clear idxname to let the recursive invocation choose
* a new name. Likewise, the existing target relation
- * field is wrong, and if indexOid or oldNode are set,
+ * field is wrong, and if indexOid or oldNumber are set,
* they mustn't be applied to the child either.
*/
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNumber = InvalidRelFileNumber;
childStmt->oldCreateSubid = InvalidSubTransactionId;
- childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ childStmt->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
/*
* Adjust any Vars (both in expressions and in the index's
@@ -3015,7 +3015,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelFileNumberIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106..9ac0383 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -118,7 +118,7 @@ SetMatViewPopulatedState(Relation relation, bool newstate)
* ExecRefreshMatView -- execute a REFRESH MATERIALIZED VIEW command
*
* This refreshes the materialized view by creating a new table and swapping
- * the relfilenodes of the new table and the old materialized view, so the OID
+ * the relfilenumbers of the new table and the old materialized view, so the OID
* of the original materialized view is preserved. Thus we do not lose GRANT
* nor references to this materialized view.
*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ddf219b..48d9d43 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -75,7 +75,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelFileNumber filenumber; /* last seen relfilenumber of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
@@ -255,7 +255,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
*
* The change is made transactionally, so that on failure of the current
* transaction, the sequence will be restored to its previous state.
- * We do that by creating a whole new relfilenode for the sequence; so this
+ * We do that by creating a whole new relfilenumber for the sequence; so this
* works much like the rewriting forms of ALTER TABLE.
*
* Caller is assumed to have acquired AccessExclusiveLock on the sequence,
@@ -310,7 +310,7 @@ ResetSequence(Oid seq_relid)
/*
* Create a new storage file for the sequence.
*/
- RelationSetNewRelfilenode(seq_rel, seq_rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seq_rel, seq_rel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -347,9 +347,9 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
{
SMgrRelation srel;
- srel = smgropen(rel->rd_node, InvalidBackendId);
+ srel = smgropen(rel->rd_locator, InvalidBackendId);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
+ log_smgrcreate(&rel->rd_locator, INIT_FORKNUM);
fill_seq_fork_with_data(rel, tuple, INIT_FORKNUM);
FlushRelationBuffers(rel);
smgrclose(srel);
@@ -418,7 +418,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = rel->rd_node;
+ xlrec.locator = rel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) tuple->t_data, tuple->t_len);
@@ -509,7 +509,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
* Create a new storage file for the sequence, making the state
* changes transactional.
*/
- RelationSetNewRelfilenode(seqrel, seqrel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seqrel, seqrel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -557,7 +557,7 @@ SequenceChangePersistence(Oid relid, char newrelpersistence)
GetTopTransactionId();
(void) read_seq_tuple(seqrel, &buf, &seqdatatuple);
- RelationSetNewRelfilenode(seqrel, newrelpersistence);
+ RelationSetNewRelfilenumber(seqrel, newrelpersistence);
fill_seq_with_data(seqrel, &seqdatatuple);
UnlockReleaseBuffer(buf);
@@ -836,7 +836,7 @@ nextval_internal(Oid relid, bool check_permissions)
seq->is_called = true;
seq->log_cnt = 0;
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1023,7 +1023,7 @@ do_setval(Oid relid, int64 next, bool iscalled)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1147,7 +1147,7 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
if (!found)
{
/* relid already filled in */
- elm->filenode = InvalidOid;
+ elm->filenumber = InvalidRelFileNumber;
elm->lxid = InvalidLocalTransactionId;
elm->last_valid = false;
elm->last = elm->cached = 0;
@@ -1169,9 +1169,9 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
* discard any cached-but-unissued values. We do not touch the currval()
* state, however.
*/
- if (seqrel->rd_rel->relfilenode != elm->filenode)
+ if (seqrel->rd_rel->relfilenode != elm->filenumber)
{
- elm->filenode = seqrel->rd_rel->relfilenode;
+ elm->filenumber = seqrel->rd_rel->relfilenode;
elm->cached = elm->last;
}
@@ -1254,7 +1254,8 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
* changed. This allows ALTER SEQUENCE to behave transactionally. Currently,
* the only option that doesn't cause that is OWNED BY. It's *necessary* for
* ALTER SEQUENCE OWNED BY to not rewrite the sequence, because that would
- * break pg_upgrade by causing unwanted changes in the sequence's relfilenode.
+ * break pg_upgrade by causing unwanted changes in the sequence's
+ * relfilenumber.
*/
static void
init_params(ParseState *pstate, List *options, bool for_identity,
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 2de0eba..1249c89 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -596,7 +596,7 @@ static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
static ObjectAddress ATExecSetCompression(AlteredTableInfo *tab, Relation rel,
const char *column, Node *newValue, LOCKMODE lockmode);
-static void index_copy_data(Relation rel, RelFileNode newrnode);
+static void index_copy_data(Relation rel, RelFileLocator newrlocator);
static const char *storage_name(char c);
static void RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid,
@@ -1986,12 +1986,12 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* Normally, we need a transaction-safe truncation here. However, if
* the table was either created in the current (sub)transaction or has
- * a new relfilenode in the current (sub)transaction, then we can just
+ * a new relfilenumber in the current (sub)transaction, then we can just
* truncate it in-place, because a rollback would cause the whole
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilelocatorSubid == mySubid)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -2014,10 +2014,10 @@ ExecuteTruncateGuts(List *explicit_rels,
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
- * as the relfilenode value. The old storage file is scheduled for
+ * as the relfilenumber value. The old storage file is scheduled for
* deletion at commit.
*/
- RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(rel, rel->rd_rel->relpersistence);
heap_relid = RelationGetRelid(rel);
@@ -2030,7 +2030,7 @@ ExecuteTruncateGuts(List *explicit_rels,
Relation toastrel = relation_open(toast_relid,
AccessExclusiveLock);
- RelationSetNewRelfilenode(toastrel,
+ RelationSetNewRelfilenumber(toastrel,
toastrel->rd_rel->relpersistence);
table_close(toastrel, NoLock);
}
@@ -3315,11 +3315,11 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
/*
* SetRelationTableSpace
- * Set new reltablespace and relfilenode in pg_class entry.
+ * Set new reltablespace and relfilenumber in pg_class entry.
*
* newTableSpaceId is the new tablespace for the relation, and
- * newRelFileNode its new filenode. If newRelFileNode is InvalidOid,
- * this field is not updated.
+ * newRelFilenumber its new filenumber. If newRelFilenumber is
+ * InvalidRelFileNumber, this field is not updated.
*
* NOTE: The caller must hold AccessExclusiveLock on the relation.
*
@@ -3331,7 +3331,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelFileNumber newRelFilenumber)
{
Relation pg_class;
HeapTuple tuple;
@@ -3351,8 +3351,8 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
- rd_rel->relfilenode = newRelFileNode;
+ if (RelFileNumberIsValid(newRelFilenumber))
+ rd_rel->relfilenode = newRelFilenumber;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
/*
@@ -5420,7 +5420,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* persistence: on one hand, we need to ensure that the buffers
* belonging to each of the two relations are marked with or without
* BM_PERMANENT properly. On the other hand, since rewriting creates
- * and assigns a new relfilenode, we automatically create or drop an
+ * and assigns a new relfilenumber, we automatically create or drop an
* init fork for the relation as appropriate.
*/
if (tab->rewrite > 0 && tab->relkind != RELKIND_SEQUENCE)
@@ -5506,12 +5506,13 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* Create transient table that will receive the modified data.
*
* Ensure it is marked correctly as logged or unlogged. We have
- * to do this here so that buffers for the new relfilenode will
+ * to do this here so that buffers for the new relfilenumber will
* have the right persistence set, and at the same time ensure
- * that the original filenode's buffers will get read in with the
- * correct setting (i.e. the original one). Otherwise a rollback
- * after the rewrite would possibly result with buffers for the
- * original filenode having the wrong persistence setting.
+ * that the original filenumbers's buffers will get read in with
+ * the correct setting (i.e. the original one). Otherwise a
+ * rollback after the rewrite would possibly result with buffers
+ * for the original filenumbers having the wrong persistence
+ * setting.
*
* NB: This relies on swap_relation_files() also swapping the
* persistence. That wouldn't work for pg_class, but that can't be
@@ -8597,7 +8598,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelFileNumberIsValid(stmt->oldNumber);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8613,7 +8614,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
quiet);
/*
- * If TryReuseIndex() stashed a relfilenode for us, we used it for the new
+ * If TryReuseIndex() stashed a relfilenumber for us, we used it for the new
* index instead of building from scratch. Restore associated fields.
* This may store InvalidSubTransactionId in both fields, in which case
* relcache.c will assume it can rebuild the relcache entry. Hence, do
@@ -8621,13 +8622,13 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelFileNumberIsValid(stmt->oldNumber))
{
Relation irel = index_open(address.objectId, NoLock);
irel->rd_createSubid = stmt->oldCreateSubid;
- irel->rd_firstRelfilenodeSubid = stmt->oldFirstRelfilenodeSubid;
- RelationPreserveStorage(irel->rd_node, true);
+ irel->rd_firstRelfilelocatorSubid = stmt->oldFirstRelfilelocatorSubid;
+ RelationPreserveStorage(irel->rd_locator, true);
index_close(irel, NoLock);
}
@@ -13491,9 +13492,9 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNumber = irel->rd_locator.relNumber;
stmt->oldCreateSubid = irel->rd_createSubid;
- stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
+ stmt->oldFirstRelfilelocatorSubid = irel->rd_firstRelfilelocatorSubid;
}
index_close(irel, NoLock);
}
@@ -14340,8 +14341,8 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
- RelFileNode newrnode;
+ RelFileNumber newrelfilenumber;
+ RelFileLocator newrlocator;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14370,26 +14371,28 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
+ * Relfilenumbers are not unique in databases across tablespaces, so we need
* to allocate a new one in the new tablespace.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
- newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
- newrnode.spcNode = newTableSpace;
+ newrlocator = rel->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
+ newrlocator.spcOid = newTableSpace;
- /* hand off to AM to actually create the new filenode and copy the data */
+ /*
+ * hand off to AM to actually create the new filelocator and copy the data
+ */
if (rel->rd_rel->relkind == RELKIND_INDEX)
{
- index_copy_data(rel, newrnode);
+ index_copy_data(rel, newrlocator);
}
else
{
Assert(RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind));
- table_relation_copy_data(rel, &newrnode);
+ table_relation_copy_data(rel, &newrlocator);
}
/*
@@ -14400,11 +14403,11 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* the updated pg_class entry), but that's forbidden with
* CheckRelationTableSpaceMove().
*/
- SetRelationTableSpace(rel, newTableSpace, newrelfilenode);
+ SetRelationTableSpace(rel, newTableSpace, newrelfilenumber);
InvokeObjectPostAlterHook(RelationRelationId, RelationGetRelid(rel), 0);
- RelationAssumeNewRelfilenode(rel);
+ RelationAssumeNewRelfilelocator(rel);
relation_close(rel, NoLock);
@@ -14630,11 +14633,11 @@ AlterTableMoveAll(AlterTableMoveAllStmt *stmt)
}
static void
-index_copy_data(Relation rel, RelFileNode newrnode)
+index_copy_data(Relation rel, RelFileLocator newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -14648,10 +14651,10 @@ index_copy_data(Relation rel, RelFileNode newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilenumber value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -14672,7 +14675,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(&newrnode, forkNum);
+ log_smgrcreate(&newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 00ca397..c8bdd99 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -12,12 +12,12 @@
* remove the possibility of having file name conflicts, we isolate
* files within a tablespace into database-specific subdirectories.
*
- * To support file access via the information given in RelFileNode, we
+ * To support file access via the information given in RelFileLocator, we
* maintain a symbolic-link map in $PGDATA/pg_tblspc. The symlinks are
* named by tablespace OIDs and point to the actual tablespace directories.
* There is also a per-cluster version directory in each tablespace.
* Thus the full path to an arbitrary file is
- * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenode
+ * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenumber
* e.g.
* $PGDATA/pg_tblspc/20981/PG_9.0_201002161/719849/83292814
*
@@ -25,8 +25,8 @@
* tables) and pg_default (for everything else). For backwards compatibility
* and to remain functional on platforms without symlinks, these tablespaces
* are accessed specially: they are respectively
- * $PGDATA/global/relfilenode
- * $PGDATA/base/dboid/relfilenode
+ * $PGDATA/global/relfilenumber
+ * $PGDATA/base/dboid/relfilenumber
*
* To allow CREATE DATABASE to give a new database a default tablespace
* that's different from the template database's default, we make the
@@ -115,7 +115,7 @@ static bool destroy_tablespace_directories(Oid tablespaceoid, bool redo);
* re-create a database subdirectory (of $PGDATA/base) during WAL replay.
*/
void
-TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
+TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
{
struct stat st;
char *dir;
@@ -124,13 +124,13 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
* The global tablespace doesn't have per-database subdirectories, so
* nothing to do for it.
*/
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
return;
- Assert(OidIsValid(spcNode));
- Assert(OidIsValid(dbNode));
+ Assert(OidIsValid(spcOid));
+ Assert(OidIsValid(dbOid));
- dir = GetDatabasePath(dbNode, spcNode);
+ dir = GetDatabasePath(dbOid, spcOid);
if (stat(dir, &st) < 0)
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630f..9c28efa 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4193,9 +4193,9 @@ _copyIndexStmt(const IndexStmt *from)
COPY_NODE_FIELD(excludeOpNames);
COPY_STRING_FIELD(idxcomment);
COPY_SCALAR_FIELD(indexOid);
- COPY_SCALAR_FIELD(oldNode);
+ COPY_SCALAR_FIELD(oldNumber);
COPY_SCALAR_FIELD(oldCreateSubid);
- COPY_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COPY_SCALAR_FIELD(oldFirstRelfilelocatorSubid);
COPY_SCALAR_FIELD(unique);
COPY_SCALAR_FIELD(nulls_not_distinct);
COPY_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index e747e16..5b30005 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1752,9 +1752,9 @@ _equalIndexStmt(const IndexStmt *a, const IndexStmt *b)
COMPARE_NODE_FIELD(excludeOpNames);
COMPARE_STRING_FIELD(idxcomment);
COMPARE_SCALAR_FIELD(indexOid);
- COMPARE_SCALAR_FIELD(oldNode);
+ COMPARE_SCALAR_FIELD(oldNumber);
COMPARE_SCALAR_FIELD(oldCreateSubid);
- COMPARE_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COMPARE_SCALAR_FIELD(oldFirstRelfilelocatorSubid);
COMPARE_SCALAR_FIELD(unique);
COMPARE_SCALAR_FIELD(nulls_not_distinct);
COMPARE_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4315c53..05f27f0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2932,9 +2932,9 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_OID_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
- WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
+ WRITE_UINT_FIELD(oldFirstRelfilelocatorSubid);
WRITE_BOOL_FIELD(unique);
WRITE_BOOL_FIELD(nulls_not_distinct);
WRITE_BOOL_FIELD(primary);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 969c9c1..0523013 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,9 +7990,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidRelFileNumber;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
@@ -8022,9 +8022,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidRelFileNumber;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index f889726..b572534 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1578,9 +1578,9 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
index->nulls_not_distinct = idxrec->indnullsnotdistinct;
index->primary = idxrec->indisprimary;
@@ -2199,9 +2199,9 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
index->transformed = false;
index->concurrent = false;
index->if_not_exists = false;
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index c937c39..5fc076f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -1207,7 +1207,7 @@ CompactCheckpointerRequestQueue(void)
* We use the request struct directly as a hashtable key. This
* assumes that any padding bytes in the structs are consistently the
* same, which should be okay because we zeroed them in
- * CheckpointerShmemInit. Note also that RelFileNode had better
+ * CheckpointerShmemInit. Note also that RelFileLocator had better
* contain no pad bytes.
*/
request = &CheckpointerShmem->requests[n];
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index aa2427b..c5c6a2b 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -845,7 +845,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_insert *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_insert *) XLogRecGetData(r);
@@ -857,8 +857,8 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -872,7 +872,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
tupledata = XLogRecGetBlockData(r, 0, &datalen);
tuplelen = datalen - SizeOfHeapHeader;
@@ -902,13 +902,13 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xl_heap_update *xlrec;
ReorderBufferChange *change;
char *data;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -918,7 +918,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change = ReorderBufferGetChange(ctx->reorder);
change->action = REORDER_BUFFER_CHANGE_UPDATE;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
if (xlrec->flags & XLH_UPDATE_CONTAINS_NEW_TUPLE)
{
@@ -968,13 +968,13 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_delete *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -990,7 +990,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
/* old primary key stored */
if (xlrec->flags & XLH_DELETE_CONTAINS_OLD)
@@ -1063,7 +1063,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
char *data;
char *tupledata;
Size tuplelen;
- RelFileNode rnode;
+ RelFileLocator rlocator;
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
@@ -1075,8 +1075,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
- if (rnode.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &rlocator, NULL, NULL);
+ if (rlocator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1103,7 +1103,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &rnode, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &rlocator, sizeof(RelFileLocator));
xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
@@ -1165,11 +1165,11 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
{
XLogReaderState *r = buf->record;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1180,7 +1180,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
change->data.tp.clear_toast_afterwards = true;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8da5f90..f8fb228 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -106,7 +106,7 @@
#include "utils/memdebug.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
/* entry for a hash table we use to map from xid to our transaction state */
@@ -116,10 +116,10 @@ typedef struct ReorderBufferTXNByIdEnt
ReorderBufferTXN *txn;
} ReorderBufferTXNByIdEnt;
-/* data structures for (relfilenode, ctid) => (cmin, cmax) mapping */
+/* data structures for (relfilelocator, ctid) => (cmin, cmax) mapping */
typedef struct ReorderBufferTupleCidKey
{
- RelFileNode relnode;
+ RelFileLocator rlocator;
ItemPointerData tid;
} ReorderBufferTupleCidKey;
@@ -1643,7 +1643,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Destroy the (relfilenode, ctid) hashtable, so that we don't leak any
+ * Destroy the (relfilelocator, ctid) hashtable, so that we don't leak any
* memory. We could also keep the hash table and update it with new ctid
* values, but this seems simpler and good enough for now.
*/
@@ -1673,7 +1673,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Build a hash with a (relfilenode, ctid) -> (cmin, cmax) mapping for use by
+ * Build a hash with a (relfilelocator, ctid) -> (cmin, cmax) mapping for use by
* HeapTupleSatisfiesHistoricMVCC.
*/
static void
@@ -1711,7 +1711,7 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
/* be careful about padding */
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
- key.relnode = change->data.tuplecid.node;
+ key.rlocator = change->data.tuplecid.locator;
ItemPointerCopy(&change->data.tuplecid.tid,
&key.tid);
@@ -2140,36 +2140,36 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
+ change->data.tp.rlocator.relNumber);
/*
* Mapped catalog tuple without data, emitted while
* catalog table was in the process of being rewritten. We
- * can fail to look up the relfilenode, because the
+ * can fail to look up the relfilenumber, because the
* relmapper has no "historic" view, in contrast to the
* normal catalog during decoding. Thus repeated rewrites
* can cause a lookup failure. That's OK because we do not
* decode catalog changes anyway. Normally such tuples
* would be skipped over below, but we can't identify
* whether the table should be logically logged without
- * mapping the relfilenode to the oid.
+ * mapping the relfilenumber to the oid.
*/
if (reloid == InvalidOid &&
change->data.tp.newtuple == NULL &&
change->data.tp.oldtuple == NULL)
goto change_done;
else if (reloid == InvalidOid)
- elog(ERROR, "could not map filenode \"%s\" to relation OID",
- relpathperm(change->data.tp.relnode,
+ elog(ERROR, "could not map filenumber \"%s\" to relation OID",
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
relation = RelationIdGetRelation(reloid);
if (!RelationIsValid(relation))
- elog(ERROR, "could not open relation with OID %u (for filenode \"%s\")",
+ elog(ERROR, "could not open relation with OID %u (for filenumber \"%s\")",
reloid,
- relpathperm(change->data.tp.relnode,
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
if (!RelationIsLogicallyLogged(relation))
@@ -3157,7 +3157,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
}
/*
- * Add new (relfilenode, tid) -> (cmin, cmax) mappings.
+ * Add new (relfilelocator, tid) -> (cmin, cmax) mappings.
*
* We do not include this change type in memory accounting, because we
* keep CIDs in a separate list and do not evict them when reaching
@@ -3165,7 +3165,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
*/
void
ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
- XLogRecPtr lsn, RelFileNode node,
+ XLogRecPtr lsn, RelFileLocator locator,
ItemPointerData tid, CommandId cmin,
CommandId cmax, CommandId combocid)
{
@@ -3174,7 +3174,7 @@ ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
- change->data.tuplecid.node = node;
+ change->data.tuplecid.locator = locator;
change->data.tuplecid.tid = tid;
change->data.tuplecid.cmin = cmin;
change->data.tuplecid.cmax = cmax;
@@ -4839,7 +4839,7 @@ ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn)
* need anymore.
*
* To resolve those problems we have a per-transaction hash of (cmin,
- * cmax) tuples keyed by (relfilenode, ctid) which contains the actual
+ * cmax) tuples keyed by (relfilelocator, ctid) which contains the actual
* (cmin, cmax) values. That also takes care of combo CIDs by simply
* not caring about them at all. As we have the real cmin/cmax values
* combo CIDs aren't interesting.
@@ -4870,9 +4870,9 @@ DisplayMapping(HTAB *tuplecid_data)
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
- ent->key.relnode.dbNode,
- ent->key.relnode.spcNode,
- ent->key.relnode.relNode,
+ ent->key.rlocator.dbOid,
+ ent->key.rlocator.spcOid,
+ ent->key.rlocator.relNumber,
ItemPointerGetBlockNumber(&ent->key.tid),
ItemPointerGetOffsetNumber(&ent->key.tid),
ent->cmin,
@@ -4932,7 +4932,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
path, readBytes,
(int32) sizeof(LogicalRewriteMappingData))));
- key.relnode = map.old_node;
+ key.rlocator = map.old_locator;
ItemPointerCopy(&map.old_tid,
&key.tid);
@@ -4947,7 +4947,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
if (!ent)
continue;
- key.relnode = map.new_node;
+ key.rlocator = map.new_locator;
ItemPointerCopy(&map.new_tid,
&key.tid);
@@ -5120,10 +5120,10 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
Assert(!BufferIsLocal(buffer));
/*
- * get relfilenode from the buffer, no convenient way to access it other
+ * get relfilelocator from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &key.rlocator, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1119a12..73c0f15 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -781,7 +781,7 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
- xlrec->target_node, xlrec->target_tid,
+ xlrec->target_locator, xlrec->target_tid,
xlrec->cmin, xlrec->cmax,
xlrec->combocid);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index ae13011..7071ff6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -121,12 +121,12 @@ typedef struct CkptTsStatus
* Type for array used to sort SMgrRelations
*
* FlushRelationsAllBuffers shares the same comparator function with
- * DropRelFileNodesAllBuffers. Pointer to this struct and RelFileNode must be
+ * DropRelFileLocatorsAllBuffers. Pointer to this struct and RelFileLocator must be
* compatible.
*/
typedef struct SMgrSortArray
{
- RelFileNode rnode; /* This must be the first member */
+ RelFileLocator rlocator; /* This must be the first member */
SMgrRelation srel;
} SMgrSortArray;
@@ -483,7 +483,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
BufferAccessStrategy strategy,
bool *foundPtr);
static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
-static void FindAndDropRelFileNodeBuffers(RelFileNode rnode,
+static void FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator,
ForkNumber forkNum,
BlockNumber nForkBlock,
BlockNumber firstDelBlock);
@@ -492,7 +492,7 @@ static void RelationCopyStorageUsingBuffer(Relation src, Relation dst,
bool isunlogged);
static void AtProcExit_Buffers(int code, Datum arg);
static void CheckForBufferLeaks(void);
-static int rnode_comparator(const void *p1, const void *p2);
+static int rlocator_comparator(const void *p1, const void *p2);
static inline int buffertag_comparator(const BufferTag *a, const BufferTag *b);
static inline int ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b);
static int ts_ckpt_progress_comparator(Datum a, Datum b, void *arg);
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -620,7 +620,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
* tag. In that case, the buffer is pinned and the usage count is bumped.
*/
bool
-ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
+ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockNum,
Buffer recent_buffer)
{
BufferDesc *bufHdr;
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rnode, forkNum, blockNum);
+ INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -786,13 +786,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* BackendId).
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool permanent)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(rlocator, InvalidBackendId);
return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
@@ -824,10 +824,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
isExtend = (blockNum == P_NEW);
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
@@ -839,7 +839,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend relation %s beyond %u blocks",
- relpath(smgr->smgr_rnode, forkNum),
+ relpath(smgr->smgr_rlocator, forkNum),
P_NEW)));
}
@@ -886,10 +886,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -926,7 +926,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (!PageIsNew((Page) bufBlock))
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
- blockNum, relpath(smgr->smgr_rnode, forkNum)),
+ blockNum, relpath(smgr->smgr_rlocator, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
@@ -1028,7 +1028,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
@@ -1036,7 +1036,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
}
}
}
@@ -1076,10 +1076,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageMiss;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1255,9 +1255,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
/* OK, do the I/O */
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1266,9 +1266,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
&buf->tag);
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
}
else
{
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rlocator.spcOid;
+ item->relNumber = bufHdr->tag.rlocator.relNumber;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2708,7 +2708,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2769,11 +2769,11 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
+ * Returns the relfilelocator, fork number and block number associated with
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2787,7 +2787,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ *rlocator = bufHdr->tag.rlocator;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,13 +2838,13 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rlocator, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
buf_state = LockBufHdr(buf);
@@ -2922,9 +2922,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3026,7 +3026,7 @@ BufferGetLSNAtomic(Buffer buffer)
}
/* ---------------------------------------------------------------------
- * DropRelFileNodeBuffers
+ * DropRelFileLocatorBuffers
*
* This function removes from the buffer pool all the pages of the
* specified relation forks that have block numbers >= firstDelBlock.
@@ -3047,24 +3047,24 @@ BufferGetLSNAtomic(Buffer buffer)
* --------------------------------------------------------------------
*/
void
-DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
+DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock)
{
int i;
int j;
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
BlockNumber nForkBlock[MAX_FORKNUM];
uint64 nBlocksToInvalidate = 0;
- rnode = smgr_reln->smgr_rnode;
+ rlocator = smgr_reln->smgr_rlocator;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileLocatorBackendIsTemp(rlocator))
{
- if (rnode.backend == MyBackendId)
+ if (rlocator.backend == MyBackendId)
{
for (j = 0; j < nforks; j++)
- DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
+ DropRelFileLocatorLocalBuffers(rlocator.locator, forkNum[j],
firstDelBlock[j]);
}
return;
@@ -3115,7 +3115,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)
{
for (j = 0; j < nforks; j++)
- FindAndDropRelFileNodeBuffers(rnode.node, forkNum[j],
+ FindAndDropRelFileLocatorBuffers(rlocator.locator, forkNum[j],
nForkBlock[j], firstDelBlock[j]);
return;
}
@@ -3138,17 +3138,17 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* false positives are safe because we'll recheck after getting the
* buffer lock.
*
- * We could check forkNum and blockNum as well as the rnode, but the
+ * We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3162,16 +3162,16 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
}
/* ---------------------------------------------------------------------
- * DropRelFileNodesAllBuffers
+ * DropRelFileLocatorsAllBuffers
*
* This function removes from the buffer pool all the pages of all
* forks of the specified relations. It's equivalent to calling
- * DropRelFileNodeBuffers once per fork per relation with
+ * DropRelFileLocatorBuffers once per fork per relation with
* firstDelBlock = 0.
* --------------------------------------------------------------------
*/
void
-DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
+DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
{
int i;
int j;
@@ -3179,22 +3179,22 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
SMgrRelation *rels;
BlockNumber (*block)[MAX_FORKNUM + 1];
uint64 nBlocksToInvalidate = 0;
- RelFileNode *nodes;
+ RelFileLocator *locators;
bool cached = true;
bool use_bsearch;
- if (nnodes == 0)
+ if (nlocators == 0)
return;
- rels = palloc(sizeof(SMgrRelation) * nnodes); /* non-local relations */
+ rels = palloc(sizeof(SMgrRelation) * nlocators); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
- for (i = 0; i < nnodes; i++)
+ for (i = 0; i < nlocators; i++)
{
- if (RelFileNodeBackendIsTemp(smgr_reln[i]->smgr_rnode))
+ if (RelFileLocatorBackendIsTemp(smgr_reln[i]->smgr_rlocator))
{
- if (smgr_reln[i]->smgr_rnode.backend == MyBackendId)
- DropRelFileNodeAllLocalBuffers(smgr_reln[i]->smgr_rnode.node);
+ if (smgr_reln[i]->smgr_rlocator.backend == MyBackendId)
+ DropRelFileLocatorAllLocalBuffers(smgr_reln[i]->smgr_rlocator.locator);
}
else
rels[n++] = smgr_reln[i];
@@ -3219,7 +3219,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
/*
* We can avoid scanning the entire buffer pool if we know the exact size
- * of each of the given relation forks. See DropRelFileNodeBuffers.
+ * of each of the given relation forks. See DropRelFileLocatorBuffers.
*/
for (i = 0; i < n && cached; i++)
{
@@ -3257,7 +3257,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
/* drop all the buffers for a particular relation fork */
- FindAndDropRelFileNodeBuffers(rels[i]->smgr_rnode.node,
+ FindAndDropRelFileLocatorBuffers(rels[i]->smgr_rlocator.locator,
j, block[i][j], 0);
}
}
@@ -3268,9 +3268,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
pfree(block);
- nodes = palloc(sizeof(RelFileNode) * n); /* non-local relations */
+ locators = palloc(sizeof(RelFileLocator) * n); /* non-local relations */
for (i = 0; i < n; i++)
- nodes[i] = rels[i]->smgr_rnode.node;
+ locators[i] = rels[i]->smgr_rlocator.locator;
/*
* For low number of relations to drop just use a simple walk through, to
@@ -3280,18 +3280,18 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
*/
use_bsearch = n > RELS_BSEARCH_THRESHOLD;
- /* sort the list of rnodes if necessary */
+ /* sort the list of rlocators if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(locators, n, sizeof(RelFileLocator), rlocator_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3301,37 +3301,37 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
{
- rnode = &nodes[j];
+ rlocator = &locators[j];
break;
}
}
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
- rnode_comparator);
+ rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ locators, n, sizeof(RelFileLocator),
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
- if (rnode == NULL)
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
+ if (rlocator == NULL)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
}
- pfree(nodes);
+ pfree(locators);
pfree(rels);
}
/* ---------------------------------------------------------------------
- * FindAndDropRelFileNodeBuffers
+ * FindAndDropRelFileLocatorBuffers
*
* This function performs look up in BufMapping table and removes from the
* buffer pool all the pages of the specified relation fork that has block
@@ -3340,9 +3340,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
* --------------------------------------------------------------------
*/
static void
-FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber nForkBlock,
- BlockNumber firstDelBlock)
+FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber nForkBlock,
+ BlockNumber firstDelBlock)
{
BlockNumber curBlock;
@@ -3356,7 +3356,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rnode, forkNum, curBlock);
+ INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
@@ -3380,7 +3380,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3397,7 +3397,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
* bothering to write them out first. This is used when we destroy a
* database, to avoid trying to flush data to disk when the directory
* tree no longer exists. Implementation is pretty similar to
- * DropRelFileNodeBuffers() which is for destroying just one relation.
+ * DropRelFileLocatorBuffers() which is for destroying just one relation.
* --------------------------------------------------------------------
*/
void
@@ -3416,14 +3416,14 @@ DropDatabaseBuffers(Oid dbid)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rlocator.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3453,7 +3453,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3478,7 +3478,7 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(buf->tag.rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3517,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3561,16 +3561,16 @@ FlushRelationBuffers(Relation rel)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3608,21 +3608,21 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (i = 0; i < nrels; i++)
{
- Assert(!RelFileNodeBackendIsTemp(smgrs[i]->smgr_rnode));
+ Assert(!RelFileLocatorBackendIsTemp(smgrs[i]->smgr_rlocator));
- srels[i].rnode = smgrs[i]->smgr_rnode.node;
+ srels[i].rlocator = smgrs[i]->smgr_rlocator.locator;
srels[i].srel = smgrs[i];
}
/*
* Save the bsearch overhead for low number of relations to sync. See
- * DropRelFileNodesAllBuffers for details.
+ * DropRelFileLocatorsAllBuffers for details.
*/
use_bsearch = nrels > RELS_BSEARCH_THRESHOLD;
/* sort the list of SMgrRelations if necessary */
if (use_bsearch)
- pg_qsort(srels, nrels, sizeof(SMgrSortArray), rnode_comparator);
+ pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
@@ -3634,7 +3634,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3644,7 +3644,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,19 +3653,19 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
srels, nrels, sizeof(SMgrSortArray),
- rnode_comparator);
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
if (srelent == NULL)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3729,7 +3729,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
CHECK_FOR_INTERRUPTS();
/* Read block from source relation. */
- srcBuf = ReadBufferWithoutRelcache(src->rd_node, forkNum, blkno,
+ srcBuf = ReadBufferWithoutRelcache(src->rd_locator, forkNum, blkno,
RBM_NORMAL, bstrategy_src,
permanent);
srcPage = BufferGetPage(srcBuf);
@@ -3740,7 +3740,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
}
/* Use P_NEW to extend the destination relation. */
- dstBuf = ReadBufferWithoutRelcache(dst->rd_node, forkNum, P_NEW,
+ dstBuf = ReadBufferWithoutRelcache(dst->rd_locator, forkNum, P_NEW,
RBM_NORMAL, bstrategy_dst,
permanent);
LockBuffer(dstBuf, BUFFER_LOCK_EXCLUSIVE);
@@ -3775,8 +3775,8 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
* --------------------------------------------------------------------
*/
void
-CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
- bool permanent)
+CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator, bool permanent)
{
Relation src_rel;
Relation dst_rel;
@@ -3793,8 +3793,8 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- src_rel = CreateFakeRelcacheEntry(src_rnode);
- dst_rel = CreateFakeRelcacheEntry(dst_rnode);
+ src_rel = CreateFakeRelcacheEntry(src_rlocator);
+ dst_rel = CreateFakeRelcacheEntry(dst_rlocator);
/*
* Create and copy all forks of the relation. During create database we
@@ -3802,7 +3802,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* directory. Therefore, each individual relation doesn't need to be
* registered for cleanup.
*/
- RelationCreateStorage(dst_rnode, relpersistence, false);
+ RelationCreateStorage(dst_rlocator, relpersistence, false);
/* copy main fork. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, MAIN_FORKNUM, permanent);
@@ -3820,7 +3820,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* init fork of an unlogged relation.
*/
if (permanent || forkNum == INIT_FORKNUM)
- log_smgrcreate(&dst_rnode, forkNum);
+ log_smgrcreate(&dst_rlocator, forkNum);
/* Copy a fork's data, block by block. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, forkNum,
@@ -3864,16 +3864,16 @@ FlushDatabaseBuffers(Oid dbid)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rlocator.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4034,7 +4034,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
/*
- * If we must not write WAL, due to a relfilenode-specific
+ * If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
* set the hint, just not dirty the page as a result so the hint
* is lost when we evict the page or shutdown.
@@ -4042,7 +4042,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
return;
/*
@@ -4651,7 +4651,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4675,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,7 +4693,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4703,27 +4703,27 @@ local_buffer_write_error_callback(void *arg)
}
/*
- * RelFileNode qsort/bsearch comparator; see RelFileNodeEquals.
+ * RelFileLocator qsort/bsearch comparator; see RelFileLocatorEquals.
*/
static int
-rnode_comparator(const void *p1, const void *p2)
+rlocator_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileLocator n1 = *(const RelFileLocator *) p1;
+ RelFileLocator n2 = *(const RelFileLocator *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.relNumber < n2.relNumber)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.relNumber > n2.relNumber)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.dbOid < n2.dbOid)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.dbOid > n2.dbOid)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.spcOid < n2.spcOid)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.spcOid > n2.spcOid)
return 1;
else
return 0;
@@ -4789,7 +4789,7 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
if (ret != 0)
return ret;
@@ -4822,9 +4822,9 @@ ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b)
else if (a->tsId > b->tsId)
return 1;
/* compare relation */
- if (a->relNode < b->relNode)
+ if (a->relNumber < b->relNumber)
return -1;
- else if (a->relNode > b->relNode)
+ else if (a->relNumber > b->relNumber)
return 1;
/* compare fork */
else if (a->forkNum < b->forkNum)
@@ -4960,7 +4960,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +4979,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..3dc9cc7 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -134,7 +134,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum, -b - 1);
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
#endif
buf_state = pg_atomic_read_u32(&bufHdr->state);
@@ -162,7 +162,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum,
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum,
-nextFreeLocalBuf - 1);
#endif
@@ -215,7 +215,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -312,7 +312,7 @@ MarkLocalBufferDirty(Buffer buffer)
}
/*
- * DropRelFileNodeLocalBuffers
+ * DropRelFileLocatorLocalBuffers
* This function removes from the buffer pool all the pages of the
* specified relation that have block numbers >= firstDelBlock.
* (In particular, with firstDelBlock = 0, all pages are removed.)
@@ -320,11 +320,11 @@ MarkLocalBufferDirty(Buffer buffer)
* out first. Therefore, this is NOT rollback-able, and so should be
* used only with extreme caution!
*
- * See DropRelFileNodeBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber firstDelBlock)
+DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber firstDelBlock)
{
int i;
@@ -337,14 +337,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -363,14 +363,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
}
/*
- * DropRelFileNodeAllLocalBuffers
+ * DropRelFileLocatorAllLocalBuffers
* This function removes from the buffer pool all pages of all forks
* of the specified relation.
*
- * See DropRelFileNodesAllBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorsAllBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
+DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
{
int i;
@@ -383,12 +383,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -589,7 +589,7 @@ AtProcExit_LocalBuffers(void)
{
/*
* We shouldn't be holding any remaining pins; if we are, and assertions
- * aren't enabled, we'll fail later in DropRelFileNodeBuffers while trying
+ * aren't enabled, we'll fail later in DropRelFileLocatorBuffers while trying
* to drop the temp rels.
*/
CheckForLocalBufferLeaks();
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index d41ae37..005def5 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -196,7 +196,7 @@ RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk, Size spaceAvail)
* WAL replay
*/
void
-XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail)
{
int new_cat = fsm_space_avail_to_cat(spaceAvail);
@@ -211,8 +211,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR,
- InvalidBuffer);
+ buf = XLogReadBufferExtended(rlocator, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR, InvalidBuffer);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..af4dab7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &rlocator, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 671b00a..9dab931 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -442,7 +442,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
}
void
-ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
+ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileLocator locator)
{
VirtualTransactionId *backends;
@@ -461,7 +461,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
return;
backends = GetConflictingVirtualXIDs(latestRemovedXid,
- node.dbNode);
+ locator.dbOid);
ResolveRecoveryConflictWithVirtualXIDs(backends,
PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
@@ -475,7 +475,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
*/
void
ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node)
+ RelFileLocator locator)
{
/*
* ResolveRecoveryConflictWithSnapshot operates on 32-bit TransactionIds,
@@ -493,7 +493,7 @@ ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXi
TransactionId latestRemovedXid;
latestRemovedXid = XidFromFullTransactionId(latestRemovedFullXid);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, node);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, locator);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 25e7e4e..5136da6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1997,7 +1997,7 @@ PageIsPredicateLocked(Relation relation, BlockNumber blkno)
PREDICATELOCKTARGET *target;
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
@@ -2576,7 +2576,7 @@ PredicateLockRelation(Relation relation, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
PredicateLockAcquire(&tag);
}
@@ -2599,7 +2599,7 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_PAGE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
PredicateLockAcquire(&tag);
@@ -2638,13 +2638,13 @@ PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
* level lock.
*/
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
if (PredicateLockExists(&tag))
return;
SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -2974,7 +2974,7 @@ DropAllPredicateLocksFromTable(Relation relation, bool transfer)
if (!PredicateLockingNeededForRelation(relation))
return;
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
relId = relation->rd_id;
if (relation->rd_index == NULL)
{
@@ -3194,11 +3194,11 @@ PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
Assert(BlockNumberIsValid(newblkno));
SET_PREDICATELOCKTARGETTAG_PAGE(oldtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
oldblkno);
SET_PREDICATELOCKTARGETTAG_PAGE(newtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
newblkno);
@@ -4478,7 +4478,7 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (tid != NULL)
{
SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -4488,14 +4488,14 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (blkno != InvalidBlockNumber)
{
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
CheckTargetForConflictsIn(&targettag);
}
SET_PREDICATELOCKTARGETTAG_RELATION(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
CheckTargetForConflictsIn(&targettag);
}
@@ -4556,7 +4556,7 @@ CheckTableForSerializableConflictIn(Relation relation)
Assert(relation->rd_index == NULL); /* not an index relation */
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
heapId = relation->rd_id;
LWLockAcquire(SerializablePredicateListLock, LW_EXCLUSIVE);
diff --git a/src/backend/storage/smgr/README b/src/backend/storage/smgr/README
index e1cfc6c..cf3aa56 100644
--- a/src/backend/storage/smgr/README
+++ b/src/backend/storage/smgr/README
@@ -46,7 +46,7 @@ physical relation in system catalogs.
It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
exists. Fork numbers are assigned in src/include/common/relpath.h.
Functions in smgr.c and md.c take an extra fork number argument, in addition
-to relfilenode and block number, to identify which relation fork you want to
+to relfilelocator and block number, to identify which relation fork you want to
access. Since most code wants to access the main fork, a shortcut version of
ReadBuffer that accesses MAIN_FORKNUM is provided in the buffer manager for
convenience.
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 43edaf5..3998296 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -35,7 +35,7 @@
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/md.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
@@ -89,11 +89,11 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* Populate a file tag describing an md.c segment file. */
-#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
+#define INIT_MD_FILETAG(a,xx_rlocator,xx_forknum,xx_segno) \
( \
memset(&(a), 0, sizeof(FileTag)), \
(a).handler = SYNC_HANDLER_MD, \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forknum = (xx_forknum), \
(a).segno = (xx_segno) \
)
@@ -121,14 +121,14 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
-static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
+static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
bool isRedo);
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
-static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
ForkNumber forknum,
@@ -199,11 +199,11 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* should be here and not in commands/tablespace.c? But that would imply
* importing a lot of stuff that smgr.c oughtn't know, either.
*/
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
+ TablespaceCreateDbspace(reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
isRedo);
- path = relpath(reln->smgr_rnode, forkNum);
+ path = relpath(reln->smgr_rlocator, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
@@ -234,7 +234,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
/*
* mdunlink() -- Unlink a relation.
*
- * Note that we're passed a RelFileNodeBackend --- by the time this is called,
+ * Note that we're passed a RelFileLocatorBackend --- by the time this is called,
* there won't be an SMgrRelation hashtable entry anymore.
*
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
@@ -243,10 +243,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* For regular relations, we don't unlink the first segment file of the rel,
* but just truncate it to zero length, and record a request to unlink it after
* the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenode
- * number from being reused. The scenario this protects us from is:
+ * however. Leaving the empty file in place prevents that relfilenumber
+ * from being reused. The scenario this protects us from is:
* 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenode as
+ * 2. We create a new relation, which by chance gets the same relfilenumber as
* the just-deleted one (OIDs must've wrapped around for that to happen).
* 3. We crash before another checkpoint occurs.
* During replay, we would delete the file and then recreate it, which is fine
@@ -254,18 +254,18 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* But if we didn't WAL-log insertions, but instead relied on fsyncing the
* file after populating it (as we do at wal_level=minimal), the contents of
* the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenode number until
- * it's safe, because relfilenode assignment skips over any existing file.
+ * next checkpoint, we prevent reassignment of the relfilenumber until it's
+ * safe, because relfilenumber assignment skips over any existing file.
*
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenode number.
+ * to the health of a regular rel that has taken over its relfilenumber.
* The fact that temp rels and regular rels have different file naming
* patterns provides additional safety.
*
* All the above applies only to the relation's main fork; other forks can
* just be removed immediately, since they are not needed to prevent the
- * relfilenode number from being recycled. Also, we do not carefully
+ * relfilenumber from being recycled. Also, we do not carefully
* track whether other forks have been created or not, but just attempt to
* unlink them unconditionally; so we should never complain about ENOENT.
*
@@ -278,16 +278,16 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* we are usually not in a transaction anymore when this is called.
*/
void
-mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlink(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
/* Now do the per-fork work */
if (forkNum == InvalidForkNumber)
{
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
else
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
/*
@@ -315,25 +315,25 @@ do_truncate(const char *path)
}
static void
-mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
int ret;
- path = relpath(rnode, forkNum);
+ path = relpath(rlocator, forkNum);
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
/* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
ret = 0;
@@ -354,7 +354,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
ret = do_truncate(path);
/* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
}
/*
@@ -373,7 +373,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
sprintf(segpath, "%s.%u", path, segno);
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
* Prevent other backends' fds from holding on to the disk
@@ -386,7 +386,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
* Forget any pending sync requests for this segment before we
* try to unlink.
*/
- register_forget_request(rnode, forkNum, segno);
+ register_forget_request(rlocator, forkNum, segno);
}
if (unlink(segpath) < 0)
@@ -437,7 +437,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend file \"%s\" beyond %u blocks",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
InvalidBlockNumber)));
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync, EXTENSION_CREATE);
@@ -490,7 +490,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (reln->md_num_open_segs[forknum] > 0)
return &reln->md_seg_fds[forknum][0];
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);
@@ -645,10 +645,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
MdfdVec *v;
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -660,10 +660,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_READ);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -715,10 +715,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
#endif
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -730,10 +730,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_WRITE);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -842,7 +842,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
return;
ereport(ERROR,
(errmsg("could not truncate file \"%s\" to %u blocks: it's only %u blocks now",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
nblocks, curnblk)));
}
if (nblocks == curnblk)
@@ -983,7 +983,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
{
FileTag tag;
- INIT_MD_FILETAG(tag, reln->smgr_rnode.node, forknum, seg->mdfd_segno);
+ INIT_MD_FILETAG(tag, reln->smgr_rlocator.locator, forknum, seg->mdfd_segno);
/* Temp relations should never be fsync'd */
Assert(!SmgrIsTemp(reln));
@@ -1005,15 +1005,15 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
* register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
*/
static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
/* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
+ Assert(!RelFileLocatorBackendIsTemp(rlocator));
RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
}
@@ -1022,12 +1022,12 @@ register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
-register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true /* retryOnError */ );
}
@@ -1039,13 +1039,13 @@ void
ForgetDatabaseSyncRequests(Oid dbid)
{
FileTag tag;
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.dbNode = dbid;
- rnode.spcNode = 0;
- rnode.relNode = 0;
+ rlocator.dbOid = dbid;
+ rlocator.spcOid = 0;
+ rlocator.relNumber = 0;
- INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
+ INIT_MD_FILETAG(tag, rlocator, InvalidForkNumber, InvalidBlockNumber);
RegisterSyncRequest(&tag, SYNC_FILTER_REQUEST, true /* retryOnError */ );
}
@@ -1054,7 +1054,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
* DropRelationFiles -- drop files of all given relations
*/
void
-DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
+DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo)
{
SMgrRelation *srels;
int i;
@@ -1129,7 +1129,7 @@ _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
char *path,
*fullpath;
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
if (segno > 0)
{
@@ -1345,7 +1345,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(ftag->rlocator, InvalidBackendId);
File file;
bool need_to_close;
int result,
@@ -1395,7 +1395,7 @@ mdunlinkfiletag(const FileTag *ftag, char *path)
char *p;
/* Compute the path. */
- p = relpathperm(ftag->rnode, MAIN_FORKNUM);
+ p = relpathperm(ftag->rlocator, MAIN_FORKNUM);
strlcpy(path, p, MAXPGPATH);
pfree(p);
@@ -1417,5 +1417,5 @@ mdfiletagmatches(const FileTag *ftag, const FileTag *candidate)
* We'll return true for all candidates that have the same database OID as
* the ftag from the SYNC_FILTER_REQUEST request, so they're forgotten.
*/
- return ftag->rnode.dbNode == candidate->rnode.dbNode;
+ return ftag->rlocator.dbOid == candidate->rlocator.dbOid;
}
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index a477f70..b21d8c3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -46,7 +46,7 @@ typedef struct f_smgr
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
bool (*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
- void (*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
+ void (*smgr_unlink) (RelFileLocatorBackend rlocator, ForkNumber forknum,
bool isRedo);
void (*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
@@ -143,9 +143,9 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(RelFileLocator rlocator, BackendId backend)
{
- RelFileNodeBackend brnode;
+ RelFileLocatorBackend brlocator;
SMgrRelation reln;
bool found;
@@ -154,7 +154,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = sizeof(RelFileLocatorBackend);
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
@@ -162,10 +162,10 @@ smgropen(RelFileNode rnode, BackendId backend)
}
/* Look up or create an entry */
- brnode.node = rnode;
- brnode.backend = backend;
+ brlocator.locator = rlocator;
+ brlocator.backend = backend;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &brnode,
+ (void *) &brlocator,
HASH_ENTER, &found);
/* Initialize it if not present before */
@@ -267,7 +267,7 @@ smgrclose(SMgrRelation reln)
dlist_delete(&reln->node);
if (hash_search(SMgrRelationHash,
- (void *) &(reln->smgr_rnode),
+ (void *) &(reln->smgr_rlocator),
HASH_REMOVE, NULL) == NULL)
elog(ERROR, "SMgrRelation hashtable corrupted");
@@ -335,15 +335,15 @@ smgrcloseall(void)
}
/*
- * smgrclosenode() -- Close SMgrRelation object for given RelFileNode,
+ * smgrcloserellocator() -- Close SMgrRelation object for given RelFileLocator,
* if one exists.
*
- * This has the same effects as smgrclose(smgropen(rnode)), but it avoids
+ * This has the same effects as smgrclose(smgropen(rlocator)), but it avoids
* uselessly creating a hashtable entry only to drop it again when no
* such entry exists already.
*/
void
-smgrclosenode(RelFileNodeBackend rnode)
+smgrcloserellocator(RelFileLocatorBackend rlocator)
{
SMgrRelation reln;
@@ -352,7 +352,7 @@ smgrclosenode(RelFileNodeBackend rnode)
return;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &rnode,
+ (void *) &rlocator,
HASH_FIND, NULL);
if (reln != NULL)
smgrclose(reln);
@@ -420,7 +420,7 @@ void
smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
{
int i = 0;
- RelFileNodeBackend *rnodes;
+ RelFileLocatorBackend *rlocators;
ForkNumber forknum;
if (nrels == 0)
@@ -430,19 +430,19 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* Get rid of any remaining buffers for the relations. bufmgr will just
* drop them without bothering to write the contents.
*/
- DropRelFileNodesAllBuffers(rels, nrels);
+ DropRelFileLocatorsAllBuffers(rels, nrels);
/*
* create an array which contains all relations to be dropped, and close
* each relation's forks at the smgr level while at it
*/
- rnodes = palloc(sizeof(RelFileNodeBackend) * nrels);
+ rlocators = palloc(sizeof(RelFileLocatorBackend) * nrels);
for (i = 0; i < nrels; i++)
{
- RelFileNodeBackend rnode = rels[i]->smgr_rnode;
+ RelFileLocatorBackend rlocator = rels[i]->smgr_rlocator;
int which = rels[i]->smgr_which;
- rnodes[i] = rnode;
+ rlocators[i] = rlocator;
/* Close the forks at smgr level */
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
@@ -458,7 +458,7 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* closed our own smgr rel.
*/
for (i = 0; i < nrels; i++)
- CacheInvalidateSmgr(rnodes[i]);
+ CacheInvalidateSmgr(rlocators[i]);
/*
* Delete the physical file(s).
@@ -473,10 +473,10 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
int which = rels[i]->smgr_which;
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- smgrsw[which].smgr_unlink(rnodes[i], forknum, isRedo);
+ smgrsw[which].smgr_unlink(rlocators[i], forknum, isRedo);
}
- pfree(rnodes);
+ pfree(rlocators);
}
@@ -631,7 +631,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* Get rid of any buffers for the about-to-be-deleted blocks. bufmgr will
* just drop them without bothering to write the contents.
*/
- DropRelFileNodeBuffers(reln, forknum, nforks, nblocks);
+ DropRelFileLocatorBuffers(reln, forknum, nforks, nblocks);
/*
* Send a shared-inval message to force other backends to close any smgr
@@ -643,7 +643,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* is a performance-critical path.) As in the unlink code, we want to be
* sure the message is sent before we start changing things on-disk.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
/* Do the truncation */
for (i = 0; i < nforks; i++)
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index b4a2c8d..36ec845 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -27,7 +27,7 @@
#include "utils/builtins.h"
#include "utils/numeric.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/relmapper.h"
#include "utils/syscache.h"
@@ -292,7 +292,7 @@ pg_tablespace_size_name(PG_FUNCTION_ARGS)
* is no check here or at the call sites for that.
*/
static int64
-calculate_relation_size(RelFileNode *rfn, BackendId backend, ForkNumber forknum)
+calculate_relation_size(RelFileLocator *rfn, BackendId backend, ForkNumber forknum)
{
int64 totalsize = 0;
char *relationpath;
@@ -349,7 +349,7 @@ pg_relation_size(PG_FUNCTION_ARGS)
if (rel == NULL)
PG_RETURN_NULL();
- size = calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size = calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkname_to_number(text_to_cstring(forkName)));
relation_close(rel, AccessShareLock);
@@ -374,7 +374,7 @@ calculate_toast_table_size(Oid toastrelid)
/* toast heap size, including FSM and VM size */
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastRel->rd_node),
+ size += calculate_relation_size(&(toastRel->rd_locator),
toastRel->rd_backend, forkNum);
/* toast index size, including FSM and VM size */
@@ -388,7 +388,7 @@ calculate_toast_table_size(Oid toastrelid)
toastIdxRel = relation_open(lfirst_oid(lc),
AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastIdxRel->rd_node),
+ size += calculate_relation_size(&(toastIdxRel->rd_locator),
toastIdxRel->rd_backend, forkNum);
relation_close(toastIdxRel, AccessShareLock);
@@ -417,7 +417,7 @@ calculate_table_size(Relation rel)
* heap size, including FSM and VM
*/
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size += calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkNum);
/*
@@ -456,7 +456,7 @@ calculate_indexes_size(Relation rel)
idxRel = relation_open(idxOid, AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(idxRel->rd_node),
+ size += calculate_relation_size(&(idxRel->rd_locator),
idxRel->rd_backend,
forkNum);
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelFileNumber result;
HeapTuple tuple;
Form_pg_class relform;
@@ -864,29 +864,29 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (relform->relfilenode)
result = relform->relfilenode;
else /* Consult the relation mapper */
- result = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ result = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelFileNumber;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
PG_RETURN_OID(result);
}
/*
- * Get the relation via (reltablespace, relfilenode)
+ * Get the relation via (reltablespace, relfilenumber)
*
* This is expected to be used when somebody wants to match an individual file
* on the filesystem back to its table. That's not trivially possible via
- * pg_class, because that doesn't contain the relfilenodes of shared and nailed
+ * pg_class, because that doesn't contain the relfilenumbers of shared and nailed
* tables.
*
* We don't fail but return NULL if we cannot find a mapping.
@@ -898,14 +898,14 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_OID(1);
Oid heaprel;
- /* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ /* test needed so RelidByRelfilenumber doesn't misbehave */
+ if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenode(reltablespace, relfilenode);
+ heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
@@ -924,7 +924,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
Oid relid = PG_GETARG_OID(0);
HeapTuple tuple;
Form_pg_class relform;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BackendId backend;
char *path;
@@ -937,29 +937,29 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
{
/* This logic should match RelationInitPhysicalAddr */
if (relform->reltablespace)
- rnode.spcNode = relform->reltablespace;
+ rlocator.spcOid = relform->reltablespace;
else
- rnode.spcNode = MyDatabaseTableSpace;
- if (rnode.spcNode == GLOBALTABLESPACE_OID)
- rnode.dbNode = InvalidOid;
+ rlocator.spcOid = MyDatabaseTableSpace;
+ if (rlocator.spcOid == GLOBALTABLESPACE_OID)
+ rlocator.dbOid = InvalidOid;
else
- rnode.dbNode = MyDatabaseId;
+ rlocator.dbOid = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ rlocator.relNumber = relform->relfilenode;
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ rlocator.relNumber = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rlocator.relNumber = InvalidRelFileNumber;
/* some compilers generate warnings without these next two lines */
- rnode.dbNode = InvalidOid;
- rnode.spcNode = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.spcOid = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!RelFileNumberIsValid(rlocator.relNumber))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
@@ -990,7 +990,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
ReleaseSysCache(tuple);
- path = relpathbackend(rnode, backend, MAIN_FORKNUM);
+ path = relpathbackend(rlocator, backend, MAIN_FORKNUM);
PG_RETURN_TEXT_P(cstring_to_text(path));
}
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 65764d7..c260c97 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -2,7 +2,7 @@
* pg_upgrade_support.c
*
* server-side functions to set backend global variables
- * to control oid and relfilenode assignment, and do other special
+ * to control oid and relfilenumber assignment, and do other special
* hacks needed for pg_upgrade.
*
* Copyright (c) 2010-2022, PostgreSQL Global Development Group
@@ -98,10 +98,10 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -120,10 +120,10 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile
index 38e46d2..5105018 100644
--- a/src/backend/utils/cache/Makefile
+++ b/src/backend/utils/cache/Makefile
@@ -21,7 +21,7 @@ OBJS = \
partcache.o \
plancache.o \
relcache.o \
- relfilenodemap.o \
+ relfilenumbermap.o \
relmapper.o \
spccache.o \
syscache.o \
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index af000d4..eb5782f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -661,11 +661,11 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
* We could have smgr entries for relations of other databases, so no
* short-circuit test is possible here.
*/
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
- rnode.node = msg->sm.rnode;
- rnode.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
- smgrclosenode(rnode);
+ rlocator.locator = msg->sm.rlocator;
+ rlocator.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
+ smgrcloserellocator(rlocator);
}
else if (msg->id == SHAREDINVALRELMAP_ID)
{
@@ -1459,14 +1459,14 @@ CacheInvalidateRelcacheByRelid(Oid relid)
* Thus, the maximum possible backend ID is 2^23-1.
*/
void
-CacheInvalidateSmgr(RelFileNodeBackend rnode)
+CacheInvalidateSmgr(RelFileLocatorBackend rlocator)
{
SharedInvalidationMessage msg;
msg.sm.id = SHAREDINVALSMGR_ID;
- msg.sm.backend_hi = rnode.backend >> 16;
- msg.sm.backend_lo = rnode.backend & 0xffff;
- msg.sm.rnode = rnode.node;
+ msg.sm.backend_hi = rlocator.backend >> 16;
+ msg.sm.backend_lo = rlocator.backend & 0xffff;
+ msg.sm.rlocator = rlocator.locator;
/* check AddCatcacheInvalidationMessage() for an explanation */
VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index f502df9..0639875 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -369,7 +369,7 @@ ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic)
/*
* The caller might need a tuple that's newer than the one the historic
* snapshot; currently the only case requiring to do so is looking up the
- * relfilenode of non mapped system relations during decoding. That
+ * relfilenumber of non mapped system relations during decoding. That
* snapshot can't change in the midst of a relcache build, so there's no
* need to register the snapshot.
*/
@@ -1133,8 +1133,8 @@ retry:
relation->rd_refcnt = 0;
relation->rd_isnailed = false;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
@@ -1300,7 +1300,7 @@ retry:
}
/*
- * Initialize the physical addressing info (RelFileNode) for a relcache entry
+ * Initialize the physical addressing info (RelFileLocator) for a relcache entry
*
* Note: at the physical level, relations in the pg_global tablespace must
* be treated as shared, even if relisshared isn't set. Hence we do not
@@ -1309,20 +1309,20 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ RelFileNumber oldnumber = relation->rd_locator.relNumber;
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
if (relation->rd_rel->reltablespace)
- relation->rd_node.spcNode = relation->rd_rel->reltablespace;
+ relation->rd_locator.spcOid = relation->rd_rel->reltablespace;
else
- relation->rd_node.spcNode = MyDatabaseTableSpace;
- if (relation->rd_node.spcNode == GLOBALTABLESPACE_OID)
- relation->rd_node.dbNode = InvalidOid;
+ relation->rd_locator.spcOid = MyDatabaseTableSpace;
+ if (relation->rd_locator.spcOid == GLOBALTABLESPACE_OID)
+ relation->rd_locator.dbOid = InvalidOid;
else
- relation->rd_node.dbNode = MyDatabaseId;
+ relation->rd_locator.dbOid = MyDatabaseId;
if (relation->rd_rel->relfilenode)
{
@@ -1356,30 +1356,30 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ relation->rd_locator.relNumber = relation->rd_rel->relfilenode;
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ relation->rd_locator.relNumber =
+ RelationMapOidToFilenumber(relation->rd_id,
+ relation->rd_rel->relisshared);
+ if (!RelFileNumberIsValid(relation->rd_locator.relNumber))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
/*
* For RelationNeedsWAL() to answer correctly on parallel workers, restore
- * rd_firstRelfilenodeSubid. No subtransactions start or end while in
+ * rd_firstRelfilelocatorSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnumber != relation->rd_locator.relNumber)
{
- if (RelFileNodeSkippingWAL(relation->rd_node))
- relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ if (RelFileLocatorSkippingWAL(relation->rd_locator))
+ relation->rd_firstRelfilelocatorSubid = TopSubTransactionId;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
}
@@ -1889,8 +1889,8 @@ formrdesc(const char *relationName, Oid relationReltype,
*/
relation->rd_isnailed = true;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
relation->rd_backend = InvalidBackendId;
relation->rd_islocaltemp = false;
@@ -1978,11 +1978,11 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
- * because there is no other way to know what filenode they currently
+ * because there is no other way to know what filenumber they currently
* have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * specifying that the initial filenumber is the same as the OID.
*/
- relation->rd_rel->relfilenode = InvalidOid;
+ relation->rd_rel->relfilenode = InvalidRelFileNumber;
if (IsBootstrapProcessingMode())
RelationMapUpdateMap(RelationGetRelid(relation),
RelationGetRelid(relation),
@@ -2180,7 +2180,7 @@ RelationClose(Relation relation)
#ifdef RELCACHE_FORCE_RELEASE
if (RelationHasReferenceCountZero(relation) &&
relation->rd_createSubid == InvalidSubTransactionId &&
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
RelationClearRelation(relation, false);
#endif
}
@@ -2352,7 +2352,7 @@ RelationReloadNailed(Relation relation)
{
/*
* If it's a nailed-but-not-mapped index, then we need to re-read the
- * pg_class row to see if its relfilenode changed.
+ * pg_class row to see if its relfilenumber changed.
*/
RelationReloadIndexInfo(relation);
}
@@ -2700,8 +2700,8 @@ RelationClearRelation(Relation relation, bool rebuild)
Assert(newrel->rd_isnailed == relation->rd_isnailed);
/* creation sub-XIDs must be preserved */
SWAPFIELD(SubTransactionId, rd_createSubid);
- SWAPFIELD(SubTransactionId, rd_newRelfilenodeSubid);
- SWAPFIELD(SubTransactionId, rd_firstRelfilenodeSubid);
+ SWAPFIELD(SubTransactionId, rd_newRelfilelocatorSubid);
+ SWAPFIELD(SubTransactionId, rd_firstRelfilelocatorSubid);
SWAPFIELD(SubTransactionId, rd_droppedSubid);
/* un-swap rd_rel pointers, swap contents instead */
SWAPFIELD(Form_pg_class, rd_rel);
@@ -2791,12 +2791,12 @@ static void
RelationFlushRelation(Relation relation)
{
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* New relcache entries are always rebuilt, not flushed; else we'd
* forget the "new" status of the relation. Ditto for the
- * new-relfilenode status.
+ * new-relfilenumber status.
*
* The rel could have zero refcnt here, so temporarily increment the
* refcnt to ensure it's safe to rebuild it. We can assume that the
@@ -2835,7 +2835,7 @@ RelationForgetRelation(Oid rid)
Assert(relation->rd_droppedSubid == InvalidSubTransactionId);
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* In the event of subtransaction rollback, we must not forget
@@ -2894,7 +2894,7 @@ RelationCacheInvalidateEntry(Oid relationId)
*
* Apart from debug_discard_caches, this is currently used only to recover
* from SI message buffer overflow, so we do not touch relations having
- * new-in-transaction relfilenodes; they cannot be targets of cross-backend
+ * new-in-transaction relfilenumbers; they cannot be targets of cross-backend
* SI updates (and our own updates now go through a separate linked list
* that isn't limited by the SI message buffer size).
*
@@ -2909,7 +2909,7 @@ RelationCacheInvalidateEntry(Oid relationId)
* so hash_seq_search will complete safely; (b) during the second pass we
* only hold onto pointers to nondeletable entries.
*
- * The two-phase approach also makes it easy to update relfilenodes for
+ * The two-phase approach also makes it easy to update relfilenumbers for
* mapped relations before we do anything else, and to ensure that the
* second pass processes nailed-in-cache items before other nondeletable
* items. This should ensure that system catalogs are up to date before
@@ -2948,12 +2948,12 @@ RelationCacheInvalidate(bool debug_discard)
/*
* Ignore new relations; no other backend will manipulate them before
- * we commit. Likewise, before replacing a relation's relfilenode, we
- * shall have acquired AccessExclusiveLock and drained any applicable
- * pending invalidations.
+ * we commit. Likewise, before replacing a relation's relfilelocator,
+ * we shall have acquired AccessExclusiveLock and drained any
+ * applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;
relcacheInvalsReceived++;
@@ -2967,8 +2967,8 @@ RelationCacheInvalidate(bool debug_discard)
else
{
/*
- * If it's a mapped relation, immediately update its rd_node in
- * case its relfilenode changed. We must do this during phase 1
+ * If it's a mapped relation, immediately update its rd_locator in
+ * case its relfilenumber changed. We must do this during phase 1
* in case the relation is consulted during rebuild of other
* relcache entries in phase 2. It's safe since consulting the
* map doesn't involve any access to relcache entries.
@@ -3078,14 +3078,14 @@ AssertPendingSyncConsistency(Relation relation)
RelationIsPermanent(relation) &&
((relation->rd_createSubid != InvalidSubTransactionId &&
RELKIND_HAS_STORAGE(relation->rd_rel->relkind)) ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId);
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId);
- Assert(relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node));
+ Assert(relcache_verdict == RelFileLocatorSkippingWAL(relation->rd_locator));
if (relation->rd_droppedSubid != InvalidSubTransactionId)
Assert(!relation->rd_isvalid &&
(relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId));
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId));
}
/*
@@ -3282,8 +3282,8 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
* also lets RelationClearRelation() drop the relcache entry.
*/
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
if (clear_relcache)
@@ -3397,8 +3397,8 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
{
/* allow the entry to be removed */
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
RelationClearRelation(relation, false);
return;
@@ -3419,23 +3419,23 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
}
/*
- * Likewise, update or drop any new-relfilenode-in-subtransaction record
+ * Likewise, update or drop any new-relfilenumber-in-subtransaction record
* or drop record.
*/
- if (relation->rd_newRelfilenodeSubid == mySubid)
+ if (relation->rd_newRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_newRelfilenodeSubid = parentSubid;
+ relation->rd_newRelfilelocatorSubid = parentSubid;
else
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
}
- if (relation->rd_firstRelfilenodeSubid == mySubid)
+ if (relation->rd_firstRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_firstRelfilenodeSubid = parentSubid;
+ relation->rd_firstRelfilelocatorSubid = parentSubid;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
if (relation->rd_droppedSubid == mySubid)
@@ -3459,7 +3459,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3533,8 +3533,8 @@ RelationBuildLocalRelation(const char *relname,
/* it's being created in this transaction */
rel->rd_createSubid = GetCurrentSubTransactionId();
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
/*
@@ -3616,7 +3616,7 @@ RelationBuildLocalRelation(const char *relname,
/*
* Insert relation physical and logical identifiers (OIDs) into the right
- * places. For a mapped relation, we set relfilenode to zero and rely on
+ * places. For a mapped relation, we set relfilenumber to zero and rely on
* RelationInitPhysicalAddr to consult the map.
*/
rel->rd_rel->relisshared = shared_relation;
@@ -3630,12 +3630,12 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelFileNumber;
/* Add it to the active mapping information */
- RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
+ RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
else
- rel->rd_rel->relfilenode = relfilenode;
+ rel->rd_rel->relfilenode = relfilenumber;
RelationInitLockInfo(rel); /* see lmgr.c */
@@ -3683,13 +3683,13 @@ RelationBuildLocalRelation(const char *relname,
/*
- * RelationSetNewRelfilenode
+ * RelationSetNewRelfilenumber
*
- * Assign a new relfilenode (physical file name), and possibly a new
+ * Assign a new relfilenumber (physical file name), and possibly a new
* persistence setting, to the relation.
*
* This allows a full rewrite of the relation to be done with transactional
- * safety (since the filenode assignment can be rolled back). Note however
+ * safety (since the filenumber assignment can be rolled back). Note however
* that there is no simple way to access the relation's old data for the
* remainder of the current transaction. This limits the usefulness to cases
* such as TRUNCATE or rebuilding an index from scratch.
@@ -3697,19 +3697,19 @@ RelationBuildLocalRelation(const char *relname,
* Caller must already hold exclusive lock on the relation.
*/
void
-RelationSetNewRelfilenode(Relation relation, char persistence)
+RelationSetNewRelfilenumber(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelFileNumber newrelfilenumber;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
MultiXactId minmulti = InvalidMultiXactId;
TransactionId freezeXid = InvalidTransactionId;
- RelFileNode newrnode;
+ RelFileLocator newrlocator;
- /* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
- persistence);
+ /* Allocate a new relfilenumber */
+ newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
+ NULL, persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
@@ -3729,28 +3729,28 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelationDropStorage(relation);
/*
- * Create storage for the main fork of the new relfilenode. If it's a
+ * Create storage for the main fork of the new relfilenumber. If it's a
* table-like object, call into the table AM to do so, which'll also
* create the table's init fork if needed.
*
- * NOTE: If relevant for the AM, any conflict in relfilenode value will be
- * caught here, if GetNewRelFileNode messes up for any reason.
+ * NOTE: If relevant for the AM, any conflict in relfilenumber value will be
+ * caught here, if GetNewRelFileNumber messes up for any reason.
*/
- newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ newrlocator = relation->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
- table_relation_set_new_filenode(relation, &newrnode,
- persistence,
- &freezeXid, &minmulti);
+ table_relation_set_new_filelocator(relation, &newrlocator,
+ persistence,
+ &freezeXid, &minmulti);
}
else if (RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
{
/* handle these directly, at least for now */
SMgrRelation srel;
- srel = RelationCreateStorage(newrnode, persistence, true);
+ srel = RelationCreateStorage(newrlocator, persistence, true);
smgrclose(srel);
}
else
@@ -3789,7 +3789,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
/* Do the deed */
RelationMapUpdateMap(RelationGetRelid(relation),
- newrelfilenode,
+ newrelfilenumber,
relation->rd_rel->relisshared,
false);
@@ -3799,7 +3799,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
else
{
/* Normal case, update the pg_class entry */
- classform->relfilenode = newrelfilenode;
+ classform->relfilenode = newrelfilenumber;
/* relpages etc. never change for sequences */
if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
@@ -3825,27 +3825,27 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
*/
CommandCounterIncrement();
- RelationAssumeNewRelfilenode(relation);
+ RelationAssumeNewRelfilelocator(relation);
}
/*
- * RelationAssumeNewRelfilenode
+ * RelationAssumeNewRelfilelocator
*
* Code that modifies pg_class.reltablespace or pg_class.relfilenode must call
* this. The call shall precede any code that might insert WAL records whose
- * replay would modify bytes in the new RelFileNode, and the call shall follow
- * any WAL modifying bytes in the prior RelFileNode. See struct RelationData.
+ * replay would modify bytes in the new RelFileLocator, and the call shall follow
+ * any WAL modifying bytes in the prior RelFileLocator. See struct RelationData.
* Ideally, call this as near as possible to the CommandCounterIncrement()
* that makes the pg_class change visible (before it or after it); that
* minimizes the chance of future development adding a forbidden WAL insertion
- * between RelationAssumeNewRelfilenode() and CommandCounterIncrement().
+ * between RelationAssumeNewRelfilelocator() and CommandCounterIncrement().
*/
void
-RelationAssumeNewRelfilenode(Relation relation)
+RelationAssumeNewRelfilelocator(Relation relation)
{
- relation->rd_newRelfilenodeSubid = GetCurrentSubTransactionId();
- if (relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
- relation->rd_firstRelfilenodeSubid = relation->rd_newRelfilenodeSubid;
+ relation->rd_newRelfilelocatorSubid = GetCurrentSubTransactionId();
+ if (relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid = relation->rd_newRelfilelocatorSubid;
/* Flag relation as needing eoxact cleanup (to clear these fields) */
EOXactListAdd(relation);
@@ -6254,8 +6254,8 @@ load_relcache_init_file(bool shared)
rel->rd_fkeyvalid = false;
rel->rd_fkeylist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
rel->pgstat_info = NULL;
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
deleted file mode 100644
index 70c323c..0000000
--- a/src/backend/utils/cache/relfilenodemap.c
+++ /dev/null
@@ -1,244 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.c
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- * src/backend/utils/cache/relfilenodemap.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "access/genam.h"
-#include "access/htup_details.h"
-#include "access/table.h"
-#include "catalog/pg_class.h"
-#include "catalog/pg_tablespace.h"
-#include "miscadmin.h"
-#include "utils/builtins.h"
-#include "utils/catcache.h"
-#include "utils/fmgroids.h"
-#include "utils/hsearch.h"
-#include "utils/inval.h"
-#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
-#include "utils/relmapper.h"
-
-/* Hash table for information about each relfilenode <-> oid pair */
-static HTAB *RelfilenodeMapHash = NULL;
-
-/* built first time through in InitializeRelfilenodeMap */
-static ScanKeyData relfilenode_skey[2];
-
-typedef struct
-{
- Oid reltablespace;
- Oid relfilenode;
-} RelfilenodeMapKey;
-
-typedef struct
-{
- RelfilenodeMapKey key; /* lookup key - must be first */
- Oid relid; /* pg_class.oid */
-} RelfilenodeMapEntry;
-
-/*
- * RelfilenodeMapInvalidateCallback
- * Flush mapping entries when pg_class is updated in a relevant fashion.
- */
-static void
-RelfilenodeMapInvalidateCallback(Datum arg, Oid relid)
-{
- HASH_SEQ_STATUS status;
- RelfilenodeMapEntry *entry;
-
- /* callback only gets registered after creating the hash */
- Assert(RelfilenodeMapHash != NULL);
-
- hash_seq_init(&status, RelfilenodeMapHash);
- while ((entry = (RelfilenodeMapEntry *) hash_seq_search(&status)) != NULL)
- {
- /*
- * If relid is InvalidOid, signaling a complete reset, we must remove
- * all entries, otherwise just remove the specific relation's entry.
- * Always remove negative cache entries.
- */
- if (relid == InvalidOid || /* complete reset */
- entry->relid == InvalidOid || /* negative cache entry */
- entry->relid == relid) /* individual flushed relation */
- {
- if (hash_search(RelfilenodeMapHash,
- (void *) &entry->key,
- HASH_REMOVE,
- NULL) == NULL)
- elog(ERROR, "hash table corrupted");
- }
- }
-}
-
-/*
- * InitializeRelfilenodeMap
- * Initialize cache, either on first use or after a reset.
- */
-static void
-InitializeRelfilenodeMap(void)
-{
- HASHCTL ctl;
- int i;
-
- /* Make sure we've initialized CacheMemoryContext. */
- if (CacheMemoryContext == NULL)
- CreateCacheMemoryContext();
-
- /* build skey */
- MemSet(&relfilenode_skey, 0, sizeof(relfilenode_skey));
-
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenode_skey[i].sk_func,
- CacheMemoryContext);
- relfilenode_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenode_skey[i].sk_subtype = InvalidOid;
- relfilenode_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenode_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenode_skey[1].sk_attno = Anum_pg_class_relfilenode;
-
- /*
- * Only create the RelfilenodeMapHash now, so we don't end up partially
- * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
- * error.
- */
- ctl.keysize = sizeof(RelfilenodeMapKey);
- ctl.entrysize = sizeof(RelfilenodeMapEntry);
- ctl.hcxt = CacheMemoryContext;
-
- RelfilenodeMapHash =
- hash_create("RelfilenodeMap cache", 64, &ctl,
- HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
-
- /* Watch for invalidation events. */
- CacheRegisterRelcacheCallback(RelfilenodeMapInvalidateCallback,
- (Datum) 0);
-}
-
-/*
- * Map a relation's (tablespace, filenode) to a relation's oid and cache the
- * result.
- *
- * Returns InvalidOid if no relation matching the criteria could be found.
- */
-Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
-{
- RelfilenodeMapKey key;
- RelfilenodeMapEntry *entry;
- bool found;
- SysScanDesc scandesc;
- Relation relation;
- HeapTuple ntp;
- ScanKeyData skey[2];
- Oid relid;
-
- if (RelfilenodeMapHash == NULL)
- InitializeRelfilenodeMap();
-
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
-
- /*
- * Check cache and return entry if one is found. Even if no target
- * relation can be found later on we store the negative match and return a
- * InvalidOid from cache. That's not really necessary for performance
- * since querying invalid values isn't supposed to be a frequent thing,
- * but it's basically free.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_FIND, &found);
-
- if (found)
- return entry->relid;
-
- /* ok, no previous cache entry, do it the hard way */
-
- /* initialize empty/negative cache entry before doing the actual lookups */
- relid = InvalidOid;
-
- if (reltablespace == GLOBALTABLESPACE_OID)
- {
- /*
- * Ok, shared table, check relmapper.
- */
- relid = RelationMapFilenodeToOid(relfilenode, true);
- }
- else
- {
- /*
- * Not a shared table, could either be a plain relation or a
- * non-shared, nailed one, like e.g. pg_class.
- */
-
- /* check for plain relations by looking in pg_class */
- relation = table_open(RelationRelationId, AccessShareLock);
-
- /* copy scankey to local copy, it will be modified during the scan */
- memcpy(skey, relfilenode_skey, sizeof(skey));
-
- /* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
-
- scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
- true,
- NULL,
- 2,
- skey);
-
- found = false;
-
- while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
- {
- Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
-
- if (found)
- elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
- reltablespace, relfilenode);
- found = true;
-
- Assert(classform->reltablespace == reltablespace);
- Assert(classform->relfilenode == relfilenode);
- relid = classform->oid;
- }
-
- systable_endscan(scandesc);
- table_close(relation, AccessShareLock);
-
- /* check for tables that are mapped but not shared */
- if (!found)
- relid = RelationMapFilenodeToOid(relfilenode, false);
- }
-
- /*
- * Only enter entry into cache now, our opening of pg_class could have
- * caused cache invalidations to be executed which would have deleted a
- * new entry if we had entered it above.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_ENTER, &found);
- if (found)
- elog(ERROR, "corrupted hashtable");
- entry->relid = relid;
-
- return relid;
-}
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
new file mode 100644
index 0000000..3dc45e9
--- /dev/null
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -0,0 +1,244 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.c
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/utils/cache/relfilenumbermap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/htup_details.h"
+#include "access/table.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/catcache.h"
+#include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/inval.h"
+#include "utils/rel.h"
+#include "utils/relfilenumbermap.h"
+#include "utils/relmapper.h"
+
+/* Hash table for information about each relfilenumber <-> oid pair */
+static HTAB *RelfilenumberMapHash = NULL;
+
+/* built first time through in InitializeRelfilenumberMap */
+static ScanKeyData relfilenumber_skey[2];
+
+typedef struct
+{
+ Oid reltablespace;
+ RelFileNumber relfilenumber;
+} RelfilenumberMapKey;
+
+typedef struct
+{
+ RelfilenumberMapKey key; /* lookup key - must be first */
+ Oid relid; /* pg_class.oid */
+} RelfilenumberMapEntry;
+
+/*
+ * RelfilenumberMapInvalidateCallback
+ * Flush mapping entries when pg_class is updated in a relevant fashion.
+ */
+static void
+RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
+{
+ HASH_SEQ_STATUS status;
+ RelfilenumberMapEntry *entry;
+
+ /* callback only gets registered after creating the hash */
+ Assert(RelfilenumberMapHash != NULL);
+
+ hash_seq_init(&status, RelfilenumberMapHash);
+ while ((entry = (RelfilenumberMapEntry *) hash_seq_search(&status)) != NULL)
+ {
+ /*
+ * If relid is InvalidOid, signaling a complete reset, we must remove
+ * all entries, otherwise just remove the specific relation's entry.
+ * Always remove negative cache entries.
+ */
+ if (relid == InvalidOid || /* complete reset */
+ entry->relid == InvalidOid || /* negative cache entry */
+ entry->relid == relid) /* individual flushed relation */
+ {
+ if (hash_search(RelfilenumberMapHash,
+ (void *) &entry->key,
+ HASH_REMOVE,
+ NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+ }
+ }
+}
+
+/*
+ * InitializeRelfilenumberMap
+ * Initialize cache, either on first use or after a reset.
+ */
+static void
+InitializeRelfilenumberMap(void)
+{
+ HASHCTL ctl;
+ int i;
+
+ /* Make sure we've initialized CacheMemoryContext. */
+ if (CacheMemoryContext == NULL)
+ CreateCacheMemoryContext();
+
+ /* build skey */
+ MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
+
+ for (i = 0; i < 2; i++)
+ {
+ fmgr_info_cxt(F_OIDEQ,
+ &relfilenumber_skey[i].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[i].sk_subtype = InvalidOid;
+ relfilenumber_skey[i].sk_collation = InvalidOid;
+ }
+
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
+ relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+
+ /*
+ * Only create the RelfilenumberMapHash now, so we don't end up partially
+ * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
+ * error.
+ */
+ ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.entrysize = sizeof(RelfilenumberMapEntry);
+ ctl.hcxt = CacheMemoryContext;
+
+ RelfilenumberMapHash =
+ hash_create("RelfilenumberMap cache", 64, &ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+ /* Watch for invalidation events. */
+ CacheRegisterRelcacheCallback(RelfilenumberMapInvalidateCallback,
+ (Datum) 0);
+}
+
+/*
+ * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * the result.
+ *
+ * Returns InvalidOid if no relation matching the criteria could be found.
+ */
+Oid
+RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
+{
+ RelfilenumberMapKey key;
+ RelfilenumberMapEntry *entry;
+ bool found;
+ SysScanDesc scandesc;
+ Relation relation;
+ HeapTuple ntp;
+ ScanKeyData skey[2];
+ Oid relid;
+
+ if (RelfilenumberMapHash == NULL)
+ InitializeRelfilenumberMap();
+
+ /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
+ if (reltablespace == MyDatabaseTableSpace)
+ reltablespace = 0;
+
+ MemSet(&key, 0, sizeof(key));
+ key.reltablespace = reltablespace;
+ key.relfilenumber = relfilenumber;
+
+ /*
+ * Check cache and return entry if one is found. Even if no target
+ * relation can be found later on we store the negative match and return a
+ * InvalidOid from cache. That's not really necessary for performance
+ * since querying invalid values isn't supposed to be a frequent thing,
+ * but it's basically free.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+
+ if (found)
+ return entry->relid;
+
+ /* ok, no previous cache entry, do it the hard way */
+
+ /* initialize empty/negative cache entry before doing the actual lookups */
+ relid = InvalidOid;
+
+ if (reltablespace == GLOBALTABLESPACE_OID)
+ {
+ /*
+ * Ok, shared table, check relmapper.
+ */
+ relid = RelationMapFilenumberToOid(relfilenumber, true);
+ }
+ else
+ {
+ /*
+ * Not a shared table, could either be a plain relation or a
+ * non-shared, nailed one, like e.g. pg_class.
+ */
+
+ /* check for plain relations by looking in pg_class */
+ relation = table_open(RelationRelationId, AccessShareLock);
+
+ /* copy scankey to local copy, it will be modified during the scan */
+ memcpy(skey, relfilenumber_skey, sizeof(skey));
+
+ /* set scan arguments */
+ skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
+ skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+
+ scandesc = systable_beginscan(relation,
+ ClassTblspcRelfilenodeIndexId,
+ true,
+ NULL,
+ 2,
+ skey);
+
+ found = false;
+
+ while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
+ {
+ Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
+
+ if (found)
+ elog(ERROR,
+ "unexpected duplicate for tablespace %u, relfilenumber %u",
+ reltablespace, relfilenumber);
+ found = true;
+
+ Assert(classform->reltablespace == reltablespace);
+ Assert(classform->relfilenode == relfilenumber);
+ relid = classform->oid;
+ }
+
+ systable_endscan(scandesc);
+ table_close(relation, AccessShareLock);
+
+ /* check for tables that are mapped but not shared */
+ if (!found)
+ relid = RelationMapFilenumberToOid(relfilenumber, false);
+ }
+
+ /*
+ * Only enter entry into cache now, our opening of pg_class could have
+ * caused cache invalidations to be executed which would have deleted a
+ * new entry if we had entered it above.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ if (found)
+ elog(ERROR, "corrupted hashtable");
+ entry->relid = relid;
+
+ return relid;
+}
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 2a330cf..e2ac0fa 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.c
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
* For most tables, the physical file underlying the table is specified by
* pg_class.relfilenode. However, that obviously won't work for pg_class
@@ -11,7 +11,7 @@
* update other databases' pg_class entries when relocating a shared catalog.
* Therefore, for these special catalogs (henceforth referred to as "mapped
* catalogs") we rely on a separately maintained file that shows the mapping
- * from catalog OIDs to filenode numbers. Each database has a map file for
+ * from catalog OIDs to filenumbers. Each database has a map file for
* its local mapped catalogs, and there is a separate map file for shared
* catalogs. Mapped catalogs have zero in their pg_class.relfilenode entries.
*
@@ -78,8 +78,8 @@
typedef struct RelMapping
{
- Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ Oid mapoid; /* OID of a catalog */
+ RelFileNumber mapfilenumber; /* its rel file number */
} RelMapping;
typedef struct RelMapFile
@@ -116,7 +116,7 @@ static RelMapFile local_map;
* subtransactions, so one set of transaction-level changes is sufficient.
*
* The active_xxx variables contain updates that are valid in our transaction
- * and should be honored by RelationMapOidToFilenode. The pending_xxx
+ * and should be honored by RelationMapOidToFilenumber. The pending_xxx
* variables contain updates we have been told about that aren't active yet;
* they will become active at the next CommandCounterIncrement. This setup
* lets map updates act similarly to updates of pg_class rows, ie, they
@@ -132,8 +132,8 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
- bool add_okay);
+static void apply_map_update(RelMapFile *map, Oid relationId,
+ RelFileNumber filenumber, bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
static void load_relmap_file(bool shared, bool lock_held);
@@ -146,19 +146,20 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
/*
- * RelationMapOidToFilenode
+ * RelationMapOidToFilenumber
*
- * The raison d' etre ... given a relation OID, look up its filenode.
+ * The raison d' etre ... given a relation OID, look up its filenumber.
*
* Although shared and local relation OIDs should never overlap, the caller
* always knows which we need --- so pass that information to avoid useless
* searching.
*
- * Returns InvalidOid if the OID is not known (which should never happen,
- * but the caller is in a better position to report a meaningful error).
+ * Returns InvalidRelFileNumber if the OID is not known (which should never
+ * happen, but the caller is in a better position to report a meaningful
+ * error).
*/
-Oid
-RelationMapOidToFilenode(Oid relationId, bool shared)
+RelFileNumber
+RelationMapOidToFilenumber(Oid relationId, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -170,13 +171,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
else
@@ -185,33 +186,33 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
- * RelationMapFilenodeToOid
+ * RelationMapFilenumberToOid
*
* Do the reverse of the normal direction of mapping done in
- * RelationMapOidToFilenode.
+ * RelationMapOidToFilenumber.
*
* This is not supposed to be used during normal running but rather for
* information purposes when looking at the filesystem or xlog.
*
* Returns InvalidOid if the OID is not known; this can easily happen if the
- * relfilenode doesn't pertain to a mapped relation.
+ * relfilenumber doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenumberToOid(RelFileNumber filenumber, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -222,13 +223,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -237,13 +238,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -252,13 +253,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
}
/*
- * RelationMapOidToFilenodeForDatabase
+ * RelationMapOidToFilenumberForDatabase
*
- * Like RelationMapOidToFilenode, but reads the mapping from the indicated
+ * Like RelationMapOidToFilenumber, but reads the mapping from the indicated
* path instead of using the one for the current database.
*/
-Oid
-RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
+RelFileNumber
+RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId)
{
RelMapFile map;
int i;
@@ -270,10 +271,10 @@ RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
for (i = 0; i < map.num_mappings; i++)
{
if (relationId == map.mappings[i].mapoid)
- return map.mappings[i].mapfilenode;
+ return map.mappings[i].mapfilenumber;
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
@@ -311,13 +312,13 @@ RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath, char *dstdbpath)
/*
* RelationMapUpdateMap
*
- * Install a new relfilenode mapping for the specified relation.
+ * Install a new relfilenumber mapping for the specified relation.
*
* If immediate is true (or we're bootstrapping), the mapping is activated
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -362,7 +363,7 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
map = &pending_local_updates;
}
}
- apply_map_update(map, relationId, fileNode, true);
+ apply_map_update(map, relationId, fileNumber, true);
}
/*
@@ -375,7 +376,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelFileNumber fileNumber,
+ bool add_okay)
{
int32 i;
@@ -384,7 +386,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ map->mappings[i].mapfilenumber = fileNumber;
return;
}
}
@@ -396,7 +398,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ map->mappings[map->num_mappings].mapfilenumber = fileNumber;
map->num_mappings++;
}
@@ -415,7 +417,7 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ updates->mappings[i].mapfilenumber,
add_okay);
}
}
@@ -983,12 +985,12 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
for (i = 0; i < newmap->num_mappings; i++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.spcNode = tsid;
- rnode.dbNode = dbid;
- rnode.relNode = newmap->mappings[i].mapfilenode;
- RelationPreserveStorage(rnode, false);
+ rlocator.spcOid = tsid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = newmap->mappings[i].mapfilenumber;
+ RelationPreserveStorage(rlocator, false);
}
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c871cb7..6b90e7c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4803,16 +4803,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
bool is_index)
{
PQExpBuffer upgrade_query = createPQExpBuffer();
- PGresult *upgrade_res;
- Oid relfilenode;
- Oid toast_oid;
- Oid toast_relfilenode;
- char relkind;
- Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ PGresult *upgrade_res;
+ RelFileNumber relfilenumber;
+ Oid toast_oid;
+ RelFileNumber toast_relfilenumber;
+ char relkind;
+ Oid toast_index_oid;
+ RelFileNumber toast_index_relfilenumber;
/*
- * Preserve the OID and relfilenode of the table, table's index, table's
+ * Preserve the OID and relfilenumber of the table, table's index, table's
* toast table and toast table's index if any.
*
* One complexity is that the current table definition might not require
@@ -4835,15 +4835,15 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
@@ -4857,13 +4857,13 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
/*
* Not every relation has storage. Also, in a pre-v12 database,
- * partitioned tables have a relfilenode, which should not be
+ * partitioned tables have a relfilenumber, which should not be
* preserved when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
/*
* In a pre-v12 database, partitioned tables might be marked as having
@@ -4877,7 +4877,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
- toast_relfilenode);
+ toast_relfilenumber);
/* every toast table has an index */
appendPQExpBuffer(upgrade_buffer,
@@ -4885,20 +4885,20 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- toast_index_relfilenode);
+ toast_index_relfilenumber);
}
PQclear(upgrade_res);
}
else
{
- /* Preserve the OID and relfilenode of the index */
+ /* Preserve the OID and relfilenumber of the index */
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
}
appendPQExpBufferChar(upgrade_buffer, '\n');
diff --git a/src/bin/pg_rewind/datapagemap.h b/src/bin/pg_rewind/datapagemap.h
index ae4965f..235b676 100644
--- a/src/bin/pg_rewind/datapagemap.h
+++ b/src/bin/pg_rewind/datapagemap.h
@@ -10,7 +10,7 @@
#define DATAPAGEMAP_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
struct datapagemap
{
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 6252931..269ed64 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -56,7 +56,7 @@ static uint32 hash_string_pointer(const char *s);
static filehash_hash *filehash;
static bool isRelDataFile(const char *path);
-static char *datasegpath(RelFileNode rnode, ForkNumber forknum,
+static char *datasegpath(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber segno);
static file_entry_t *insert_filehash_entry(const char *path);
@@ -288,7 +288,7 @@ process_target_file(const char *path, file_type_t type, size_t size,
* hash table!
*/
void
-process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
+process_target_wal_block_change(ForkNumber forknum, RelFileLocator rlocator,
BlockNumber blkno)
{
char *path;
@@ -299,7 +299,7 @@ process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
segno = blkno / RELSEG_SIZE;
blkno_inseg = blkno % RELSEG_SIZE;
- path = datasegpath(rnode, forknum, segno);
+ path = datasegpath(rlocator, forknum, segno);
entry = lookup_filehash_entry(path);
pfree(path);
@@ -508,7 +508,7 @@ print_filemap(filemap_t *filemap)
static bool
isRelDataFile(const char *path)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
unsigned int segNo;
int nmatch;
bool matched;
@@ -532,32 +532,32 @@ isRelDataFile(const char *path)
*
*----
*/
- rnode.spcNode = InvalidOid;
- rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rlocator.spcOid = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.relNumber = InvalidRelFileNumber;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
- rnode.spcNode = GLOBALTABLESPACE_OID;
- rnode.dbNode = 0;
+ rlocator.spcOid = GLOBALTABLESPACE_OID;
+ rlocator.dbOid = 0;
matched = true;
}
else
{
nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ &rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
- rnode.spcNode = DEFAULTTABLESPACE_OID;
+ rlocator.spcOid = DEFAULTTABLESPACE_OID;
matched = true;
}
else
{
nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ &rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
matched = true;
@@ -567,12 +567,12 @@ isRelDataFile(const char *path)
/*
* The sscanf tests above can match files that have extra characters at
* the end. To eliminate such cases, cross-check that GetRelationPath
- * creates the exact same filename, when passed the RelFileNode
+ * creates the exact same filename, when passed the RelFileLocator
* information we extracted from the filename.
*/
if (matched)
{
- char *check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
+ char *check_path = datasegpath(rlocator, MAIN_FORKNUM, segNo);
if (strcmp(check_path, path) != 0)
matched = false;
@@ -589,12 +589,12 @@ isRelDataFile(const char *path)
* The returned path is palloc'd
*/
static char *
-datasegpath(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
+datasegpath(RelFileLocator rlocator, ForkNumber forknum, BlockNumber segno)
{
char *path;
char *segpath;
- path = relpathperm(rnode, forknum);
+ path = relpathperm(rlocator, forknum);
if (segno > 0)
{
segpath = psprintf("%s.%u", path, segno);
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 096f57a..0e011fb 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -10,7 +10,7 @@
#include "datapagemap.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* these enum values are sorted in the order we want actions to be processed */
typedef enum
@@ -103,7 +103,7 @@ extern void process_source_file(const char *path, file_type_t type,
extern void process_target_file(const char *path, file_type_t type,
size_t size, const char *link_target);
extern void process_target_wal_block_change(ForkNumber forknum,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blkno);
extern filemap_t *decide_file_actions(void);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index c6792da..d97240e 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -445,18 +445,18 @@ extractPageInfo(XLogReaderState *record)
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
- ForkNumber forknum;
- BlockNumber blkno;
+ RelFileLocator rlocator;
+ ForkNumber forknum;
+ BlockNumber blkno;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
continue;
/* We only care about the main fork; others are copied in toto */
if (forknum != MAIN_FORKNUM)
continue;
- process_target_wal_block_change(forknum, rnode, blkno);
+ process_target_wal_block_change(forknum, rlocator, blkno);
}
}
diff --git a/src/bin/pg_rewind/pg_rewind.h b/src/bin/pg_rewind/pg_rewind.h
index 393182f..8b4b50a 100644
--- a/src/bin/pg_rewind/pg_rewind.h
+++ b/src/bin/pg_rewind/pg_rewind.h
@@ -16,7 +16,7 @@
#include "datapagemap.h"
#include "libpq-fe.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* Configuration options */
extern char *datadir_target;
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 587793e..7f8042f 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -19,7 +19,7 @@ OBJS = \
option.o \
parallel.o \
pg_upgrade.o \
- relfilenode.o \
+ relfilenumber.o \
server.o \
tablespace.o \
util.o \
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 36b0670..5d30b87 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -190,9 +190,9 @@ create_rel_filename_map(const char *old_data, const char *new_data,
map->new_tablespace_suffix = new_cluster.tablespace_suffix;
}
- /* DB oid and relfilenodes are preserved between old and new cluster */
+ /* DB oid and relfilenumbers are preserved between old and new cluster */
map->db_oid = old_db->db_oid;
- map->relfilenode = old_rel->relfilenode;
+ map->relfilenumber = old_rel->relfilenumber;
/* used only for logging and error reporting, old/new are identical */
map->nspname = old_rel->nspname;
@@ -399,7 +399,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
+ i_relfilenumber,
i_reltablespace;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
@@ -495,7 +495,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_toastheap = PQfnumber(res, "toastheap");
i_nspname = PQfnumber(res, "nspname");
i_relname = PQfnumber(res, "relname");
- i_relfilenode = PQfnumber(res, "relfilenode");
+ i_relfilenumber = PQfnumber(res, "relfilenode");
i_reltablespace = PQfnumber(res, "reltablespace");
i_spclocation = PQfnumber(res, "spclocation");
@@ -527,7 +527,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 55de244..30c3ee6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -132,15 +132,15 @@ extern char *output_files[];
typedef struct
{
/* Can't use NAMEDATALEN; not guaranteed to be same on client */
- char *nspname; /* namespace name */
- char *relname; /* relation name */
- Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
- Oid indtable; /* if index, OID of its table, else 0 */
- Oid toastheap; /* if toast table, OID of base table, else 0 */
- char *tablespace; /* tablespace path; "" for cluster default */
- bool nsp_alloc; /* should nspname be freed? */
- bool tblsp_alloc; /* should tablespace be freed? */
+ char *nspname; /* namespace name */
+ char *relname; /* relation name */
+ Oid reloid; /* relation OID */
+ RelFileNumber relfilenumber; /* relation file number */
+ Oid indtable; /* if index, OID of its table, else 0 */
+ Oid toastheap; /* if toast table, OID of base table, else 0 */
+ char *tablespace; /* tablespace path; "" for cluster default */
+ bool nsp_alloc; /* should nspname be freed? */
+ bool tblsp_alloc; /* should tablespace be freed? */
} RelInfo;
typedef struct
@@ -159,7 +159,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
@@ -400,7 +400,7 @@ void parseCommandLine(int argc, char *argv[]);
void adjust_data_dir(ClusterInfo *cluster);
void get_sock_dir(ClusterInfo *cluster, bool live_check);
-/* relfilenode.c */
+/* relfilenumber.c */
void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
deleted file mode 100644
index d23ac88..0000000
--- a/src/bin/pg_upgrade/relfilenode.c
+++ /dev/null
@@ -1,259 +0,0 @@
-/*
- * relfilenode.c
- *
- * relfilenode functions
- *
- * Copyright (c) 2010-2022, PostgreSQL Global Development Group
- * src/bin/pg_upgrade/relfilenode.c
- */
-
-#include "postgres_fe.h"
-
-#include <sys/stat.h>
-
-#include "access/transam.h"
-#include "catalog/pg_class_d.h"
-#include "pg_upgrade.h"
-
-static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
-static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
-
-
-/*
- * transfer_all_new_tablespaces()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata)
-{
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- prep_status_progress("Cloning user relation files");
- break;
- case TRANSFER_MODE_COPY:
- prep_status_progress("Copying user relation files");
- break;
- case TRANSFER_MODE_LINK:
- prep_status_progress("Linking user relation files");
- break;
- }
-
- /*
- * Transferring files by tablespace is tricky because a single database
- * can use multiple tablespaces. For non-parallel mode, we just pass a
- * NULL tablespace path, which matches all tablespaces. In parallel mode,
- * we pass the default tablespace and all user-created tablespaces and let
- * those operations happen in parallel.
- */
- if (user_opts.jobs <= 1)
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, NULL);
- else
- {
- int tblnum;
-
- /* transfer default tablespace */
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, old_pgdata);
-
- for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
- parallel_transfer_all_new_dbs(old_db_arr,
- new_db_arr,
- old_pgdata,
- new_pgdata,
- os_info.old_tablespaces[tblnum]);
- /* reap all children */
- while (reap_child(true) == true)
- ;
- }
-
- end_progress_output();
- check_ok();
-}
-
-
-/*
- * transfer_all_new_dbs()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata, char *old_tablespace)
-{
- int old_dbnum,
- new_dbnum;
-
- /* Scan the old cluster databases and transfer their files */
- for (old_dbnum = new_dbnum = 0;
- old_dbnum < old_db_arr->ndbs;
- old_dbnum++, new_dbnum++)
- {
- DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
- *new_db = NULL;
- FileNameMap *mappings;
- int n_maps;
-
- /*
- * Advance past any databases that exist in the new cluster but not in
- * the old, e.g. "postgres". (The user might have removed the
- * 'postgres' database from the old cluster.)
- */
- for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
- {
- new_db = &new_db_arr->dbs[new_dbnum];
- if (strcmp(old_db->db_name, new_db->db_name) == 0)
- break;
- }
-
- if (new_dbnum >= new_db_arr->ndbs)
- pg_fatal("old database \"%s\" not found in the new cluster\n",
- old_db->db_name);
-
- mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
- new_pgdata);
- if (n_maps)
- {
- transfer_single_new_db(mappings, n_maps, old_tablespace);
- }
- /* We allocate something even for n_maps == 0 */
- pg_free(mappings);
- }
-}
-
-/*
- * transfer_single_new_db()
- *
- * create links for mappings stored in "maps" array.
- */
-static void
-transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
-{
- int mapnum;
- bool vm_must_add_frozenbit = false;
-
- /*
- * Do we need to rewrite visibilitymap?
- */
- if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
- new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
- vm_must_add_frozenbit = true;
-
- for (mapnum = 0; mapnum < size; mapnum++)
- {
- if (old_tablespace == NULL ||
- strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
- {
- /* transfer primary file */
- transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
-
- /*
- * Copy/link any fsm and vm files, if they exist
- */
- transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
- transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
- }
- }
-}
-
-
-/*
- * transfer_relfile()
- *
- * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
- * is true, visibility map forks are converted and rewritten, even in link
- * mode.
- */
-static void
-transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
-{
- char old_file[MAXPGPATH];
- char new_file[MAXPGPATH];
- int segno;
- char extent_suffix[65];
- struct stat statbuf;
-
- /*
- * Now copy/link any related segments as well. Remember, PG breaks large
- * files into 1GB segments, the first segment has no extension, subsequent
- * segments are named relfilenode.1, relfilenode.2, relfilenode.3.
- */
- for (segno = 0;; segno++)
- {
- if (segno == 0)
- extent_suffix[0] = '\0';
- else
- snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
-
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
- map->old_tablespace,
- map->old_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
- map->new_tablespace,
- map->new_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
-
- /* Is it an extent, fsm, or vm file? */
- if (type_suffix[0] != '\0' || segno != 0)
- {
- /* Did file open fail? */
- if (stat(old_file, &statbuf) != 0)
- {
- /* File does not exist? That's OK, just return */
- if (errno == ENOENT)
- return;
- else
- pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
- map->nspname, map->relname, old_file, new_file,
- strerror(errno));
- }
-
- /* If file is empty, just return */
- if (statbuf.st_size == 0)
- return;
- }
-
- unlink(new_file);
-
- /* Copying files might take some time, so give feedback. */
- pg_log(PG_STATUS, "%s", old_file);
-
- if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
- {
- /* Need to rewrite visibility map format */
- pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
- old_file, new_file);
- rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
- }
- else
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
- old_file, new_file);
- cloneFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_COPY:
- pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
- old_file, new_file);
- copyFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_LINK:
- pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
- old_file, new_file);
- linkFile(old_file, new_file, map->nspname, map->relname);
- }
- }
-}
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
new file mode 100644
index 0000000..b3ad820
--- /dev/null
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -0,0 +1,259 @@
+/*
+ * relfilenumber.c
+ *
+ * relfilenumber functions
+ *
+ * Copyright (c) 2010-2022, PostgreSQL Global Development Group
+ * src/bin/pg_upgrade/relfilenumber.c
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+
+#include "access/transam.h"
+#include "catalog/pg_class_d.h"
+#include "pg_upgrade.h"
+
+static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
+static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
+
+
+/*
+ * transfer_all_new_tablespaces()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata)
+{
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ prep_status_progress("Cloning user relation files");
+ break;
+ case TRANSFER_MODE_COPY:
+ prep_status_progress("Copying user relation files");
+ break;
+ case TRANSFER_MODE_LINK:
+ prep_status_progress("Linking user relation files");
+ break;
+ }
+
+ /*
+ * Transferring files by tablespace is tricky because a single database
+ * can use multiple tablespaces. For non-parallel mode, we just pass a
+ * NULL tablespace path, which matches all tablespaces. In parallel mode,
+ * we pass the default tablespace and all user-created tablespaces and let
+ * those operations happen in parallel.
+ */
+ if (user_opts.jobs <= 1)
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, NULL);
+ else
+ {
+ int tblnum;
+
+ /* transfer default tablespace */
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, old_pgdata);
+
+ for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+ parallel_transfer_all_new_dbs(old_db_arr,
+ new_db_arr,
+ old_pgdata,
+ new_pgdata,
+ os_info.old_tablespaces[tblnum]);
+ /* reap all children */
+ while (reap_child(true) == true)
+ ;
+ }
+
+ end_progress_output();
+ check_ok();
+}
+
+
+/*
+ * transfer_all_new_dbs()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata, char *old_tablespace)
+{
+ int old_dbnum,
+ new_dbnum;
+
+ /* Scan the old cluster databases and transfer their files */
+ for (old_dbnum = new_dbnum = 0;
+ old_dbnum < old_db_arr->ndbs;
+ old_dbnum++, new_dbnum++)
+ {
+ DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
+ *new_db = NULL;
+ FileNameMap *mappings;
+ int n_maps;
+
+ /*
+ * Advance past any databases that exist in the new cluster but not in
+ * the old, e.g. "postgres". (The user might have removed the
+ * 'postgres' database from the old cluster.)
+ */
+ for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
+ {
+ new_db = &new_db_arr->dbs[new_dbnum];
+ if (strcmp(old_db->db_name, new_db->db_name) == 0)
+ break;
+ }
+
+ if (new_dbnum >= new_db_arr->ndbs)
+ pg_fatal("old database \"%s\" not found in the new cluster\n",
+ old_db->db_name);
+
+ mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
+ new_pgdata);
+ if (n_maps)
+ {
+ transfer_single_new_db(mappings, n_maps, old_tablespace);
+ }
+ /* We allocate something even for n_maps == 0 */
+ pg_free(mappings);
+ }
+}
+
+/*
+ * transfer_single_new_db()
+ *
+ * create links for mappings stored in "maps" array.
+ */
+static void
+transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
+{
+ int mapnum;
+ bool vm_must_add_frozenbit = false;
+
+ /*
+ * Do we need to rewrite visibilitymap?
+ */
+ if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
+ new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
+ vm_must_add_frozenbit = true;
+
+ for (mapnum = 0; mapnum < size; mapnum++)
+ {
+ if (old_tablespace == NULL ||
+ strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
+ {
+ /* transfer primary file */
+ transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
+
+ /*
+ * Copy/link any fsm and vm files, if they exist
+ */
+ transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
+ transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
+ }
+ }
+}
+
+
+/*
+ * transfer_relfile()
+ *
+ * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
+ * is true, visibility map forks are converted and rewritten, even in link
+ * mode.
+ */
+static void
+transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
+{
+ char old_file[MAXPGPATH];
+ char new_file[MAXPGPATH];
+ int segno;
+ char extent_suffix[65];
+ struct stat statbuf;
+
+ /*
+ * Now copy/link any related segments as well. Remember, PG breaks large
+ * files into 1GB segments, the first segment has no extension, subsequent
+ * segments are named relfilenumber.1, relfilenumber.2, relfilenumber.3.
+ */
+ for (segno = 0;; segno++)
+ {
+ if (segno == 0)
+ extent_suffix[0] = '\0';
+ else
+ snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
+
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ map->old_tablespace,
+ map->old_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ map->new_tablespace,
+ map->new_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+
+ /* Is it an extent, fsm, or vm file? */
+ if (type_suffix[0] != '\0' || segno != 0)
+ {
+ /* Did file open fail? */
+ if (stat(old_file, &statbuf) != 0)
+ {
+ /* File does not exist? That's OK, just return */
+ if (errno == ENOENT)
+ return;
+ else
+ pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+ map->nspname, map->relname, old_file, new_file,
+ strerror(errno));
+ }
+
+ /* If file is empty, just return */
+ if (statbuf.st_size == 0)
+ return;
+ }
+
+ unlink(new_file);
+
+ /* Copying files might take some time, so give feedback. */
+ pg_log(PG_STATUS, "%s", old_file);
+
+ if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
+ {
+ /* Need to rewrite visibility map format */
+ pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
+ }
+ else
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ cloneFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_COPY:
+ pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ copyFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_LINK:
+ pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ linkFile(old_file, new_file, map->nspname, map->relname);
+ }
+ }
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5dc6010..6528113 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -37,7 +37,7 @@ static const char *progname;
static int WalSegSz;
static volatile sig_atomic_t time_to_stop = false;
-static const RelFileNode emptyRelFileNode = {0, 0, 0};
+static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
typedef struct XLogDumpPrivate
{
@@ -63,7 +63,7 @@ typedef struct XLogDumpConfig
bool filter_by_rmgr_enabled;
TransactionId filter_by_xid;
bool filter_by_xid_enabled;
- RelFileNode filter_by_relation;
+ RelFileLocator filter_by_relation;
bool filter_by_extended;
bool filter_by_relation_enabled;
BlockNumber filter_by_relation_block;
@@ -393,7 +393,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
*/
static bool
XLogRecordMatchesRelationBlock(XLogReaderState *record,
- RelFileNode matchRnode,
+ RelFileLocator matchRlocator,
BlockNumber matchBlock,
ForkNumber matchFork)
{
@@ -401,17 +401,17 @@ XLogRecordMatchesRelationBlock(XLogReaderState *record,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if ((matchFork == InvalidForkNumber || matchFork == forknum) &&
- (RelFileNodeEquals(matchRnode, emptyRelFileNode) ||
- RelFileNodeEquals(matchRnode, rnode)) &&
+ (RelFileLocatorEquals(matchRlocator, emptyRelFileLocator) ||
+ RelFileLocatorEquals(matchRlocator, rlocator)) &&
(matchBlock == InvalidBlockNumber || matchBlock == blk))
return true;
}
@@ -885,11 +885,11 @@ main(int argc, char **argv)
break;
case 'R':
if (sscanf(optarg, "%u/%u/%u",
- &config.filter_by_relation.spcNode,
- &config.filter_by_relation.dbNode,
- &config.filter_by_relation.relNode) != 3 ||
- !OidIsValid(config.filter_by_relation.spcNode) ||
- !OidIsValid(config.filter_by_relation.relNode))
+ &config.filter_by_relation.spcOid,
+ &config.filter_by_relation.dbOid,
+ &config.filter_by_relation.relNumber) != 3 ||
+ !OidIsValid(config.filter_by_relation.spcOid) ||
+ !RelFileNumberIsValid(config.filter_by_relation.relNumber))
{
pg_log_error("invalid relation specification: \"%s\"", optarg);
pg_log_error_detail("Expecting \"tablespace OID/database OID/relation filenode\".");
@@ -1132,7 +1132,7 @@ main(int argc, char **argv)
!XLogRecordMatchesRelationBlock(xlogreader_state,
config.filter_by_relation_enabled ?
config.filter_by_relation :
- emptyRelFileNode,
+ emptyRelFileLocator,
config.filter_by_relation_block_enabled ?
config.filter_by_relation_block :
InvalidBlockNumber,
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..1b6b620 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -107,24 +107,24 @@ forkname_chars(const char *str, ForkNumber *fork)
* XXX this must agree with GetRelationPath()!
*/
char *
-GetDatabasePath(Oid dbNode, Oid spcNode)
+GetDatabasePath(Oid dbOid, Oid spcOid)
{
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
return pstrdup("global");
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
- return psprintf("base/%u", dbNode);
+ return psprintf("base/%u", dbOid);
}
else
{
/* All other tablespaces are accessed via symlinks */
return psprintf("pg_tblspc/%u/%s/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY, dbNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY, dbOid);
}
}
@@ -138,44 +138,44 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber)
{
char *path;
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
path = psprintf("global/%u_%s",
- relNode, forkNames[forkNumber]);
+ relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/%u", relNumber);
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/%u_%s",
- dbNode, relNode,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/%u",
- dbNode, relNode);
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
- dbNode, backendId, relNode,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/t%d_%u",
- dbNode, backendId, relNode);
+ dbOid, backendId, relNumber);
}
}
else
@@ -185,25 +185,25 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber);
}
}
return path;
diff --git a/src/include/access/brin_xlog.h b/src/include/access/brin_xlog.h
index 95bfc7e..012a9af 100644
--- a/src/include/access/brin_xlog.h
+++ b/src/include/access/brin_xlog.h
@@ -18,7 +18,7 @@
#include "lib/stringinfo.h"
#include "storage/bufpage.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
diff --git a/src/include/access/ginxlog.h b/src/include/access/ginxlog.h
index 21de389..7f98503 100644
--- a/src/include/access/ginxlog.h
+++ b/src/include/access/ginxlog.h
@@ -110,7 +110,7 @@ typedef struct
typedef struct ginxlogSplit
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber rrlink; /* right link, or root's blocknumber if root
* split */
BlockNumber leftChildBlkno; /* valid on a non-leaf split */
@@ -167,7 +167,7 @@ typedef struct ginxlogDeletePage
*/
typedef struct ginxlogUpdateMeta
{
- RelFileNode node;
+ RelFileLocator locator;
GinMetaPageData metadata;
BlockNumber prevTail;
BlockNumber newRightlink;
diff --git a/src/include/access/gistxlog.h b/src/include/access/gistxlog.h
index 4537e67..9bbe4c2 100644
--- a/src/include/access/gistxlog.h
+++ b/src/include/access/gistxlog.h
@@ -97,7 +97,7 @@ typedef struct gistxlogPageDelete
*/
typedef struct gistxlogPageReuse
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} gistxlogPageReuse;
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 2d8a7f6..1705e73 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
@@ -370,9 +370,9 @@ typedef struct xl_heap_new_cid
CommandId combocid; /* just for debugging */
/*
- * Store the relfilenode/ctid pair to facilitate lookups.
+ * Store the relfilelocator/ctid pair to facilitate lookups.
*/
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
} xl_heap_new_cid;
@@ -415,7 +415,7 @@ extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
MultiXactId *relminmxid_out);
extern void heap_execute_freeze_tuple(HeapTupleHeader tuple,
xl_heap_freeze_tuple *xlrec_tp);
-extern XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer,
+extern XLogRecPtr log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer,
Buffer vm_buffer, TransactionId cutoff_xid, uint8 flags);
#endif /* HEAPAM_XLOG_H */
diff --git a/src/include/access/nbtxlog.h b/src/include/access/nbtxlog.h
index de362d3..d79489e 100644
--- a/src/include/access/nbtxlog.h
+++ b/src/include/access/nbtxlog.h
@@ -180,12 +180,12 @@ typedef struct xl_btree_dedup
* This is what we need to know about page reuse within btree. This record
* only exists to generate a conflict point for Hot Standby.
*
- * Note that we must include a RelFileNode in the record because we don't
+ * Note that we must include a RelFileLocator in the record because we don't
* actually register the buffer with the record.
*/
typedef struct xl_btree_reuse_page
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} xl_btree_reuse_page;
diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h
index 3e27790..353cbb2 100644
--- a/src/include/access/rewriteheap.h
+++ b/src/include/access/rewriteheap.h
@@ -15,7 +15,7 @@
#include "access/htup.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* struct definition is private to rewriteheap.c */
@@ -34,8 +34,8 @@ extern bool rewrite_heap_dead_tuple(RewriteState state, HeapTuple oldTuple);
*/
typedef struct LogicalRewriteMappingData
{
- RelFileNode old_node;
- RelFileNode new_node;
+ RelFileLocator old_locator;
+ RelFileLocator new_locator;
ItemPointerData old_tid;
ItemPointerData new_tid;
} LogicalRewriteMappingData;
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index fe869c6..54f54fe 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -560,32 +560,32 @@ typedef struct TableAmRoutine
*/
/*
- * This callback needs to create a new relation filenode for `rel`, with
+ * This callback needs to create a new relation storage for `rel`, with
* appropriate durability behaviour for `persistence`.
*
* Note that only the subset of the relcache filled by
* RelationBuildLocalRelation() can be relied upon and that the relation's
* catalog entries will either not yet exist (new relation), or will still
- * reference the old relfilenode.
+ * reference the old relfilelocator.
*
* As output *freezeXid, *minmulti must be set to the values appropriate
* for pg_class.{relfrozenxid, relminmxid}. For AMs that don't need those
* fields to be filled they can be set to InvalidTransactionId and
* InvalidMultiXactId, respectively.
*
- * See also table_relation_set_new_filenode().
+ * See also table_relation_set_new_filelocator().
*/
- void (*relation_set_new_filenode) (Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti);
+ void (*relation_set_new_filelocator) (Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti);
/*
* This callback needs to remove all contents from `rel`'s current
- * relfilenode. No provisions for transactional behaviour need to be made.
- * Often this can be implemented by truncating the underlying storage to
- * its minimal size.
+ * relfilelocator. No provisions for transactional behaviour need to be
+ * made. Often this can be implemented by truncating the underlying
+ * storage to its minimal size.
*
* See also table_relation_nontransactional_truncate().
*/
@@ -598,7 +598,7 @@ typedef struct TableAmRoutine
* storage, unless it contains references to the tablespace internally.
*/
void (*relation_copy_data) (Relation rel,
- const RelFileNode *newrnode);
+ const RelFileLocator *newrlocator);
/* See table_relation_copy_for_cluster() */
void (*relation_copy_for_cluster) (Relation NewTable,
@@ -1348,7 +1348,7 @@ table_index_delete_tuples(Relation rel, TM_IndexDeleteOp *delstate)
* RelationGetBufferForTuple. See that method for more information.
*
* TABLE_INSERT_FROZEN should only be specified for inserts into
- * relfilenodes created during the current subtransaction and when
+ * relfilenumbers created during the current subtransaction and when
* there are no prior snapshots or pre-existing portals open.
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
@@ -1577,33 +1577,34 @@ table_finish_bulk_insert(Relation rel, int options)
*/
/*
- * Create storage for `rel` in `newrnode`, with persistence set to
+ * Create storage for `rel` in `newrlocator`, with persistence set to
* `persistence`.
*
* This is used both during relation creation and various DDL operations to
- * create a new relfilenode that can be filled from scratch. When creating
- * new storage for an existing relfilenode, this should be called before the
+ * create a new storage that can be filled from scratch. When creating
+ * new storage for an existing relfilelocator, this should be called before the
* relcache entry has been updated.
*
* *freezeXid, *minmulti are set to the xid / multixact horizon for the table
* that pg_class.{relfrozenxid, relminmxid} have to be set to.
*/
static inline void
-table_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+table_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
- rel->rd_tableam->relation_set_new_filenode(rel, newrnode, persistence,
- freezeXid, minmulti);
+ rel->rd_tableam->relation_set_new_filelocator(rel, newrlocator,
+ persistence, freezeXid,
+ minmulti);
}
/*
* Remove all table contents from `rel`, in a non-transactional manner.
* Non-transactional meaning that there's no need to support rollbacks. This
- * commonly only is used to perform truncations for relfilenodes created in the
- * current transaction.
+ * commonly only is used to perform truncations for relation storage created in
+ * the current transaction.
*/
static inline void
table_relation_nontransactional_truncate(Relation rel)
@@ -1612,15 +1613,15 @@ table_relation_nontransactional_truncate(Relation rel)
}
/*
- * Copy data from `rel` into the new relfilenode `newrnode`. The new
- * relfilenode may not have storage associated before this function is
+ * Copy data from `rel` into the new relfilelocator `newrlocator`. The new
+ * relfilelocator may not have storage associated before this function is
* called. This is only supposed to be used for low level operations like
* changing a relation's tablespace.
*/
static inline void
-table_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+table_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
- rel->rd_tableam->relation_copy_data(rel, newrnode);
+ rel->rd_tableam->relation_copy_data(rel, newrlocator);
}
/*
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 4794941..7d2b352 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -19,7 +19,7 @@
#include "datatype/timestamp.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/sinval.h"
/*
@@ -174,7 +174,7 @@ typedef struct SavedTransactionCharacteristics
*/
#define XACT_XINFO_HAS_DBINFO (1U << 0)
#define XACT_XINFO_HAS_SUBXACTS (1U << 1)
-#define XACT_XINFO_HAS_RELFILENODES (1U << 2)
+#define XACT_XINFO_HAS_RELFILELOCATORS (1U << 2)
#define XACT_XINFO_HAS_INVALS (1U << 3)
#define XACT_XINFO_HAS_TWOPHASE (1U << 4)
#define XACT_XINFO_HAS_ORIGIN (1U << 5)
@@ -252,12 +252,12 @@ typedef struct xl_xact_subxacts
} xl_xact_subxacts;
#define MinSizeOfXactSubxacts offsetof(xl_xact_subxacts, subxacts)
-typedef struct xl_xact_relfilenodes
+typedef struct xl_xact_relfilelocators
{
int nrels; /* number of relations */
- RelFileNode xnodes[FLEXIBLE_ARRAY_MEMBER];
-} xl_xact_relfilenodes;
-#define MinSizeOfXactRelfilenodes offsetof(xl_xact_relfilenodes, xnodes)
+ RelFileLocator xlocators[FLEXIBLE_ARRAY_MEMBER];
+} xl_xact_relfilelocators;
+#define MinSizeOfXactRelfileLocators offsetof(xl_xact_relfilelocators, xlocators)
/*
* A transactionally dropped statistics entry.
@@ -305,7 +305,7 @@ typedef struct xl_xact_commit
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* xl_xact_invals follows if XINFO_HAS_INVALS */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -321,7 +321,7 @@ typedef struct xl_xact_abort
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* No invalidation messages needed. */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -367,7 +367,7 @@ typedef struct xl_xact_parsed_commit
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -378,7 +378,7 @@ typedef struct xl_xact_parsed_commit
TransactionId twophase_xid; /* only for 2PC */
char twophase_gid[GIDSIZE]; /* only for 2PC */
int nabortrels; /* only for 2PC */
- RelFileNode *abortnodes; /* only for 2PC */
+ RelFileLocator *abortlocators; /* only for 2PC */
int nabortstats; /* only for 2PC */
xl_xact_stats_item *abortstats; /* only for 2PC */
@@ -400,7 +400,7 @@ typedef struct xl_xact_parsed_abort
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -483,7 +483,7 @@ extern int xactGetCommittedChildren(TransactionId **ptr);
extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int nmsgs, SharedInvalidationMessage *msgs,
@@ -494,7 +494,7 @@ extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
extern XLogRecPtr XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int xactflags, TransactionId twophase_xid,
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index fae0bef..3524c39 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,7 +25,7 @@
#include "lib/stringinfo.h"
#include "pgtime.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index 5fc340c..c04f77b 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "storage/block.h"
#include "storage/buf.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/*
@@ -45,16 +45,16 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int max_block_id, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
+extern void XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator,
ForkNumber forknum, BlockNumber blknum, char *page,
uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
+extern XLogRecPtr log_newpage(RelFileLocator *rlocator, ForkNumber forkNum,
BlockNumber blk, char *page, bool page_std);
-extern void log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+extern void log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, char **pages, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index e73ea4a..5395f15 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -122,7 +122,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
@@ -430,10 +430,10 @@ extern FullTransactionId XLogRecGetFullXid(XLogReaderState *record);
extern bool RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern void XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum);
extern bool XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer);
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 052ac68..7e467ef 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "port/pg_crc32c.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* The overall layout of an XLOG record is:
@@ -97,7 +97,7 @@ typedef struct XLogRecordBlockHeader
* image) */
/* If BKPBLOCK_HAS_IMAGE, an XLogRecordBlockImageHeader struct follows */
- /* If BKPBLOCK_SAME_REL is not set, a RelFileNode follows */
+ /* If BKPBLOCK_SAME_REL is not set, a RelFileLocator follows */
/* BlockNumber follows */
} XLogRecordBlockHeader;
@@ -175,7 +175,7 @@ typedef struct XLogRecordBlockCompressHeader
(SizeOfXLogRecordBlockHeader + \
SizeOfXLogRecordBlockImageHeader + \
SizeOfXLogRecordBlockCompressHeader + \
- sizeof(RelFileNode) + \
+ sizeof(RelFileLocator) + \
sizeof(BlockNumber))
/*
@@ -187,7 +187,7 @@ typedef struct XLogRecordBlockCompressHeader
#define BKPBLOCK_HAS_IMAGE 0x10 /* block data is an XLogRecordBlockImage */
#define BKPBLOCK_HAS_DATA 0x20
#define BKPBLOCK_WILL_INIT 0x40 /* redo will re-init the page */
-#define BKPBLOCK_SAME_REL 0x80 /* RelFileNode omitted, same as previous */
+#define BKPBLOCK_SAME_REL 0x80 /* RelFileLocator omitted, same as previous */
/*
* XLogRecordDataHeaderShort/Long are used for the "main data" portion of
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index c9d0b75..ef18297 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -60,9 +60,9 @@ extern PGDLLIMPORT HotStandbyState standbyState;
extern bool XLogHaveInvalidPages(void);
extern void XLogCheckInvalidPages(void);
-extern void XLogDropRelation(RelFileNode rnode, ForkNumber forknum);
+extern void XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum);
extern void XLogDropDatabase(Oid dbid);
-extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+extern void XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks);
/* Result codes for XLogReadBufferForRedo[Extended] */
@@ -89,11 +89,11 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer);
-extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
+extern Relation CreateFakeRelcacheEntry(RelFileLocator rlocator);
extern void FreeFakeRelcacheEntry(Relation fakerel);
extern int read_local_xlog_page(XLogReaderState *state,
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..fd93442 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..66900f1 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,7 +38,8 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ Relation pg_class,
+ char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 07c5b88..5774c46 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1bdb00a 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/storage.h b/src/include/catalog/storage.h
index 59f3404..9964c31 100644
--- a/src/include/catalog/storage.h
+++ b/src/include/catalog/storage.h
@@ -15,23 +15,23 @@
#define STORAGE_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
/* GUC variables */
extern PGDLLIMPORT int wal_skip_threshold;
-extern SMgrRelation RelationCreateStorage(RelFileNode rnode,
+extern SMgrRelation RelationCreateStorage(RelFileLocator rlocator,
char relpersistence,
bool register_delete);
extern void RelationDropStorage(Relation rel);
-extern void RelationPreserveStorage(RelFileNode rnode, bool atCommit);
+extern void RelationPreserveStorage(RelFileLocator rlocator, bool atCommit);
extern void RelationPreTruncate(Relation rel);
extern void RelationTruncate(Relation rel, BlockNumber nblocks);
extern void RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
-extern bool RelFileNodeSkippingWAL(RelFileNode rnode);
+extern bool RelFileLocatorSkippingWAL(RelFileLocator rlocator);
extern Size EstimatePendingSyncsSpace(void);
extern void SerializePendingSyncs(Size maxSize, char *startAddress);
extern void RestorePendingSyncs(char *startAddress);
@@ -42,7 +42,7 @@ extern void RestorePendingSyncs(char *startAddress);
*/
extern void smgrDoPendingDeletes(bool isCommit);
extern void smgrDoPendingSyncs(bool isCommit, bool isParallelWorker);
-extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr);
+extern int smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr);
extern void AtSubCommit_smgr(void);
extern void AtSubAbort_smgr(void);
extern void PostPrepare_smgr(void);
diff --git a/src/include/catalog/storage_xlog.h b/src/include/catalog/storage_xlog.h
index 622de22..44a5e20 100644
--- a/src/include/catalog/storage_xlog.h
+++ b/src/include/catalog/storage_xlog.h
@@ -17,7 +17,7 @@
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Declarations for smgr-related XLOG records
@@ -32,7 +32,7 @@
typedef struct xl_smgr_create
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
} xl_smgr_create;
@@ -46,11 +46,11 @@ typedef struct xl_smgr_create
typedef struct xl_smgr_truncate
{
BlockNumber blkno;
- RelFileNode rnode;
+ RelFileLocator rlocator;
int flags;
} xl_smgr_truncate;
-extern void log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum);
+extern void log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum);
extern void smgr_redo(XLogReaderState *record);
extern void smgr_desc(StringInfo buf, XLogReaderState *record);
diff --git a/src/include/commands/sequence.h b/src/include/commands/sequence.h
index 9da2300..d38c0e2 100644
--- a/src/include/commands/sequence.h
+++ b/src/include/commands/sequence.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "parser/parse_node.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
typedef struct FormData_pg_sequence_data
@@ -47,7 +47,7 @@ typedef FormData_pg_sequence_data *Form_pg_sequence_data;
typedef struct xl_seq_rec
{
- RelFileNode node;
+ RelFileLocator locator;
/* SEQUENCE TUPLE DATA FOLLOWS AT THE END */
} xl_seq_rec;
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..0c48654 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelFileNumber newRelFileNumber);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index 24b6473..1f80907 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -50,7 +50,7 @@ extern void DropTableSpace(DropTableSpaceStmt *stmt);
extern ObjectAddress RenameTableSpace(const char *oldname, const char *newname);
extern Oid AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
-extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
+extern void TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo);
extern Oid GetDefaultTablespace(char relpersistence, bool partitioned);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 13849a3..3ab7132 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -64,27 +64,27 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
/*
* Stuff for computing filesystem pathnames for relations.
*/
-extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
+extern char *GetDatabasePath(Oid dbOid, Oid spcOid);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber);
/*
* Wrapper macros for GetRelationPath. Beware of multiple
- * evaluation of the RelFileNode or RelFileNodeBackend argument!
+ * evaluation of the RelFileLocator or RelFileLocatorBackend argument!
*/
-/* First argument is a RelFileNode */
-#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
+/* First argument is a RelFileLocator */
+#define relpathbackend(rlocator, backend, forknum) \
+ GetRelationPath((rlocator).dbOid, (rlocator).spcOid, (rlocator).relNumber, \
backend, forknum)
-/* First argument is a RelFileNode */
-#define relpathperm(rnode, forknum) \
- relpathbackend(rnode, InvalidBackendId, forknum)
+/* First argument is a RelFileLocator */
+#define relpathperm(rlocator, forknum) \
+ relpathbackend(rlocator, InvalidBackendId, forknum)
-/* First argument is a RelFileNodeBackend */
-#define relpath(rnode, forknum) \
- relpathbackend((rnode).node, (rnode).backend, forknum)
+/* First argument is a RelFileLocatorBackend */
+#define relpath(rlocator, forknum) \
+ relpathbackend((rlocator).locator, (rlocator).backend, forknum)
#endif /* RELPATH_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f93d866..9a21417 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3248,10 +3248,10 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
- SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
- SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
- * oldNode */
+ RelFileNumber oldNumber; /* relfilenumber of existing storage, if any */
+ SubTransactionId oldCreateSubid; /* rd_createSubid of oldNumber */
+ SubTransactionId oldFirstRelfilelocatorSubid; /* rd_firstRelfilelocatorSubid
+ * of oldNumber */
bool unique; /* is index unique? */
bool nulls_not_distinct; /* null treatment for UNIQUE constraints */
bool primary; /* is index a primary key? */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..d8af68b 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,13 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelFileNumber data type identifies the specific relation file name.
+ */
+typedef Oid RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+#define RelFileNumberIsValid(relnumber) \
+ ((bool) ((relnumber) != InvalidRelFileNumber))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 2511ef4..b67fb1e 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -16,7 +16,7 @@
#define _BGWRITER_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 4a01f87..d109d0b 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -99,7 +99,7 @@ typedef struct ReorderBufferChange
struct
{
/* relation that has been changed */
- RelFileNode relnode;
+ RelFileLocator rlocator;
/* no previously reassembled toast chunks are necessary anymore */
bool clear_toast_afterwards;
@@ -145,7 +145,7 @@ typedef struct ReorderBufferChange
*/
struct
{
- RelFileNode node;
+ RelFileLocator locator;
ItemPointerData tid;
CommandId cmin;
CommandId cmax;
@@ -657,7 +657,7 @@ extern void ReorderBufferAddSnapshot(ReorderBuffer *, TransactionId, XLogRecPtr
extern void ReorderBufferAddNewCommandId(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
CommandId cid);
extern void ReorderBufferAddNewTupleCids(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
- RelFileNode node, ItemPointerData pt,
+ RelFileLocator locator, ItemPointerData pt,
CommandId cmin, CommandId cmax, CommandId combocid);
extern void ReorderBufferAddInvalidations(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
Size nmsgs, SharedInvalidationMessage *msgs);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index a17e7b2..b85b94f 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,30 +90,30 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rlocator.spcOid = InvalidOid, \
+ (a).rlocator.dbOid = InvalidOid, \
+ (a).rlocator.relNumber = InvalidRelFileNumber, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -291,11 +291,11 @@ extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
- Oid tsId;
- Oid relNode;
- ForkNumber forkNum;
- BlockNumber blockNum;
- int buf_id;
+ Oid tsId;
+ RelFileNumber relNumber;
+ ForkNumber forkNum;
+ BlockNumber blockNum;
+ int buf_id;
} CkptSortItem;
extern PGDLLIMPORT CkptSortItem *CkptBufferIds;
@@ -337,9 +337,9 @@ extern PrefetchBufferResult PrefetchLocalBuffer(SMgrRelation smgr,
extern BufferDesc *LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum,
BlockNumber blockNum, bool *foundPtr);
extern void MarkLocalBufferDirty(Buffer buffer);
-extern void DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
+extern void DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber firstDelBlock);
-extern void DropRelFileNodeAllLocalBuffers(RelFileNode rnode);
+extern void DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator);
extern void AtEOXact_LocalBuffers(bool isCommit);
#endif /* BUFMGR_INTERNALS_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 5839140..96e473e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,7 +17,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -176,13 +176,13 @@ extern PrefetchBufferResult PrefetchSharedBuffer(struct SMgrRelationData *smgr_r
BlockNumber blockNum);
extern PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum,
BlockNumber blockNum);
-extern bool ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum,
+extern bool ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, Buffer recent_buffer);
extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
+extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool permanent);
@@ -204,13 +204,13 @@ extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
extern void FlushOneBuffer(Buffer buffer);
extern void FlushRelationBuffers(Relation rel);
extern void FlushRelationsAllBuffers(struct SMgrRelationData **smgrs, int nrels);
-extern void CreateAndCopyRelationData(RelFileNode src_rnode,
- RelFileNode dst_rnode,
+extern void CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator,
bool permanent);
extern void FlushDatabaseBuffers(Oid dbid);
-extern void DropRelFileNodeBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
+extern void DropRelFileLocatorBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock);
-extern void DropRelFileNodesAllBuffers(struct SMgrRelationData **smgr_reln, int nnodes);
+extern void DropRelFileLocatorsAllBuffers(struct SMgrRelationData **smgr_reln, int nlocators);
extern void DropDatabaseBuffers(Oid dbid);
#define RelationGetNumberOfBlocks(reln) \
@@ -223,7 +223,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileLocator *rlocator,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/freespace.h b/src/include/storage/freespace.h
index dcc40eb..fcb0802 100644
--- a/src/include/storage/freespace.h
+++ b/src/include/storage/freespace.h
@@ -15,7 +15,7 @@
#define FREESPACE_H_
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* prototypes for public functions in freespace.c */
@@ -27,7 +27,7 @@ extern BlockNumber RecordAndGetPageWithFreeSpace(Relation rel,
Size spaceNeeded);
extern void RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk,
Size spaceAvail);
-extern void XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+extern void XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail);
extern BlockNumber FreeSpaceMapPrepareTruncateRel(Relation rel,
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ffffa40..10aa1b0 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -15,7 +15,7 @@
#define MD_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
@@ -25,7 +25,7 @@ extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
-extern void mdunlink(RelFileNodeBackend rnode, ForkNumber forknum, bool isRedo);
+extern void mdunlink(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo);
extern void mdextend(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern bool mdprefetch(SMgrRelation reln, ForkNumber forknum,
@@ -42,7 +42,7 @@ extern void mdtruncate(SMgrRelation reln, ForkNumber forknum,
extern void mdimmedsync(SMgrRelation reln, ForkNumber forknum);
extern void ForgetDatabaseSyncRequests(Oid dbid);
-extern void DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo);
+extern void DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo);
/* md sync callbacks */
extern int mdsyncfiletag(const FileTag *ftag, char *path);
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
new file mode 100644
index 0000000..7211fe7
--- /dev/null
+++ b/src/include/storage/relfilelocator.h
@@ -0,0 +1,99 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilelocator.h
+ * Physical access information for relations.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/relfilelocator.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILELOCATOR_H
+#define RELFILELOCATOR_H
+
+#include "common/relpath.h"
+#include "storage/backendid.h"
+
+/*
+ * RelFileLocator must provide all that we need to know to physically access
+ * a relation, with the exception of the backend ID, which can be provided
+ * separately. Note, however, that a "physical" relation is comprised of
+ * multiple files on the filesystem, as each fork is stored as a separate
+ * file, and each fork can be divided into multiple segments. See md.c.
+ *
+ * spcOid identifies the tablespace of the relation. It corresponds to
+ * pg_tablespace.oid.
+ *
+ * dbOid identifies the database of the relation. It is zero for
+ * "shared" relations (those common to all databases of a cluster).
+ * Nonzero dbOid values correspond to pg_database.oid.
+ *
+ * relNumber identifies the specific relation. relNumber corresponds to
+ * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
+ * to assign new physical files to relations in some situations).
+ * Notice that relNumber is only unique within a database in a particular
+ * tablespace.
+ *
+ * Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
+ * zero. We support shared relations only in the "global" tablespace.
+ *
+ * Note: in pg_class we allow reltablespace == 0 to denote that the
+ * relation is stored in its database's "default" tablespace (as
+ * identified by pg_database.dattablespace). However this shorthand
+ * is NOT allowed in RelFileLocator structs --- the real tablespace ID
+ * must be supplied when setting spcOid.
+ *
+ * Note: in pg_class, relfilenode can be zero to denote that the relation
+ * is a "mapped" relation, whose current true filenode number is available
+ * from relmapper.c. Again, this case is NOT allowed in RelFileLocators.
+ *
+ * Note: various places use RelFileLocator in hashtable keys. Therefore,
+ * there *must not* be any unused padding bytes in this struct. That
+ * should be safe as long as all the fields are of type Oid.
+ */
+typedef struct RelFileLocator
+{
+ Oid spcOid; /* tablespace */
+ Oid dbOid; /* database */
+ RelFileNumber relNumber; /* relation */
+} RelFileLocator;
+
+/*
+ * Augmenting a relfilelocator with the backend ID provides all the information
+ * we need to locate the physical storage. The backend ID is InvalidBackendId
+ * for regular relations (those accessible to more than one backend), or the
+ * owning backend's ID for backend-local relations. Backend-local relations
+ * are always transient and removed in case of a database crash; they are
+ * never WAL-logged or fsync'd.
+ */
+typedef struct RelFileLocatorBackend
+{
+ RelFileLocator locator;
+ BackendId backend;
+} RelFileLocatorBackend;
+
+#define RelFileLocatorBackendIsTemp(rlocator) \
+ ((rlocator).backend != InvalidBackendId)
+
+/*
+ * Note: RelFileLocatorEquals and RelFileLocatorBackendEquals compare relNumber first
+ * since that is most likely to be different in two unequal RelFileLocators. It
+ * is probably redundant to compare spcOid if the other fields are found equal,
+ * but do it anyway to be sure. Likewise for checking the backend ID in
+ * RelFileLocatorBackendEquals.
+ */
+#define RelFileLocatorEquals(locator1, locator2) \
+ ((locator1).relNumber == (locator2).relNumber && \
+ (locator1).dbOid == (locator2).dbOid && \
+ (locator1).spcOid == (locator2).spcOid)
+
+#define RelFileLocatorBackendEquals(locator1, locator2) \
+ ((locator1).locator.relNumber == (locator2).locator.relNumber && \
+ (locator1).locator.dbOid == (locator2).locator.dbOid && \
+ (locator1).backend == (locator2).backend && \
+ (locator1).locator.spcOid == (locator2).locator.spcOid)
+
+#endif /* RELFILELOCATOR_H */
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
deleted file mode 100644
index 4fdc606..0000000
--- a/src/include/storage/relfilenode.h
+++ /dev/null
@@ -1,99 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenode.h
- * Physical access information for relations.
- *
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/storage/relfilenode.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODE_H
-#define RELFILENODE_H
-
-#include "common/relpath.h"
-#include "storage/backendid.h"
-
-/*
- * RelFileNode must provide all that we need to know to physically access
- * a relation, with the exception of the backend ID, which can be provided
- * separately. Note, however, that a "physical" relation is comprised of
- * multiple files on the filesystem, as each fork is stored as a separate
- * file, and each fork can be divided into multiple segments. See md.c.
- *
- * spcNode identifies the tablespace of the relation. It corresponds to
- * pg_tablespace.oid.
- *
- * dbNode identifies the database of the relation. It is zero for
- * "shared" relations (those common to all databases of a cluster).
- * Nonzero dbNode values correspond to pg_database.oid.
- *
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
- *
- * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
- * zero. We support shared relations only in the "global" tablespace.
- *
- * Note: in pg_class we allow reltablespace == 0 to denote that the
- * relation is stored in its database's "default" tablespace (as
- * identified by pg_database.dattablespace). However this shorthand
- * is NOT allowed in RelFileNode structs --- the real tablespace ID
- * must be supplied when setting spcNode.
- *
- * Note: in pg_class, relfilenode can be zero to denote that the relation
- * is a "mapped" relation, whose current true filenode number is available
- * from relmapper.c. Again, this case is NOT allowed in RelFileNodes.
- *
- * Note: various places use RelFileNode in hashtable keys. Therefore,
- * there *must not* be any unused padding bytes in this struct. That
- * should be safe as long as all the fields are of type Oid.
- */
-typedef struct RelFileNode
-{
- Oid spcNode; /* tablespace */
- Oid dbNode; /* database */
- Oid relNode; /* relation */
-} RelFileNode;
-
-/*
- * Augmenting a relfilenode with the backend ID provides all the information
- * we need to locate the physical storage. The backend ID is InvalidBackendId
- * for regular relations (those accessible to more than one backend), or the
- * owning backend's ID for backend-local relations. Backend-local relations
- * are always transient and removed in case of a database crash; they are
- * never WAL-logged or fsync'd.
- */
-typedef struct RelFileNodeBackend
-{
- RelFileNode node;
- BackendId backend;
-} RelFileNodeBackend;
-
-#define RelFileNodeBackendIsTemp(rnode) \
- ((rnode).backend != InvalidBackendId)
-
-/*
- * Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
- * since that is most likely to be different in two unequal RelFileNodes. It
- * is probably redundant to compare spcNode if the other fields are found equal,
- * but do it anyway to be sure. Likewise for checking the backend ID in
- * RelFileNodeBackendEquals.
- */
-#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
- (node1).dbNode == (node2).dbNode && \
- (node1).spcNode == (node2).spcNode)
-
-#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
- (node1).node.dbNode == (node2).node.dbNode && \
- (node1).backend == (node2).backend && \
- (node1).node.spcNode == (node2).node.spcNode)
-
-#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index e7cd456..56c6fc9 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -16,7 +16,7 @@
#include <signal.h>
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* We support several types of shared-invalidation messages:
@@ -90,7 +90,7 @@ typedef struct
int8 id; /* type field --- must be first */
int8 backend_hi; /* high bits of backend ID, if temprel */
uint16 backend_lo; /* low bits of backend ID, if temprel */
- RelFileNode rnode; /* spcNode, dbNode, relNode */
+ RelFileLocator rlocator; /* spcOid, dbOid, relNumber */
} SharedInvalSmgrMsg;
#define SHAREDINVALRELMAP_ID (-4)
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 6b63c60..a077153 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -16,7 +16,7 @@
#include "lib/ilist.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* smgr.c maintains a table of SMgrRelation objects, which are essentially
@@ -38,8 +38,8 @@
*/
typedef struct SMgrRelationData
{
- /* rnode is the hashtable lookup key, so it must be first! */
- RelFileNodeBackend smgr_rnode; /* relation physical identifier */
+ /* rlocator is the hashtable lookup key, so it must be first! */
+ RelFileLocatorBackend smgr_rlocator; /* relation physical identifier */
/* pointer to owning pointer, or NULL if none */
struct SMgrRelationData **smgr_owner;
@@ -75,16 +75,16 @@ typedef struct SMgrRelationData
typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
- RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+ RelFileLocatorBackendIsTemp((smgr)->smgr_rlocator)
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(RelFileLocator rlocator, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclose(SMgrRelation reln);
extern void smgrcloseall(void);
-extern void smgrclosenode(RelFileNodeBackend rnode);
+extern void smgrcloserellocator(RelFileLocatorBackend rlocator);
extern void smgrrelease(SMgrRelation reln);
extern void smgrreleaseall(void);
extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 6a77632..dacef92 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -17,7 +17,7 @@
#include "datatype/timestamp.h"
#include "storage/lock.h"
#include "storage/procsignal.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/standbydefs.h"
/* User-settable GUC parameters */
@@ -30,9 +30,9 @@ extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithTablespace(Oid tsid);
extern void ResolveRecoveryConflictWithDatabase(Oid dbid);
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..049af87 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -13,7 +13,7 @@
#ifndef SYNC_H
#define SYNC_H
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Type of sync request. These are used to manage the set of pending
@@ -51,7 +51,7 @@ typedef struct FileTag
{
int16 handler; /* SyncRequestHandler value, saving space */
int16 forknum; /* ForkNumber, saving space */
- RelFileNode rnode;
+ RelFileLocator rlocator;
uint32 segno;
} FileTag;
diff --git a/src/include/utils/inval.h b/src/include/utils/inval.h
index 0e0323b..23748b7 100644
--- a/src/include/utils/inval.h
+++ b/src/include/utils/inval.h
@@ -15,7 +15,7 @@
#define INVAL_H
#include "access/htup.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
extern PGDLLIMPORT int debug_discard_caches;
@@ -48,7 +48,7 @@ extern void CacheInvalidateRelcacheByTuple(HeapTuple classTuple);
extern void CacheInvalidateRelcacheByRelid(Oid relid);
-extern void CacheInvalidateSmgr(RelFileNodeBackend rnode);
+extern void CacheInvalidateSmgr(RelFileLocatorBackend rlocator);
extern void CacheInvalidateRelmap(Oid databaseId);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1896a9a..e5b6662 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -23,7 +23,7 @@
#include "partitioning/partdefs.h"
#include "rewrite/prs2lock.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/reltrigger.h"
@@ -53,7 +53,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
- RelFileNode rd_node; /* relation physical identifier */
+ RelFileLocator rd_locator; /* relation physical identifier */
SMgrRelation rd_smgr; /* cached file handle, or NULL */
int rd_refcnt; /* reference count */
BackendId rd_backend; /* owning backend id, if temporary relation */
@@ -66,44 +66,44 @@ typedef struct RelationData
/*----------
* rd_createSubid is the ID of the highest subtransaction the rel has
- * survived into or zero if the rel or its rd_node was created before the
- * current top transaction. (IndexStmt.oldNode leads to the case of a new
- * rel with an old rd_node.) rd_firstRelfilenodeSubid is the ID of the
- * highest subtransaction an rd_node change has survived into or zero if
- * rd_node matches the value it had at the start of the current top
+ * survived into or zero if the rel or its rd_locator was created before the
+ * current top transaction. (IndexStmt.oldNumber leads to the case of a new
+ * rel with an old rd_locator.) rd_firstRelfilelocatorSubid is the ID of the
+ * highest subtransaction an rd_locator change has survived into or zero if
+ * rd_locator matches the value it had at the start of the current top
* transaction. (Rolling back the subtransaction that
- * rd_firstRelfilenodeSubid denotes would restore rd_node to the value it
+ * rd_firstRelfilelocatorSubid denotes would restore rd_locator to the value it
* had at the start of the current top transaction. Rolling back any
* lower subtransaction would not.) Their accuracy is critical to
* RelationNeedsWAL().
*
- * rd_newRelfilenodeSubid is the ID of the highest subtransaction the
- * most-recent relfilenode change has survived into or zero if not changed
+ * rd_newRelfilelocatorSubid is the ID of the highest subtransaction the
+ * most-recent relfilenumber change has survived into or zero if not changed
* in the current transaction (or we have forgotten changing it). This
* field is accurate when non-zero, but it can be zero when a relation has
- * multiple new relfilenodes within a single transaction, with one of them
+ * multiple new relfilenumbers within a single transaction, with one of them
* occurring in a subsequently aborted subtransaction, e.g.
* BEGIN;
* TRUNCATE t;
* SAVEPOINT save;
* TRUNCATE t;
* ROLLBACK TO save;
- * -- rd_newRelfilenodeSubid is now forgotten
+ * -- rd_newRelfilelocatorSubid is now forgotten
*
* If every rd_*Subid field is zero, they are read-only outside
- * relcache.c. Files that trigger rd_node changes by updating
+ * relcache.c. Files that trigger rd_locator changes by updating
* pg_class.reltablespace and/or pg_class.relfilenode call
- * RelationAssumeNewRelfilenode() to update rd_*Subid.
+ * RelationAssumeNewRelfilelocator() to update rd_*Subid.
*
* rd_droppedSubid is the ID of the highest subtransaction that a drop of
* the rel has survived into. In entries visible outside relcache.c, this
* is always zero.
*/
SubTransactionId rd_createSubid; /* rel was created in current xact */
- SubTransactionId rd_newRelfilenodeSubid; /* highest subxact changing
- * rd_node to current value */
- SubTransactionId rd_firstRelfilenodeSubid; /* highest subxact changing
- * rd_node to any value */
+ SubTransactionId rd_newRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to current value */
+ SubTransactionId rd_firstRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to any value */
SubTransactionId rd_droppedSubid; /* dropped with another Subid set */
Form_pg_class rd_rel; /* RELATION tuple */
@@ -531,12 +531,12 @@ typedef struct ViewOptions
/*
* RelationIsMapped
- * True if the relation uses the relfilenode map. Note multiple eval
+ * True if the relation uses the relfilenumber map. Note multiple eval
* of argument!
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelFileNumber))
/*
* RelationGetSmgr
@@ -555,7 +555,7 @@ static inline SMgrRelation
RelationGetSmgr(Relation rel)
{
if (unlikely(rel->rd_smgr == NULL))
- smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_node, rel->rd_backend));
+ smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_locator, rel->rd_backend));
return rel->rd_smgr;
}
@@ -607,12 +607,12 @@ RelationGetSmgr(Relation rel)
*
* Returns false if wal_level = minimal and this relation is created or
* truncated in the current transaction. See "Skipping WAL for New
- * RelFileNode" in src/backend/access/transam/README.
+ * RelFileLocator" in src/backend/access/transam/README.
*/
#define RelationNeedsWAL(relation) \
(RelationIsPermanent(relation) && (XLogIsNeeded() || \
(relation->rd_createSubid == InvalidSubTransactionId && \
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)))
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)))
/*
* RelationUsesLocalBuffers
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index c93d865..ba35d6b 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -103,7 +103,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -111,10 +111,10 @@ extern Relation RelationBuildLocalRelation(const char *relname,
char relkind);
/*
- * Routines to manage assignment of new relfilenode to a relation
+ * Routines to manage assignment of new relfilenumber to a relation
*/
-extern void RelationSetNewRelfilenode(Relation relation, char persistence);
-extern void RelationAssumeNewRelfilenode(Relation relation);
+extern void RelationSetNewRelfilenumber(Relation relation, char persistence);
+extern void RelationAssumeNewRelfilelocator(Relation relation);
/*
* Routines for flushing/rebuilding relcache entries in various scenarios
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
deleted file mode 100644
index 77d8046..0000000
--- a/src/include/utils/relfilenodemap.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.h
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/relfilenodemap.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODEMAP_H
-#define RELFILENODEMAP_H
-
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
-
-#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
new file mode 100644
index 0000000..c149a93
--- /dev/null
+++ b/src/include/utils/relfilenumbermap.h
@@ -0,0 +1,19 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.h
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/relfilenumbermap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILENUMBERMAP_H
+#define RELFILENUMBERMAP_H
+
+extern Oid RelidByRelfilenumber(Oid reltablespace,
+ RelFileNumber relfilenumber);
+
+#endif /* RELFILENUMBERMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 557f77e..2bb2e25 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.h
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
*
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
@@ -35,14 +35,15 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumber(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
-extern Oid RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId);
+extern Oid RelationMapFilenumberToOid(RelFileNumber relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumberForDatabase(char *dbpath,
+ Oid relationId);
extern void RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath,
char *dstdbpath);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
- bool immediate);
+extern void RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber,
+ bool shared, bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
index 4700d49..869d9d5 100644
--- a/src/test/recovery/t/018_wal_optimize.pl
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -5,7 +5,7 @@
#
# These tests exercise code that once violated the mandate described in
# src/backend/access/transam/README section "Skipping WAL for New
-# RelFileNode". The tests work by committing some transactions, initiating an
+# RelFileLocator". The tests work by committing some transactions, initiating an
# immediate shutdown, and confirming that the expected data survives recovery.
# For many years, individual commands made the decision to skip WAL, hence the
# frequent appearance of COPY in these tests.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4fb7469..11b68b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2255,8 +2255,8 @@ ReindexObjectType
ReindexParams
ReindexStmt
ReindexType
-RelFileNode
-RelFileNodeBackend
+RelFileLocator
+RelFileLocatorBackend
RelIdCacheEnt
RelInfo
RelInfoArr
@@ -2274,8 +2274,8 @@ RelationPtr
RelationSyncEntry
RelcacheCallbackFunction
ReleaseMatchCB
-RelfilenodeMapEntry
-RelfilenodeMapKey
+RelfilenumberMapEntry
+RelfilenumberMapKey
Relids
RelocationBufferInfo
RelptrFreePageBtree
@@ -3877,7 +3877,7 @@ xl_xact_parsed_abort
xl_xact_parsed_commit
xl_xact_parsed_prepare
xl_xact_prepare
-xl_xact_relfilenodes
+xl_xact_relfilelocators
xl_xact_stats_item
xl_xact_stats_items
xl_xact_subxacts
--
1.8.3.1
On Fri, Jul 1, 2022 at 6:42 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
- I might be missing something here, but this isn't actually making
the relfilenode 56 bits, is it? The reason to do that is to make the
BufferTag smaller, so I expected to see that BufferTag either used
bitfields like RelFileNumber relNumber:56 and ForkNumber forkNum:8, or
else that it just declared a single field for both as uint64 and used
accessor macros or static inlines to separate them out. But it doesn't
seem to do either of those things, which seems like it can't be right.
On a related note, I think it would be better to declare RelFileNumber
as an unsigned type even though we have no use for the high bit; we
have, equally, no use for negative values. It's easier to reason about
bit-shifting operations with unsigned types.Opps, somehow missed to merge that change in the patch. Changed that
like below and adjusted the macros.
typedef struct buftag
{
Oid spcOid; /* tablespace oid. */
Oid dbOid; /* database oid. */
uint32 relNumber_low; /* relfilenumber 32 lower bits */
uint32 relNumber_hi:24; /* relfilenumber 24 high bits */
uint32 forkNum:8; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;I think we need to break like this to keep the BufferTag 4 byte
aligned otherwise the size of the structure will be increased.
Well, I guess you're right. That's a bummer. In that case I'm a little
unsure whether it's worth using bit fields at all. Maybe we should
just write uint32 something[2] and use macros after that.
Another approach could be to accept the padding and define a constant
SizeOfBufferTag and use that as the hash table element size, like we
do for the sizes of xlog records.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jul 5, 2022 at 4:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I thought about this comment from Robert
that's not quite the same as either of those things. For example, in
tableam.h we currently say "This callback needs to create a new
relation filenode for `rel`" and how should that be changed in this
new naming? We're not creating a new RelFileNumber - those would need
to be allocated, not created, as all the numbers in the universe exist
already. Neither are we creating a new locator; that sounds like it
means assembling it from pieces.I think that "This callback needs to create a new relation storage
for `rel`" looks better.
I like the idea, but it would sound better to say "create new relation
storage" rather than "create a new relation storage."
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, Jul 6, 2022 at 2:32 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jul 5, 2022 at 4:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I thought about this comment from Robert
that's not quite the same as either of those things. For example, in
tableam.h we currently say "This callback needs to create a new
relation filenode for `rel`" and how should that be changed in this
new naming? We're not creating a new RelFileNumber - those would need
to be allocated, not created, as all the numbers in the universe exist
already. Neither are we creating a new locator; that sounds like it
means assembling it from pieces.I think that "This callback needs to create a new relation storage
for `rel`" looks better.I like the idea, but it would sound better to say "create new relation
storage" rather than "create a new relation storage."
Okay, changed that and changed a few more occurrences in 0001 which
were on similar lines. I also tested the performance of pg_bench
where concurrently I am running the script which creates/drops
relation but I do not see any regression with fairly small values of
VAR_RELNUMBER_PREFETCH, the smallest value I tried was 8. That
doesn't mean I am suggesting this small value but I think we can keep
the value something like 512 or 1024 without worrying much about the
performance, so changed to 512 in the latest patch.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v6-0004-Assert-checking-to-be-merged-with-0003.patchtext/x-patch; charset=US-ASCII; name=v6-0004-Assert-checking-to-be-merged-with-0003.patchDownload
From 172022c565a51478062ee8afcd0722880c2cb62e Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Mon, 4 Jul 2022 14:51:21 +0530
Subject: [PATCH v6 4/5] Assert checking (to be merged with 0003)
---
src/backend/catalog/catalog.c | 54 ++++++++++++++++++++++++++++++++++++++++
src/backend/catalog/heap.c | 5 ++++
src/backend/catalog/storage.c | 6 +++++
src/backend/commands/tablecmds.c | 3 +++
src/include/catalog/catalog.h | 9 +++++++
5 files changed, 77 insertions(+)
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 155400c..9a22203 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -583,3 +583,57 @@ pg_stop_making_pinned_objects(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+
+#ifdef USE_ASSERT_CHECKING
+
+/*
+ * Assert that there is no existing diskfile for input relnumber.
+ */
+void
+AssertRelfileNumberFileNotExists(Oid spcoid, RelFileNumber relnumber,
+ char relpersistence)
+{
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ /*
+ * If we ever get here during pg_upgrade, there's something wrong; all
+ * relfilenode assignments during a binary-upgrade run should be
+ * determined by commands in the dump script.
+ */
+ Assert(!IsBinaryUpgrade);
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid = spcoid ? spcoid : MyDatabaseTableSpace;
+ rlocator.locator.dbOid =
+ (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid :
+ MyDatabaseId;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must initialize
+ * that properly here to make sure that any collisions based on filename
+ * are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+
+ Assert(access(rpath, F_OK) != 0);
+}
+#endif
\ No newline at end of file
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 4b813e9..d8c25a6 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -347,7 +347,12 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
+ {
relfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(reltablespace,
+ relfilenumber,
+ relpersistence);
+ }
}
/*
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d024d94..7ab2400 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,9 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +984,9 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 671b04d..8647801 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14378,6 +14378,9 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* relfilenumber file.
*/
newrelfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(newTableSpace,
+ newrelfilenumber,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index b452530..be6ba13 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -39,4 +39,13 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
+#ifdef USE_ASSERT_CHECKING
+extern void AssertRelfileNumberFileNotExists(Oid spcoid,
+ RelFileNumber relnumber,
+ char relpersistence);
+#else
+#define AssertRelfileNumberFileNotExists(spcoid, relnumber, relpersistence) \
+ ((void)true)
+#endif
+
#endif /* CATALOG_H */
--
1.8.3.1
v6-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchtext/x-patch; charset=UTF-8; name=v6-0003-Use-56-bits-for-relfilenumber-to-avoid-wraparound.patchDownload
From 54858e52c0bdfe7b4dbcc81fbf251a0c97972f13 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Tue, 5 Jul 2022 12:56:31 +0530
Subject: [PATCH v6 3/5] Use 56 bits for relfilenumber to avoid wraparound
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As part of this patch, we will make the relfilenumber 56 bits wide.
But the problem is that if we make it 56 bits wide then the size
of the BufferTag will be increased which will increase the memory
usage and that may also impact the performance. So in order to
avoid that inside the buffer tag, instead of using 64 bits for the
relfilenumber we will use 8 bits for the fork number and 56 bits for
the relfilenumber.
---
contrib/pg_buffercache/Makefile | 3 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 31 ++++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 +++--
src/backend/access/transam/README | 4 +-
src/backend/access/transam/varsup.c | 94 +++++++++++++++++++++-
src/backend/access/transam/xlog.c | 48 +++++++++++
src/backend/access/transam/xlogprefetcher.c | 24 +++---
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 8 +-
src/backend/catalog/catalog.c | 93 ---------------------
src/backend/catalog/heap.c | 15 ++--
src/backend/catalog/index.c | 11 +--
src/backend/commands/tablecmds.c | 10 ++-
src/backend/nodes/outfuncs.c | 2 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 4 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 4 +-
src/backend/utils/adt/pg_upgrade_support.c | 9 ++-
src/backend/utils/cache/relcache.c | 3 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 ++-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 ++---
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 11 +--
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 ++---
src/fe_utils/option_utils.c | 42 ++++++++++
src/include/access/transam.h | 5 ++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 ++--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +--
src/include/fe_utils/option_utils.h | 3 +
src/include/postgres_ext.h | 7 +-
src/include/storage/buf_internals.h | 18 +++--
src/include/storage/relfilelocator.h | 12 ++-
src/test/regress/expected/alter_table.out | 24 +++---
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
59 files changed, 434 insertions(+), 261 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..2fbb62f 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -7,7 +7,8 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+ pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
+ pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index abc8813..4e3884b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +247,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 4f3f375..3a48c35 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..e21559d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..18ee70d 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,8 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because relfilenumber is 56 bit wide so logically there should not be any
+collisions. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..eb35f0f 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to prefetch (preallocate) per XLOG write */
+#define VAR_RELNUMBER_PREFETCH 512
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,94 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GenerateNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely (ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /* if we run out of logged RelFileNumber then we must log more */
+ if (ShmemVariableCache->relnumbercount == 0)
+ {
+ LogNextRelFileNumber(ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PREFETCH);
+
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+ (ShmemVariableCache->relnumbercount)--;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ int relnumbercount;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the relfilenode for the objects can be in any
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * Check if we set the new relfilenumber then do we run out of the logged
+ * relnumber, if so then we need to WAL log again. Otherwise, just adjust
+ * the relnumbercount.
+ */
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH);
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1b2f240..1fbb5af 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5026,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6472,6 +6477,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ checkPoint.nextRelFileNumber += ShmemVariableCache->relnumbercount;
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7350,6 +7361,29 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * Flush xlog record to disk before returning. To protect against file
+ * system changes reaching the disk before the XLOG_NEXT_RELFILENUMBER log.
+ *
+ * This should not impact the performance because we are WAL logging the
+ * RelFileNumber after assigning every 64 RelFileNumber
+ */
+ XLogFlush(recptr);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7564,6 +7598,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7578,6 +7622,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c469610..6ce6d29 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -573,9 +573,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
/*
* Don't try to prefetch anything in this database until
- * it has been created, or we might confuse the blocks of
- * different generations, if a database OID or
- * relfilenumber is reused. It's also more efficient than
+ * it has been created, because it's more efficient than
* discovering that relations don't exist on disk yet with
* ENOENT errors.
*/
@@ -600,10 +598,8 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
/*
* Don't prefetch anything for this whole relation
- * until it has been created. Otherwise we might
- * confuse the blocks of different generations, if a
- * relfilenumber is reused. This also avoids the need
- * to discover the problem via extra syscalls that
+ * until it has been created. This avoids the need to
+ * discover the problem via extra syscalls that
* report ENOENT.
*/
XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
@@ -611,7 +607,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +630,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +729,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +750,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +789,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -931,7 +927,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -947,7 +943,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5..1d95fc0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..8a56e8d 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2a33273..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,99 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidOid; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index c69c923..4b813e9 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -347,7 +347,7 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ relfilenumber = GetNewRelFileNumber();
}
/*
@@ -900,7 +900,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1172,12 +1172,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1231,8 +1226,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 3dc535e..68dabeb 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -900,12 +900,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -937,8 +932,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d9530a3..671b04d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14371,11 +14371,13 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenumber. Although relfilenumber are unique within a
+ * cluster, we are unable to use the old relfilenumber since unused
+ * relfilenumber are not unlinked until commit. So if within a
+ * transaction, if we set the old tablespace again, we will get conflicting
+ * relfilenumber file.
*/
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber();
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 05f27f0..c30fca2 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2932,7 +2932,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNumber);
+ WRITE_UINT64_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilelocatorSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index f8fb228..4366ae6 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..b64dbe7 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..aaf8881 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX although this all was true when we had 32bits relfilenumber but now we
+ * have 56bits relfilenumber so we don't have risk of relfilenumber being
+ * reused so in future we can immediately unlink the first segment as well.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index b21d8c3..5f6c12a 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 36ec845..65f76ce 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,7 +898,7 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenumber doesn't misbehave */
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c260c97..291dff0 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -98,10 +98,11 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +121,11 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +144,11 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 0639875..a1c159b 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3708,8 +3708,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
RelFileLocator newrlocator;
/* Allocate a new relfilenumber */
- newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ newrelfilenumber = GetNewRelFileNumber();
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index 3dc45e9..a5ec78c 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " INT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 21dfe1b..65fc623 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 6b90e7c..54861d5 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4835,16 +4835,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4862,7 +4862,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4876,7 +4876,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4884,7 +4884,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4897,7 +4897,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5d30b87..ea62e7d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,11 +399,11 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
- char query[QUERY_ALLOC];
- char *last_namespace = NULL,
- *last_tablespace = NULL;
+ RelFileNumber i_relfilenumber;
+ char query[QUERY_ALLOC];
+ char *last_namespace = NULL,
+ *last_tablespace = NULL;
query[0] = '\0'; /* initialize query string to empty */
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 265d829..4c4f03a 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index b3ad820..50e94df 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..2cb3370 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..37afdd1 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -213,6 +213,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ uint32 relnumbercount; /* relfilenumbers available before must do
+ XLOG work */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +296,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..bd683cc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 66900f1..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..4768e5e 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..d5e6172 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index d8af68b..ecdfc90 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,14 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 78484a9..91f64d9 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,16 +92,19 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid. */
Oid dbOid; /* database oid. */
- RelFileNumber relNumber; /* relation file number. */
- ForkNumber forkNum;
+ uint32 relNumber_low; /* relfilenumber 32 lower bits */
+ uint32 relNumber_hi:24; /* relfilenumber 24 high bits */
+ uint32 forkNum:8; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define BufTagGetFileNumber(a) ((a).relNumber)
+#define BufTagGetFileNumber(a) \
+ ((((uint64) (a).relNumber_hi << 32) | ((uint32) (a).relNumber_low)))
#define BufTagSetFileNumber(a, relnumber) \
( \
- (a).relNumber = (relnumber) \
+ (a).relNumber_hi = (relnumber) >> 32, \
+ (a).relNumber_low = (relnumber) & 0xffffffff \
)
#define CLEAR_BUFFERTAG(a) \
@@ -126,7 +129,8 @@ typedef struct buftag
( \
(a).spcOid == (b).spcOid && \
(a).dbOid == (b).dbOid && \
- (a).relNumber == (b).relNumber && \
+ (a).relNumber_low == (b).relNumber_low && \
+ (a).relNumber_hi == (b).relNumber_hi && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -135,14 +139,14 @@ typedef struct buftag
do { \
(locator).spcOid = (a).spcOid; \
(locator).dbOid = (a).dbOid; \
- (locator).relNumber = (a).relNumber; \
+ (locator).relNumber = BufTagGetFileNumber(a); \
} while(0)
#define BufTagRelFileLocatorEquals(a, locator) \
( \
(a).spcOid == (locator).spcOid && \
(a).dbOid == (locator).dbOid && \
- (a).relNumber == (locator).relNumber \
+ BufTagGetFileNumber(a) == (locator).relNumber \
)
/*
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 7211fe7..6046506 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -34,8 +34,7 @@
* relNumber identifies the specific relation. relNumber corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * Notice that relNumber is unique within a cluster.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +74,15 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
+/*
+ * Max value of the relfilnumber. RelFileNumber will be of 56bits wide for
+ * more details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 5ede56d..6230fcb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 52001e3..4190b12 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
v6-0005-Don-t-delay-removing-Tombstone-file-until-next.patchtext/x-patch; charset=US-ASCII; name=v6-0005-Don-t-delay-removing-Tombstone-file-until-next.patchDownload
From 554a748e04bd4c2529443e47fd3f3ee6838ad9fd Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Tue, 5 Jul 2022 13:25:39 +0530
Subject: [PATCH v6 5/5] Don't delay removing Tombstone file until next
checkpoint
Prior to making relfilenode to 56bit wider, we can not
remove the unused relfilenode until the next checkpoint
because if we remove them immediately then there is a risk
of reusing the same relfilenode for two different relations
during single checkpoint due to Oid wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/storage/smgr/md.c | 151 +++++++++++---------------------------
src/backend/storage/sync/sync.c | 101 -------------------------
src/include/storage/sync.h | 2 -
4 files changed, 43 insertions(+), 216 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1fbb5af..e298318 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6641,11 +6641,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index aaf8881..7d15920 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -24,6 +24,7 @@
#include <unistd.h>
#include <fcntl.h>
#include <sys/file.h>
+#include <sys/stat.h>
#include "access/xlog.h"
#include "access/xlogutils.h"
@@ -126,8 +127,6 @@ static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -240,38 +239,14 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
* to delete all forks.
*
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
- * the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenumber
- * from being reused. The scenario this protects us from is:
- * 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenumber as
- * the just-deleted one (OIDs must've wrapped around for that to happen).
- * 3. We crash before another checkpoint occurs.
- * During replay, we would delete the file and then recreate it, which is fine
- * if the contents of the file were repopulated by subsequent WAL entries.
- * But if we didn't WAL-log insertions, but instead relied on fsyncing the
- * file after populating it (as we do at wal_level=minimal), the contents of
- * the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenumber until it's
- * safe, because relfilenumber assignment skips over any existing file.
- *
- * XXX although this all was true when we had 32bits relfilenumber but now we
- * have 56bits relfilenumber so we don't have risk of relfilenumber being
- * reused so in future we can immediately unlink the first segment as well.
- *
- * We do not need to go through this dance for temp relations, though, because
- * we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenumber.
- * The fact that temp rels and regular rels have different file naming
- * patterns provides additional safety.
+ * We do not carefully track whether other forks have been created or not, but
+ * just attempt to unlink them unconditionally; so we should never complain
+ * about ENOENT.
*
- * All the above applies only to the relation's main fork; other forks can
- * just be removed immediately, since they are not needed to prevent the
- * relfilenumber from being recycled. Also, we do not carefully
- * track whether other forks have been created or not, but just attempt to
- * unlink them unconditionally; so we should never complain about ENOENT.
+ * Note that now we can immediately unlink the first segment of the regular
+ * relation as well because the relfilenumber is 56 bits wide since PG 16. So
+ * we don't have to worry about relfilenumber getting reused for some unrelated
+ * relation file.
*
* If isRedo is true, it's unsurprising for the relation to be already gone.
* Also, we should remove the file immediately instead of queuing a request
@@ -322,90 +297,67 @@ static void
mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
- int ret;
+ char *segpath;
+ int segno;
+ int lastsegment = -1;
+ struct stat statbuf;
path = relpath(rlocator, forkNum);
+ segpath = (char *) palloc(strlen(path) + 12);
- /*
- * Delete or truncate the first segment.
- */
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
+ /* compute number of segments. */
+ for (segno = 0;; segno++)
{
- if (!RelFileLocatorBackendIsTemp(rlocator))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rlocator, forkNum, 0 /* first seg */ );
- }
+ if (segno == 0)
+ sprintf(segpath, "%s", path);
else
- ret = 0;
+ sprintf(segpath, "%s.%u", path, segno);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
+ if (stat(segpath, &statbuf) != 0)
{
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
+ /* ENOENT is expected after the last segment... */
+ if (errno == ENOENT)
+ break;
}
- }
- else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Register request to unlink first segment later */
- register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
+ lastsegment = segno;
}
/*
- * Delete any additional segments.
+ * Unlink segment files in descending order so that if there is any failure
+ * while deleting any of the segment files, we do not create any gaps in
+ * segment files sequence.
*/
- if (ret >= 0)
+ for (segno = lastsegment; segno >= 0; segno--)
{
- char *segpath = (char *) palloc(strlen(path) + 12);
- BlockNumber segno;
-
- /*
- * Note that because we loop until getting ENOENT, we will correctly
- * remove all inactive segments as well as active ones.
- */
- for (segno = 1;; segno++)
- {
+ if (segno == 0)
+ sprintf(segpath, "%s", path);
+ else
sprintf(segpath, "%s.%u", path, segno);
if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
- * Prevent other backends' fds from holding on to the disk
+ * prevent other backends' fds from holding on to the disk
* space.
*/
- if (do_truncate(segpath) < 0 && errno == ENOENT)
- break;
+ do_truncate(path);
- /*
- * Forget any pending sync requests for this segment before we
- * try to unlink.
- */
+ /* forget any pending sync requests for the first segment. */
register_forget_request(rlocator, forkNum, segno);
}
- if (unlink(segpath) < 0)
- {
- /* ENOENT is expected after the last segment... */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", segpath)));
- break;
- }
- }
- pfree(segpath);
+ /*
+ * Unlink the file, we have already checked for file existence in
+ * the above loop while computing the segments so we do not need to
+ * check for ENOENT.
+ */
+ if (unlink(path))
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
+ pfree(segpath);
pfree(path);
}
@@ -1006,23 +958,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileLocatorBackendIsTemp(rlocator));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e1fb631..9a4a31c 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -201,92 +201,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -532,21 +446,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 049af87..2c0b812 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -23,7 +23,6 @@
typedef enum SyncRequestType
{
SYNC_REQUEST, /* schedule a call of sync function */
- SYNC_UNLINK_REQUEST, /* schedule a call of unlink function */
SYNC_FORGET_REQUEST, /* forget all calls for a tag */
SYNC_FILTER_REQUEST /* forget all calls satisfying match fn */
} SyncRequestType;
@@ -57,7 +56,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
v6-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Rename-RelFileNode-to-RelFileLocator-and-relNode-.patchDownload
From 6b9c6f17d354fb742db47e2756cddaff43e2f72c Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 21 Jun 2022 14:04:01 +0530
Subject: [PATCH v6 1/5] Rename RelFileNode to RelFileLocator and relNode to
RelNumber
Currently, the way relfilenode and relnode are used is really confusing.
Although there is some precedent for calling the number that pertains to
the file on disk "relnode" and that value when combined with the database
and tablespace OIDs "relfilenode," but it's definitely not the most obvious
thing, and this terminology is also not used uniformaly.
So as part of this patchset these variables are renamed to something more suited
with their usage. So the RelFileNode is renamed to the RelFileLocator
and all related variable declaration from relfilenode to relfilelocator.
And the relNode in the RelFileLocator is renamed to relNumber and along with that
the dbNode and spcNode are also renamed to dbOid and spcOid. Along with that
all other references to relnode/relfilenode w.r.t to the ondisk file is renamed to
relnumber/relfilenumber.
---
contrib/bloom/blinsert.c | 2 +-
contrib/oid2name/oid2name.c | 28 +--
contrib/pg_buffercache/pg_buffercache_pages.c | 10 +-
contrib/pg_prewarm/autoprewarm.c | 26 +--
contrib/pg_visibility/pg_visibility.c | 2 +-
src/backend/access/common/syncscan.c | 29 +--
src/backend/access/gin/ginbtree.c | 2 +-
src/backend/access/gin/ginfast.c | 2 +-
src/backend/access/gin/ginutil.c | 2 +-
src/backend/access/gin/ginxlog.c | 6 +-
src/backend/access/gist/gistbuild.c | 4 +-
src/backend/access/gist/gistxlog.c | 11 +-
src/backend/access/hash/hash_xlog.c | 6 +-
src/backend/access/hash/hashpage.c | 4 +-
src/backend/access/heap/heapam.c | 78 +++----
src/backend/access/heap/heapam_handler.c | 26 +--
src/backend/access/heap/rewriteheap.c | 10 +-
src/backend/access/heap/visibilitymap.c | 4 +-
src/backend/access/nbtree/nbtpage.c | 2 +-
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 2 +-
src/backend/access/nbtree/nbtxlog.c | 8 +-
src/backend/access/rmgrdesc/genericdesc.c | 2 +-
src/backend/access/rmgrdesc/gindesc.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 6 +-
src/backend/access/rmgrdesc/heapdesc.c | 6 +-
src/backend/access/rmgrdesc/nbtdesc.c | 4 +-
src/backend/access/rmgrdesc/seqdesc.c | 4 +-
src/backend/access/rmgrdesc/smgrdesc.c | 4 +-
src/backend/access/rmgrdesc/xactdesc.c | 44 ++--
src/backend/access/rmgrdesc/xlogdesc.c | 10 +-
src/backend/access/spgist/spginsert.c | 6 +-
src/backend/access/spgist/spgxlog.c | 6 +-
src/backend/access/table/tableamapi.c | 2 +-
src/backend/access/transam/README | 14 +-
src/backend/access/transam/README.parallel | 2 +-
src/backend/access/transam/twophase.c | 38 ++--
src/backend/access/transam/varsup.c | 2 +-
src/backend/access/transam/xact.c | 40 ++--
src/backend/access/transam/xloginsert.c | 38 ++--
src/backend/access/transam/xlogprefetcher.c | 97 ++++----
src/backend/access/transam/xlogreader.c | 25 ++-
src/backend/access/transam/xlogrecovery.c | 18 +-
src/backend/access/transam/xlogutils.c | 73 +++---
src/backend/bootstrap/bootparse.y | 8 +-
src/backend/catalog/catalog.c | 30 +--
src/backend/catalog/heap.c | 56 ++---
src/backend/catalog/index.c | 39 ++--
src/backend/catalog/storage.c | 119 +++++-----
src/backend/commands/cluster.c | 46 ++--
src/backend/commands/copyfrom.c | 8 +-
src/backend/commands/dbcommands.c | 106 ++++-----
src/backend/commands/indexcmds.c | 14 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/sequence.c | 29 +--
src/backend/commands/tablecmds.c | 87 ++++----
src/backend/commands/tablespace.c | 18 +-
src/backend/nodes/copyfuncs.c | 4 +-
src/backend/nodes/equalfuncs.c | 4 +-
src/backend/nodes/outfuncs.c | 4 +-
src/backend/parser/gram.y | 8 +-
src/backend/parser/parse_utilcmd.c | 8 +-
src/backend/postmaster/checkpointer.c | 2 +-
src/backend/replication/logical/decode.c | 40 ++--
src/backend/replication/logical/reorderbuffer.c | 50 ++---
src/backend/replication/logical/snapbuild.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 284 ++++++++++++------------
src/backend/storage/buffer/localbuf.c | 34 +--
src/backend/storage/freespace/freespace.c | 6 +-
src/backend/storage/freespace/fsmpage.c | 6 +-
src/backend/storage/ipc/standby.c | 8 +-
src/backend/storage/lmgr/predicate.c | 24 +-
src/backend/storage/smgr/README | 2 +-
src/backend/storage/smgr/md.c | 126 +++++------
src/backend/storage/smgr/smgr.c | 44 ++--
src/backend/utils/adt/dbsize.c | 64 +++---
src/backend/utils/adt/pg_upgrade_support.c | 14 +-
src/backend/utils/cache/Makefile | 2 +-
src/backend/utils/cache/inval.c | 16 +-
src/backend/utils/cache/relcache.c | 184 +++++++--------
src/backend/utils/cache/relfilenodemap.c | 244 --------------------
src/backend/utils/cache/relfilenumbermap.c | 244 ++++++++++++++++++++
src/backend/utils/cache/relmapper.c | 90 ++++----
src/bin/pg_dump/pg_dump.c | 36 +--
src/bin/pg_rewind/datapagemap.h | 2 +-
src/bin/pg_rewind/filemap.c | 34 +--
src/bin/pg_rewind/filemap.h | 4 +-
src/bin/pg_rewind/parsexlog.c | 10 +-
src/bin/pg_rewind/pg_rewind.h | 2 +-
src/bin/pg_upgrade/Makefile | 2 +-
src/bin/pg_upgrade/info.c | 10 +-
src/bin/pg_upgrade/pg_upgrade.h | 22 +-
src/bin/pg_upgrade/relfilenode.c | 259 ---------------------
src/bin/pg_upgrade/relfilenumber.c | 259 +++++++++++++++++++++
src/bin/pg_waldump/pg_waldump.c | 26 +--
src/common/relpath.c | 48 ++--
src/include/access/brin_xlog.h | 2 +-
src/include/access/ginxlog.h | 4 +-
src/include/access/gistxlog.h | 2 +-
src/include/access/heapam_xlog.h | 8 +-
src/include/access/nbtxlog.h | 4 +-
src/include/access/rewriteheap.h | 6 +-
src/include/access/tableam.h | 59 ++---
src/include/access/xact.h | 26 +--
src/include/access/xlog_internal.h | 2 +-
src/include/access/xloginsert.h | 8 +-
src/include/access/xlogreader.h | 6 +-
src/include/access/xlogrecord.h | 8 +-
src/include/access/xlogutils.h | 8 +-
src/include/catalog/binary_upgrade.h | 6 +-
src/include/catalog/catalog.h | 5 +-
src/include/catalog/heap.h | 2 +-
src/include/catalog/index.h | 2 +-
src/include/catalog/storage.h | 10 +-
src/include/catalog/storage_xlog.h | 8 +-
src/include/commands/sequence.h | 4 +-
src/include/commands/tablecmds.h | 2 +-
src/include/commands/tablespace.h | 2 +-
src/include/common/relpath.h | 24 +-
src/include/nodes/parsenodes.h | 8 +-
src/include/postgres_ext.h | 7 +
src/include/postmaster/bgwriter.h | 2 +-
src/include/replication/reorderbuffer.h | 6 +-
src/include/storage/buf_internals.h | 28 +--
src/include/storage/bufmgr.h | 16 +-
src/include/storage/freespace.h | 4 +-
src/include/storage/md.h | 6 +-
src/include/storage/relfilelocator.h | 99 +++++++++
src/include/storage/relfilenode.h | 99 ---------
src/include/storage/sinval.h | 4 +-
src/include/storage/smgr.h | 12 +-
src/include/storage/standby.h | 6 +-
src/include/storage/sync.h | 4 +-
src/include/utils/inval.h | 4 +-
src/include/utils/rel.h | 46 ++--
src/include/utils/relcache.h | 8 +-
src/include/utils/relfilenodemap.h | 18 --
src/include/utils/relfilenumbermap.h | 19 ++
src/include/utils/relmapper.h | 13 +-
src/test/recovery/t/018_wal_optimize.pl | 2 +-
src/tools/pgindent/typedefs.list | 10 +-
141 files changed, 2078 insertions(+), 2050 deletions(-)
delete mode 100644 src/backend/utils/cache/relfilenodemap.c
create mode 100644 src/backend/utils/cache/relfilenumbermap.c
delete mode 100644 src/bin/pg_upgrade/relfilenode.c
create mode 100644 src/bin/pg_upgrade/relfilenumber.c
create mode 100644 src/include/storage/relfilelocator.h
delete mode 100644 src/include/storage/relfilenode.h
delete mode 100644 src/include/utils/relfilenodemap.h
create mode 100644 src/include/utils/relfilenumbermap.h
diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index 82378db..e64291e 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -179,7 +179,7 @@ blbuildempty(Relation index)
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO, metapage, true);
/*
diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index a3e358b..cadba3b 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -30,7 +30,7 @@ struct options
{
eary *tables;
eary *oids;
- eary *filenodes;
+ eary *filenumbers;
bool quiet;
bool systables;
@@ -125,9 +125,9 @@ get_opts(int argc, char **argv, struct options *my_opts)
my_opts->dbname = pg_strdup(optarg);
break;
- /* specify one filenode to show */
+ /* specify one filenumber to show */
case 'f':
- add_one_elt(optarg, my_opts->filenodes);
+ add_one_elt(optarg, my_opts->filenumbers);
break;
/* host to connect to */
@@ -494,7 +494,7 @@ sql_exec_dumpalltables(PGconn *conn, struct options *opts)
}
/*
- * Show oid, filenode, name, schema and tablespace for each of the
+ * Show oid, filenumber, name, schema and tablespace for each of the
* given objects in the current database.
*/
void
@@ -504,19 +504,19 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
char *qualifiers,
*ptr;
char *comma_oids,
- *comma_filenodes,
+ *comma_filenumbers,
*comma_tables;
bool written = false;
char *addfields = ",c.oid AS \"Oid\", nspname AS \"Schema\", spcname as \"Tablespace\" ";
- /* get tables qualifiers, whether names, filenodes, or OIDs */
+ /* get tables qualifiers, whether names, filenumbers, or OIDs */
comma_oids = get_comma_elts(opts->oids);
comma_tables = get_comma_elts(opts->tables);
- comma_filenodes = get_comma_elts(opts->filenodes);
+ comma_filenumbers = get_comma_elts(opts->filenumbers);
/* 80 extra chars for SQL expression */
qualifiers = (char *) pg_malloc(strlen(comma_oids) + strlen(comma_tables) +
- strlen(comma_filenodes) + 80);
+ strlen(comma_filenumbers) + 80);
ptr = qualifiers;
if (opts->oids->num > 0)
@@ -524,11 +524,11 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
ptr += sprintf(ptr, "c.oid IN (%s)", comma_oids);
written = true;
}
- if (opts->filenodes->num > 0)
+ if (opts->filenumbers->num > 0)
{
if (written)
ptr += sprintf(ptr, " OR ");
- ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenodes);
+ ptr += sprintf(ptr, "pg_catalog.pg_relation_filenode(c.oid) IN (%s)", comma_filenumbers);
written = true;
}
if (opts->tables->num > 0)
@@ -539,7 +539,7 @@ sql_exec_searchtables(PGconn *conn, struct options *opts)
}
free(comma_oids);
free(comma_tables);
- free(comma_filenodes);
+ free(comma_filenumbers);
/* now build the query */
todo = psprintf("SELECT pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\", relname as \"Table Name\" %s\n"
@@ -588,11 +588,11 @@ main(int argc, char **argv)
my_opts->oids = (eary *) pg_malloc(sizeof(eary));
my_opts->tables = (eary *) pg_malloc(sizeof(eary));
- my_opts->filenodes = (eary *) pg_malloc(sizeof(eary));
+ my_opts->filenumbers = (eary *) pg_malloc(sizeof(eary));
my_opts->oids->num = my_opts->oids->alloc = 0;
my_opts->tables->num = my_opts->tables->alloc = 0;
- my_opts->filenodes->num = my_opts->filenodes->alloc = 0;
+ my_opts->filenumbers->num = my_opts->filenumbers->alloc = 0;
/* parse the opts */
get_opts(argc, argv, my_opts);
@@ -618,7 +618,7 @@ main(int argc, char **argv)
/* display the given elements in the database */
if (my_opts->oids->num > 0 ||
my_opts->tables->num > 0 ||
- my_opts->filenodes->num > 0)
+ my_opts->filenumbers->num > 0)
{
if (!my_opts->quiet)
printf("From database \"%s\":\n", my_opts->dbname);
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..713f52a 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -26,7 +26,7 @@ PG_MODULE_MAGIC;
typedef struct
{
uint32 bufferid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
Oid reltablespace;
Oid reldatabase;
ForkNumber forknum;
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
+ fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
@@ -209,7 +209,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenode);
+ values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c0c4f5d..7f1d55c 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -52,7 +52,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/resowner.h"
#define AUTOPREWARM_FILE "autoprewarm.blocks"
@@ -62,7 +62,7 @@ typedef struct BlockInfoRecord
{
Oid database;
Oid tablespace;
- Oid filenode;
+ RelFileNumber filenumber;
ForkNumber forknum;
BlockNumber blocknum;
} BlockInfoRecord;
@@ -347,7 +347,7 @@ apw_load_buffers(void)
unsigned forknum;
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
- &blkinfo[i].tablespace, &blkinfo[i].filenode,
+ &blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
(errmsg("autoprewarm block dump file is corrupted at line %d",
@@ -494,7 +494,7 @@ autoprewarm_database_main(Datum main_arg)
* relation. Note that rel will be NULL if try_relation_open failed
* previously; in that case, there is nothing to close.
*/
- if (old_blk != NULL && old_blk->filenode != blk->filenode &&
+ if (old_blk != NULL && old_blk->filenumber != blk->filenumber &&
rel != NULL)
{
relation_close(rel, AccessShareLock);
@@ -506,13 +506,13 @@ autoprewarm_database_main(Datum main_arg)
* Try to open each new relation, but only once, when we first
* encounter it. If it's been dropped, skip the associated blocks.
*/
- if (old_blk == NULL || old_blk->filenode != blk->filenode)
+ if (old_blk == NULL || old_blk->filenumber != blk->filenumber)
{
Oid reloid;
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenode(blk->tablespace, blk->filenode);
+ reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
@@ -527,7 +527,7 @@ autoprewarm_database_main(Datum main_arg)
/* Once per fork, check for fork existence and size. */
if (old_blk == NULL ||
- old_blk->filenode != blk->filenode ||
+ old_blk->filenumber != blk->filenumber ||
old_blk->forknum != blk->forknum)
{
/*
@@ -631,9 +631,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
+ block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
@@ -671,7 +671,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
- block_info_array[i].filenode,
+ block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
if (ret < 0)
@@ -900,7 +900,7 @@ do { \
* We depend on all records for a particular database being consecutive
* in the dump file; each per-database worker will preload blocks until
* it sees a block for some other database. Sorting by tablespace,
- * filenode, forknum, and blocknum isn't critical for correctness, but
+ * filenumber, forknum, and blocknum isn't critical for correctness, but
* helps us get a sequential I/O pattern.
*/
static int
@@ -911,7 +911,7 @@ apw_compare_blockinfo(const void *p, const void *q)
cmp_member_elem(database);
cmp_member_elem(tablespace);
- cmp_member_elem(filenode);
+ cmp_member_elem(filenumber);
cmp_member_elem(forknum);
cmp_member_elem(blocknum);
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 1853c35..4e2e9ea 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -407,7 +407,7 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
xl_smgr_truncate xlrec;
xlrec.blkno = 0;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_VM;
XLogBeginInsert();
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index d5b16c5..ad48cb7 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -90,7 +90,7 @@ bool trace_syncscan = false;
*/
typedef struct ss_scan_location_t
{
- RelFileNode relfilenode; /* identity of a relation */
+ RelFileLocator relfilelocator; /* identity of a relation */
BlockNumber location; /* last-reported location in the relation */
} ss_scan_location_t;
@@ -115,7 +115,7 @@ typedef struct ss_scan_locations_t
static ss_scan_locations_t *scan_locations;
/* prototypes for internal functions */
-static BlockNumber ss_search(RelFileNode relfilenode,
+static BlockNumber ss_search(RelFileLocator relfilelocator,
BlockNumber location, bool set);
@@ -159,9 +159,9 @@ SyncScanShmemInit(void)
* these invalid entries will fall off the LRU list and get
* replaced with real entries.
*/
- item->location.relfilenode.spcNode = InvalidOid;
- item->location.relfilenode.dbNode = InvalidOid;
- item->location.relfilenode.relNode = InvalidOid;
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
item->location.location = InvalidBlockNumber;
item->prev = (i > 0) ?
@@ -176,10 +176,10 @@ SyncScanShmemInit(void)
/*
* ss_search --- search the scan_locations structure for an entry with the
- * given relfilenode.
+ * given relfilelocator.
*
* If "set" is true, the location is updated to the given location. If no
- * entry for the given relfilenode is found, it will be created at the head
+ * entry for the given relfilelocator is found, it will be created at the head
* of the list with the given location, even if "set" is false.
*
* In any case, the location after possible update is returned.
@@ -188,7 +188,7 @@ SyncScanShmemInit(void)
* data structure.
*/
static BlockNumber
-ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
+ss_search(RelFileLocator relfilelocator, BlockNumber location, bool set)
{
ss_lru_item_t *item;
@@ -197,7 +197,8 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
{
bool match;
- match = RelFileNodeEquals(item->location.relfilenode, relfilenode);
+ match = RelFileLocatorEquals(item->location.relfilelocator,
+ relfilelocator);
if (match || item->next == NULL)
{
@@ -207,7 +208,7 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
*/
if (!match)
{
- item->location.relfilenode = relfilenode;
+ item->location.relfilelocator = relfilelocator;
item->location.location = location;
}
else if (set)
@@ -255,7 +256,7 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
BlockNumber startloc;
LWLockAcquire(SyncScanLock, LW_EXCLUSIVE);
- startloc = ss_search(rel->rd_node, 0, false);
+ startloc = ss_search(rel->rd_locator, 0, false);
LWLockRelease(SyncScanLock);
/*
@@ -281,8 +282,8 @@ ss_get_location(Relation rel, BlockNumber relnblocks)
* ss_report_location --- update the current scan location
*
* Writes an entry into the shared Sync Scan state of the form
- * (relfilenode, blocknumber), overwriting any existing entry for the
- * same relfilenode.
+ * (relfilelocator, blocknumber), overwriting any existing entry for the
+ * same relfilelocator.
*/
void
ss_report_location(Relation rel, BlockNumber location)
@@ -309,7 +310,7 @@ ss_report_location(Relation rel, BlockNumber location)
{
if (LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE))
{
- (void) ss_search(rel->rd_node, location, true);
+ (void) ss_search(rel->rd_locator, location, true);
LWLockRelease(SyncScanLock);
}
#ifdef TRACE_SYNCSCAN
diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index cc6d4e6..c75bfc2 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -470,7 +470,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
savedRightLink = GinPageGetOpaque(page)->rightlink;
/* Begin setting up WAL record */
- data.node = btree->index->rd_node;
+ data.locator = btree->index->rd_locator;
data.flags = xlflags;
if (BufferIsValid(childbuf))
{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 7409fdc..6c67744 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -235,7 +235,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
needWal = RelationNeedsWAL(index);
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 20f4706..6df7f2e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -688,7 +688,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats, bool is_build)
XLogRecPtr recptr;
ginxlogUpdateMeta data;
- data.node = index->rd_node;
+ data.locator = index->rd_locator;
data.ntuples = 0;
data.newRightlink = data.prevTail = InvalidBlockNumber;
memcpy(&data.metadata, metadata, sizeof(GinMetaPageData));
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 87e8366..41b9211 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileLocator locator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &locator, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index f5a5caf..374e64e 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -462,7 +462,7 @@ gist_indexsortbuild(GISTBuildState *state)
smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
if (RelationNeedsWAL(state->indexrel))
- log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM, GIST_ROOT_BLKNO,
+ log_newpage(&state->indexrel->rd_locator, MAIN_FORKNUM, GIST_ROOT_BLKNO,
levelstate->pages[0], true);
pfree(levelstate->pages[0]);
@@ -663,7 +663,7 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
}
if (RelationNeedsWAL(state->indexrel))
- log_newpages(&state->indexrel->rd_node, MAIN_FORKNUM, state->ready_num_pages,
+ log_newpages(&state->indexrel->rd_locator, MAIN_FORKNUM, state->ready_num_pages,
state->ready_blknos, state->ready_pages, true);
for (int i = 0; i < state->ready_num_pages; i++)
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index df70f90..b4f629f 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -191,11 +191,12 @@ gistRedoDeleteRecord(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid,
+ rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -395,7 +396,7 @@ gistRedoPageReuse(XLogReaderState *record)
*/
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
@@ -607,7 +608,7 @@ gistXLogPageReuse(Relation rel, BlockNumber blkno, FullTransactionId latestRemov
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = latestRemovedXid;
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index 62dbfc3..2e68303 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -999,10 +999,10 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(xldata->latestRemovedXid, rlocator);
}
action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL, true, &buffer);
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 39206d1..d2edcd4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -428,7 +428,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
MarkBufferDirty(buf);
if (use_wal)
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
forkNum,
blkno,
BufferGetPage(buf),
@@ -1019,7 +1019,7 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
ovflopaque->hasho_page_id = HASHO_PAGE_ID;
if (RelationNeedsWAL(rel))
- log_newpage(&rel->rd_node,
+ log_newpage(&rel->rd_locator,
MAIN_FORKNUM,
lastblock,
zerobuf.data,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 637de11..aab8d6f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8189,7 +8189,7 @@ log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid,
* heap_buffer, if necessary.
*/
XLogRecPtr
-log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
+log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer, Buffer vm_buffer,
TransactionId cutoff_xid, uint8 vmflags)
{
xl_heap_visible xlrec;
@@ -8454,7 +8454,7 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
Assert(tup->t_tableOid != InvalidOid);
xlrec.top_xid = GetTopTransactionId();
- xlrec.target_node = relation->rd_node;
+ xlrec.target_locator = relation->rd_locator;
xlrec.target_tid = tup->t_self;
/*
@@ -8623,18 +8623,18 @@ heap_xlog_prune(XLogReaderState *record)
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_prune *xlrec = (xl_heap_prune *) XLogRecGetData(record);
Buffer buffer;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/*
* We're about to remove tuples. In Hot Standby mode, ensure that there's
* no queries running for which the removed tuples are still visible.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
/*
* If we have a full-page image, restore it (using a cleanup lock) and
@@ -8694,7 +8694,7 @@ heap_xlog_prune(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8751,9 +8751,9 @@ heap_xlog_vacuum(XLogReaderState *record)
if (BufferIsValid(buffer))
{
Size freespace = PageGetHeapFreeSpace(BufferGetPage(buffer));
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
UnlockReleaseBuffer(buffer);
@@ -8766,7 +8766,7 @@ heap_xlog_vacuum(XLogReaderState *record)
* Do this regardless of a full-page image being applied, since the
* FSM data is not in the page anyway.
*/
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
}
@@ -8786,11 +8786,11 @@ heap_xlog_visible(XLogReaderState *record)
Buffer vmbuffer = InvalidBuffer;
Buffer buffer;
Page page;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
/*
* If there are any Hot Standby transactions running that have an xmin
@@ -8802,7 +8802,7 @@ heap_xlog_visible(XLogReaderState *record)
* rather than killing the transaction outright.
*/
if (InHotStandby)
- ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->cutoff_xid, rlocator);
/*
* Read the heap page, if it still exists. If the heap file has dropped or
@@ -8865,7 +8865,7 @@ heap_xlog_visible(XLogReaderState *record)
* FSM data is not in the page anyway.
*/
if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
- XLogRecordPageWithFreeSpace(rnode, blkno, space);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, space);
}
/*
@@ -8890,7 +8890,7 @@ heap_xlog_visible(XLogReaderState *record)
*/
LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
- reln = CreateFakeRelcacheEntry(rnode);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, blkno, &vmbuffer);
/*
@@ -8933,13 +8933,13 @@ heap_xlog_freeze_page(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
TransactionId latestRemovedXid = cutoff_xid;
TransactionIdRetreat(latestRemovedXid);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rlocator);
}
if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)
@@ -9007,10 +9007,10 @@ heap_xlog_delete(XLogReaderState *record)
ItemId lp = NULL;
HeapTupleHeader htup;
BlockNumber blkno;
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9020,7 +9020,7 @@ heap_xlog_delete(XLogReaderState *record)
*/
if (xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9086,12 +9086,12 @@ heap_xlog_insert(XLogReaderState *record)
xl_heap_header xlhdr;
uint32 newlen;
Size freespace = 0;
- RelFileNode target_node;
+ RelFileLocator target_locator;
BlockNumber blkno;
ItemPointerData target_tid;
XLogRedoAction action;
- XLogRecGetBlockTag(record, 0, &target_node, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &target_locator, NULL, &blkno);
ItemPointerSetBlockNumber(&target_tid, blkno);
ItemPointerSetOffsetNumber(&target_tid, xlrec->offnum);
@@ -9101,7 +9101,7 @@ heap_xlog_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(target_node);
+ Relation reln = CreateFakeRelcacheEntry(target_locator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9184,7 +9184,7 @@ heap_xlog_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(target_node, blkno, freespace);
+ XLogRecordPageWithFreeSpace(target_locator, blkno, freespace);
}
/*
@@ -9195,7 +9195,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_multi_insert *xlrec;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber blkno;
Buffer buffer;
Page page;
@@ -9217,7 +9217,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
xlrec = (xl_heap_multi_insert *) XLogRecGetData(record);
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
/* check that the mutually exclusive flags are not both set */
Assert(!((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) &&
@@ -9229,7 +9229,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
*/
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, blkno, &vmbuffer);
@@ -9331,7 +9331,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
* totally accurate anyway.
*/
if (action == BLK_NEEDS_REDO && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, blkno, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
}
/*
@@ -9342,7 +9342,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
{
XLogRecPtr lsn = record->EndRecPtr;
xl_heap_update *xlrec = (xl_heap_update *) XLogRecGetData(record);
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber oldblk;
BlockNumber newblk;
ItemPointerData newtid;
@@ -9371,7 +9371,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
oldtup.t_data = NULL;
oldtup.t_len = 0;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &newblk);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &newblk);
if (XLogRecGetBlockTagExtended(record, 1, NULL, NULL, &oldblk, NULL))
{
/* HOT updates are never done across pages */
@@ -9388,7 +9388,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, oldblk, &vmbuffer);
@@ -9472,7 +9472,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
*/
if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED)
{
- Relation reln = CreateFakeRelcacheEntry(rnode);
+ Relation reln = CreateFakeRelcacheEntry(rlocator);
Buffer vmbuffer = InvalidBuffer;
visibilitymap_pin(reln, newblk, &vmbuffer);
@@ -9606,7 +9606,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
* totally accurate anyway.
*/
if (newaction == BLK_NEEDS_REDO && !hot_update && freespace < BLCKSZ / 5)
- XLogRecordPageWithFreeSpace(rnode, newblk, freespace);
+ XLogRecordPageWithFreeSpace(rlocator, newblk, freespace);
}
static void
@@ -9662,13 +9662,13 @@ heap_xlog_lock(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
@@ -9735,13 +9735,13 @@ heap_xlog_lock_updated(XLogReaderState *record)
*/
if (xlrec->flags & XLH_LOCK_ALL_FROZEN_CLEARED)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
Buffer vmbuffer = InvalidBuffer;
BlockNumber block;
Relation reln;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, &block);
- reln = CreateFakeRelcacheEntry(rnode);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, &block);
+ reln = CreateFakeRelcacheEntry(rlocator);
visibilitymap_pin(reln, block, &vmbuffer);
visibilitymap_clear(reln, block, vmbuffer, VISIBILITYMAP_ALL_FROZEN);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 444f027..a3414a7 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -566,11 +566,11 @@ tuple_lock_retry:
*/
static void
-heapam_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+heapam_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
SMgrRelation srel;
@@ -591,7 +591,7 @@ heapam_relation_set_new_filenode(Relation rel,
*/
*minmulti = GetOldestMultiXactId();
- srel = RelationCreateStorage(*newrnode, persistence, true);
+ srel = RelationCreateStorage(*newrlocator, persistence, true);
/*
* If required, set up an init fork for an unlogged table so that it can
@@ -608,7 +608,7 @@ heapam_relation_set_new_filenode(Relation rel,
rel->rd_rel->relkind == RELKIND_MATVIEW ||
rel->rd_rel->relkind == RELKIND_TOASTVALUE);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(newrnode, INIT_FORKNUM);
+ log_smgrcreate(newrlocator, INIT_FORKNUM);
smgrimmedsync(srel, INIT_FORKNUM);
}
@@ -622,11 +622,11 @@ heapam_relation_nontransactional_truncate(Relation rel)
}
static void
-heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+heapam_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(*newrnode, rel->rd_backend);
+ dstrel = smgropen(*newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -640,10 +640,10 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilenumber value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(*newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -664,7 +664,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(newrnode, forkNum);
+ log_smgrcreate(newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
@@ -2569,7 +2569,7 @@ static const TableAmRoutine heapam_methods = {
.tuple_satisfies_snapshot = heapam_tuple_satisfies_snapshot,
.index_delete_tuples = heap_index_delete_tuples,
- .relation_set_new_filenode = heapam_relation_set_new_filenode,
+ .relation_set_new_filelocator = heapam_relation_set_new_filelocator,
.relation_nontransactional_truncate = heapam_relation_nontransactional_truncate,
.relation_copy_data = heapam_relation_copy_data,
.relation_copy_for_cluster = heapam_relation_copy_for_cluster,
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826..197f06b 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -318,7 +318,7 @@ end_heap_rewrite(RewriteState state)
if (state->rs_buffer_valid)
{
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer,
@@ -679,7 +679,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
/* XLOG stuff */
if (RelationNeedsWAL(state->rs_new_rel))
- log_newpage(&state->rs_new_rel->rd_node,
+ log_newpage(&state->rs_new_rel->rd_locator,
MAIN_FORKNUM,
state->rs_blockno,
page,
@@ -742,7 +742,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
* When doing logical decoding - which relies on using cmin/cmax of catalog
* tuples, via xl_heap_new_cid records - heap rewrites have to log enough
* information to allow the decoding backend to update its internal mapping
- * of (relfilenode,ctid) => (cmin, cmax) to be correct for the rewritten heap.
+ * of (relfilelocator,ctid) => (cmin, cmax) to be correct for the rewritten heap.
*
* For that, every time we find a tuple that's been modified in a catalog
* relation within the xmin horizon of any decoding slot, we log a mapping
@@ -1080,9 +1080,9 @@ logical_rewrite_heap_tuple(RewriteState state, ItemPointerData old_tid,
return;
/* fill out mapping information */
- map.old_node = state->rs_old_rel->rd_node;
+ map.old_locator = state->rs_old_rel->rd_locator;
map.old_tid = old_tid;
- map.new_node = state->rs_new_rel->rd_node;
+ map.new_locator = state->rs_new_rel->rd_locator;
map.new_tid = new_tid;
/* ---
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index e09f25a..ed72eb7 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -283,7 +283,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
if (XLogRecPtrIsInvalid(recptr))
{
Assert(!InRecovery);
- recptr = log_heap_visible(rel->rd_node, heapBuf, vmBuf,
+ recptr = log_heap_visible(rel->rd_locator, heapBuf, vmBuf,
cutoff_xid, flags);
/*
@@ -668,7 +668,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
* to keep checking for creation or extension of the file, which happens
* infrequently.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
UnlockRelationForExtension(rel, ExclusiveLock);
}
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 20adb60..8b96708 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -836,7 +836,7 @@ _bt_log_reuse_page(Relation rel, BlockNumber blkno, FullTransactionId safexid)
*/
/* XLOG stuff */
- xlrec_reuse.node = rel->rd_node;
+ xlrec_reuse.locator = rel->rd_locator;
xlrec_reuse.block = blkno;
xlrec_reuse.latestRemovedFullXid = safexid;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 9b730f3..b52eca8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -166,7 +166,7 @@ btbuildempty(Relation index)
PageSetChecksumInplace(metapage, BTREE_METAPAGE);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
- log_newpage(&RelationGetSmgr(index)->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&RelationGetSmgr(index)->smgr_rlocator.locator, INIT_FORKNUM,
BTREE_METAPAGE, metapage, true);
/*
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9f60fa9..bd1685c 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -647,7 +647,7 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno)
if (wstate->btws_use_wal)
{
/* We use the XLOG_FPI record type for this */
- log_newpage(&wstate->index->rd_node, MAIN_FORKNUM, blkno, page, true);
+ log_newpage(&wstate->index->rd_locator, MAIN_FORKNUM, blkno, page, true);
}
/*
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index f9186ca..ad489e3 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -664,11 +664,11 @@ btree_xlog_delete(XLogReaderState *record)
*/
if (InHotStandby)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &rlocator, NULL, NULL);
- ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rnode);
+ ResolveRecoveryConflictWithSnapshot(xlrec->latestRemovedXid, rlocator);
}
/*
@@ -1006,7 +1006,7 @@ btree_xlog_reuse_page(XLogReaderState *record)
if (InHotStandby)
ResolveRecoveryConflictWithSnapshotFullXid(xlrec->latestRemovedFullXid,
- xlrec->node);
+ xlrec->locator);
}
void
diff --git a/src/backend/access/rmgrdesc/genericdesc.c b/src/backend/access/rmgrdesc/genericdesc.c
index 877beb5..d8509b8 100644
--- a/src/backend/access/rmgrdesc/genericdesc.c
+++ b/src/backend/access/rmgrdesc/genericdesc.c
@@ -15,7 +15,7 @@
#include "access/generic_xlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Description of generic xlog record: write page regions that this record
diff --git a/src/backend/access/rmgrdesc/gindesc.c b/src/backend/access/rmgrdesc/gindesc.c
index 57f7bce..7d147ce 100644
--- a/src/backend/access/rmgrdesc/gindesc.c
+++ b/src/backend/access/rmgrdesc/gindesc.c
@@ -17,7 +17,7 @@
#include "access/ginxlog.h"
#include "access/xlogutils.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
desc_recompress_leaf(StringInfo buf, ginxlogRecompressDataLeaf *insertData)
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index d0c8e24..7dd3c1d 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -16,7 +16,7 @@
#include "access/gistxlog.h"
#include "lib/stringinfo.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
static void
out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
@@ -27,8 +27,8 @@ static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode, xlrec->block,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
}
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 6238085..923d3bc 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -170,9 +170,9 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
- xlrec->target_node.spcNode,
- xlrec->target_node.dbNode,
- xlrec->target_node.relNode,
+ xlrec->target_locator.spcOid,
+ xlrec->target_locator.dbOid,
+ xlrec->target_locator.relNumber,
ItemPointerGetBlockNumber(&(xlrec->target_tid)),
ItemPointerGetOffsetNumber(&(xlrec->target_tid)));
appendStringInfo(buf, "; cmin: %u, cmax: %u, combo: %u",
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index dfbbf4e..4843cd5 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -101,8 +101,8 @@ btree_desc(StringInfo buf, XLogReaderState *record)
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode,
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
XidFromFullTransactionId(xlrec->latestRemovedFullXid));
break;
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index d9b1e60..b3845f9 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -26,8 +26,8 @@ seq_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SEQ_LOG)
appendStringInfo(buf, "rel %u/%u/%u",
- xlrec->node.spcNode, xlrec->node.dbNode,
- xlrec->node.relNode);
+ xlrec->locator.spcOid, xlrec->locator.dbOid,
+ xlrec->locator.relNumber);
}
const char *
diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 7547813..e0ee8a0 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,7 +26,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
if (info == XLOG_SMGR_CREATE)
{
xl_smgr_create *xlrec = (xl_smgr_create *) rec;
- char *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+ char *path = relpathperm(xlrec->rlocator, xlrec->forkNum);
appendStringInfoString(buf, path);
pfree(path);
@@ -34,7 +34,7 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
else if (info == XLOG_SMGR_TRUNCATE)
{
xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
- char *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+ char *path = relpathperm(xlrec->rlocator, MAIN_FORKNUM);
appendStringInfo(buf, "%s to %u blocks flags %d", path,
xlrec->blkno, xlrec->flags);
diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 90b6ac2..39752cf 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -73,15 +73,15 @@ ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *pars
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocators = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocators->nrels;
+ parsed->xlocators = xl_rellocators->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocators->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -179,15 +179,15 @@ ParseAbortRecord(uint8 info, xl_xact_abort *xlrec, xl_xact_parsed_abort *parsed)
data += parsed->nsubxacts * sizeof(TransactionId);
}
- if (parsed->xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (parsed->xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- xl_xact_relfilenodes *xl_relfilenodes = (xl_xact_relfilenodes *) data;
+ xl_xact_relfilelocators *xl_rellocator = (xl_xact_relfilelocators *) data;
- parsed->nrels = xl_relfilenodes->nrels;
- parsed->xnodes = xl_relfilenodes->xnodes;
+ parsed->nrels = xl_rellocator->nrels;
+ parsed->xlocators = xl_rellocator->xlocators;
- data += MinSizeOfXactRelfilenodes;
- data += xl_relfilenodes->nrels * sizeof(RelFileNode);
+ data += MinSizeOfXactRelfileLocators;
+ data += xl_rellocator->nrels * sizeof(RelFileLocator);
}
if (parsed->xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -260,11 +260,11 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
parsed->subxacts = (TransactionId *) bufptr;
bufptr += MAXALIGN(xlrec->nsubxacts * sizeof(TransactionId));
- parsed->xnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileNode));
+ parsed->xlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->ncommitrels * sizeof(RelFileLocator));
- parsed->abortnodes = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileNode));
+ parsed->abortlocators = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(xlrec->nabortrels * sizeof(RelFileLocator));
parsed->stats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(xlrec->ncommitstats * sizeof(xl_xact_stats_item));
@@ -278,7 +278,7 @@ ParsePrepareRecord(uint8 info, xl_xact_prepare *xlrec, xl_xact_parsed_prepare *p
static void
xact_desc_relations(StringInfo buf, char *label, int nrels,
- RelFileNode *xnodes)
+ RelFileLocator *xlocators)
{
int i;
@@ -287,7 +287,7 @@ xact_desc_relations(StringInfo buf, char *label, int nrels,
appendStringInfo(buf, "; %s:", label);
for (i = 0; i < nrels; i++)
{
- char *path = relpathperm(xnodes[i], MAIN_FORKNUM);
+ char *path = relpathperm(xlocators[i], MAIN_FORKNUM);
appendStringInfo(buf, " %s", path);
pfree(path);
@@ -340,7 +340,7 @@ xact_desc_commit(StringInfo buf, uint8 info, xl_xact_commit *xlrec, RepOriginId
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
xact_desc_stats(buf, "", parsed.nstats, parsed.stats);
@@ -376,7 +376,7 @@ xact_desc_abort(StringInfo buf, uint8 info, xl_xact_abort *xlrec, RepOriginId or
appendStringInfoString(buf, timestamptz_to_str(xlrec->xact_time));
- xact_desc_relations(buf, "rels", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels", parsed.nrels, parsed.xlocators);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
if (parsed.xinfo & XACT_XINFO_HAS_ORIGIN)
@@ -400,9 +400,9 @@ xact_desc_prepare(StringInfo buf, uint8 info, xl_xact_prepare *xlrec, RepOriginI
appendStringInfo(buf, "gid %s: ", parsed.twophase_gid);
appendStringInfoString(buf, timestamptz_to_str(parsed.xact_time));
- xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xnodes);
+ xact_desc_relations(buf, "rels(commit)", parsed.nrels, parsed.xlocators);
xact_desc_relations(buf, "rels(abort)", parsed.nabortrels,
- parsed.abortnodes);
+ parsed.abortlocators);
xact_desc_stats(buf, "commit ", parsed.nstats, parsed.stats);
xact_desc_stats(buf, "abort ", parsed.nabortstats, parsed.abortstats);
xact_desc_subxacts(buf, parsed.nsubxacts, parsed.subxacts);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index fefc563..6fec485 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -219,12 +219,12 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (detailed_format)
@@ -239,7 +239,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
"blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
@@ -299,7 +299,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u fork %s blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
blk);
}
@@ -308,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfo(buf,
", blkref #%d: rel %u/%u/%u blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
}
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index bfb7404..c6821b5 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -171,7 +171,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_METAPAGE_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
@@ -180,7 +180,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_ROOT_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_ROOT_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_ROOT_BLKNO, page, true);
/* Likewise for the null-tuples root page. */
@@ -189,7 +189,7 @@ spgbuildempty(Relation index)
PageSetChecksumInplace(page, SPGIST_NULL_BLKNO);
smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, SPGIST_NULL_BLKNO,
(char *) page, true);
- log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
+ log_newpage(&(RelationGetSmgr(index))->smgr_rlocator.locator, INIT_FORKNUM,
SPGIST_NULL_BLKNO, page, true);
/*
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index b500b2c..4c9f402 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -877,11 +877,11 @@ spgRedoVacuumRedirect(XLogReaderState *record)
{
if (TransactionIdIsValid(xldata->newestRedirectXid))
{
- RelFileNode node;
+ RelFileLocator locator;
- XLogRecGetBlockTag(record, 0, &node, NULL, NULL);
+ XLogRecGetBlockTag(record, 0, &locator, NULL, NULL);
ResolveRecoveryConflictWithSnapshot(xldata->newestRedirectXid,
- node);
+ locator);
}
}
diff --git a/src/backend/access/table/tableamapi.c b/src/backend/access/table/tableamapi.c
index 76df798..873d961 100644
--- a/src/backend/access/table/tableamapi.c
+++ b/src/backend/access/table/tableamapi.c
@@ -82,7 +82,7 @@ GetTableAmRoutine(Oid amhandler)
Assert(routine->tuple_update != NULL);
Assert(routine->tuple_lock != NULL);
- Assert(routine->relation_set_new_filenode != NULL);
+ Assert(routine->relation_set_new_filelocator != NULL);
Assert(routine->relation_nontransactional_truncate != NULL);
Assert(routine->relation_copy_data != NULL);
Assert(routine->relation_copy_for_cluster != NULL);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 1edc818..734c39a 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -557,7 +557,7 @@ void XLogRegisterBuffer(uint8 block_id, Buffer buf, uint8 flags);
XLogRegisterBuffer adds information about a data block to the WAL record.
block_id is an arbitrary number used to identify this page reference in
the redo routine. The information needed to re-find the page at redo -
- relfilenode, fork, and block number - are included in the WAL record.
+ relfilelocator, fork, and block number - are included in the WAL record.
XLogInsert will automatically include a full copy of the page contents, if
this is the first modification of the buffer since the last checkpoint.
@@ -692,7 +692,7 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenode
+because we check for on-disk collisions when allocating new relfilenumber
OIDs. So cleaning up isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
@@ -725,10 +725,10 @@ then restart recovery. This is part of the reason for not writing a WAL
entry until we've successfully done the original action.
-Skipping WAL for New RelFileNode
+Skipping WAL for New RelFileLocator
--------------------------------
-Under wal_level=minimal, if a change modifies a relfilenode that ROLLBACK
+Under wal_level=minimal, if a change modifies a relfilenumber that ROLLBACK
would unlink, in-tree access methods write no WAL for that change. Code that
writes WAL without calling RelationNeedsWAL() must check for this case. This
skipping is mandatory. If a WAL-writing change preceded a WAL-skipping change
@@ -748,9 +748,9 @@ unconditionally for permanent relations. Under these approaches, the access
method callbacks must not call functions that react to RelationNeedsWAL().
This applies only to WAL records whose replay would modify bytes stored in the
-new relfilenode. It does not apply to other records about the relfilenode,
+new relfilenumber. It does not apply to other records about the relfilenumber,
such as XLOG_SMGR_CREATE. Because it operates at the level of individual
-relfilenodes, RelationNeedsWAL() can differ for tightly-coupled relations.
+relfilenumbers, RelationNeedsWAL() can differ for tightly-coupled relations.
Consider "CREATE TABLE t (); BEGIN; ALTER TABLE t ADD c text; ..." in which
ALTER TABLE adds a TOAST relation. The TOAST relation will skip WAL, while
the table owning it will not. ALTER TABLE SET TABLESPACE will cause a table
@@ -860,7 +860,7 @@ Changes to a temp table are not WAL-logged, hence could reach disk in
advance of T1's commit, but we don't care since temp table contents don't
survive crashes anyway.
-Database writes that skip WAL for new relfilenodes are also safe. In these
+Database writes that skip WAL for new relfilenumbers are also safe. In these
cases it's entirely possible for the data to reach disk before T1's commit,
because T1 will fsync it down to disk without any sort of interlock. However,
all these paths are designed to write data that no other transaction can see
diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel
index 99c588d..e486bff 100644
--- a/src/backend/access/transam/README.parallel
+++ b/src/backend/access/transam/README.parallel
@@ -126,7 +126,7 @@ worker. This includes:
an index that is currently being rebuilt.
- Active relmapper.c mapping state. This is needed to allow consistent
- answers when fetching the current relfilenode for relation oids of
+ answers when fetching the current relfilenumber for relation oids of
mapped relations.
To prevent unprincipled deadlocks when running in parallel mode, this code
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 75551f6..41b31c5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -204,7 +204,7 @@ static void RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -215,7 +215,7 @@ static void RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid);
@@ -951,8 +951,8 @@ TwoPhaseGetDummyProc(TransactionId xid, bool lock_held)
*
* 1. TwoPhaseFileHeader
* 2. TransactionId[] (subtransactions)
- * 3. RelFileNode[] (files to be deleted at commit)
- * 4. RelFileNode[] (files to be deleted at abort)
+ * 3. RelFileLocator[] (files to be deleted at commit)
+ * 4. RelFileLocator[] (files to be deleted at abort)
* 5. SharedInvalidationMessage[] (inval messages to be sent at commit)
* 6. TwoPhaseRecordOnDisk
* 7. ...
@@ -1047,8 +1047,8 @@ StartPrepare(GlobalTransaction gxact)
TransactionId xid = gxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
xl_xact_stats_item *abortstats = NULL;
xl_xact_stats_item *commitstats = NULL;
SharedInvalidationMessage *invalmsgs;
@@ -1102,12 +1102,12 @@ StartPrepare(GlobalTransaction gxact)
}
if (hdr.ncommitrels > 0)
{
- save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileNode));
+ save_state_data(commitrels, hdr.ncommitrels * sizeof(RelFileLocator));
pfree(commitrels);
}
if (hdr.nabortrels > 0)
{
- save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileNode));
+ save_state_data(abortrels, hdr.nabortrels * sizeof(RelFileLocator));
pfree(abortrels);
}
if (hdr.ncommitstats > 0)
@@ -1489,9 +1489,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
TwoPhaseFileHeader *hdr;
TransactionId latestXid;
TransactionId *children;
- RelFileNode *commitrels;
- RelFileNode *abortrels;
- RelFileNode *delrels;
+ RelFileLocator *commitrels;
+ RelFileLocator *abortrels;
+ RelFileLocator *delrels;
int ndelrels;
xl_xact_stats_item *commitstats;
xl_xact_stats_item *abortstats;
@@ -1525,10 +1525,10 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
bufptr += MAXALIGN(hdr->gidlen);
children = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- commitrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- abortrels = (RelFileNode *) bufptr;
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ commitrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ abortrels = (RelFileLocator *) bufptr;
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
commitstats = (xl_xact_stats_item *) bufptr;
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
abortstats = (xl_xact_stats_item *) bufptr;
@@ -2100,8 +2100,8 @@ RecoverPreparedTransactions(void)
bufptr += MAXALIGN(hdr->gidlen);
subxids = (TransactionId *) bufptr;
bufptr += MAXALIGN(hdr->nsubxacts * sizeof(TransactionId));
- bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileNode));
- bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));
+ bufptr += MAXALIGN(hdr->ncommitrels * sizeof(RelFileLocator));
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileLocator));
bufptr += MAXALIGN(hdr->ncommitstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->nabortstats * sizeof(xl_xact_stats_item));
bufptr += MAXALIGN(hdr->ninvalmsgs * sizeof(SharedInvalidationMessage));
@@ -2285,7 +2285,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int ninvalmsgs,
@@ -2383,7 +2383,7 @@ RecordTransactionAbortPrepared(TransactionId xid,
int nchildren,
TransactionId *children,
int nrels,
- RelFileNode *rels,
+ RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
const char *gid)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a..849a7ce 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -521,7 +521,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNode() in
+ * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
* catalog/catalog.c.
*/
Oid
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index bd60b55..116de11 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1282,7 +1282,7 @@ RecordTransactionCommit(void)
bool markXidCommitted = TransactionIdIsValid(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int nchildren;
TransactionId *children;
int ndroppedstats = 0;
@@ -1705,7 +1705,7 @@ RecordTransactionAbort(bool isSubXact)
TransactionId xid = GetCurrentTransactionIdIfAny();
TransactionId latestXid;
int nrels;
- RelFileNode *rels;
+ RelFileLocator *rels;
int ndroppedstats = 0;
xl_xact_stats_item *droppedstats = NULL;
int nchildren;
@@ -5586,7 +5586,7 @@ xactGetCommittedChildren(TransactionId **ptr)
XLogRecPtr
XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int nmsgs, SharedInvalidationMessage *msgs,
bool relcacheInval,
@@ -5597,7 +5597,7 @@ XactLogCommitRecord(TimestampTz commit_time,
xl_xact_xinfo xl_xinfo;
xl_xact_dbinfo xl_dbinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_invals xl_invals;
xl_xact_twophase xl_twophase;
@@ -5651,8 +5651,8 @@ XactLogCommitRecord(TimestampTz commit_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5710,12 +5710,12 @@ XactLogCommitRecord(TimestampTz commit_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -5758,7 +5758,7 @@ XactLogCommitRecord(TimestampTz commit_time,
XLogRecPtr
XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int ndroppedstats, xl_xact_stats_item *droppedstats,
int xactflags, TransactionId twophase_xid,
const char *twophase_gid)
@@ -5766,7 +5766,7 @@ XactLogAbortRecord(TimestampTz abort_time,
xl_xact_abort xlrec;
xl_xact_xinfo xl_xinfo;
xl_xact_subxacts xl_subxacts;
- xl_xact_relfilenodes xl_relfilenodes;
+ xl_xact_relfilelocators xl_relfilelocators;
xl_xact_stats_items xl_dropped_stats;
xl_xact_twophase xl_twophase;
xl_xact_dbinfo xl_dbinfo;
@@ -5800,8 +5800,8 @@ XactLogAbortRecord(TimestampTz abort_time,
if (nrels > 0)
{
- xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILENODES;
- xl_relfilenodes.nrels = nrels;
+ xl_xinfo.xinfo |= XACT_XINFO_HAS_RELFILELOCATORS;
+ xl_relfilelocators.nrels = nrels;
info |= XLR_SPECIAL_REL_UPDATE;
}
@@ -5864,12 +5864,12 @@ XactLogAbortRecord(TimestampTz abort_time,
nsubxacts * sizeof(TransactionId));
}
- if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILENODES)
+ if (xl_xinfo.xinfo & XACT_XINFO_HAS_RELFILELOCATORS)
{
- XLogRegisterData((char *) (&xl_relfilenodes),
- MinSizeOfXactRelfilenodes);
+ XLogRegisterData((char *) (&xl_relfilelocators),
+ MinSizeOfXactRelfileLocators);
XLogRegisterData((char *) rels,
- nrels * sizeof(RelFileNode));
+ nrels * sizeof(RelFileLocator));
}
if (xl_xinfo.xinfo & XACT_XINFO_HAS_DROPPED_STATS)
@@ -6010,7 +6010,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
XLogFlush(lsn);
/* Make sure files supposed to be dropped are dropped */
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
@@ -6121,7 +6121,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid,
*/
XLogFlush(lsn);
- DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+ DropRelationFiles(parsed->xlocators, parsed->nrels, true);
}
if (parsed->nstats > 0)
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 2ce9be2..ec27d36 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -70,7 +70,7 @@ typedef struct
{
bool in_use; /* is this slot in use? */
uint8 flags; /* REGBUF_* flags */
- RelFileNode rnode; /* identifies the relation and block */
+ RelFileLocator rlocator; /* identifies the relation and block */
ForkNumber forkno;
BlockNumber block;
Page page; /* page content */
@@ -257,7 +257,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, ®buf->rlocator, ®buf->forkno, ®buf->block);
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -278,7 +278,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -293,7 +293,7 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
* shared buffer pool (i.e. when you don't have a Buffer for it).
*/
void
-XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
+XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
BlockNumber blknum, Page page, uint8 flags)
{
registered_buffer *regbuf;
@@ -308,7 +308,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
regbuf = ®istered_buffers[block_id];
- regbuf->rnode = *rnode;
+ regbuf->rlocator = *rlocator;
regbuf->forkno = forknum;
regbuf->block = blknum;
regbuf->page = page;
@@ -331,7 +331,7 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
if (i == block_id || !regbuf_old->in_use)
continue;
- Assert(!RelFileNodeEquals(regbuf_old->rnode, regbuf->rnode) ||
+ Assert(!RelFileLocatorEquals(regbuf_old->rlocator, regbuf->rlocator) ||
regbuf_old->forkno != regbuf->forkno ||
regbuf_old->block != regbuf->block);
}
@@ -768,7 +768,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
rdt_datas_last = regbuf->rdata_tail;
}
- if (prev_regbuf && RelFileNodeEquals(regbuf->rnode, prev_regbuf->rnode))
+ if (prev_regbuf && RelFileLocatorEquals(regbuf->rlocator, prev_regbuf->rlocator))
{
samerel = true;
bkpb.fork_flags |= BKPBLOCK_SAME_REL;
@@ -793,8 +793,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
- memcpy(scratch, ®buf->rnode, sizeof(RelFileNode));
- scratch += sizeof(RelFileNode);
+ memcpy(scratch, ®buf->rlocator, sizeof(RelFileLocator));
+ scratch += sizeof(RelFileLocator);
}
memcpy(scratch, ®buf->block, sizeof(BlockNumber));
scratch += sizeof(BlockNumber);
@@ -1031,7 +1031,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags = 0;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkno;
BlockNumber blkno;
@@ -1058,8 +1058,8 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (buffer_std)
flags |= REGBUF_STANDARD;
- BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ BufferGetTag(buffer, &rlocator, &forkno, &blkno);
+ XLogRegisterBlock(0, &rlocator, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1080,7 +1080,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
* the unused space to be left out from the WAL record, making it smaller.
*/
XLogRecPtr
-log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
+log_newpage(RelFileLocator *rlocator, ForkNumber forkNum, BlockNumber blkno,
Page page, bool page_std)
{
int flags;
@@ -1091,7 +1091,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
flags |= REGBUF_STANDARD;
XLogBeginInsert();
- XLogRegisterBlock(0, rnode, forkNum, blkno, page, flags);
+ XLogRegisterBlock(0, rlocator, forkNum, blkno, page, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI);
/*
@@ -1112,7 +1112,7 @@ log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
* because we can write multiple pages in a single WAL record.
*/
void
-log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, Page *pages, bool page_std)
{
int flags;
@@ -1142,7 +1142,7 @@ log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
nbatch = 0;
while (nbatch < XLR_MAX_BLOCK_ID && i < num_pages)
{
- XLogRegisterBlock(nbatch, rnode, forkNum, blknos[i], pages[i], flags);
+ XLogRegisterBlock(nbatch, rlocator, forkNum, blknos[i], pages[i], flags);
i++;
nbatch++;
}
@@ -1177,16 +1177,16 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
BlockNumber blkno;
/* Shared buffers should be modified in a critical section. */
Assert(CritSectionCount > 0);
- BufferGetTag(buffer, &rnode, &forkNum, &blkno);
+ BufferGetTag(buffer, &rlocator, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rlocator, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 959e409..c469610 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -138,7 +138,7 @@ struct XLogPrefetcher
dlist_head filter_queue;
/* Book-keeping to avoid repeat prefetches. */
- RelFileNode recent_rnode[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
+ RelFileLocator recent_rlocator[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
BlockNumber recent_block[XLOGPREFETCHER_SEQ_WINDOW_SIZE];
int recent_idx;
@@ -161,7 +161,7 @@ struct XLogPrefetcher
*/
typedef struct XLogPrefetcherFilter
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
XLogRecPtr filter_until_replayed;
BlockNumber filter_from_block;
dlist_node link;
@@ -187,11 +187,11 @@ typedef struct XLogPrefetchStats
} XLogPrefetchStats;
static inline void XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno,
XLogRecPtr lsn);
static inline bool XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blockno);
static inline void XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher,
XLogRecPtr replaying_lsn);
@@ -365,7 +365,7 @@ XLogPrefetcherAllocate(XLogReaderState *reader)
{
XLogPrefetcher *prefetcher;
static HASHCTL hash_table_ctl = {
- .keysize = sizeof(RelFileNode),
+ .keysize = sizeof(RelFileLocator),
.entrysize = sizeof(XLogPrefetcherFilter)
};
@@ -568,22 +568,23 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
xl_dbase_create_file_copy_rec *xlrec =
(xl_dbase_create_file_copy_rec *) record->main_data;
- RelFileNode rnode = {InvalidOid, xlrec->db_id, InvalidOid};
+ RelFileLocator rlocator =
+ {InvalidOid, xlrec->db_id, InvalidRelFileNumber};
/*
* Don't try to prefetch anything in this database until
* it has been created, or we might confuse the blocks of
- * different generations, if a database OID or relfilenode
- * is reused. It's also more efficient than discovering
- * that relations don't exist on disk yet with ENOENT
- * errors.
+ * different generations, if a database OID or
+ * relfilenumber is reused. It's also more efficient than
+ * discovering that relations don't exist on disk yet with
+ * ENOENT errors.
*/
- XLogPrefetcherAddFilter(prefetcher, rnode, 0, record->lsn);
+ XLogPrefetcherAddFilter(prefetcher, rlocator, 0, record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in database %u until %X/%X is replayed due to raw file copy",
- rnode.dbNode,
+ rlocator.dbOid,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -601,19 +602,19 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't prefetch anything for this whole relation
* until it has been created. Otherwise we might
* confuse the blocks of different generations, if a
- * relfilenode is reused. This also avoids the need
+ * relfilenumber is reused. This also avoids the need
* to discover the problem via extra syscalls that
* report ENOENT.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator, 0,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
}
@@ -627,16 +628,16 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* Don't consider prefetching anything in the truncated
* range until the truncation has been performed.
*/
- XLogPrefetcherAddFilter(prefetcher, xlrec->rnode,
+ XLogPrefetcherAddFilter(prefetcher, xlrec->rlocator,
xlrec->blkno,
record->lsn);
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
- xlrec->rnode.spcNode,
- xlrec->rnode.dbNode,
- xlrec->rnode.relNode,
+ xlrec->rlocator.spcOid,
+ xlrec->rlocator.dbOid,
+ xlrec->rlocator.relNumber,
xlrec->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
@@ -688,7 +689,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
}
/* Should we skip prefetching this block due to a filter? */
- if (XLogPrefetcherIsFiltered(prefetcher, block->rnode, block->blkno))
+ if (XLogPrefetcherIsFiltered(prefetcher, block->rlocator, block->blkno))
{
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -698,7 +699,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
for (int i = 0; i < XLOGPREFETCHER_SEQ_WINDOW_SIZE; ++i)
{
if (block->blkno == prefetcher->recent_block[i] &&
- RelFileNodeEquals(block->rnode, prefetcher->recent_rnode[i]))
+ RelFileLocatorEquals(block->rlocator, prefetcher->recent_rlocator[i]))
{
/*
* XXX If we also remembered where it was, we could set
@@ -709,7 +710,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
return LRQ_NEXT_NO_IO;
}
}
- prefetcher->recent_rnode[prefetcher->recent_idx] = block->rnode;
+ prefetcher->recent_rlocator[prefetcher->recent_idx] = block->rlocator;
prefetcher->recent_block[prefetcher->recent_idx] = block->blkno;
prefetcher->recent_idx =
(prefetcher->recent_idx + 1) % XLOGPREFETCHER_SEQ_WINDOW_SIZE;
@@ -719,7 +720,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* same relation (with some scheme to handle invalidations
* safely), but for now we'll call smgropen() every time.
*/
- reln = smgropen(block->rnode, InvalidBackendId);
+ reln = smgropen(block->rlocator, InvalidBackendId);
/*
* If the relation file doesn't exist on disk, for example because
@@ -733,12 +734,12 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, 0,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, 0,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -754,13 +755,13 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno,
LSN_FORMAT_ARGS(record->lsn));
#endif
- XLogPrefetcherAddFilter(prefetcher, block->rnode, block->blkno,
+ XLogPrefetcherAddFilter(prefetcher, block->rlocator, block->blkno,
record->lsn);
XLogPrefetchIncrement(&SharedStats->skip_new);
return LRQ_NEXT_NO_IO;
@@ -793,9 +794,9 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
*/
elog(ERROR,
"could not prefetch relation %u/%u/%u block %u",
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
block->blkno);
}
}
@@ -852,17 +853,17 @@ pg_stat_get_recovery_prefetch(PG_FUNCTION_ARGS)
}
/*
- * Don't prefetch any blocks >= 'blockno' from a given 'rnode', until 'lsn'
+ * Don't prefetch any blocks >= 'blockno' from a given 'rlocator', until 'lsn'
* has been replayed.
*/
static inline void
-XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno, XLogRecPtr lsn)
{
XLogPrefetcherFilter *filter;
bool found;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_ENTER, &found);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_ENTER, &found);
if (!found)
{
/*
@@ -875,7 +876,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
else
{
/*
- * We were already filtering this rnode. Extend the filter's lifetime
+ * We were already filtering this rlocator. Extend the filter's lifetime
* to cover this WAL record, but leave the lower of the block numbers
* there because we don't want to have to track individual blocks.
*/
@@ -890,7 +891,7 @@ XLogPrefetcherAddFilter(XLogPrefetcher *prefetcher, RelFileNode rnode,
* Have we replayed any records that caused us to begin filtering a block
* range? That means that relations should have been created, extended or
* dropped as required, so we can stop filtering out accesses to a given
- * relfilenode.
+ * relfilenumber.
*/
static inline void
XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_lsn)
@@ -913,7 +914,7 @@ XLogPrefetcherCompleteFilters(XLogPrefetcher *prefetcher, XLogRecPtr replaying_l
* Check if a given block should be skipped due to a filter.
*/
static inline bool
-XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
+XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
BlockNumber blockno)
{
/*
@@ -925,13 +926,13 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
XLogPrefetcherFilter *filter;
/* See if the block range is filtered. */
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter && filter->filter_from_block <= blockno)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
#endif
@@ -939,15 +940,15 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileNode rnode,
}
/* See if the whole database is filtered. */
- rnode.relNode = InvalidOid;
- rnode.spcNode = InvalidOid;
- filter = hash_search(prefetcher->filter_table, &rnode, HASH_FIND, NULL);
+ rlocator.relNumber = InvalidRelFileNumber;
+ rlocator.spcOid = InvalidOid;
+ filter = hash_search(prefetcher->filter_table, &rlocator, HASH_FIND, NULL);
if (filter)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
"prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
- rnode.spcNode, rnode.dbNode, rnode.relNode, blockno,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
return true;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cf5db23..f3dc4b7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1638,7 +1638,7 @@ DecodeXLogRecord(XLogReaderState *state,
char *out;
uint32 remaining;
uint32 datatotal;
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
uint8 block_id;
decoded->header = *record;
@@ -1823,12 +1823,12 @@ DecodeXLogRecord(XLogReaderState *state,
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
- COPY_HEADER_FIELD(&blk->rnode, sizeof(RelFileNode));
- rnode = &blk->rnode;
+ COPY_HEADER_FIELD(&blk->rlocator, sizeof(RelFileLocator));
+ rlocator = &blk->rlocator;
}
else
{
- if (rnode == NULL)
+ if (rlocator == NULL)
{
report_invalid_record(state,
"BKPBLOCK_SAME_REL set but no previous rel at %X/%X",
@@ -1836,7 +1836,7 @@ DecodeXLogRecord(XLogReaderState *state,
goto err;
}
- blk->rnode = *rnode;
+ blk->rlocator = *rlocator;
}
COPY_HEADER_FIELD(&blk->blkno, sizeof(BlockNumber));
}
@@ -1926,10 +1926,11 @@ err:
*/
void
XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum)
+ RelFileLocator *rlocator, ForkNumber *forknum,
+ BlockNumber *blknum)
{
- if (!XLogRecGetBlockTagExtended(record, block_id, rnode, forknum, blknum,
- NULL))
+ if (!XLogRecGetBlockTagExtended(record, block_id, rlocator, forknum,
+ blknum, NULL))
{
#ifndef FRONTEND
elog(ERROR, "failed to locate backup block with ID %d in WAL record",
@@ -1945,13 +1946,13 @@ XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
* Returns information about the block that a block reference refers to,
* optionally including the buffer that the block may already be in.
*
- * If the WAL record contains a block reference with the given ID, *rnode,
+ * If the WAL record contains a block reference with the given ID, *rlocator,
* *forknum, *blknum and *prefetch_buffer are filled in (if not NULL), and
* returns true. Otherwise returns false.
*/
bool
XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer)
{
@@ -1961,8 +1962,8 @@ XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
return false;
bkpb = &record->record->blocks[block_id];
- if (rnode)
- *rnode = bkpb->rnode;
+ if (rlocator)
+ *rlocator = bkpb->rlocator;
if (forknum)
*forknum = bkpb->forknum;
if (blknum)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e23451b..5d6f1b5 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2166,24 +2166,26 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
/* decode block references */
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if (forknum != MAIN_FORKNUM)
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
forknum,
blk);
else
appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
block_id,
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid,
+ rlocator.relNumber,
blk);
if (XLogRecHasBlockImage(record, block_id))
appendStringInfoString(buf, " FPW");
@@ -2285,7 +2287,7 @@ static void
verifyBackupPageConsistency(XLogReaderState *record)
{
RmgrData rmgr = GetRmgr(XLogRecGetRmid(record));
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
int block_id;
@@ -2302,7 +2304,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
Page page;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
{
/*
* WAL record doesn't contain a block reference with the given id.
@@ -2327,7 +2329,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
* Read the contents from the current buffer and store it in a
* temporary page.
*/
- buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ buf = XLogReadBufferExtended(rlocator, forknum, blkno,
RBM_NORMAL_NO_LOG,
InvalidBuffer);
if (!BufferIsValid(buf))
@@ -2377,7 +2379,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
{
elog(FATAL,
"inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
- rnode.spcNode, rnode.dbNode, rnode.relNode,
+ rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 4851669..0cda225 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -67,7 +67,7 @@ HotStandbyState standbyState = STANDBY_DISABLED;
*/
typedef struct xl_invalid_page_key
{
- RelFileNode node; /* the relation */
+ RelFileLocator locator; /* the relation */
ForkNumber forkno; /* the fork number */
BlockNumber blkno; /* the page */
} xl_invalid_page_key;
@@ -86,10 +86,10 @@ static int read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPa
/* Report a reference to an invalid page */
static void
-report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
+report_invalid_page(int elevel, RelFileLocator locator, ForkNumber forkno,
BlockNumber blkno, bool present)
{
- char *path = relpathperm(node, forkno);
+ char *path = relpathperm(locator, forkno);
if (present)
elog(elevel, "page %u of relation %s is uninitialized",
@@ -102,7 +102,7 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
/* Log a reference to an invalid page */
static void
-log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
+log_invalid_page(RelFileLocator locator, ForkNumber forkno, BlockNumber blkno,
bool present)
{
xl_invalid_page_key key;
@@ -119,7 +119,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
*/
if (reachedConsistency)
{
- report_invalid_page(WARNING, node, forkno, blkno, present);
+ report_invalid_page(WARNING, locator, forkno, blkno, present);
elog(ignore_invalid_pages ? WARNING : PANIC,
"WAL contains references to invalid pages");
}
@@ -130,7 +130,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
* something about the XLOG record that generated the reference).
*/
if (message_level_is_interesting(DEBUG1))
- report_invalid_page(DEBUG1, node, forkno, blkno, present);
+ report_invalid_page(DEBUG1, locator, forkno, blkno, present);
if (invalid_page_tab == NULL)
{
@@ -147,7 +147,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
}
/* we currently assume xl_invalid_page_key contains no padding */
- key.node = node;
+ key.locator = locator;
key.forkno = forkno;
key.blkno = blkno;
hentry = (xl_invalid_page *)
@@ -166,7 +166,8 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
/* Forget any invalid pages >= minblkno, because they've been dropped */
static void
-forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
+forget_invalid_pages(RelFileLocator locator, ForkNumber forkno,
+ BlockNumber minblkno)
{
HASH_SEQ_STATUS status;
xl_invalid_page *hentry;
@@ -178,13 +179,13 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (RelFileNodeEquals(hentry->key.node, node) &&
+ if (RelFileLocatorEquals(hentry->key.locator, locator) &&
hentry->key.forkno == forkno &&
hentry->key.blkno >= minblkno)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, forkno);
+ char *path = relpathperm(hentry->key.locator, forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -213,11 +214,11 @@ forget_invalid_pages_db(Oid dbid)
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- if (hentry->key.node.dbNode == dbid)
+ if (hentry->key.locator.dbOid == dbid)
{
if (message_level_is_interesting(DEBUG2))
{
- char *path = relpathperm(hentry->key.node, hentry->key.forkno);
+ char *path = relpathperm(hentry->key.locator, hentry->key.forkno);
elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
@@ -261,7 +262,7 @@ XLogCheckInvalidPages(void)
*/
while ((hentry = (xl_invalid_page *) hash_seq_search(&status)) != NULL)
{
- report_invalid_page(WARNING, hentry->key.node, hentry->key.forkno,
+ report_invalid_page(WARNING, hentry->key.locator, hentry->key.forkno,
hentry->key.blkno, hentry->present);
foundone = true;
}
@@ -356,7 +357,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
Buffer *buf)
{
XLogRecPtr lsn = record->EndRecPtr;
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
Buffer prefetch_buffer;
@@ -364,7 +365,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
bool zeromode;
bool willinit;
- if (!XLogRecGetBlockTagExtended(record, block_id, &rnode, &forknum, &blkno,
+ if (!XLogRecGetBlockTagExtended(record, block_id, &rlocator, &forknum, &blkno,
&prefetch_buffer))
{
/* Caller specified a bogus block_id */
@@ -387,7 +388,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
- *buf = XLogReadBufferExtended(rnode, forknum, blkno,
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno,
get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK,
prefetch_buffer);
page = BufferGetPage(*buf);
@@ -418,7 +419,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
}
else
{
- *buf = XLogReadBufferExtended(rnode, forknum, blkno, mode, prefetch_buffer);
+ *buf = XLogReadBufferExtended(rlocator, forknum, blkno, mode, prefetch_buffer);
if (BufferIsValid(*buf))
{
if (mode != RBM_ZERO_AND_LOCK && mode != RBM_ZERO_AND_CLEANUP_LOCK)
@@ -468,7 +469,7 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
* they will be invisible to tools that need to know which pages are modified.
*/
Buffer
-XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer)
{
@@ -481,14 +482,14 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* Do we have a clue where the buffer might be already? */
if (BufferIsValid(recent_buffer) &&
mode == RBM_NORMAL &&
- ReadRecentBuffer(rnode, forknum, blkno, recent_buffer))
+ ReadRecentBuffer(rlocator, forknum, blkno, recent_buffer))
{
buffer = recent_buffer;
goto recent_buffer_fast_path;
}
/* Open the relation at smgr level */
- smgr = smgropen(rnode, InvalidBackendId);
+ smgr = smgropen(rlocator, InvalidBackendId);
/*
* Create the target file if it doesn't already exist. This lets us cope
@@ -505,7 +506,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (blkno < lastblock)
{
/* page exists in file */
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
else
@@ -513,7 +514,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
/* hm, page doesn't exist in file */
if (mode == RBM_NORMAL)
{
- log_invalid_page(rnode, forknum, blkno, false);
+ log_invalid_page(rlocator, forknum, blkno, false);
return InvalidBuffer;
}
if (mode == RBM_NORMAL_NO_LOG)
@@ -530,7 +531,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
}
- buffer = ReadBufferWithoutRelcache(rnode, forknum,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum,
P_NEW, mode, NULL, true);
}
while (BufferGetBlockNumber(buffer) < blkno);
@@ -540,7 +541,7 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
ReleaseBuffer(buffer);
- buffer = ReadBufferWithoutRelcache(rnode, forknum, blkno,
+ buffer = ReadBufferWithoutRelcache(rlocator, forknum, blkno,
mode, NULL, true);
}
}
@@ -559,7 +560,7 @@ recent_buffer_fast_path:
if (PageIsNew(page))
{
ReleaseBuffer(buffer);
- log_invalid_page(rnode, forknum, blkno, true);
+ log_invalid_page(rlocator, forknum, blkno, true);
return InvalidBuffer;
}
}
@@ -594,7 +595,7 @@ typedef FakeRelCacheEntryData *FakeRelCacheEntry;
* Caller must free the returned entry with FreeFakeRelcacheEntry().
*/
Relation
-CreateFakeRelcacheEntry(RelFileNode rnode)
+CreateFakeRelcacheEntry(RelFileLocator rlocator)
{
FakeRelCacheEntry fakeentry;
Relation rel;
@@ -604,7 +605,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
rel = (Relation) fakeentry;
rel->rd_rel = &fakeentry->pgc;
- rel->rd_node = rnode;
+ rel->rd_locator = rlocator;
/*
* We will never be working with temp rels during recovery or while
@@ -615,18 +616,18 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
/* It must be a permanent table here */
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
- /* We don't know the name of the relation; use relfilenode instead */
- sprintf(RelationGetRelationName(rel), "%u", rnode.relNode);
+ /* We don't know the name of the relation; use relfilenumber instead */
+ sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNode may be
+ * relation. Note that this is fairly bogus since relNumber may be
* different from the relation's OID. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
- rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
- rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
+ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
+ rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
rel->rd_smgr = NULL;
@@ -652,9 +653,9 @@ FreeFakeRelcacheEntry(Relation fakerel)
* any open "invalid-page" records for the relation.
*/
void
-XLogDropRelation(RelFileNode rnode, ForkNumber forknum)
+XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum)
{
- forget_invalid_pages(rnode, forknum, 0);
+ forget_invalid_pages(rlocator, forknum, 0);
}
/*
@@ -682,10 +683,10 @@ XLogDropDatabase(Oid dbid)
* We need to clean up any open "invalid-page" records for the dropped pages.
*/
void
-XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks)
{
- forget_invalid_pages(rnode, forkNum, nblocks);
+ forget_invalid_pages(rlocator, forkNum, nblocks);
}
/*
diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index e5cf1b3..7d7655d 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -287,9 +287,9 @@ Boot_DeclareIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidRelFileNumber;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
stmt->unique = false;
stmt->primary = false;
stmt->isconstraint = false;
@@ -339,9 +339,9 @@ Boot_DeclareUniqueIndexStmt:
stmt->excludeOpNames = NIL;
stmt->idxcomment = NULL;
stmt->indexOid = InvalidOid;
- stmt->oldNode = InvalidOid;
+ stmt->oldNumber = InvalidRelFileNumber;
stmt->oldCreateSubid = InvalidSubTransactionId;
- stmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ stmt->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
stmt->unique = true;
stmt->primary = false;
stmt->isconstraint = false;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index e784538..2a33273 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,14 +481,14 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNode
- * Generate a new relfilenode number that is unique within the
+ * GetNewRelFileNumber
+ * Generate a new relfilenumber that is unique within the
* database of the given tablespace.
*
- * If the relfilenode will also be used as the relation's OID, pass the
+ * If the relfilenumber will also be used as the relation's OID, pass the
* opened pg_class catalog, and this routine will guarantee that the result
* is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenode for an existing relation, pass NULL for pg_class.
+ * as a relfilenumber for an existing relation, pass NULL for pg_class.
*
* As with GetNewOidWithIndex(), there is some theoretical risk of a race
* condition, but it doesn't seem worth worrying about.
@@ -496,17 +496,17 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
* Note: we don't support using this in bootstrap mode. All relations
* created by bootstrap have preassigned OIDs, so there's no need.
*/
-Oid
-GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
{
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
char *rpath;
bool collides;
BackendId backend;
/*
* If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenode assignments during a binary-upgrade run should be
+ * relfilenumber assignments during a binary-upgrade run should be
* determined by commands in the dump script.
*/
Assert(!IsBinaryUpgrade);
@@ -526,15 +526,15 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
}
/* This logic should match RelationInitPhysicalAddr */
- rnode.node.spcNode = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rnode.node.dbNode = (rnode.node.spcNode == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
+ rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid : MyDatabaseId;
/*
* The relpath will vary based on the backend ID, so we must initialize
* that properly here to make sure that any collisions based on filename
* are properly detected.
*/
- rnode.backend = backend;
+ rlocator.backend = backend;
do
{
@@ -542,13 +542,13 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
/* Generate the OID */
if (pg_class)
- rnode.node.relNode = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
Anum_pg_class_oid);
else
- rnode.node.relNode = GetNewObjectId();
+ rlocator.locator.relNumber = GetNewObjectId();
/* Check for existing file of same name */
- rpath = relpath(rnode, MAIN_FORKNUM);
+ rpath = relpath(rlocator, MAIN_FORKNUM);
if (access(rpath, F_OK) == 0)
{
@@ -570,7 +570,7 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
pfree(rpath);
} while (collides);
- return rnode.node.relNode;
+ return rlocator.locator.relNumber;
}
/*
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 1803194..c69c923 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -77,9 +77,11 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber =
+ InvalidRelFileNumber;
+RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber =
+ InvalidRelFileNumber;
static void AddNewRelationTuple(Relation pg_class_desc,
Relation new_rel_desc,
@@ -273,7 +275,7 @@ SystemAttributeByName(const char *attname)
* heap_create - Create an uncataloged heap relation
*
* Note API change: the caller must now always provide the OID
- * to use for the relation. The relfilenode may be (and in
+ * to use for the relation. The relfilenumber may be (and in
* the simplest cases is) left unspecified.
*
* create_storage indicates whether or not to create the storage.
@@ -289,7 +291,7 @@ heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
@@ -341,11 +343,11 @@ heap_create(const char *relname,
else
{
/*
- * If relfilenode is unspecified by the caller then create storage
+ * If relfilenumber is unspecified by the caller then create storage
* with oid same as relid.
*/
- if (!OidIsValid(relfilenode))
- relfilenode = relid;
+ if (!RelFileNumberIsValid(relfilenumber))
+ relfilenumber = relid;
}
/*
@@ -368,7 +370,7 @@ heap_create(const char *relname,
tupDesc,
relid,
accessmtd,
- relfilenode,
+ relfilenumber,
reltablespace,
shared_relation,
mapped_relation,
@@ -385,11 +387,11 @@ heap_create(const char *relname,
if (create_storage)
{
if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind))
- table_relation_set_new_filenode(rel, &rel->rd_node,
- relpersistence,
- relfrozenxid, relminmxid);
+ table_relation_set_new_filelocator(rel, &rel->rd_locator,
+ relpersistence,
+ relfrozenxid, relminmxid);
else if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
- RelationCreateStorage(rel->rd_node, relpersistence, true);
+ RelationCreateStorage(rel->rd_locator, relpersistence, true);
else
Assert(false);
}
@@ -1069,7 +1071,7 @@ AddNewRelationType(const char *typeName,
* relkind: relkind for new rel
* relpersistence: rel's persistence status (permanent, temp, or unlogged)
* shared_relation: true if it's to be a shared relation
- * mapped_relation: true if the relation will use the relfilenode map
+ * mapped_relation: true if the relation will use the relfilenumber map
* oncommit: ON COMMIT marking (only relevant if it's a temp table)
* reloptions: reloptions in Datum form, or (Datum) 0 if none
* use_user_acl: true if should look for user-defined default permissions;
@@ -1115,7 +1117,7 @@ heap_create_with_catalog(const char *relname,
Oid new_type_oid;
/* By default set to InvalidOid unless overridden by binary-upgrade */
- Oid relfilenode = InvalidOid;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
TransactionId relfrozenxid;
MultiXactId relminmxid;
@@ -1173,12 +1175,12 @@ heap_create_with_catalog(const char *relname,
/*
* Allocate an OID for the relation, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(relid))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
/*
@@ -1196,13 +1198,13 @@ heap_create_with_catalog(const char *relname,
relid = binary_upgrade_next_toast_pg_class_oid;
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
- if (!OidIsValid(binary_upgrade_next_toast_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_toast_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("toast relfilenode value not set when in binary upgrade mode")));
+ errmsg("toast relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_toast_pg_class_relfilenode;
- binary_upgrade_next_toast_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_toast_pg_class_relfilenumber;
+ binary_upgrade_next_toast_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
else
@@ -1217,20 +1219,20 @@ heap_create_with_catalog(const char *relname,
if (RELKIND_HAS_STORAGE(relkind))
{
- if (!OidIsValid(binary_upgrade_next_heap_pg_class_relfilenode))
+ if (!RelFileNumberIsValid(binary_upgrade_next_heap_pg_class_relfilenumber))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("relfilenode value not set when in binary upgrade mode")));
+ errmsg("relfilenumber value not set when in binary upgrade mode")));
- relfilenode = binary_upgrade_next_heap_pg_class_relfilenode;
- binary_upgrade_next_heap_pg_class_relfilenode = InvalidOid;
+ relfilenumber = binary_upgrade_next_heap_pg_class_relfilenumber;
+ binary_upgrade_next_heap_pg_class_relfilenumber = InvalidRelFileNumber;
}
}
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNode(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
+ relpersistence);
}
/*
@@ -1273,7 +1275,7 @@ heap_create_with_catalog(const char *relname,
relnamespace,
reltablespace,
relid,
- relfilenode,
+ relfilenumber,
accessmtd,
tupdesc,
relkind,
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bdd3c34..3dc535e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -87,7 +87,8 @@
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
-Oid binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber =
+ InvalidRelFileNumber;
/*
* Pointer-free representation of variables used when reindexing system
@@ -662,8 +663,8 @@ UpdateIndexRelation(Oid indexoid,
* parent index; otherwise InvalidOid.
* parentConstraintId: if creating a constraint on a partition, the OID
* of the constraint in the parent; otherwise InvalidOid.
- * relFileNode: normally, pass InvalidOid to get new storage. May be
- * nonzero to attach an existing valid build.
+ * relFileNumber: normally, pass InvalidRelFileNumber to get new storage.
+ * May be nonzero to attach an existing valid build.
* indexInfo: same info executor uses to insert into the index
* indexColNames: column names to use for index (List of char *)
* accessMethodObjectId: OID of index AM to use
@@ -703,7 +704,7 @@ index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
@@ -735,7 +736,7 @@ index_create(Relation heapRelation,
char relkind;
TransactionId relfrozenxid;
MultiXactId relminmxid;
- bool create_storage = !OidIsValid(relFileNode);
+ bool create_storage = !RelFileNumberIsValid(relFileNumber);
/* constraint flags can only be set when a constraint is requested */
Assert((constr_flags == 0) ||
@@ -751,7 +752,7 @@ index_create(Relation heapRelation,
/*
* The index will be in the same namespace as its parent table, and is
* shared across databases if and only if the parent is. Likewise, it
- * will use the relfilenode map if and only if the parent does; and it
+ * will use the relfilenumber map if and only if the parent does; and it
* inherits the parent's relpersistence.
*/
namespaceId = RelationGetNamespace(heapRelation);
@@ -902,12 +903,12 @@ index_create(Relation heapRelation,
/*
* Allocate an OID for the index, unless we were told what to use.
*
- * The OID will be the relfilenode as well, so make sure it doesn't
+ * The OID will be the relfilenumber as well, so make sure it doesn't
* collide with either pg_class OIDs or existing physical files.
*/
if (!OidIsValid(indexRelationId))
{
- /* Use binary-upgrade override for pg_class.oid and relfilenode */
+ /* Use binary-upgrade override for pg_class.oid and relfilenumber */
if (IsBinaryUpgrade)
{
if (!OidIsValid(binary_upgrade_next_index_pg_class_oid))
@@ -918,14 +919,14 @@ index_create(Relation heapRelation,
indexRelationId = binary_upgrade_next_index_pg_class_oid;
binary_upgrade_next_index_pg_class_oid = InvalidOid;
- /* Override the index relfilenode */
+ /* Override the index relfilenumber */
if ((relkind == RELKIND_INDEX) &&
- (!OidIsValid(binary_upgrade_next_index_pg_class_relfilenode)))
+ (!RelFileNumberIsValid(binary_upgrade_next_index_pg_class_relfilenumber)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("index relfilenode value not set when in binary upgrade mode")));
- relFileNode = binary_upgrade_next_index_pg_class_relfilenode;
- binary_upgrade_next_index_pg_class_relfilenode = InvalidOid;
+ errmsg("index relfilenumber value not set when in binary upgrade mode")));
+ relFileNumber = binary_upgrade_next_index_pg_class_relfilenumber;
+ binary_upgrade_next_index_pg_class_relfilenumber = InvalidRelFileNumber;
/*
* Note that we want create_storage = true for binary upgrade. The
@@ -937,7 +938,7 @@ index_create(Relation heapRelation,
else
{
indexRelationId =
- GetNewRelFileNode(tableSpaceId, pg_class, relpersistence);
+ GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
}
}
@@ -950,7 +951,7 @@ index_create(Relation heapRelation,
namespaceId,
tableSpaceId,
indexRelationId,
- relFileNode,
+ relFileNumber,
accessMethodObjectId,
indexTupDesc,
relkind,
@@ -1408,7 +1409,7 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
InvalidOid, /* indexRelationId */
InvalidOid, /* parentIndexRelid */
InvalidOid, /* parentConstraintId */
- InvalidOid, /* relFileNode */
+ InvalidRelFileNumber, /* relFileNumber */
newInfo,
indexColNames,
indexRelation->rd_rel->relam,
@@ -3024,7 +3025,7 @@ index_build(Relation heapRelation,
* it -- but we must first check whether one already exists. If, for
* example, an unlogged relation is truncated in the transaction that
* created it, or truncated twice in a subsequent transaction, the
- * relfilenode won't change, and nothing needs to be done here.
+ * relfilenumber won't change, and nothing needs to be done here.
*/
if (indexRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
!smgrexists(RelationGetSmgr(indexRelation), INIT_FORKNUM))
@@ -3681,7 +3682,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
* Schedule unlinking of the old index storage at transaction commit.
*/
RelationDropStorage(iRel);
- RelationAssumeNewRelfilenode(iRel);
+ RelationAssumeNewRelfilelocator(iRel);
/* Make sure the reltablespace change is visible */
CommandCounterIncrement();
@@ -3711,7 +3712,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
SetReindexProcessing(heapId, indexId);
/* Create a new physical relation for the index */
- RelationSetNewRelfilenode(iRel, persistence);
+ RelationSetNewRelfilenumber(iRel, persistence);
/* Initialize the index and rebuild */
/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index c06e414..d024d94 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -38,7 +38,7 @@
int wal_skip_threshold = 2048; /* in kilobytes */
/*
- * We keep a list of all relations (represented as RelFileNode values)
+ * We keep a list of all relations (represented as RelFileLocator values)
* that have been created or deleted in the current transaction. When
* a relation is created, we create the physical file immediately, but
* remember it so that we can delete the file again if the current
@@ -59,7 +59,7 @@ int wal_skip_threshold = 2048; /* in kilobytes */
typedef struct PendingRelDelete
{
- RelFileNode relnode; /* relation that may need to be deleted */
+ RelFileLocator rlocator; /* relation that may need to be deleted */
BackendId backend; /* InvalidBackendId if not a temp rel */
bool atCommit; /* T=delete at commit; F=delete at abort */
int nestLevel; /* xact nesting level of request */
@@ -68,7 +68,7 @@ typedef struct PendingRelDelete
typedef struct PendingRelSync
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
bool is_truncated; /* Has the file experienced truncation? */
} PendingRelSync;
@@ -81,7 +81,7 @@ static HTAB *pendingSyncHash = NULL;
* Queue an at-commit fsync.
*/
static void
-AddPendingSync(const RelFileNode *rnode)
+AddPendingSync(const RelFileLocator *rlocator)
{
PendingRelSync *pending;
bool found;
@@ -91,14 +91,14 @@ AddPendingSync(const RelFileNode *rnode)
{
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNode);
+ ctl.keysize = sizeof(RelFileLocator);
ctl.entrysize = sizeof(PendingRelSync);
ctl.hcxt = TopTransactionContext;
pendingSyncHash = hash_create("pending sync hash", 16, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
}
- pending = hash_search(pendingSyncHash, rnode, HASH_ENTER, &found);
+ pending = hash_search(pendingSyncHash, rlocator, HASH_ENTER, &found);
Assert(!found);
pending->is_truncated = false;
}
@@ -117,7 +117,7 @@ AddPendingSync(const RelFileNode *rnode)
* pass register_delete = false.
*/
SMgrRelation
-RelationCreateStorage(RelFileNode rnode, char relpersistence,
+RelationCreateStorage(RelFileLocator rlocator, char relpersistence,
bool register_delete)
{
SMgrRelation srel;
@@ -145,11 +145,11 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
return NULL; /* placate compiler */
}
- srel = smgropen(rnode, backend);
+ srel = smgropen(rlocator, backend);
smgrcreate(srel, MAIN_FORKNUM, false);
if (needs_wal)
- log_smgrcreate(&srel->smgr_rnode.node, MAIN_FORKNUM);
+ log_smgrcreate(&srel->smgr_rlocator.locator, MAIN_FORKNUM);
/*
* Add the relation to the list of stuff to delete at abort, if we are
@@ -161,7 +161,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rnode;
+ pending->rlocator = rlocator;
pending->backend = backend;
pending->atCommit = false; /* delete if abort */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -172,7 +172,7 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
if (relpersistence == RELPERSISTENCE_PERMANENT && !XLogIsNeeded())
{
Assert(backend == InvalidBackendId);
- AddPendingSync(&rnode);
+ AddPendingSync(&rlocator);
}
return srel;
@@ -182,14 +182,14 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence,
* Perform XLogInsert of an XLOG_SMGR_CREATE record to WAL.
*/
void
-log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum)
+log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum)
{
xl_smgr_create xlrec;
/*
* Make an XLOG entry reporting the file creation.
*/
- xlrec.rnode = *rnode;
+ xlrec.rlocator = *rlocator;
xlrec.forkNum = forkNum;
XLogBeginInsert();
@@ -209,7 +209,7 @@ RelationDropStorage(Relation rel)
/* Add the relation to the list of stuff to delete at commit */
pending = (PendingRelDelete *)
MemoryContextAlloc(TopMemoryContext, sizeof(PendingRelDelete));
- pending->relnode = rel->rd_node;
+ pending->rlocator = rel->rd_locator;
pending->backend = rel->rd_backend;
pending->atCommit = true; /* delete if commit */
pending->nestLevel = GetCurrentTransactionNestLevel();
@@ -247,7 +247,7 @@ RelationDropStorage(Relation rel)
* No-op if the relation is not among those scheduled for deletion.
*/
void
-RelationPreserveStorage(RelFileNode rnode, bool atCommit)
+RelationPreserveStorage(RelFileLocator rlocator, bool atCommit)
{
PendingRelDelete *pending;
PendingRelDelete *prev;
@@ -257,7 +257,7 @@ RelationPreserveStorage(RelFileNode rnode, bool atCommit)
for (pending = pendingDeletes; pending != NULL; pending = next)
{
next = pending->next;
- if (RelFileNodeEquals(rnode, pending->relnode)
+ if (RelFileLocatorEquals(rlocator, pending->rlocator)
&& pending->atCommit == atCommit)
{
/* unlink and delete list entry */
@@ -369,7 +369,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
xl_smgr_truncate xlrec;
xlrec.blkno = nblocks;
- xlrec.rnode = rel->rd_node;
+ xlrec.rlocator = rel->rd_locator;
xlrec.flags = SMGR_TRUNCATE_ALL;
XLogBeginInsert();
@@ -428,7 +428,7 @@ RelationPreTruncate(Relation rel)
return;
pending = hash_search(pendingSyncHash,
- &(RelationGetSmgr(rel)->smgr_rnode.node),
+ &(RelationGetSmgr(rel)->smgr_rlocator.locator),
HASH_FIND, NULL);
if (pending)
pending->is_truncated = true;
@@ -472,7 +472,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* We need to log the copied data in WAL iff WAL archiving/streaming is
* enabled AND it's a permanent relation. This gives the same answer as
* "RelationNeedsWAL(rel) || copying_initfork", because we know the
- * current operation created a new relfilenode.
+ * current operation created new relation storage.
*/
use_wal = XLogIsNeeded() &&
(relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork);
@@ -496,8 +496,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* (errcontext callbacks shouldn't be risking any such thing, but
* people have been known to forget that rule.)
*/
- char *relpath = relpathbackend(src->smgr_rnode.node,
- src->smgr_rnode.backend,
+ char *relpath = relpathbackend(src->smgr_rlocator.locator,
+ src->smgr_rlocator.backend,
forkNum);
ereport(ERROR,
@@ -512,7 +512,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* space.
*/
if (use_wal)
- log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page, false);
+ log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
PageSetChecksumInplace(page, blkno);
@@ -538,19 +538,19 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
}
/*
- * RelFileNodeSkippingWAL
- * Check if a BM_PERMANENT relfilenode is using WAL.
+ * RelFileLocatorSkippingWAL
+ * Check if a BM_PERMANENT relfilelocator is using WAL.
*
- * Changes of certain relfilenodes must not write WAL; see "Skipping WAL for
- * New RelFileNode" in src/backend/access/transam/README. Though it is known
- * from Relation efficiently, this function is intended for the code paths not
- * having access to Relation.
+ * Changes of certain relfilelocator must not write WAL; see "Skipping WAL for
+ * New RelFileLocator" in src/backend/access/transam/README. Though it is
+ * known from Relation efficiently, this function is intended for the code
+ * paths not having access to Relation.
*/
bool
-RelFileNodeSkippingWAL(RelFileNode rnode)
+RelFileLocatorSkippingWAL(RelFileLocator rlocator)
{
if (!pendingSyncHash ||
- hash_search(pendingSyncHash, &rnode, HASH_FIND, NULL) == NULL)
+ hash_search(pendingSyncHash, &rlocator, HASH_FIND, NULL) == NULL)
return false;
return true;
@@ -566,7 +566,7 @@ EstimatePendingSyncsSpace(void)
long entries;
entries = pendingSyncHash ? hash_get_num_entries(pendingSyncHash) : 0;
- return mul_size(1 + entries, sizeof(RelFileNode));
+ return mul_size(1 + entries, sizeof(RelFileLocator));
}
/*
@@ -581,57 +581,58 @@ SerializePendingSyncs(Size maxSize, char *startAddress)
HASH_SEQ_STATUS scan;
PendingRelSync *sync;
PendingRelDelete *delete;
- RelFileNode *src;
- RelFileNode *dest = (RelFileNode *) startAddress;
+ RelFileLocator *src;
+ RelFileLocator *dest = (RelFileLocator *) startAddress;
if (!pendingSyncHash)
goto terminate;
- /* Create temporary hash to collect active relfilenodes */
- ctl.keysize = sizeof(RelFileNode);
- ctl.entrysize = sizeof(RelFileNode);
+ /* Create temporary hash to collect active relfilelocators */
+ ctl.keysize = sizeof(RelFileLocator);
+ ctl.entrysize = sizeof(RelFileLocator);
ctl.hcxt = CurrentMemoryContext;
- tmphash = hash_create("tmp relfilenodes",
+ tmphash = hash_create("tmp relfilelocators",
hash_get_num_entries(pendingSyncHash), &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
- /* collect all rnodes from pending syncs */
+ /* collect all rlocator from pending syncs */
hash_seq_init(&scan, pendingSyncHash);
while ((sync = (PendingRelSync *) hash_seq_search(&scan)))
- (void) hash_search(tmphash, &sync->rnode, HASH_ENTER, NULL);
+ (void) hash_search(tmphash, &sync->rlocator, HASH_ENTER, NULL);
/* remove deleted rnodes */
for (delete = pendingDeletes; delete != NULL; delete = delete->next)
if (delete->atCommit)
- (void) hash_search(tmphash, (void *) &delete->relnode,
+ (void) hash_search(tmphash, (void *) &delete->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, tmphash);
- while ((src = (RelFileNode *) hash_seq_search(&scan)))
+ while ((src = (RelFileLocator *) hash_seq_search(&scan)))
*dest++ = *src;
hash_destroy(tmphash);
terminate:
- MemSet(dest, 0, sizeof(RelFileNode));
+ MemSet(dest, 0, sizeof(RelFileLocator));
}
/*
* RestorePendingSyncs
* Restore syncs within a parallel worker.
*
- * RelationNeedsWAL() and RelFileNodeSkippingWAL() must offer the correct
+ * RelationNeedsWAL() and RelFileLocatorSkippingWAL() must offer the correct
* answer to parallel workers. Only smgrDoPendingSyncs() reads the
* is_truncated field, at end of transaction. Hence, don't restore it.
*/
void
RestorePendingSyncs(char *startAddress)
{
- RelFileNode *rnode;
+ RelFileLocator *rlocator;
Assert(pendingSyncHash == NULL);
- for (rnode = (RelFileNode *) startAddress; rnode->relNode != 0; rnode++)
- AddPendingSync(rnode);
+ for (rlocator = (RelFileLocator *) startAddress; rlocator->relNumber != 0;
+ rlocator++)
+ AddPendingSync(rlocator);
}
/*
@@ -677,7 +678,7 @@ smgrDoPendingDeletes(bool isCommit)
{
SMgrRelation srel;
- srel = smgropen(pending->relnode, pending->backend);
+ srel = smgropen(pending->rlocator, pending->backend);
/* allocate the initial array, or extend it, if needed */
if (maxrels == 0)
@@ -747,7 +748,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
/* Skip syncing nodes that smgrDoPendingDeletes() will delete. */
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
if (pending->atCommit)
- (void) hash_search(pendingSyncHash, (void *) &pending->relnode,
+ (void) hash_search(pendingSyncHash, (void *) &pending->rlocator,
HASH_REMOVE, NULL);
hash_seq_init(&scan, pendingSyncHash);
@@ -758,7 +759,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
BlockNumber total_blocks = 0;
SMgrRelation srel;
- srel = smgropen(pendingsync->rnode, InvalidBackendId);
+ srel = smgropen(pendingsync->rlocator, InvalidBackendId);
/*
* We emit newpage WAL records for smaller relations.
@@ -832,7 +833,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* page including any unused space. ReadBufferExtended()
* counts some pgstat events; unfortunately, we discard them.
*/
- rel = CreateFakeRelcacheEntry(srel->smgr_rnode.node);
+ rel = CreateFakeRelcacheEntry(srel->smgr_rlocator.locator);
log_newpage_range(rel, fork, 0, n, false);
FreeFakeRelcacheEntry(rel);
}
@@ -852,7 +853,7 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* smgrGetPendingDeletes() -- Get a list of non-temp relations to be deleted.
*
* The return value is the number of relations scheduled for termination.
- * *ptr is set to point to a freshly-palloc'd array of RelFileNodes.
+ * *ptr is set to point to a freshly-palloc'd array of RelFileLocators.
* If there are no relations to be deleted, *ptr is set to NULL.
*
* Only non-temporary relations are included in the returned list. This is OK
@@ -866,11 +867,11 @@ smgrDoPendingSyncs(bool isCommit, bool isParallelWorker)
* by upper-level transactions.
*/
int
-smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
+smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr)
{
int nestLevel = GetCurrentTransactionNestLevel();
int nrels;
- RelFileNode *rptr;
+ RelFileLocator *rptr;
PendingRelDelete *pending;
nrels = 0;
@@ -885,14 +886,14 @@ smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr)
*ptr = NULL;
return 0;
}
- rptr = (RelFileNode *) palloc(nrels * sizeof(RelFileNode));
+ rptr = (RelFileLocator *) palloc(nrels * sizeof(RelFileLocator));
*ptr = rptr;
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
{
if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit
&& pending->backend == InvalidBackendId)
{
- *rptr = pending->relnode;
+ *rptr = pending->rlocator;
rptr++;
}
}
@@ -967,7 +968,7 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
else if (info == XLOG_SMGR_TRUNCATE)
@@ -980,7 +981,7 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
- reln = smgropen(xlrec->rnode, InvalidBackendId);
+ reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
* Forcibly create relation if it doesn't exist (which suggests that
@@ -1015,11 +1016,11 @@ smgr_redo(XLogReaderState *record)
nforks++;
/* Also tell xlogutils.c about it */
- XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+ XLogTruncateRelation(xlrec->rlocator, MAIN_FORKNUM, xlrec->blkno);
}
/* Prepare for truncation of FSM and VM too */
- rel = CreateFakeRelcacheEntry(xlrec->rnode);
+ rel = CreateFakeRelcacheEntry(xlrec->rlocator);
if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 &&
smgrexists(reln, FSM_FORKNUM))
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cea2c8b..da137eb 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -293,7 +293,7 @@ cluster_multiple_rels(List *rtcs, ClusterParams *params)
* cluster_rel
*
* This clusters the table by creating a new, clustered table and
- * swapping the relfilenodes of the new table and the old table, so
+ * swapping the relfilenumbers of the new table and the old table, so
* the OID of the original table is preserved. Thus we do not lose
* GRANT, inheritance nor references to this table (this was a bug
* in releases through 7.3).
@@ -1025,8 +1025,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
/*
* Swap the physical files of two given relations.
*
- * We swap the physical identity (reltablespace, relfilenode) while keeping the
- * same logical identities of the two relations. relpersistence is also
+ * We swap the physical identity (reltablespace, relfilenumber) while keeping
+ * the same logical identities of the two relations. relpersistence is also
* swapped, which is critical since it determines where buffers live for each
* relation.
*
@@ -1061,9 +1061,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
reltup2;
Form_pg_class relform1,
relform2;
- Oid relfilenode1,
- relfilenode2;
- Oid swaptemp;
+ RelFileNumber relfilenumber1,
+ relfilenumber2;
+ RelFileNumber swaptemp;
char swptmpchr;
/* We need writable copies of both pg_class tuples. */
@@ -1079,13 +1079,14 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
elog(ERROR, "cache lookup failed for relation %u", r2);
relform2 = (Form_pg_class) GETSTRUCT(reltup2);
- relfilenode1 = relform1->relfilenode;
- relfilenode2 = relform2->relfilenode;
+ relfilenumber1 = relform1->relfilenode;
+ relfilenumber2 = relform2->relfilenode;
- if (OidIsValid(relfilenode1) && OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) &&
+ RelFileNumberIsValid(relfilenumber2))
{
/*
- * Normal non-mapped relations: swap relfilenodes, reltablespaces,
+ * Normal non-mapped relations: swap relfilenumbers, reltablespaces,
* relpersistence
*/
Assert(!target_is_pg_class);
@@ -1120,7 +1121,8 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Mapped-relation case. Here we have to swap the relation mappings
* instead of modifying the pg_class columns. Both must be mapped.
*/
- if (OidIsValid(relfilenode1) || OidIsValid(relfilenode2))
+ if (RelFileNumberIsValid(relfilenumber1) ||
+ RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "cannot swap mapped relation \"%s\" with non-mapped relation",
NameStr(relform1->relname));
@@ -1148,12 +1150,12 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
/*
* Fetch the mappings --- shouldn't fail, but be paranoid
*/
- relfilenode1 = RelationMapOidToFilenode(r1, relform1->relisshared);
- if (!OidIsValid(relfilenode1))
+ relfilenumber1 = RelationMapOidToFilenumber(r1, relform1->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber1))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform1->relname), r1);
- relfilenode2 = RelationMapOidToFilenode(r2, relform2->relisshared);
- if (!OidIsValid(relfilenode2))
+ relfilenumber2 = RelationMapOidToFilenumber(r2, relform2->relisshared);
+ if (!RelFileNumberIsValid(relfilenumber2))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
NameStr(relform2->relname), r2);
@@ -1161,15 +1163,15 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
* Send replacement mappings to relmapper. Note these won't actually
* take effect until CommandCounterIncrement.
*/
- RelationMapUpdateMap(r1, relfilenode2, relform1->relisshared, false);
- RelationMapUpdateMap(r2, relfilenode1, relform2->relisshared, false);
+ RelationMapUpdateMap(r1, relfilenumber2, relform1->relisshared, false);
+ RelationMapUpdateMap(r2, relfilenumber1, relform2->relisshared, false);
/* Pass OIDs of mapped r2 tables back to caller */
*mapped_tables++ = r2;
}
/*
- * Recognize that rel1's relfilenode (swapped from rel2) is new in this
+ * Recognize that rel1's relfilenumber (swapped from rel2) is new in this
* subtransaction. The rel2 storage (swapped from rel1) may or may not be
* new.
*/
@@ -1180,9 +1182,9 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
rel1 = relation_open(r1, NoLock);
rel2 = relation_open(r2, NoLock);
rel2->rd_createSubid = rel1->rd_createSubid;
- rel2->rd_newRelfilenodeSubid = rel1->rd_newRelfilenodeSubid;
- rel2->rd_firstRelfilenodeSubid = rel1->rd_firstRelfilenodeSubid;
- RelationAssumeNewRelfilenode(rel1);
+ rel2->rd_newRelfilelocatorSubid = rel1->rd_newRelfilelocatorSubid;
+ rel2->rd_firstRelfilelocatorSubid = rel1->rd_firstRelfilelocatorSubid;
+ RelationAssumeNewRelfilelocator(rel1);
relation_close(rel1, NoLock);
relation_close(rel2, NoLock);
}
@@ -1523,7 +1525,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
table_close(relRelation, RowExclusiveLock);
}
- /* Destroy new heap with old filenode */
+ /* Destroy new heap with old filenumber */
object.classId = RelationRelationId;
object.objectId = OIDNewHeap;
object.objectSubId = 0;
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 35a1d3a..a976008 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -593,12 +593,12 @@ CopyFrom(CopyFromState cstate)
*/
if (RELKIND_HAS_STORAGE(cstate->rel->rd_rel->relkind) &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
- cstate->rel->rd_firstRelfilenodeSubid != InvalidSubTransactionId))
+ cstate->rel->rd_firstRelfilelocatorSubid != InvalidSubTransactionId))
ti_options |= TABLE_INSERT_SKIP_FSM;
/*
- * Optimize if new relfilenode was created in this subxact or one of its
- * committed children and we won't see those rows later as part of an
+ * Optimize if new relation storage was created in this subxact or one of
+ * its committed children and we won't see those rows later as part of an
* earlier scan or command. The subxact test ensures that if this subxact
* aborts then the frozen rows won't be visible after xact cleanup. Note
* that the stronger test of exactly which subtransaction created it is
@@ -640,7 +640,7 @@ CopyFrom(CopyFromState cstate)
errmsg("cannot perform COPY FREEZE because of prior transaction activity")));
if (cstate->rel->rd_createSubid != GetCurrentSubTransactionId() &&
- cstate->rel->rd_newRelfilenodeSubid != GetCurrentSubTransactionId())
+ cstate->rel->rd_newRelfilelocatorSubid != GetCurrentSubTransactionId())
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction")));
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index f269168..c78bab5 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -101,7 +101,7 @@ typedef struct
*/
typedef struct CreateDBRelInfo
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
Oid reloid; /* relation oid */
bool permanent; /* relation is permanent or unlogged */
} CreateDBRelInfo;
@@ -127,7 +127,7 @@ static void CreateDatabaseUsingWalLog(Oid src_dboid, Oid dboid, Oid src_tsid,
static List *ScanSourceDatabasePgClass(Oid srctbid, Oid srcdbid, char *srcpath);
static List *ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid,
Oid dbid, char *srcpath,
- List *rnodelist, Snapshot snapshot);
+ List *rlocatorlist, Snapshot snapshot);
static CreateDBRelInfo *ScanSourceDatabasePgClassTuple(HeapTupleData *tuple,
Oid tbid, Oid dbid,
char *srcpath);
@@ -147,12 +147,12 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
{
char *srcpath;
char *dstpath;
- List *rnodelist = NULL;
+ List *rlocatorlist = NULL;
ListCell *cell;
LockRelId srcrelid;
LockRelId dstrelid;
- RelFileNode srcrnode;
- RelFileNode dstrnode;
+ RelFileLocator srcrlocator;
+ RelFileLocator dstrlocator;
CreateDBRelInfo *relinfo;
/* Get source and destination database paths. */
@@ -165,9 +165,9 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
/* Copy relmap file from source database to the destination database. */
RelationMapCopy(dst_dboid, dst_tsid, srcpath, dstpath);
- /* Get list of relfilenodes to copy from the source database. */
- rnodelist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
- Assert(rnodelist != NIL);
+ /* Get list of relfilelocators to copy from the source database. */
+ rlocatorlist = ScanSourceDatabasePgClass(src_tsid, src_dboid, srcpath);
+ Assert(rlocatorlist != NIL);
/*
* Database IDs will be the same for all relations so set them before
@@ -176,11 +176,11 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
srcrelid.dbId = src_dboid;
dstrelid.dbId = dst_dboid;
- /* Loop over our list of relfilenodes and copy each one. */
- foreach(cell, rnodelist)
+ /* Loop over our list of relfilelocators and copy each one. */
+ foreach(cell, rlocatorlist)
{
relinfo = lfirst(cell);
- srcrnode = relinfo->rnode;
+ srcrlocator = relinfo->rlocator;
/*
* If the relation is from the source db's default tablespace then we
@@ -188,13 +188,13 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
* Otherwise, we need to create in the same tablespace as it is in the
* source database.
*/
- if (srcrnode.spcNode == src_tsid)
- dstrnode.spcNode = dst_tsid;
+ if (srcrlocator.spcOid == src_tsid)
+ dstrlocator.spcOid = dst_tsid;
else
- dstrnode.spcNode = srcrnode.spcNode;
+ dstrlocator.spcOid = srcrlocator.spcOid;
- dstrnode.dbNode = dst_dboid;
- dstrnode.relNode = srcrnode.relNode;
+ dstrlocator.dbOid = dst_dboid;
+ dstrlocator.relNumber = srcrlocator.relNumber;
/*
* Acquire locks on source and target relations before copying.
@@ -210,7 +210,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
LockRelationId(&dstrelid, AccessShareLock);
/* Copy relation storage from source to the destination. */
- CreateAndCopyRelationData(srcrnode, dstrnode, relinfo->permanent);
+ CreateAndCopyRelationData(srcrlocator, dstrlocator, relinfo->permanent);
/* Release the relation locks. */
UnlockRelationId(&srcrelid, AccessShareLock);
@@ -219,7 +219,7 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
pfree(srcpath);
pfree(dstpath);
- list_free_deep(rnodelist);
+ list_free_deep(rlocatorlist);
}
/*
@@ -246,31 +246,31 @@ CreateDatabaseUsingWalLog(Oid src_dboid, Oid dst_dboid,
static List *
ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenode;
+ Oid relfilenumber;
Page page;
- List *rnodelist = NIL;
+ List *rlocatorlist = NIL;
LockRelId relid;
Relation rel;
Snapshot snapshot;
BufferAccessStrategy bstrategy;
- /* Get pg_class relfilenode. */
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- RelationRelationId);
+ /* Get pg_class relfilenumber. */
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ RelationRelationId);
/* Don't read data into shared_buffers without holding a relation lock. */
relid.dbId = dbid;
relid.relId = RelationRelationId;
LockRelationId(&relid, AccessShareLock);
- /* Prepare a RelFileNode for the pg_class relation. */
- rnode.spcNode = tbid;
- rnode.dbNode = dbid;
- rnode.relNode = relfilenode;
+ /* Prepare a RelFileLocator for the pg_class relation. */
+ rlocator.spcOid = tbid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = relfilenumber;
/*
* We can't use a real relcache entry for a relation in some other
@@ -279,7 +279,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- rel = CreateFakeRelcacheEntry(rnode);
+ rel = CreateFakeRelcacheEntry(rlocator);
nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
FreeFakeRelcacheEntry(rel);
@@ -299,7 +299,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
{
CHECK_FOR_INTERRUPTS();
- buf = ReadBufferWithoutRelcache(rnode, MAIN_FORKNUM, blkno,
+ buf = ReadBufferWithoutRelcache(rlocator, MAIN_FORKNUM, blkno,
RBM_NORMAL, bstrategy, false);
LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -310,9 +310,9 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
continue;
}
- /* Append relevant pg_class tuples for current page to rnodelist. */
- rnodelist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
- srcpath, rnodelist,
+ /* Append relevant pg_class tuples for current page to rlocatorlist. */
+ rlocatorlist = ScanSourceDatabasePgClassPage(page, buf, tbid, dbid,
+ srcpath, rlocatorlist,
snapshot);
UnlockReleaseBuffer(buf);
@@ -321,16 +321,16 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
/* Release relation lock. */
UnlockRelationId(&relid, AccessShareLock);
- return rnodelist;
+ return rlocatorlist;
}
/*
* Scan one page of the source database's pg_class relation and add relevant
- * entries to rnodelist. The return value is the updated list.
+ * entries to rlocatorlist. The return value is the updated list.
*/
static List *
ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
- char *srcpath, List *rnodelist,
+ char *srcpath, List *rlocatorlist,
Snapshot snapshot)
{
BlockNumber blkno = BufferGetBlockNumber(buf);
@@ -376,11 +376,11 @@ ScanSourceDatabasePgClassPage(Page page, Buffer buf, Oid tbid, Oid dbid,
relinfo = ScanSourceDatabasePgClassTuple(&tuple, tbid, dbid,
srcpath);
if (relinfo != NULL)
- rnodelist = lappend(rnodelist, relinfo);
+ rlocatorlist = lappend(rlocatorlist, relinfo);
}
}
- return rnodelist;
+ return rlocatorlist;
}
/*
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenode = InvalidOid;
+ Oid relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
@@ -418,29 +418,29 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
return NULL;
/*
- * If relfilenode is valid then directly use it. Otherwise, consult the
+ * If relfilenumber is valid then directly use it. Otherwise, consult the
* relmap.
*/
- if (OidIsValid(classForm->relfilenode))
- relfilenode = classForm->relfilenode;
+ if (RelFileNumberIsValid(classForm->relfilenode))
+ relfilenumber = classForm->relfilenode;
else
- relfilenode = RelationMapOidToFilenodeForDatabase(srcpath,
- classForm->oid);
+ relfilenumber = RelationMapOidToFilenumberForDatabase(srcpath,
+ classForm->oid);
- /* We must have a valid relfilenode oid. */
- if (!OidIsValid(relfilenode))
- elog(ERROR, "relation with OID %u does not have a valid relfilenode",
+ /* We must have a valid relfilenumber. */
+ if (!RelFileNumberIsValid(relfilenumber))
+ elog(ERROR, "relation with OID %u does not have a valid relfilenumber",
classForm->oid);
/* Prepare a rel info element and add it to the list. */
relinfo = (CreateDBRelInfo *) palloc(sizeof(CreateDBRelInfo));
if (OidIsValid(classForm->reltablespace))
- relinfo->rnode.spcNode = classForm->reltablespace;
+ relinfo->rlocator.spcOid = classForm->reltablespace;
else
- relinfo->rnode.spcNode = tbid;
+ relinfo->rlocator.spcOid = tbid;
- relinfo->rnode.dbNode = dbid;
- relinfo->rnode.relNode = relfilenode;
+ relinfo->rlocator.dbOid = dbid;
+ relinfo->rlocator.relNumber = relfilenumber;
relinfo->reloid = classForm->oid;
/* Temporary relations were rejected above. */
@@ -2867,8 +2867,8 @@ remove_dbtablespaces(Oid db_id)
* try to remove that already-existing subdirectory during the cleanup in
* remove_dbtablespaces. Nuking existing files seems like a bad idea, so
* instead we make this extra check before settling on the OID of the new
- * database. This exactly parallels what GetNewRelFileNode() does for table
- * relfilenode values.
+ * database. This exactly parallels what GetNewRelFileNumber() does for table
+ * relfilenumber values.
*/
static bool
check_db_file_conflict(Oid db_id)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 99f5ab8..1868608 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1109,10 +1109,10 @@ DefineIndex(Oid relationId,
}
/*
- * A valid stmt->oldNode implies that we already have a built form of the
+ * A valid stmt->oldNumber implies that we already have a built form of the
* index. The caller should also decline any index build.
*/
- Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
+ Assert(!RelFileNumberIsValid(stmt->oldNumber) || (skip_build && !concurrent));
/*
* Make the catalog entries for the index, including constraints. This
@@ -1154,7 +1154,7 @@ DefineIndex(Oid relationId,
indexRelationId =
index_create(rel, indexRelationName, indexRelationId, parentIndexId,
parentConstraintId,
- stmt->oldNode, indexInfo, indexColNames,
+ stmt->oldNumber, indexInfo, indexColNames,
accessMethodId, tablespaceId,
collationObjectId, classObjectId,
coloptions, reloptions,
@@ -1361,15 +1361,15 @@ DefineIndex(Oid relationId,
* We can't use the same index name for the child index,
* so clear idxname to let the recursive invocation choose
* a new name. Likewise, the existing target relation
- * field is wrong, and if indexOid or oldNode are set,
+ * field is wrong, and if indexOid or oldNumber are set,
* they mustn't be applied to the child either.
*/
childStmt->idxname = NULL;
childStmt->relation = NULL;
childStmt->indexOid = InvalidOid;
- childStmt->oldNode = InvalidOid;
+ childStmt->oldNumber = InvalidRelFileNumber;
childStmt->oldCreateSubid = InvalidSubTransactionId;
- childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ childStmt->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
/*
* Adjust any Vars (both in expressions and in the index's
@@ -3015,7 +3015,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
* particular this eliminates all shared catalogs.).
*/
if (RELKIND_HAS_STORAGE(classtuple->relkind) &&
- !OidIsValid(classtuple->relfilenode))
+ !RelFileNumberIsValid(classtuple->relfilenode))
skip_rel = true;
/*
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106..9ac0383 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -118,7 +118,7 @@ SetMatViewPopulatedState(Relation relation, bool newstate)
* ExecRefreshMatView -- execute a REFRESH MATERIALIZED VIEW command
*
* This refreshes the materialized view by creating a new table and swapping
- * the relfilenodes of the new table and the old materialized view, so the OID
+ * the relfilenumbers of the new table and the old materialized view, so the OID
* of the original materialized view is preserved. Thus we do not lose GRANT
* nor references to this materialized view.
*
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ddf219b..48d9d43 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -75,7 +75,7 @@ typedef struct sequence_magic
typedef struct SeqTableData
{
Oid relid; /* pg_class OID of this sequence (hash key) */
- Oid filenode; /* last seen relfilenode of this sequence */
+ RelFileNumber filenumber; /* last seen relfilenumber of this sequence */
LocalTransactionId lxid; /* xact in which we last did a seq op */
bool last_valid; /* do we have a valid "last" value? */
int64 last; /* value last returned by nextval */
@@ -255,7 +255,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
*
* The change is made transactionally, so that on failure of the current
* transaction, the sequence will be restored to its previous state.
- * We do that by creating a whole new relfilenode for the sequence; so this
+ * We do that by creating a whole new relfilenumber for the sequence; so this
* works much like the rewriting forms of ALTER TABLE.
*
* Caller is assumed to have acquired AccessExclusiveLock on the sequence,
@@ -310,7 +310,7 @@ ResetSequence(Oid seq_relid)
/*
* Create a new storage file for the sequence.
*/
- RelationSetNewRelfilenode(seq_rel, seq_rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seq_rel, seq_rel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -347,9 +347,9 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
{
SMgrRelation srel;
- srel = smgropen(rel->rd_node, InvalidBackendId);
+ srel = smgropen(rel->rd_locator, InvalidBackendId);
smgrcreate(srel, INIT_FORKNUM, false);
- log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
+ log_smgrcreate(&rel->rd_locator, INIT_FORKNUM);
fill_seq_fork_with_data(rel, tuple, INIT_FORKNUM);
FlushRelationBuffers(rel);
smgrclose(srel);
@@ -418,7 +418,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = rel->rd_node;
+ xlrec.locator = rel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) tuple->t_data, tuple->t_len);
@@ -509,7 +509,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
* Create a new storage file for the sequence, making the state
* changes transactional.
*/
- RelationSetNewRelfilenode(seqrel, seqrel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(seqrel, seqrel->rd_rel->relpersistence);
/*
* Ensure sequence's relfrozenxid is at 0, since it won't contain any
@@ -557,7 +557,7 @@ SequenceChangePersistence(Oid relid, char newrelpersistence)
GetTopTransactionId();
(void) read_seq_tuple(seqrel, &buf, &seqdatatuple);
- RelationSetNewRelfilenode(seqrel, newrelpersistence);
+ RelationSetNewRelfilenumber(seqrel, newrelpersistence);
fill_seq_with_data(seqrel, &seqdatatuple);
UnlockReleaseBuffer(buf);
@@ -836,7 +836,7 @@ nextval_internal(Oid relid, bool check_permissions)
seq->is_called = true;
seq->log_cnt = 0;
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1023,7 +1023,7 @@ do_setval(Oid relid, int64 next, bool iscalled)
XLogBeginInsert();
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
- xlrec.node = seqrel->rd_node;
+ xlrec.locator = seqrel->rd_locator;
XLogRegisterData((char *) &xlrec, sizeof(xl_seq_rec));
XLogRegisterData((char *) seqdatatuple.t_data, seqdatatuple.t_len);
@@ -1147,7 +1147,7 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
if (!found)
{
/* relid already filled in */
- elm->filenode = InvalidOid;
+ elm->filenumber = InvalidRelFileNumber;
elm->lxid = InvalidLocalTransactionId;
elm->last_valid = false;
elm->last = elm->cached = 0;
@@ -1169,9 +1169,9 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel)
* discard any cached-but-unissued values. We do not touch the currval()
* state, however.
*/
- if (seqrel->rd_rel->relfilenode != elm->filenode)
+ if (seqrel->rd_rel->relfilenode != elm->filenumber)
{
- elm->filenode = seqrel->rd_rel->relfilenode;
+ elm->filenumber = seqrel->rd_rel->relfilenode;
elm->cached = elm->last;
}
@@ -1254,7 +1254,8 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
* changed. This allows ALTER SEQUENCE to behave transactionally. Currently,
* the only option that doesn't cause that is OWNED BY. It's *necessary* for
* ALTER SEQUENCE OWNED BY to not rewrite the sequence, because that would
- * break pg_upgrade by causing unwanted changes in the sequence's relfilenode.
+ * break pg_upgrade by causing unwanted changes in the sequence's
+ * relfilenumber.
*/
static void
init_params(ParseState *pstate, List *options, bool for_identity,
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 2de0eba..d9530a3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -596,7 +596,7 @@ static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
static ObjectAddress ATExecSetCompression(AlteredTableInfo *tab, Relation rel,
const char *column, Node *newValue, LOCKMODE lockmode);
-static void index_copy_data(Relation rel, RelFileNode newrnode);
+static void index_copy_data(Relation rel, RelFileLocator newrlocator);
static const char *storage_name(char c);
static void RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid,
@@ -1986,12 +1986,12 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* Normally, we need a transaction-safe truncation here. However, if
* the table was either created in the current (sub)transaction or has
- * a new relfilenode in the current (sub)transaction, then we can just
+ * a new relfilenumber in the current (sub)transaction, then we can just
* truncate it in-place, because a rollback would cause the whole
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilelocatorSubid == mySubid)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -2014,10 +2014,10 @@ ExecuteTruncateGuts(List *explicit_rels,
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
- * as the relfilenode value. The old storage file is scheduled for
+ * as the relfilenumber value. The old storage file is scheduled for
* deletion at commit.
*/
- RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence);
+ RelationSetNewRelfilenumber(rel, rel->rd_rel->relpersistence);
heap_relid = RelationGetRelid(rel);
@@ -2030,7 +2030,7 @@ ExecuteTruncateGuts(List *explicit_rels,
Relation toastrel = relation_open(toast_relid,
AccessExclusiveLock);
- RelationSetNewRelfilenode(toastrel,
+ RelationSetNewRelfilenumber(toastrel,
toastrel->rd_rel->relpersistence);
table_close(toastrel, NoLock);
}
@@ -3315,11 +3315,11 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
/*
* SetRelationTableSpace
- * Set new reltablespace and relfilenode in pg_class entry.
+ * Set new reltablespace and relfilenumber in pg_class entry.
*
* newTableSpaceId is the new tablespace for the relation, and
- * newRelFileNode its new filenode. If newRelFileNode is InvalidOid,
- * this field is not updated.
+ * newRelFilenumber its new filenumber. If newRelFilenumber is
+ * InvalidRelFileNumber, this field is not updated.
*
* NOTE: The caller must hold AccessExclusiveLock on the relation.
*
@@ -3331,7 +3331,7 @@ CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId)
void
SetRelationTableSpace(Relation rel,
Oid newTableSpaceId,
- Oid newRelFileNode)
+ RelFileNumber newRelFilenumber)
{
Relation pg_class;
HeapTuple tuple;
@@ -3351,8 +3351,8 @@ SetRelationTableSpace(Relation rel,
/* Update the pg_class row. */
rd_rel->reltablespace = (newTableSpaceId == MyDatabaseTableSpace) ?
InvalidOid : newTableSpaceId;
- if (OidIsValid(newRelFileNode))
- rd_rel->relfilenode = newRelFileNode;
+ if (RelFileNumberIsValid(newRelFilenumber))
+ rd_rel->relfilenode = newRelFilenumber;
CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
/*
@@ -5420,7 +5420,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* persistence: on one hand, we need to ensure that the buffers
* belonging to each of the two relations are marked with or without
* BM_PERMANENT properly. On the other hand, since rewriting creates
- * and assigns a new relfilenode, we automatically create or drop an
+ * and assigns a new relfilenumber, we automatically create or drop an
* init fork for the relation as appropriate.
*/
if (tab->rewrite > 0 && tab->relkind != RELKIND_SEQUENCE)
@@ -5506,12 +5506,13 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* Create transient table that will receive the modified data.
*
* Ensure it is marked correctly as logged or unlogged. We have
- * to do this here so that buffers for the new relfilenode will
+ * to do this here so that buffers for the new relfilenumber will
* have the right persistence set, and at the same time ensure
- * that the original filenode's buffers will get read in with the
- * correct setting (i.e. the original one). Otherwise a rollback
- * after the rewrite would possibly result with buffers for the
- * original filenode having the wrong persistence setting.
+ * that the original filenumbers's buffers will get read in with
+ * the correct setting (i.e. the original one). Otherwise a
+ * rollback after the rewrite would possibly result with buffers
+ * for the original filenumbers having the wrong persistence
+ * setting.
*
* NB: This relies on swap_relation_files() also swapping the
* persistence. That wouldn't work for pg_class, but that can't be
@@ -8597,7 +8598,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
/* suppress schema rights check when rebuilding existing index */
check_rights = !is_rebuild;
/* skip index build if phase 3 will do it or we're reusing an old one */
- skip_build = tab->rewrite > 0 || OidIsValid(stmt->oldNode);
+ skip_build = tab->rewrite > 0 || RelFileNumberIsValid(stmt->oldNumber);
/* suppress notices when rebuilding existing index */
quiet = is_rebuild;
@@ -8613,7 +8614,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
quiet);
/*
- * If TryReuseIndex() stashed a relfilenode for us, we used it for the new
+ * If TryReuseIndex() stashed a relfilenumber for us, we used it for the new
* index instead of building from scratch. Restore associated fields.
* This may store InvalidSubTransactionId in both fields, in which case
* relcache.c will assume it can rebuild the relcache entry. Hence, do
@@ -8621,13 +8622,13 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
* DROP of the old edition of this index will have scheduled the storage
* for deletion at commit, so cancel that pending deletion.
*/
- if (OidIsValid(stmt->oldNode))
+ if (RelFileNumberIsValid(stmt->oldNumber))
{
Relation irel = index_open(address.objectId, NoLock);
irel->rd_createSubid = stmt->oldCreateSubid;
- irel->rd_firstRelfilenodeSubid = stmt->oldFirstRelfilenodeSubid;
- RelationPreserveStorage(irel->rd_node, true);
+ irel->rd_firstRelfilelocatorSubid = stmt->oldFirstRelfilelocatorSubid;
+ RelationPreserveStorage(irel->rd_locator, true);
index_close(irel, NoLock);
}
@@ -13491,9 +13492,9 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
/* If it's a partitioned index, there is no storage to share. */
if (irel->rd_rel->relkind != RELKIND_PARTITIONED_INDEX)
{
- stmt->oldNode = irel->rd_node.relNode;
+ stmt->oldNumber = irel->rd_locator.relNumber;
stmt->oldCreateSubid = irel->rd_createSubid;
- stmt->oldFirstRelfilenodeSubid = irel->rd_firstRelfilenodeSubid;
+ stmt->oldFirstRelfilelocatorSubid = irel->rd_firstRelfilelocatorSubid;
}
index_close(irel, NoLock);
}
@@ -14340,8 +14341,8 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
{
Relation rel;
Oid reltoastrelid;
- Oid newrelfilenode;
- RelFileNode newrnode;
+ RelFileNumber newrelfilenumber;
+ RelFileLocator newrlocator;
List *reltoastidxids = NIL;
ListCell *lc;
@@ -14370,26 +14371,26 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenodes are not unique in databases across tablespaces, so we need
+ * Relfilenumbers are not unique in databases across tablespaces, so we need
* to allocate a new one in the new tablespace.
*/
- newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
- newrnode = rel->rd_node;
- newrnode.relNode = newrelfilenode;
- newrnode.spcNode = newTableSpace;
+ newrlocator = rel->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
+ newrlocator.spcOid = newTableSpace;
- /* hand off to AM to actually create the new filenode and copy the data */
+ /* hand off to AM to actually create new rel storage and copy the data */
if (rel->rd_rel->relkind == RELKIND_INDEX)
{
- index_copy_data(rel, newrnode);
+ index_copy_data(rel, newrlocator);
}
else
{
Assert(RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind));
- table_relation_copy_data(rel, &newrnode);
+ table_relation_copy_data(rel, &newrlocator);
}
/*
@@ -14400,11 +14401,11 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* the updated pg_class entry), but that's forbidden with
* CheckRelationTableSpaceMove().
*/
- SetRelationTableSpace(rel, newTableSpace, newrelfilenode);
+ SetRelationTableSpace(rel, newTableSpace, newrelfilenumber);
InvokeObjectPostAlterHook(RelationRelationId, RelationGetRelid(rel), 0);
- RelationAssumeNewRelfilenode(rel);
+ RelationAssumeNewRelfilelocator(rel);
relation_close(rel, NoLock);
@@ -14630,11 +14631,11 @@ AlterTableMoveAll(AlterTableMoveAllStmt *stmt)
}
static void
-index_copy_data(Relation rel, RelFileNode newrnode)
+index_copy_data(Relation rel, RelFileLocator newrlocator)
{
SMgrRelation dstrel;
- dstrel = smgropen(newrnode, rel->rd_backend);
+ dstrel = smgropen(newrlocator, rel->rd_backend);
/*
* Since we copy the file directly without looking at the shared buffers,
@@ -14648,10 +14649,10 @@ index_copy_data(Relation rel, RelFileNode newrnode)
* Create and copy all forks of the relation, and schedule unlinking of
* old physical files.
*
- * NOTE: any conflict in relfilenode value will be caught in
+ * NOTE: any conflict in relfilenumber value will be caught in
* RelationCreateStorage().
*/
- RelationCreateStorage(newrnode, rel->rd_rel->relpersistence, true);
+ RelationCreateStorage(newrlocator, rel->rd_rel->relpersistence, true);
/* copy main fork */
RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
@@ -14672,7 +14673,7 @@ index_copy_data(Relation rel, RelFileNode newrnode)
if (RelationIsPermanent(rel) ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
- log_smgrcreate(&newrnode, forkNum);
+ log_smgrcreate(&newrlocator, forkNum);
RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
rel->rd_rel->relpersistence);
}
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 00ca397..c8bdd99 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -12,12 +12,12 @@
* remove the possibility of having file name conflicts, we isolate
* files within a tablespace into database-specific subdirectories.
*
- * To support file access via the information given in RelFileNode, we
+ * To support file access via the information given in RelFileLocator, we
* maintain a symbolic-link map in $PGDATA/pg_tblspc. The symlinks are
* named by tablespace OIDs and point to the actual tablespace directories.
* There is also a per-cluster version directory in each tablespace.
* Thus the full path to an arbitrary file is
- * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenode
+ * $PGDATA/pg_tblspc/spcoid/PG_MAJORVER_CATVER/dboid/relfilenumber
* e.g.
* $PGDATA/pg_tblspc/20981/PG_9.0_201002161/719849/83292814
*
@@ -25,8 +25,8 @@
* tables) and pg_default (for everything else). For backwards compatibility
* and to remain functional on platforms without symlinks, these tablespaces
* are accessed specially: they are respectively
- * $PGDATA/global/relfilenode
- * $PGDATA/base/dboid/relfilenode
+ * $PGDATA/global/relfilenumber
+ * $PGDATA/base/dboid/relfilenumber
*
* To allow CREATE DATABASE to give a new database a default tablespace
* that's different from the template database's default, we make the
@@ -115,7 +115,7 @@ static bool destroy_tablespace_directories(Oid tablespaceoid, bool redo);
* re-create a database subdirectory (of $PGDATA/base) during WAL replay.
*/
void
-TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
+TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
{
struct stat st;
char *dir;
@@ -124,13 +124,13 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo)
* The global tablespace doesn't have per-database subdirectories, so
* nothing to do for it.
*/
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
return;
- Assert(OidIsValid(spcNode));
- Assert(OidIsValid(dbNode));
+ Assert(OidIsValid(spcOid));
+ Assert(OidIsValid(dbOid));
- dir = GetDatabasePath(dbNode, spcNode);
+ dir = GetDatabasePath(dbOid, spcOid);
if (stat(dir, &st) < 0)
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 706d283..8313b5e 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4194,9 +4194,9 @@ _copyIndexStmt(const IndexStmt *from)
COPY_NODE_FIELD(excludeOpNames);
COPY_STRING_FIELD(idxcomment);
COPY_SCALAR_FIELD(indexOid);
- COPY_SCALAR_FIELD(oldNode);
+ COPY_SCALAR_FIELD(oldNumber);
COPY_SCALAR_FIELD(oldCreateSubid);
- COPY_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COPY_SCALAR_FIELD(oldFirstRelfilelocatorSubid);
COPY_SCALAR_FIELD(unique);
COPY_SCALAR_FIELD(nulls_not_distinct);
COPY_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index fccc0b4..4493526 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1768,9 +1768,9 @@ _equalIndexStmt(const IndexStmt *a, const IndexStmt *b)
COMPARE_NODE_FIELD(excludeOpNames);
COMPARE_STRING_FIELD(idxcomment);
COMPARE_SCALAR_FIELD(indexOid);
- COMPARE_SCALAR_FIELD(oldNode);
+ COMPARE_SCALAR_FIELD(oldNumber);
COMPARE_SCALAR_FIELD(oldCreateSubid);
- COMPARE_SCALAR_FIELD(oldFirstRelfilenodeSubid);
+ COMPARE_SCALAR_FIELD(oldFirstRelfilelocatorSubid);
COMPARE_SCALAR_FIELD(unique);
COMPARE_SCALAR_FIELD(nulls_not_distinct);
COMPARE_SCALAR_FIELD(primary);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4315c53..05f27f0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2932,9 +2932,9 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNode);
+ WRITE_OID_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
- WRITE_UINT_FIELD(oldFirstRelfilenodeSubid);
+ WRITE_UINT_FIELD(oldFirstRelfilelocatorSubid);
WRITE_BOOL_FIELD(unique);
WRITE_BOOL_FIELD(nulls_not_distinct);
WRITE_BOOL_FIELD(primary);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 969c9c1..0523013 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,9 +7990,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidRelFileNumber;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
@@ -8022,9 +8022,9 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name
n->excludeOpNames = NIL;
n->idxcomment = NULL;
n->indexOid = InvalidOid;
- n->oldNode = InvalidOid;
+ n->oldNumber = InvalidRelFileNumber;
n->oldCreateSubid = InvalidSubTransactionId;
- n->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ n->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
n->primary = false;
n->isconstraint = false;
n->deferrable = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index f889726..b572534 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1578,9 +1578,9 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
index->unique = idxrec->indisunique;
index->nulls_not_distinct = idxrec->indnullsnotdistinct;
index->primary = idxrec->indisprimary;
@@ -2199,9 +2199,9 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
index->excludeOpNames = NIL;
index->idxcomment = NULL;
index->indexOid = InvalidOid;
- index->oldNode = InvalidOid;
+ index->oldNumber = InvalidRelFileNumber;
index->oldCreateSubid = InvalidSubTransactionId;
- index->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
+ index->oldFirstRelfilelocatorSubid = InvalidSubTransactionId;
index->transformed = false;
index->concurrent = false;
index->if_not_exists = false;
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index c937c39..5fc076f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -1207,7 +1207,7 @@ CompactCheckpointerRequestQueue(void)
* We use the request struct directly as a hashtable key. This
* assumes that any padding bytes in the structs are consistently the
* same, which should be okay because we zeroed them in
- * CheckpointerShmemInit. Note also that RelFileNode had better
+ * CheckpointerShmemInit. Note also that RelFileLocator had better
* contain no pad bytes.
*/
request = &CheckpointerShmem->requests[n];
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index aa2427b..c5c6a2b 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -845,7 +845,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_insert *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_insert *) XLogRecGetData(r);
@@ -857,8 +857,8 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -872,7 +872,7 @@ DecodeInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
tupledata = XLogRecGetBlockData(r, 0, &datalen);
tuplelen = datalen - SizeOfHeapHeader;
@@ -902,13 +902,13 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
xl_heap_update *xlrec;
ReorderBufferChange *change;
char *data;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_update *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -918,7 +918,7 @@ DecodeUpdate(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change = ReorderBufferGetChange(ctx->reorder);
change->action = REORDER_BUFFER_CHANGE_UPDATE;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
if (xlrec->flags & XLH_UPDATE_CONTAINS_NEW_TUPLE)
{
@@ -968,13 +968,13 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
XLogReaderState *r = buf->record;
xl_heap_delete *xlrec;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
xlrec = (xl_heap_delete *) XLogRecGetData(r);
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -990,7 +990,7 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
/* old primary key stored */
if (xlrec->flags & XLH_DELETE_CONTAINS_OLD)
@@ -1063,7 +1063,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
char *data;
char *tupledata;
Size tuplelen;
- RelFileNode rnode;
+ RelFileLocator rlocator;
xlrec = (xl_heap_multi_insert *) XLogRecGetData(r);
@@ -1075,8 +1075,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &rnode, NULL, NULL);
- if (rnode.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &rlocator, NULL, NULL);
+ if (rlocator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1103,7 +1103,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INSERT;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &rnode, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &rlocator, sizeof(RelFileLocator));
xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
@@ -1165,11 +1165,11 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
{
XLogReaderState *r = buf->record;
ReorderBufferChange *change;
- RelFileNode target_node;
+ RelFileLocator target_locator;
/* only interested in our database */
- XLogRecGetBlockTag(r, 0, &target_node, NULL, NULL);
- if (target_node.dbNode != ctx->slot->data.database)
+ XLogRecGetBlockTag(r, 0, &target_locator, NULL, NULL);
+ if (target_locator.dbOid != ctx->slot->data.database)
return;
/* output plugin doesn't look for this origin, no need to queue */
@@ -1180,7 +1180,7 @@ DecodeSpecConfirm(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM;
change->origin_id = XLogRecGetOrigin(r);
- memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
+ memcpy(&change->data.tp.rlocator, &target_locator, sizeof(RelFileLocator));
change->data.tp.clear_toast_afterwards = true;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8da5f90..f8fb228 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -106,7 +106,7 @@
#include "utils/memdebug.h"
#include "utils/memutils.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
/* entry for a hash table we use to map from xid to our transaction state */
@@ -116,10 +116,10 @@ typedef struct ReorderBufferTXNByIdEnt
ReorderBufferTXN *txn;
} ReorderBufferTXNByIdEnt;
-/* data structures for (relfilenode, ctid) => (cmin, cmax) mapping */
+/* data structures for (relfilelocator, ctid) => (cmin, cmax) mapping */
typedef struct ReorderBufferTupleCidKey
{
- RelFileNode relnode;
+ RelFileLocator rlocator;
ItemPointerData tid;
} ReorderBufferTupleCidKey;
@@ -1643,7 +1643,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Destroy the (relfilenode, ctid) hashtable, so that we don't leak any
+ * Destroy the (relfilelocator, ctid) hashtable, so that we don't leak any
* memory. We could also keep the hash table and update it with new ctid
* values, but this seems simpler and good enough for now.
*/
@@ -1673,7 +1673,7 @@ ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prep
}
/*
- * Build a hash with a (relfilenode, ctid) -> (cmin, cmax) mapping for use by
+ * Build a hash with a (relfilelocator, ctid) -> (cmin, cmax) mapping for use by
* HeapTupleSatisfiesHistoricMVCC.
*/
static void
@@ -1711,7 +1711,7 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
/* be careful about padding */
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
- key.relnode = change->data.tuplecid.node;
+ key.rlocator = change->data.tuplecid.locator;
ItemPointerCopy(&change->data.tuplecid.tid,
&key.tid);
@@ -2140,36 +2140,36 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenode(change->data.tp.relnode.spcNode,
- change->data.tp.relnode.relNode);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
+ change->data.tp.rlocator.relNumber);
/*
* Mapped catalog tuple without data, emitted while
* catalog table was in the process of being rewritten. We
- * can fail to look up the relfilenode, because the
+ * can fail to look up the relfilenumber, because the
* relmapper has no "historic" view, in contrast to the
* normal catalog during decoding. Thus repeated rewrites
* can cause a lookup failure. That's OK because we do not
* decode catalog changes anyway. Normally such tuples
* would be skipped over below, but we can't identify
* whether the table should be logically logged without
- * mapping the relfilenode to the oid.
+ * mapping the relfilenumber to the oid.
*/
if (reloid == InvalidOid &&
change->data.tp.newtuple == NULL &&
change->data.tp.oldtuple == NULL)
goto change_done;
else if (reloid == InvalidOid)
- elog(ERROR, "could not map filenode \"%s\" to relation OID",
- relpathperm(change->data.tp.relnode,
+ elog(ERROR, "could not map filenumber \"%s\" to relation OID",
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
relation = RelationIdGetRelation(reloid);
if (!RelationIsValid(relation))
- elog(ERROR, "could not open relation with OID %u (for filenode \"%s\")",
+ elog(ERROR, "could not open relation with OID %u (for filenumber \"%s\")",
reloid,
- relpathperm(change->data.tp.relnode,
+ relpathperm(change->data.tp.rlocator,
MAIN_FORKNUM));
if (!RelationIsLogicallyLogged(relation))
@@ -3157,7 +3157,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
}
/*
- * Add new (relfilenode, tid) -> (cmin, cmax) mappings.
+ * Add new (relfilelocator, tid) -> (cmin, cmax) mappings.
*
* We do not include this change type in memory accounting, because we
* keep CIDs in a separate list and do not evict them when reaching
@@ -3165,7 +3165,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
*/
void
ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
- XLogRecPtr lsn, RelFileNode node,
+ XLogRecPtr lsn, RelFileLocator locator,
ItemPointerData tid, CommandId cmin,
CommandId cmax, CommandId combocid)
{
@@ -3174,7 +3174,7 @@ ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
- change->data.tuplecid.node = node;
+ change->data.tuplecid.locator = locator;
change->data.tuplecid.tid = tid;
change->data.tuplecid.cmin = cmin;
change->data.tuplecid.cmax = cmax;
@@ -4839,7 +4839,7 @@ ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn)
* need anymore.
*
* To resolve those problems we have a per-transaction hash of (cmin,
- * cmax) tuples keyed by (relfilenode, ctid) which contains the actual
+ * cmax) tuples keyed by (relfilelocator, ctid) which contains the actual
* (cmin, cmax) values. That also takes care of combo CIDs by simply
* not caring about them at all. As we have the real cmin/cmax values
* combo CIDs aren't interesting.
@@ -4870,9 +4870,9 @@ DisplayMapping(HTAB *tuplecid_data)
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
- ent->key.relnode.dbNode,
- ent->key.relnode.spcNode,
- ent->key.relnode.relNode,
+ ent->key.rlocator.dbOid,
+ ent->key.rlocator.spcOid,
+ ent->key.rlocator.relNumber,
ItemPointerGetBlockNumber(&ent->key.tid),
ItemPointerGetOffsetNumber(&ent->key.tid),
ent->cmin,
@@ -4932,7 +4932,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
path, readBytes,
(int32) sizeof(LogicalRewriteMappingData))));
- key.relnode = map.old_node;
+ key.rlocator = map.old_locator;
ItemPointerCopy(&map.old_tid,
&key.tid);
@@ -4947,7 +4947,7 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
if (!ent)
continue;
- key.relnode = map.new_node;
+ key.rlocator = map.new_locator;
ItemPointerCopy(&map.new_tid,
&key.tid);
@@ -5120,10 +5120,10 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
Assert(!BufferIsLocal(buffer));
/*
- * get relfilenode from the buffer, no convenient way to access it other
+ * get relfilelocator from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &key.rlocator, &forkno, &blockno);
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1119a12..73c0f15 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -781,7 +781,7 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
- xlrec->target_node, xlrec->target_tid,
+ xlrec->target_locator, xlrec->target_tid,
xlrec->cmin, xlrec->cmax,
xlrec->combocid);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index ae13011..7071ff6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -121,12 +121,12 @@ typedef struct CkptTsStatus
* Type for array used to sort SMgrRelations
*
* FlushRelationsAllBuffers shares the same comparator function with
- * DropRelFileNodesAllBuffers. Pointer to this struct and RelFileNode must be
+ * DropRelFileLocatorsAllBuffers. Pointer to this struct and RelFileLocator must be
* compatible.
*/
typedef struct SMgrSortArray
{
- RelFileNode rnode; /* This must be the first member */
+ RelFileLocator rlocator; /* This must be the first member */
SMgrRelation srel;
} SMgrSortArray;
@@ -483,7 +483,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
BufferAccessStrategy strategy,
bool *foundPtr);
static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
-static void FindAndDropRelFileNodeBuffers(RelFileNode rnode,
+static void FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator,
ForkNumber forkNum,
BlockNumber nForkBlock,
BlockNumber firstDelBlock);
@@ -492,7 +492,7 @@ static void RelationCopyStorageUsingBuffer(Relation src, Relation dst,
bool isunlogged);
static void AtProcExit_Buffers(int code, Datum arg);
static void CheckForBufferLeaks(void);
-static int rnode_comparator(const void *p1, const void *p2);
+static int rlocator_comparator(const void *p1, const void *p2);
static inline int buffertag_comparator(const BufferTag *a, const BufferTag *b);
static inline int ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b);
static int ts_ckpt_progress_comparator(Datum a, Datum b, void *arg);
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -620,7 +620,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
* tag. In that case, the buffer is pinned and the usage count is bumped.
*/
bool
-ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
+ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockNum,
Buffer recent_buffer)
{
BufferDesc *bufHdr;
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rnode, forkNum, blockNum);
+ INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -786,13 +786,13 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* BackendId).
*/
Buffer
-ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
+ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool permanent)
{
bool hit;
- SMgrRelation smgr = smgropen(rnode, InvalidBackendId);
+ SMgrRelation smgr = smgropen(rlocator, InvalidBackendId);
return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
@@ -824,10 +824,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
isExtend = (blockNum == P_NEW);
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
@@ -839,7 +839,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend relation %s beyond %u blocks",
- relpath(smgr->smgr_rnode, forkNum),
+ relpath(smgr->smgr_rlocator, forkNum),
P_NEW)));
}
@@ -886,10 +886,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -926,7 +926,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (!PageIsNew((Page) bufBlock))
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
- blockNum, relpath(smgr->smgr_rnode, forkNum)),
+ blockNum, relpath(smgr->smgr_rlocator, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
@@ -1028,7 +1028,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
@@ -1036,7 +1036,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
- relpath(smgr->smgr_rnode, forkNum))));
+ relpath(smgr->smgr_rlocator, forkNum))));
}
}
}
@@ -1076,10 +1076,10 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
VacuumCostBalance += VacuumCostPageMiss;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode,
- smgr->smgr_rnode.backend,
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber,
+ smgr->smgr_rlocator.backend,
isExtend,
found);
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1255,9 +1255,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
/* OK, do the I/O */
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1266,9 +1266,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
&buf->tag);
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
- smgr->smgr_rnode.node.spcNode,
- smgr->smgr_rnode.node.dbNode,
- smgr->smgr_rnode.node.relNode);
+ smgr->smgr_rlocator.locator.spcOid,
+ smgr->smgr_rlocator.locator.dbOid,
+ smgr->smgr_rlocator.locator.relNumber);
}
else
{
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rlocator.spcOid;
+ item->relNumber = bufHdr->tag.rlocator.relNumber;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2708,7 +2708,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2769,11 +2769,11 @@ BufferGetBlockNumber(Buffer buffer)
/*
* BufferGetTag
- * Returns the relfilenode, fork number and block number associated with
+ * Returns the relfilelocator, fork number and block number associated with
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2787,7 +2787,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rnode = bufHdr->tag.rnode;
+ *rlocator = bufHdr->tag.rlocator;
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,13 +2838,13 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rlocator, InvalidBackendId);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
buf_state = LockBufHdr(buf);
@@ -2922,9 +2922,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
buf->tag.blockNum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber);
/* Pop the error context stack */
error_context_stack = errcallback.previous;
@@ -3026,7 +3026,7 @@ BufferGetLSNAtomic(Buffer buffer)
}
/* ---------------------------------------------------------------------
- * DropRelFileNodeBuffers
+ * DropRelFileLocatorBuffers
*
* This function removes from the buffer pool all the pages of the
* specified relation forks that have block numbers >= firstDelBlock.
@@ -3047,24 +3047,24 @@ BufferGetLSNAtomic(Buffer buffer)
* --------------------------------------------------------------------
*/
void
-DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
+DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock)
{
int i;
int j;
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
BlockNumber nForkBlock[MAX_FORKNUM];
uint64 nBlocksToInvalidate = 0;
- rnode = smgr_reln->smgr_rnode;
+ rlocator = smgr_reln->smgr_rlocator;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileLocatorBackendIsTemp(rlocator))
{
- if (rnode.backend == MyBackendId)
+ if (rlocator.backend == MyBackendId)
{
for (j = 0; j < nforks; j++)
- DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
+ DropRelFileLocatorLocalBuffers(rlocator.locator, forkNum[j],
firstDelBlock[j]);
}
return;
@@ -3115,7 +3115,7 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)
{
for (j = 0; j < nforks; j++)
- FindAndDropRelFileNodeBuffers(rnode.node, forkNum[j],
+ FindAndDropRelFileLocatorBuffers(rlocator.locator, forkNum[j],
nForkBlock[j], firstDelBlock[j]);
return;
}
@@ -3138,17 +3138,17 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* false positives are safe because we'll recheck after getting the
* buffer lock.
*
- * We could check forkNum and blockNum as well as the rnode, but the
+ * We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3162,16 +3162,16 @@ DropRelFileNodeBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
}
/* ---------------------------------------------------------------------
- * DropRelFileNodesAllBuffers
+ * DropRelFileLocatorsAllBuffers
*
* This function removes from the buffer pool all the pages of all
* forks of the specified relations. It's equivalent to calling
- * DropRelFileNodeBuffers once per fork per relation with
+ * DropRelFileLocatorBuffers once per fork per relation with
* firstDelBlock = 0.
* --------------------------------------------------------------------
*/
void
-DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
+DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
{
int i;
int j;
@@ -3179,22 +3179,22 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
SMgrRelation *rels;
BlockNumber (*block)[MAX_FORKNUM + 1];
uint64 nBlocksToInvalidate = 0;
- RelFileNode *nodes;
+ RelFileLocator *locators;
bool cached = true;
bool use_bsearch;
- if (nnodes == 0)
+ if (nlocators == 0)
return;
- rels = palloc(sizeof(SMgrRelation) * nnodes); /* non-local relations */
+ rels = palloc(sizeof(SMgrRelation) * nlocators); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
- for (i = 0; i < nnodes; i++)
+ for (i = 0; i < nlocators; i++)
{
- if (RelFileNodeBackendIsTemp(smgr_reln[i]->smgr_rnode))
+ if (RelFileLocatorBackendIsTemp(smgr_reln[i]->smgr_rlocator))
{
- if (smgr_reln[i]->smgr_rnode.backend == MyBackendId)
- DropRelFileNodeAllLocalBuffers(smgr_reln[i]->smgr_rnode.node);
+ if (smgr_reln[i]->smgr_rlocator.backend == MyBackendId)
+ DropRelFileLocatorAllLocalBuffers(smgr_reln[i]->smgr_rlocator.locator);
}
else
rels[n++] = smgr_reln[i];
@@ -3219,7 +3219,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
/*
* We can avoid scanning the entire buffer pool if we know the exact size
- * of each of the given relation forks. See DropRelFileNodeBuffers.
+ * of each of the given relation forks. See DropRelFileLocatorBuffers.
*/
for (i = 0; i < n && cached; i++)
{
@@ -3257,7 +3257,7 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
continue;
/* drop all the buffers for a particular relation fork */
- FindAndDropRelFileNodeBuffers(rels[i]->smgr_rnode.node,
+ FindAndDropRelFileLocatorBuffers(rels[i]->smgr_rlocator.locator,
j, block[i][j], 0);
}
}
@@ -3268,9 +3268,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
}
pfree(block);
- nodes = palloc(sizeof(RelFileNode) * n); /* non-local relations */
+ locators = palloc(sizeof(RelFileLocator) * n); /* non-local relations */
for (i = 0; i < n; i++)
- nodes[i] = rels[i]->smgr_rnode.node;
+ locators[i] = rels[i]->smgr_rlocator.locator;
/*
* For low number of relations to drop just use a simple walk through, to
@@ -3280,18 +3280,18 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
*/
use_bsearch = n > RELS_BSEARCH_THRESHOLD;
- /* sort the list of rnodes if necessary */
+ /* sort the list of rlocators if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(locators, n, sizeof(RelFileLocator), rlocator_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileLocator *rlocator = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3301,37 +3301,37 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
{
- rnode = &nodes[j];
+ rlocator = &locators[j];
break;
}
}
}
else
{
- rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
- rnode_comparator);
+ rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ locators, n, sizeof(RelFileLocator),
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
- if (rnode == NULL)
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
+ if (rlocator == NULL)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
}
- pfree(nodes);
+ pfree(locators);
pfree(rels);
}
/* ---------------------------------------------------------------------
- * FindAndDropRelFileNodeBuffers
+ * FindAndDropRelFileLocatorBuffers
*
* This function performs look up in BufMapping table and removes from the
* buffer pool all the pages of the specified relation fork that has block
@@ -3340,9 +3340,9 @@ DropRelFileNodesAllBuffers(SMgrRelation *smgr_reln, int nnodes)
* --------------------------------------------------------------------
*/
static void
-FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber nForkBlock,
- BlockNumber firstDelBlock)
+FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber nForkBlock,
+ BlockNumber firstDelBlock)
{
BlockNumber curBlock;
@@ -3356,7 +3356,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rnode, forkNum, curBlock);
+ INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
@@ -3380,7 +3380,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3397,7 +3397,7 @@ FindAndDropRelFileNodeBuffers(RelFileNode rnode, ForkNumber forkNum,
* bothering to write them out first. This is used when we destroy a
* database, to avoid trying to flush data to disk when the directory
* tree no longer exists. Implementation is pretty similar to
- * DropRelFileNodeBuffers() which is for destroying just one relation.
+ * DropRelFileLocatorBuffers() which is for destroying just one relation.
* --------------------------------------------------------------------
*/
void
@@ -3416,14 +3416,14 @@ DropDatabaseBuffers(Oid dbid)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rlocator.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3453,7 +3453,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3478,7 +3478,7 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rnode, buf->tag.forkNum),
+ relpathperm(buf->tag.rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3517,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3561,16 +3561,16 @@ FlushRelationBuffers(Relation rel)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3608,21 +3608,21 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (i = 0; i < nrels; i++)
{
- Assert(!RelFileNodeBackendIsTemp(smgrs[i]->smgr_rnode));
+ Assert(!RelFileLocatorBackendIsTemp(smgrs[i]->smgr_rlocator));
- srels[i].rnode = smgrs[i]->smgr_rnode.node;
+ srels[i].rlocator = smgrs[i]->smgr_rlocator.locator;
srels[i].srel = smgrs[i];
}
/*
* Save the bsearch overhead for low number of relations to sync. See
- * DropRelFileNodesAllBuffers for details.
+ * DropRelFileLocatorsAllBuffers for details.
*/
use_bsearch = nrels > RELS_BSEARCH_THRESHOLD;
/* sort the list of SMgrRelations if necessary */
if (use_bsearch)
- pg_qsort(srels, nrels, sizeof(SMgrSortArray), rnode_comparator);
+ pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
@@ -3634,7 +3634,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
uint32 buf_state;
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
@@ -3644,7 +3644,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, srels[j].rnode))
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,19 +3653,19 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rnode),
+ srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
srels, nrels, sizeof(SMgrSortArray),
- rnode_comparator);
+ rlocator_comparator);
}
- /* buffer doesn't belong to any of the given relfilenodes; skip it */
+ /* buffer doesn't belong to any of the given relfilelocators; skip it */
if (srelent == NULL)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, srelent->rnode) &&
+ if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3729,7 +3729,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
CHECK_FOR_INTERRUPTS();
/* Read block from source relation. */
- srcBuf = ReadBufferWithoutRelcache(src->rd_node, forkNum, blkno,
+ srcBuf = ReadBufferWithoutRelcache(src->rd_locator, forkNum, blkno,
RBM_NORMAL, bstrategy_src,
permanent);
srcPage = BufferGetPage(srcBuf);
@@ -3740,7 +3740,7 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
}
/* Use P_NEW to extend the destination relation. */
- dstBuf = ReadBufferWithoutRelcache(dst->rd_node, forkNum, P_NEW,
+ dstBuf = ReadBufferWithoutRelcache(dst->rd_locator, forkNum, P_NEW,
RBM_NORMAL, bstrategy_dst,
permanent);
LockBuffer(dstBuf, BUFFER_LOCK_EXCLUSIVE);
@@ -3775,8 +3775,8 @@ RelationCopyStorageUsingBuffer(Relation src, Relation dst, ForkNumber forkNum,
* --------------------------------------------------------------------
*/
void
-CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
- bool permanent)
+CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator, bool permanent)
{
Relation src_rel;
Relation dst_rel;
@@ -3793,8 +3793,8 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* used the smgr layer directly, we would have to worry about
* invalidations.
*/
- src_rel = CreateFakeRelcacheEntry(src_rnode);
- dst_rel = CreateFakeRelcacheEntry(dst_rnode);
+ src_rel = CreateFakeRelcacheEntry(src_rlocator);
+ dst_rel = CreateFakeRelcacheEntry(dst_rlocator);
/*
* Create and copy all forks of the relation. During create database we
@@ -3802,7 +3802,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* directory. Therefore, each individual relation doesn't need to be
* registered for cleanup.
*/
- RelationCreateStorage(dst_rnode, relpersistence, false);
+ RelationCreateStorage(dst_rlocator, relpersistence, false);
/* copy main fork. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, MAIN_FORKNUM, permanent);
@@ -3820,7 +3820,7 @@ CreateAndCopyRelationData(RelFileNode src_rnode, RelFileNode dst_rnode,
* init fork of an unlogged relation.
*/
if (permanent || forkNum == INIT_FORKNUM)
- log_smgrcreate(&dst_rnode, forkNum);
+ log_smgrcreate(&dst_rlocator, forkNum);
/* Copy a fork's data, block by block. */
RelationCopyStorageUsingBuffer(src_rel, dst_rel, forkNum,
@@ -3864,16 +3864,16 @@ FlushDatabaseBuffers(Oid dbid)
bufHdr = GetBufferDescriptor(i);
/*
- * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
+ * As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rlocator.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rlocator.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4034,7 +4034,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
/*
- * If we must not write WAL, due to a relfilenode-specific
+ * If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
* set the hint, just not dirty the page as a result so the hint
* is lost when we evict the page or shutdown.
@@ -4042,7 +4042,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileNodeSkippingWAL(bufHdr->tag.rnode))
+ RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
return;
/*
@@ -4651,7 +4651,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4675,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,7 +4693,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4703,27 +4703,27 @@ local_buffer_write_error_callback(void *arg)
}
/*
- * RelFileNode qsort/bsearch comparator; see RelFileNodeEquals.
+ * RelFileLocator qsort/bsearch comparator; see RelFileLocatorEquals.
*/
static int
-rnode_comparator(const void *p1, const void *p2)
+rlocator_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileLocator n1 = *(const RelFileLocator *) p1;
+ RelFileLocator n2 = *(const RelFileLocator *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.relNumber < n2.relNumber)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.relNumber > n2.relNumber)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.dbOid < n2.dbOid)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.dbOid > n2.dbOid)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.spcOid < n2.spcOid)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.spcOid > n2.spcOid)
return 1;
else
return 0;
@@ -4789,7 +4789,7 @@ buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
- ret = rnode_comparator(&ba->rnode, &bb->rnode);
+ ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
if (ret != 0)
return ret;
@@ -4822,9 +4822,9 @@ ckpt_buforder_comparator(const CkptSortItem *a, const CkptSortItem *b)
else if (a->tsId > b->tsId)
return 1;
/* compare relation */
- if (a->relNode < b->relNode)
+ if (a->relNumber < b->relNumber)
return -1;
- else if (a->relNode > b->relNode)
+ else if (a->relNumber > b->relNumber)
return 1;
/* compare fork */
else if (a->forkNum < b->forkNum)
@@ -4960,7 +4960,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +4979,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index e71f95a..3dc9cc7 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -134,7 +134,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum, -b - 1);
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
#endif
buf_state = pg_atomic_read_u32(&bufHdr->state);
@@ -162,7 +162,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
- smgr->smgr_rnode.node.relNode, forkNum, blockNum,
+ smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum,
-nextFreeLocalBuf - 1);
#endif
@@ -215,7 +215,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -312,7 +312,7 @@ MarkLocalBufferDirty(Buffer buffer)
}
/*
- * DropRelFileNodeLocalBuffers
+ * DropRelFileLocatorLocalBuffers
* This function removes from the buffer pool all the pages of the
* specified relation that have block numbers >= firstDelBlock.
* (In particular, with firstDelBlock = 0, all pages are removed.)
@@ -320,11 +320,11 @@ MarkLocalBufferDirty(Buffer buffer)
* out first. Therefore, this is NOT rollback-able, and so should be
* used only with extreme caution!
*
- * See DropRelFileNodeBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
- BlockNumber firstDelBlock)
+DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
+ BlockNumber firstDelBlock)
{
int i;
@@ -337,14 +337,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -363,14 +363,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
}
/*
- * DropRelFileNodeAllLocalBuffers
+ * DropRelFileLocatorAllLocalBuffers
* This function removes from the buffer pool all pages of all forks
* of the specified relation.
*
- * See DropRelFileNodesAllBuffers in bufmgr.c for more notes.
+ * See DropRelFileLocatorsAllBuffers in bufmgr.c for more notes.
*/
void
-DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
+DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
{
int i;
@@ -383,12 +383,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -589,7 +589,7 @@ AtProcExit_LocalBuffers(void)
{
/*
* We shouldn't be holding any remaining pins; if we are, and assertions
- * aren't enabled, we'll fail later in DropRelFileNodeBuffers while trying
+ * aren't enabled, we'll fail later in DropRelFileLocatorBuffers while trying
* to drop the temp rels.
*/
CheckForLocalBufferLeaks();
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index d41ae37..005def5 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -196,7 +196,7 @@ RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk, Size spaceAvail)
* WAL replay
*/
void
-XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail)
{
int new_cat = fsm_space_avail_to_cat(spaceAvail);
@@ -211,8 +211,8 @@ XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
blkno = fsm_logical_to_physical(addr);
/* If the page doesn't exist already, extend */
- buf = XLogReadBufferExtended(rnode, FSM_FORKNUM, blkno, RBM_ZERO_ON_ERROR,
- InvalidBuffer);
+ buf = XLogReadBufferExtended(rlocator, FSM_FORKNUM, blkno,
+ RBM_ZERO_ON_ERROR, InvalidBuffer);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(buf);
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index d165b35..af4dab7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buf, &rnode, &forknum, &blknum);
+ BufferGetTag(buf, &rlocator, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 671b00a..9dab931 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -442,7 +442,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
}
void
-ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
+ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileLocator locator)
{
VirtualTransactionId *backends;
@@ -461,7 +461,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
return;
backends = GetConflictingVirtualXIDs(latestRemovedXid,
- node.dbNode);
+ locator.dbOid);
ResolveRecoveryConflictWithVirtualXIDs(backends,
PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
@@ -475,7 +475,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
*/
void
ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node)
+ RelFileLocator locator)
{
/*
* ResolveRecoveryConflictWithSnapshot operates on 32-bit TransactionIds,
@@ -493,7 +493,7 @@ ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXi
TransactionId latestRemovedXid;
latestRemovedXid = XidFromFullTransactionId(latestRemovedFullXid);
- ResolveRecoveryConflictWithSnapshot(latestRemovedXid, node);
+ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, locator);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 25e7e4e..5136da6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1997,7 +1997,7 @@ PageIsPredicateLocked(Relation relation, BlockNumber blkno)
PREDICATELOCKTARGET *target;
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
@@ -2576,7 +2576,7 @@ PredicateLockRelation(Relation relation, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
PredicateLockAcquire(&tag);
}
@@ -2599,7 +2599,7 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
return;
SET_PREDICATELOCKTARGETTAG_PAGE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
PredicateLockAcquire(&tag);
@@ -2638,13 +2638,13 @@ PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
* level lock.
*/
SET_PREDICATELOCKTARGETTAG_RELATION(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
if (PredicateLockExists(&tag))
return;
SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -2974,7 +2974,7 @@ DropAllPredicateLocksFromTable(Relation relation, bool transfer)
if (!PredicateLockingNeededForRelation(relation))
return;
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
relId = relation->rd_id;
if (relation->rd_index == NULL)
{
@@ -3194,11 +3194,11 @@ PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
Assert(BlockNumberIsValid(newblkno));
SET_PREDICATELOCKTARGETTAG_PAGE(oldtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
oldblkno);
SET_PREDICATELOCKTARGETTAG_PAGE(newtargettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
newblkno);
@@ -4478,7 +4478,7 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (tid != NULL)
{
SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
ItemPointerGetBlockNumber(tid),
ItemPointerGetOffsetNumber(tid));
@@ -4488,14 +4488,14 @@ CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber b
if (blkno != InvalidBlockNumber)
{
SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id,
blkno);
CheckTargetForConflictsIn(&targettag);
}
SET_PREDICATELOCKTARGETTAG_RELATION(targettag,
- relation->rd_node.dbNode,
+ relation->rd_locator.dbOid,
relation->rd_id);
CheckTargetForConflictsIn(&targettag);
}
@@ -4556,7 +4556,7 @@ CheckTableForSerializableConflictIn(Relation relation)
Assert(relation->rd_index == NULL); /* not an index relation */
- dbId = relation->rd_node.dbNode;
+ dbId = relation->rd_locator.dbOid;
heapId = relation->rd_id;
LWLockAcquire(SerializablePredicateListLock, LW_EXCLUSIVE);
diff --git a/src/backend/storage/smgr/README b/src/backend/storage/smgr/README
index e1cfc6c..cf3aa56 100644
--- a/src/backend/storage/smgr/README
+++ b/src/backend/storage/smgr/README
@@ -46,7 +46,7 @@ physical relation in system catalogs.
It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
exists. Fork numbers are assigned in src/include/common/relpath.h.
Functions in smgr.c and md.c take an extra fork number argument, in addition
-to relfilenode and block number, to identify which relation fork you want to
+to relfilelocator and block number, to identify which relation fork you want to
access. Since most code wants to access the main fork, a shortcut version of
ReadBuffer that accesses MAIN_FORKNUM is provided in the buffer manager for
convenience.
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 43edaf5..3998296 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -35,7 +35,7 @@
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/md.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
@@ -89,11 +89,11 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* Populate a file tag describing an md.c segment file. */
-#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
+#define INIT_MD_FILETAG(a,xx_rlocator,xx_forknum,xx_segno) \
( \
memset(&(a), 0, sizeof(FileTag)), \
(a).handler = SYNC_HANDLER_MD, \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forknum = (xx_forknum), \
(a).segno = (xx_segno) \
)
@@ -121,14 +121,14 @@ static MemoryContext MdCxt; /* context for all MdfdVec objects */
/* local routines */
-static void mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum,
+static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
bool isRedo);
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
-static void register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
ForkNumber forknum,
@@ -199,11 +199,11 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* should be here and not in commands/tablespace.c? But that would imply
* importing a lot of stuff that smgr.c oughtn't know, either.
*/
- TablespaceCreateDbspace(reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
+ TablespaceCreateDbspace(reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
isRedo);
- path = relpath(reln->smgr_rnode, forkNum);
+ path = relpath(reln->smgr_rlocator, forkNum);
fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
@@ -234,7 +234,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
/*
* mdunlink() -- Unlink a relation.
*
- * Note that we're passed a RelFileNodeBackend --- by the time this is called,
+ * Note that we're passed a RelFileLocatorBackend --- by the time this is called,
* there won't be an SMgrRelation hashtable entry anymore.
*
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
@@ -243,10 +243,10 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* For regular relations, we don't unlink the first segment file of the rel,
* but just truncate it to zero length, and record a request to unlink it after
* the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenode
- * number from being reused. The scenario this protects us from is:
+ * however. Leaving the empty file in place prevents that relfilenumber
+ * from being reused. The scenario this protects us from is:
* 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenode as
+ * 2. We create a new relation, which by chance gets the same relfilenumber as
* the just-deleted one (OIDs must've wrapped around for that to happen).
* 3. We crash before another checkpoint occurs.
* During replay, we would delete the file and then recreate it, which is fine
@@ -254,18 +254,18 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* But if we didn't WAL-log insertions, but instead relied on fsyncing the
* file after populating it (as we do at wal_level=minimal), the contents of
* the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenode number until
- * it's safe, because relfilenode assignment skips over any existing file.
+ * next checkpoint, we prevent reassignment of the relfilenumber until it's
+ * safe, because relfilenumber assignment skips over any existing file.
*
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenode number.
+ * to the health of a regular rel that has taken over its relfilenumber.
* The fact that temp rels and regular rels have different file naming
* patterns provides additional safety.
*
* All the above applies only to the relation's main fork; other forks can
* just be removed immediately, since they are not needed to prevent the
- * relfilenode number from being recycled. Also, we do not carefully
+ * relfilenumber from being recycled. Also, we do not carefully
* track whether other forks have been created or not, but just attempt to
* unlink them unconditionally; so we should never complain about ENOENT.
*
@@ -278,16 +278,16 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* we are usually not in a transaction anymore when this is called.
*/
void
-mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlink(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
/* Now do the per-fork work */
if (forkNum == InvalidForkNumber)
{
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
else
- mdunlinkfork(rnode, forkNum, isRedo);
+ mdunlinkfork(rlocator, forkNum, isRedo);
}
/*
@@ -315,25 +315,25 @@ do_truncate(const char *path)
}
static void
-mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
+mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
int ret;
- path = relpath(rnode, forkNum);
+ path = relpath(rlocator, forkNum);
/*
* Delete or truncate the first segment.
*/
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+ if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
{
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
/* Forget any pending sync requests for the first segment */
- register_forget_request(rnode, forkNum, 0 /* first seg */ );
+ register_forget_request(rlocator, forkNum, 0 /* first seg */ );
}
else
ret = 0;
@@ -354,7 +354,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
ret = do_truncate(path);
/* Register request to unlink first segment later */
- register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
+ register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
}
/*
@@ -373,7 +373,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
sprintf(segpath, "%s.%u", path, segno);
- if (!RelFileNodeBackendIsTemp(rnode))
+ if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
* Prevent other backends' fds from holding on to the disk
@@ -386,7 +386,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
* Forget any pending sync requests for this segment before we
* try to unlink.
*/
- register_forget_request(rnode, forkNum, segno);
+ register_forget_request(rlocator, forkNum, segno);
}
if (unlink(segpath) < 0)
@@ -437,7 +437,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("cannot extend file \"%s\" beyond %u blocks",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
InvalidBlockNumber)));
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync, EXTENSION_CREATE);
@@ -490,7 +490,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (reln->md_num_open_segs[forknum] > 0)
return &reln->md_seg_fds[forknum][0];
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);
@@ -645,10 +645,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
MdfdVec *v;
TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, false,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -660,10 +660,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_READ);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -715,10 +715,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
#endif
TRACE_POSTGRESQL_SMGR_MD_WRITE_START(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend);
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend);
v = _mdfd_getseg(reln, forknum, blocknum, skipFsync,
EXTENSION_FAIL | EXTENSION_CREATE_RECOVERY);
@@ -730,10 +730,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, seekpos, WAIT_EVENT_DATA_FILE_WRITE);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
- reln->smgr_rnode.node.spcNode,
- reln->smgr_rnode.node.dbNode,
- reln->smgr_rnode.node.relNode,
- reln->smgr_rnode.backend,
+ reln->smgr_rlocator.locator.spcOid,
+ reln->smgr_rlocator.locator.dbOid,
+ reln->smgr_rlocator.locator.relNumber,
+ reln->smgr_rlocator.backend,
nbytes,
BLCKSZ);
@@ -842,7 +842,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
return;
ereport(ERROR,
(errmsg("could not truncate file \"%s\" to %u blocks: it's only %u blocks now",
- relpath(reln->smgr_rnode, forknum),
+ relpath(reln->smgr_rlocator, forknum),
nblocks, curnblk)));
}
if (nblocks == curnblk)
@@ -983,7 +983,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
{
FileTag tag;
- INIT_MD_FILETAG(tag, reln->smgr_rnode.node, forknum, seg->mdfd_segno);
+ INIT_MD_FILETAG(tag, reln->smgr_rlocator.locator, forknum, seg->mdfd_segno);
/* Temp relations should never be fsync'd */
Assert(!SmgrIsTemp(reln));
@@ -1005,15 +1005,15 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
* register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
*/
static void
-register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
+register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
/* Should never be used with temp relations */
- Assert(!RelFileNodeBackendIsTemp(rnode));
+ Assert(!RelFileLocatorBackendIsTemp(rlocator));
RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
}
@@ -1022,12 +1022,12 @@ register_unlink_segment(RelFileNodeBackend rnode, ForkNumber forknum,
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
-register_forget_request(RelFileNodeBackend rnode, ForkNumber forknum,
+register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno)
{
FileTag tag;
- INIT_MD_FILETAG(tag, rnode.node, forknum, segno);
+ INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true /* retryOnError */ );
}
@@ -1039,13 +1039,13 @@ void
ForgetDatabaseSyncRequests(Oid dbid)
{
FileTag tag;
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.dbNode = dbid;
- rnode.spcNode = 0;
- rnode.relNode = 0;
+ rlocator.dbOid = dbid;
+ rlocator.spcOid = 0;
+ rlocator.relNumber = 0;
- INIT_MD_FILETAG(tag, rnode, InvalidForkNumber, InvalidBlockNumber);
+ INIT_MD_FILETAG(tag, rlocator, InvalidForkNumber, InvalidBlockNumber);
RegisterSyncRequest(&tag, SYNC_FILTER_REQUEST, true /* retryOnError */ );
}
@@ -1054,7 +1054,7 @@ ForgetDatabaseSyncRequests(Oid dbid)
* DropRelationFiles -- drop files of all given relations
*/
void
-DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
+DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo)
{
SMgrRelation *srels;
int i;
@@ -1129,7 +1129,7 @@ _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
char *path,
*fullpath;
- path = relpath(reln->smgr_rnode, forknum);
+ path = relpath(reln->smgr_rlocator, forknum);
if (segno > 0)
{
@@ -1345,7 +1345,7 @@ _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
int
mdsyncfiletag(const FileTag *ftag, char *path)
{
- SMgrRelation reln = smgropen(ftag->rnode, InvalidBackendId);
+ SMgrRelation reln = smgropen(ftag->rlocator, InvalidBackendId);
File file;
bool need_to_close;
int result,
@@ -1395,7 +1395,7 @@ mdunlinkfiletag(const FileTag *ftag, char *path)
char *p;
/* Compute the path. */
- p = relpathperm(ftag->rnode, MAIN_FORKNUM);
+ p = relpathperm(ftag->rlocator, MAIN_FORKNUM);
strlcpy(path, p, MAXPGPATH);
pfree(p);
@@ -1417,5 +1417,5 @@ mdfiletagmatches(const FileTag *ftag, const FileTag *candidate)
* We'll return true for all candidates that have the same database OID as
* the ftag from the SYNC_FILTER_REQUEST request, so they're forgotten.
*/
- return ftag->rnode.dbNode == candidate->rnode.dbNode;
+ return ftag->rlocator.dbOid == candidate->rlocator.dbOid;
}
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index a477f70..b21d8c3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -46,7 +46,7 @@ typedef struct f_smgr
void (*smgr_create) (SMgrRelation reln, ForkNumber forknum,
bool isRedo);
bool (*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
- void (*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
+ void (*smgr_unlink) (RelFileLocatorBackend rlocator, ForkNumber forknum,
bool isRedo);
void (*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
@@ -143,9 +143,9 @@ smgrshutdown(int code, Datum arg)
* This does not attempt to actually open the underlying file.
*/
SMgrRelation
-smgropen(RelFileNode rnode, BackendId backend)
+smgropen(RelFileLocator rlocator, BackendId backend)
{
- RelFileNodeBackend brnode;
+ RelFileLocatorBackend brlocator;
SMgrRelation reln;
bool found;
@@ -154,7 +154,7 @@ smgropen(RelFileNode rnode, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileNodeBackend);
+ ctl.keysize = sizeof(RelFileLocatorBackend);
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
@@ -162,10 +162,10 @@ smgropen(RelFileNode rnode, BackendId backend)
}
/* Look up or create an entry */
- brnode.node = rnode;
- brnode.backend = backend;
+ brlocator.locator = rlocator;
+ brlocator.backend = backend;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &brnode,
+ (void *) &brlocator,
HASH_ENTER, &found);
/* Initialize it if not present before */
@@ -267,7 +267,7 @@ smgrclose(SMgrRelation reln)
dlist_delete(&reln->node);
if (hash_search(SMgrRelationHash,
- (void *) &(reln->smgr_rnode),
+ (void *) &(reln->smgr_rlocator),
HASH_REMOVE, NULL) == NULL)
elog(ERROR, "SMgrRelation hashtable corrupted");
@@ -335,15 +335,15 @@ smgrcloseall(void)
}
/*
- * smgrclosenode() -- Close SMgrRelation object for given RelFileNode,
+ * smgrcloserellocator() -- Close SMgrRelation object for given RelFileLocator,
* if one exists.
*
- * This has the same effects as smgrclose(smgropen(rnode)), but it avoids
+ * This has the same effects as smgrclose(smgropen(rlocator)), but it avoids
* uselessly creating a hashtable entry only to drop it again when no
* such entry exists already.
*/
void
-smgrclosenode(RelFileNodeBackend rnode)
+smgrcloserellocator(RelFileLocatorBackend rlocator)
{
SMgrRelation reln;
@@ -352,7 +352,7 @@ smgrclosenode(RelFileNodeBackend rnode)
return;
reln = (SMgrRelation) hash_search(SMgrRelationHash,
- (void *) &rnode,
+ (void *) &rlocator,
HASH_FIND, NULL);
if (reln != NULL)
smgrclose(reln);
@@ -420,7 +420,7 @@ void
smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
{
int i = 0;
- RelFileNodeBackend *rnodes;
+ RelFileLocatorBackend *rlocators;
ForkNumber forknum;
if (nrels == 0)
@@ -430,19 +430,19 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* Get rid of any remaining buffers for the relations. bufmgr will just
* drop them without bothering to write the contents.
*/
- DropRelFileNodesAllBuffers(rels, nrels);
+ DropRelFileLocatorsAllBuffers(rels, nrels);
/*
* create an array which contains all relations to be dropped, and close
* each relation's forks at the smgr level while at it
*/
- rnodes = palloc(sizeof(RelFileNodeBackend) * nrels);
+ rlocators = palloc(sizeof(RelFileLocatorBackend) * nrels);
for (i = 0; i < nrels; i++)
{
- RelFileNodeBackend rnode = rels[i]->smgr_rnode;
+ RelFileLocatorBackend rlocator = rels[i]->smgr_rlocator;
int which = rels[i]->smgr_which;
- rnodes[i] = rnode;
+ rlocators[i] = rlocator;
/* Close the forks at smgr level */
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
@@ -458,7 +458,7 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
* closed our own smgr rel.
*/
for (i = 0; i < nrels; i++)
- CacheInvalidateSmgr(rnodes[i]);
+ CacheInvalidateSmgr(rlocators[i]);
/*
* Delete the physical file(s).
@@ -473,10 +473,10 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
int which = rels[i]->smgr_which;
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
- smgrsw[which].smgr_unlink(rnodes[i], forknum, isRedo);
+ smgrsw[which].smgr_unlink(rlocators[i], forknum, isRedo);
}
- pfree(rnodes);
+ pfree(rlocators);
}
@@ -631,7 +631,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* Get rid of any buffers for the about-to-be-deleted blocks. bufmgr will
* just drop them without bothering to write the contents.
*/
- DropRelFileNodeBuffers(reln, forknum, nforks, nblocks);
+ DropRelFileLocatorBuffers(reln, forknum, nforks, nblocks);
/*
* Send a shared-inval message to force other backends to close any smgr
@@ -643,7 +643,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber *nb
* is a performance-critical path.) As in the unlink code, we want to be
* sure the message is sent before we start changing things on-disk.
*/
- CacheInvalidateSmgr(reln->smgr_rnode);
+ CacheInvalidateSmgr(reln->smgr_rlocator);
/* Do the truncation */
for (i = 0; i < nforks; i++)
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index b4a2c8d..36ec845 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -27,7 +27,7 @@
#include "utils/builtins.h"
#include "utils/numeric.h"
#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
+#include "utils/relfilenumbermap.h"
#include "utils/relmapper.h"
#include "utils/syscache.h"
@@ -292,7 +292,7 @@ pg_tablespace_size_name(PG_FUNCTION_ARGS)
* is no check here or at the call sites for that.
*/
static int64
-calculate_relation_size(RelFileNode *rfn, BackendId backend, ForkNumber forknum)
+calculate_relation_size(RelFileLocator *rfn, BackendId backend, ForkNumber forknum)
{
int64 totalsize = 0;
char *relationpath;
@@ -349,7 +349,7 @@ pg_relation_size(PG_FUNCTION_ARGS)
if (rel == NULL)
PG_RETURN_NULL();
- size = calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size = calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkname_to_number(text_to_cstring(forkName)));
relation_close(rel, AccessShareLock);
@@ -374,7 +374,7 @@ calculate_toast_table_size(Oid toastrelid)
/* toast heap size, including FSM and VM size */
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastRel->rd_node),
+ size += calculate_relation_size(&(toastRel->rd_locator),
toastRel->rd_backend, forkNum);
/* toast index size, including FSM and VM size */
@@ -388,7 +388,7 @@ calculate_toast_table_size(Oid toastrelid)
toastIdxRel = relation_open(lfirst_oid(lc),
AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(toastIdxRel->rd_node),
+ size += calculate_relation_size(&(toastIdxRel->rd_locator),
toastIdxRel->rd_backend, forkNum);
relation_close(toastIdxRel, AccessShareLock);
@@ -417,7 +417,7 @@ calculate_table_size(Relation rel)
* heap size, including FSM and VM
*/
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+ size += calculate_relation_size(&(rel->rd_locator), rel->rd_backend,
forkNum);
/*
@@ -456,7 +456,7 @@ calculate_indexes_size(Relation rel)
idxRel = relation_open(idxOid, AccessShareLock);
for (forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
- size += calculate_relation_size(&(idxRel->rd_node),
+ size += calculate_relation_size(&(idxRel->rd_locator),
idxRel->rd_backend,
forkNum);
@@ -850,7 +850,7 @@ Datum
pg_relation_filenode(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
- Oid result;
+ RelFileNumber result;
HeapTuple tuple;
Form_pg_class relform;
@@ -864,29 +864,29 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (relform->relfilenode)
result = relform->relfilenode;
else /* Consult the relation mapper */
- result = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ result = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- result = InvalidOid;
+ result = InvalidRelFileNumber;
}
ReleaseSysCache(tuple);
- if (!OidIsValid(result))
+ if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
PG_RETURN_OID(result);
}
/*
- * Get the relation via (reltablespace, relfilenode)
+ * Get the relation via (reltablespace, relfilenumber)
*
* This is expected to be used when somebody wants to match an individual file
* on the filesystem back to its table. That's not trivially possible via
- * pg_class, because that doesn't contain the relfilenodes of shared and nailed
+ * pg_class, because that doesn't contain the relfilenumbers of shared and nailed
* tables.
*
* We don't fail but return NULL if we cannot find a mapping.
@@ -898,14 +898,14 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- Oid relfilenode = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_OID(1);
Oid heaprel;
- /* test needed so RelidByRelfilenode doesn't misbehave */
- if (!OidIsValid(relfilenode))
+ /* test needed so RelidByRelfilenumber doesn't misbehave */
+ if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenode(reltablespace, relfilenode);
+ heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
@@ -924,7 +924,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
Oid relid = PG_GETARG_OID(0);
HeapTuple tuple;
Form_pg_class relform;
- RelFileNode rnode;
+ RelFileLocator rlocator;
BackendId backend;
char *path;
@@ -937,29 +937,29 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
{
/* This logic should match RelationInitPhysicalAddr */
if (relform->reltablespace)
- rnode.spcNode = relform->reltablespace;
+ rlocator.spcOid = relform->reltablespace;
else
- rnode.spcNode = MyDatabaseTableSpace;
- if (rnode.spcNode == GLOBALTABLESPACE_OID)
- rnode.dbNode = InvalidOid;
+ rlocator.spcOid = MyDatabaseTableSpace;
+ if (rlocator.spcOid == GLOBALTABLESPACE_OID)
+ rlocator.dbOid = InvalidOid;
else
- rnode.dbNode = MyDatabaseId;
+ rlocator.dbOid = MyDatabaseId;
if (relform->relfilenode)
- rnode.relNode = relform->relfilenode;
+ rlocator.relNumber = relform->relfilenode;
else /* Consult the relation mapper */
- rnode.relNode = RelationMapOidToFilenode(relid,
- relform->relisshared);
+ rlocator.relNumber = RelationMapOidToFilenumber(relid,
+ relform->relisshared);
}
else
{
/* no storage, return NULL */
- rnode.relNode = InvalidOid;
+ rlocator.relNumber = InvalidRelFileNumber;
/* some compilers generate warnings without these next two lines */
- rnode.dbNode = InvalidOid;
- rnode.spcNode = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.spcOid = InvalidOid;
}
- if (!OidIsValid(rnode.relNode))
+ if (!RelFileNumberIsValid(rlocator.relNumber))
{
ReleaseSysCache(tuple);
PG_RETURN_NULL();
@@ -990,7 +990,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
ReleaseSysCache(tuple);
- path = relpathbackend(rnode, backend, MAIN_FORKNUM);
+ path = relpathbackend(rlocator, backend, MAIN_FORKNUM);
PG_RETURN_TEXT_P(cstring_to_text(path));
}
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 65764d7..c260c97 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -2,7 +2,7 @@
* pg_upgrade_support.c
*
* server-side functions to set backend global variables
- * to control oid and relfilenode assignment, and do other special
+ * to control oid and relfilenumber assignment, and do other special
* hacks needed for pg_upgrade.
*
* Copyright (c) 2010-2022, PostgreSQL Global Development Group
@@ -98,10 +98,10 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_heap_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -120,10 +120,10 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_index_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
@@ -142,10 +142,10 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- Oid nodeoid = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_OID(0);
CHECK_IS_BINARY_UPGRADE;
- binary_upgrade_next_toast_pg_class_relfilenode = nodeoid;
+ binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile
index 38e46d2..5105018 100644
--- a/src/backend/utils/cache/Makefile
+++ b/src/backend/utils/cache/Makefile
@@ -21,7 +21,7 @@ OBJS = \
partcache.o \
plancache.o \
relcache.o \
- relfilenodemap.o \
+ relfilenumbermap.o \
relmapper.o \
spccache.o \
syscache.o \
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index af000d4..eb5782f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -661,11 +661,11 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
* We could have smgr entries for relations of other databases, so no
* short-circuit test is possible here.
*/
- RelFileNodeBackend rnode;
+ RelFileLocatorBackend rlocator;
- rnode.node = msg->sm.rnode;
- rnode.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
- smgrclosenode(rnode);
+ rlocator.locator = msg->sm.rlocator;
+ rlocator.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo;
+ smgrcloserellocator(rlocator);
}
else if (msg->id == SHAREDINVALRELMAP_ID)
{
@@ -1459,14 +1459,14 @@ CacheInvalidateRelcacheByRelid(Oid relid)
* Thus, the maximum possible backend ID is 2^23-1.
*/
void
-CacheInvalidateSmgr(RelFileNodeBackend rnode)
+CacheInvalidateSmgr(RelFileLocatorBackend rlocator)
{
SharedInvalidationMessage msg;
msg.sm.id = SHAREDINVALSMGR_ID;
- msg.sm.backend_hi = rnode.backend >> 16;
- msg.sm.backend_lo = rnode.backend & 0xffff;
- msg.sm.rnode = rnode.node;
+ msg.sm.backend_hi = rlocator.backend >> 16;
+ msg.sm.backend_lo = rlocator.backend & 0xffff;
+ msg.sm.rlocator = rlocator.locator;
/* check AddCatcacheInvalidationMessage() for an explanation */
VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index f502df9..0639875 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -369,7 +369,7 @@ ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic)
/*
* The caller might need a tuple that's newer than the one the historic
* snapshot; currently the only case requiring to do so is looking up the
- * relfilenode of non mapped system relations during decoding. That
+ * relfilenumber of non mapped system relations during decoding. That
* snapshot can't change in the midst of a relcache build, so there's no
* need to register the snapshot.
*/
@@ -1133,8 +1133,8 @@ retry:
relation->rd_refcnt = 0;
relation->rd_isnailed = false;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
@@ -1300,7 +1300,7 @@ retry:
}
/*
- * Initialize the physical addressing info (RelFileNode) for a relcache entry
+ * Initialize the physical addressing info (RelFileLocator) for a relcache entry
*
* Note: at the physical level, relations in the pg_global tablespace must
* be treated as shared, even if relisshared isn't set. Hence we do not
@@ -1309,20 +1309,20 @@ retry:
static void
RelationInitPhysicalAddr(Relation relation)
{
- Oid oldnode = relation->rd_node.relNode;
+ RelFileNumber oldnumber = relation->rd_locator.relNumber;
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
if (relation->rd_rel->reltablespace)
- relation->rd_node.spcNode = relation->rd_rel->reltablespace;
+ relation->rd_locator.spcOid = relation->rd_rel->reltablespace;
else
- relation->rd_node.spcNode = MyDatabaseTableSpace;
- if (relation->rd_node.spcNode == GLOBALTABLESPACE_OID)
- relation->rd_node.dbNode = InvalidOid;
+ relation->rd_locator.spcOid = MyDatabaseTableSpace;
+ if (relation->rd_locator.spcOid == GLOBALTABLESPACE_OID)
+ relation->rd_locator.dbOid = InvalidOid;
else
- relation->rd_node.dbNode = MyDatabaseId;
+ relation->rd_locator.dbOid = MyDatabaseId;
if (relation->rd_rel->relfilenode)
{
@@ -1356,30 +1356,30 @@ RelationInitPhysicalAddr(Relation relation)
heap_freetuple(phys_tuple);
}
- relation->rd_node.relNode = relation->rd_rel->relfilenode;
+ relation->rd_locator.relNumber = relation->rd_rel->relfilenode;
}
else
{
/* Consult the relation mapper */
- relation->rd_node.relNode =
- RelationMapOidToFilenode(relation->rd_id,
- relation->rd_rel->relisshared);
- if (!OidIsValid(relation->rd_node.relNode))
+ relation->rd_locator.relNumber =
+ RelationMapOidToFilenumber(relation->rd_id,
+ relation->rd_rel->relisshared);
+ if (!RelFileNumberIsValid(relation->rd_locator.relNumber))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
/*
* For RelationNeedsWAL() to answer correctly on parallel workers, restore
- * rd_firstRelfilenodeSubid. No subtransactions start or end while in
+ * rd_firstRelfilelocatorSubid. No subtransactions start or end while in
* parallel mode, so the specific SubTransactionId does not matter.
*/
- if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ if (IsParallelWorker() && oldnumber != relation->rd_locator.relNumber)
{
- if (RelFileNodeSkippingWAL(relation->rd_node))
- relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ if (RelFileLocatorSkippingWAL(relation->rd_locator))
+ relation->rd_firstRelfilelocatorSubid = TopSubTransactionId;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
}
@@ -1889,8 +1889,8 @@ formrdesc(const char *relationName, Oid relationReltype,
*/
relation->rd_isnailed = true;
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
relation->rd_backend = InvalidBackendId;
relation->rd_islocaltemp = false;
@@ -1978,11 +1978,11 @@ formrdesc(const char *relationName, Oid relationReltype,
/*
* All relations made with formrdesc are mapped. This is necessarily so
- * because there is no other way to know what filenode they currently
+ * because there is no other way to know what filenumber they currently
* have. In bootstrap mode, add them to the initial relation mapper data,
- * specifying that the initial filenode is the same as the OID.
+ * specifying that the initial filenumber is the same as the OID.
*/
- relation->rd_rel->relfilenode = InvalidOid;
+ relation->rd_rel->relfilenode = InvalidRelFileNumber;
if (IsBootstrapProcessingMode())
RelationMapUpdateMap(RelationGetRelid(relation),
RelationGetRelid(relation),
@@ -2180,7 +2180,7 @@ RelationClose(Relation relation)
#ifdef RELCACHE_FORCE_RELEASE
if (RelationHasReferenceCountZero(relation) &&
relation->rd_createSubid == InvalidSubTransactionId &&
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
RelationClearRelation(relation, false);
#endif
}
@@ -2352,7 +2352,7 @@ RelationReloadNailed(Relation relation)
{
/*
* If it's a nailed-but-not-mapped index, then we need to re-read the
- * pg_class row to see if its relfilenode changed.
+ * pg_class row to see if its relfilenumber changed.
*/
RelationReloadIndexInfo(relation);
}
@@ -2700,8 +2700,8 @@ RelationClearRelation(Relation relation, bool rebuild)
Assert(newrel->rd_isnailed == relation->rd_isnailed);
/* creation sub-XIDs must be preserved */
SWAPFIELD(SubTransactionId, rd_createSubid);
- SWAPFIELD(SubTransactionId, rd_newRelfilenodeSubid);
- SWAPFIELD(SubTransactionId, rd_firstRelfilenodeSubid);
+ SWAPFIELD(SubTransactionId, rd_newRelfilelocatorSubid);
+ SWAPFIELD(SubTransactionId, rd_firstRelfilelocatorSubid);
SWAPFIELD(SubTransactionId, rd_droppedSubid);
/* un-swap rd_rel pointers, swap contents instead */
SWAPFIELD(Form_pg_class, rd_rel);
@@ -2791,12 +2791,12 @@ static void
RelationFlushRelation(Relation relation)
{
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* New relcache entries are always rebuilt, not flushed; else we'd
* forget the "new" status of the relation. Ditto for the
- * new-relfilenode status.
+ * new-relfilenumber status.
*
* The rel could have zero refcnt here, so temporarily increment the
* refcnt to ensure it's safe to rebuild it. We can assume that the
@@ -2835,7 +2835,7 @@ RelationForgetRelation(Oid rid)
Assert(relation->rd_droppedSubid == InvalidSubTransactionId);
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
{
/*
* In the event of subtransaction rollback, we must not forget
@@ -2894,7 +2894,7 @@ RelationCacheInvalidateEntry(Oid relationId)
*
* Apart from debug_discard_caches, this is currently used only to recover
* from SI message buffer overflow, so we do not touch relations having
- * new-in-transaction relfilenodes; they cannot be targets of cross-backend
+ * new-in-transaction relfilenumbers; they cannot be targets of cross-backend
* SI updates (and our own updates now go through a separate linked list
* that isn't limited by the SI message buffer size).
*
@@ -2909,7 +2909,7 @@ RelationCacheInvalidateEntry(Oid relationId)
* so hash_seq_search will complete safely; (b) during the second pass we
* only hold onto pointers to nondeletable entries.
*
- * The two-phase approach also makes it easy to update relfilenodes for
+ * The two-phase approach also makes it easy to update relfilenumbers for
* mapped relations before we do anything else, and to ensure that the
* second pass processes nailed-in-cache items before other nondeletable
* items. This should ensure that system catalogs are up to date before
@@ -2948,12 +2948,12 @@ RelationCacheInvalidate(bool debug_discard)
/*
* Ignore new relations; no other backend will manipulate them before
- * we commit. Likewise, before replacing a relation's relfilenode, we
- * shall have acquired AccessExclusiveLock and drained any applicable
- * pending invalidations.
+ * we commit. Likewise, before replacing a relation's relfilelocator,
+ * we shall have acquired AccessExclusiveLock and drained any
+ * applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;
relcacheInvalsReceived++;
@@ -2967,8 +2967,8 @@ RelationCacheInvalidate(bool debug_discard)
else
{
/*
- * If it's a mapped relation, immediately update its rd_node in
- * case its relfilenode changed. We must do this during phase 1
+ * If it's a mapped relation, immediately update its rd_locator in
+ * case its relfilenumber changed. We must do this during phase 1
* in case the relation is consulted during rebuild of other
* relcache entries in phase 2. It's safe since consulting the
* map doesn't involve any access to relcache entries.
@@ -3078,14 +3078,14 @@ AssertPendingSyncConsistency(Relation relation)
RelationIsPermanent(relation) &&
((relation->rd_createSubid != InvalidSubTransactionId &&
RELKIND_HAS_STORAGE(relation->rd_rel->relkind)) ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId);
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId);
- Assert(relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node));
+ Assert(relcache_verdict == RelFileLocatorSkippingWAL(relation->rd_locator));
if (relation->rd_droppedSubid != InvalidSubTransactionId)
Assert(!relation->rd_isvalid &&
(relation->rd_createSubid != InvalidSubTransactionId ||
- relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId));
+ relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId));
}
/*
@@ -3282,8 +3282,8 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
* also lets RelationClearRelation() drop the relcache entry.
*/
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
if (clear_relcache)
@@ -3397,8 +3397,8 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
{
/* allow the entry to be removed */
relation->rd_createSubid = InvalidSubTransactionId;
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
relation->rd_droppedSubid = InvalidSubTransactionId;
RelationClearRelation(relation, false);
return;
@@ -3419,23 +3419,23 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
}
/*
- * Likewise, update or drop any new-relfilenode-in-subtransaction record
+ * Likewise, update or drop any new-relfilenumber-in-subtransaction record
* or drop record.
*/
- if (relation->rd_newRelfilenodeSubid == mySubid)
+ if (relation->rd_newRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_newRelfilenodeSubid = parentSubid;
+ relation->rd_newRelfilelocatorSubid = parentSubid;
else
- relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
}
- if (relation->rd_firstRelfilenodeSubid == mySubid)
+ if (relation->rd_firstRelfilelocatorSubid == mySubid)
{
if (isCommit)
- relation->rd_firstRelfilenodeSubid = parentSubid;
+ relation->rd_firstRelfilelocatorSubid = parentSubid;
else
- relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ relation->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
}
if (relation->rd_droppedSubid == mySubid)
@@ -3459,7 +3459,7 @@ RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -3533,8 +3533,8 @@ RelationBuildLocalRelation(const char *relname,
/* it's being created in this transaction */
rel->rd_createSubid = GetCurrentSubTransactionId();
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
/*
@@ -3616,7 +3616,7 @@ RelationBuildLocalRelation(const char *relname,
/*
* Insert relation physical and logical identifiers (OIDs) into the right
- * places. For a mapped relation, we set relfilenode to zero and rely on
+ * places. For a mapped relation, we set relfilenumber to zero and rely on
* RelationInitPhysicalAddr to consult the map.
*/
rel->rd_rel->relisshared = shared_relation;
@@ -3630,12 +3630,12 @@ RelationBuildLocalRelation(const char *relname,
if (mapped_relation)
{
- rel->rd_rel->relfilenode = InvalidOid;
+ rel->rd_rel->relfilenode = InvalidRelFileNumber;
/* Add it to the active mapping information */
- RelationMapUpdateMap(relid, relfilenode, shared_relation, true);
+ RelationMapUpdateMap(relid, relfilenumber, shared_relation, true);
}
else
- rel->rd_rel->relfilenode = relfilenode;
+ rel->rd_rel->relfilenode = relfilenumber;
RelationInitLockInfo(rel); /* see lmgr.c */
@@ -3683,13 +3683,13 @@ RelationBuildLocalRelation(const char *relname,
/*
- * RelationSetNewRelfilenode
+ * RelationSetNewRelfilenumber
*
- * Assign a new relfilenode (physical file name), and possibly a new
+ * Assign a new relfilenumber (physical file name), and possibly a new
* persistence setting, to the relation.
*
* This allows a full rewrite of the relation to be done with transactional
- * safety (since the filenode assignment can be rolled back). Note however
+ * safety (since the filenumber assignment can be rolled back). Note however
* that there is no simple way to access the relation's old data for the
* remainder of the current transaction. This limits the usefulness to cases
* such as TRUNCATE or rebuilding an index from scratch.
@@ -3697,19 +3697,19 @@ RelationBuildLocalRelation(const char *relname,
* Caller must already hold exclusive lock on the relation.
*/
void
-RelationSetNewRelfilenode(Relation relation, char persistence)
+RelationSetNewRelfilenumber(Relation relation, char persistence)
{
- Oid newrelfilenode;
+ RelFileNumber newrelfilenumber;
Relation pg_class;
HeapTuple tuple;
Form_pg_class classform;
MultiXactId minmulti = InvalidMultiXactId;
TransactionId freezeXid = InvalidTransactionId;
- RelFileNode newrnode;
+ RelFileLocator newrlocator;
- /* Allocate a new relfilenode */
- newrelfilenode = GetNewRelFileNode(relation->rd_rel->reltablespace, NULL,
- persistence);
+ /* Allocate a new relfilenumber */
+ newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
+ NULL, persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
@@ -3729,28 +3729,28 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
RelationDropStorage(relation);
/*
- * Create storage for the main fork of the new relfilenode. If it's a
+ * Create storage for the main fork of the new relfilenumber. If it's a
* table-like object, call into the table AM to do so, which'll also
* create the table's init fork if needed.
*
- * NOTE: If relevant for the AM, any conflict in relfilenode value will be
- * caught here, if GetNewRelFileNode messes up for any reason.
+ * NOTE: If relevant for the AM, any conflict in relfilenumber value will be
+ * caught here, if GetNewRelFileNumber messes up for any reason.
*/
- newrnode = relation->rd_node;
- newrnode.relNode = newrelfilenode;
+ newrlocator = relation->rd_locator;
+ newrlocator.relNumber = newrelfilenumber;
if (RELKIND_HAS_TABLE_AM(relation->rd_rel->relkind))
{
- table_relation_set_new_filenode(relation, &newrnode,
- persistence,
- &freezeXid, &minmulti);
+ table_relation_set_new_filelocator(relation, &newrlocator,
+ persistence,
+ &freezeXid, &minmulti);
}
else if (RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
{
/* handle these directly, at least for now */
SMgrRelation srel;
- srel = RelationCreateStorage(newrnode, persistence, true);
+ srel = RelationCreateStorage(newrlocator, persistence, true);
smgrclose(srel);
}
else
@@ -3789,7 +3789,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
/* Do the deed */
RelationMapUpdateMap(RelationGetRelid(relation),
- newrelfilenode,
+ newrelfilenumber,
relation->rd_rel->relisshared,
false);
@@ -3799,7 +3799,7 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
else
{
/* Normal case, update the pg_class entry */
- classform->relfilenode = newrelfilenode;
+ classform->relfilenode = newrelfilenumber;
/* relpages etc. never change for sequences */
if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
@@ -3825,27 +3825,27 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
*/
CommandCounterIncrement();
- RelationAssumeNewRelfilenode(relation);
+ RelationAssumeNewRelfilelocator(relation);
}
/*
- * RelationAssumeNewRelfilenode
+ * RelationAssumeNewRelfilelocator
*
* Code that modifies pg_class.reltablespace or pg_class.relfilenode must call
* this. The call shall precede any code that might insert WAL records whose
- * replay would modify bytes in the new RelFileNode, and the call shall follow
- * any WAL modifying bytes in the prior RelFileNode. See struct RelationData.
+ * replay would modify bytes in the new RelFileLocator, and the call shall follow
+ * any WAL modifying bytes in the prior RelFileLocator. See struct RelationData.
* Ideally, call this as near as possible to the CommandCounterIncrement()
* that makes the pg_class change visible (before it or after it); that
* minimizes the chance of future development adding a forbidden WAL insertion
- * between RelationAssumeNewRelfilenode() and CommandCounterIncrement().
+ * between RelationAssumeNewRelfilelocator() and CommandCounterIncrement().
*/
void
-RelationAssumeNewRelfilenode(Relation relation)
+RelationAssumeNewRelfilelocator(Relation relation)
{
- relation->rd_newRelfilenodeSubid = GetCurrentSubTransactionId();
- if (relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)
- relation->rd_firstRelfilenodeSubid = relation->rd_newRelfilenodeSubid;
+ relation->rd_newRelfilelocatorSubid = GetCurrentSubTransactionId();
+ if (relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)
+ relation->rd_firstRelfilelocatorSubid = relation->rd_newRelfilelocatorSubid;
/* Flag relation as needing eoxact cleanup (to clear these fields) */
EOXactListAdd(relation);
@@ -6254,8 +6254,8 @@ load_relcache_init_file(bool shared)
rel->rd_fkeyvalid = false;
rel->rd_fkeylist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
- rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
- rel->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ rel->rd_newRelfilelocatorSubid = InvalidSubTransactionId;
+ rel->rd_firstRelfilelocatorSubid = InvalidSubTransactionId;
rel->rd_droppedSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
rel->pgstat_info = NULL;
diff --git a/src/backend/utils/cache/relfilenodemap.c b/src/backend/utils/cache/relfilenodemap.c
deleted file mode 100644
index 70c323c..0000000
--- a/src/backend/utils/cache/relfilenodemap.c
+++ /dev/null
@@ -1,244 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.c
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- * src/backend/utils/cache/relfilenodemap.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "access/genam.h"
-#include "access/htup_details.h"
-#include "access/table.h"
-#include "catalog/pg_class.h"
-#include "catalog/pg_tablespace.h"
-#include "miscadmin.h"
-#include "utils/builtins.h"
-#include "utils/catcache.h"
-#include "utils/fmgroids.h"
-#include "utils/hsearch.h"
-#include "utils/inval.h"
-#include "utils/rel.h"
-#include "utils/relfilenodemap.h"
-#include "utils/relmapper.h"
-
-/* Hash table for information about each relfilenode <-> oid pair */
-static HTAB *RelfilenodeMapHash = NULL;
-
-/* built first time through in InitializeRelfilenodeMap */
-static ScanKeyData relfilenode_skey[2];
-
-typedef struct
-{
- Oid reltablespace;
- Oid relfilenode;
-} RelfilenodeMapKey;
-
-typedef struct
-{
- RelfilenodeMapKey key; /* lookup key - must be first */
- Oid relid; /* pg_class.oid */
-} RelfilenodeMapEntry;
-
-/*
- * RelfilenodeMapInvalidateCallback
- * Flush mapping entries when pg_class is updated in a relevant fashion.
- */
-static void
-RelfilenodeMapInvalidateCallback(Datum arg, Oid relid)
-{
- HASH_SEQ_STATUS status;
- RelfilenodeMapEntry *entry;
-
- /* callback only gets registered after creating the hash */
- Assert(RelfilenodeMapHash != NULL);
-
- hash_seq_init(&status, RelfilenodeMapHash);
- while ((entry = (RelfilenodeMapEntry *) hash_seq_search(&status)) != NULL)
- {
- /*
- * If relid is InvalidOid, signaling a complete reset, we must remove
- * all entries, otherwise just remove the specific relation's entry.
- * Always remove negative cache entries.
- */
- if (relid == InvalidOid || /* complete reset */
- entry->relid == InvalidOid || /* negative cache entry */
- entry->relid == relid) /* individual flushed relation */
- {
- if (hash_search(RelfilenodeMapHash,
- (void *) &entry->key,
- HASH_REMOVE,
- NULL) == NULL)
- elog(ERROR, "hash table corrupted");
- }
- }
-}
-
-/*
- * InitializeRelfilenodeMap
- * Initialize cache, either on first use or after a reset.
- */
-static void
-InitializeRelfilenodeMap(void)
-{
- HASHCTL ctl;
- int i;
-
- /* Make sure we've initialized CacheMemoryContext. */
- if (CacheMemoryContext == NULL)
- CreateCacheMemoryContext();
-
- /* build skey */
- MemSet(&relfilenode_skey, 0, sizeof(relfilenode_skey));
-
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenode_skey[i].sk_func,
- CacheMemoryContext);
- relfilenode_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenode_skey[i].sk_subtype = InvalidOid;
- relfilenode_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenode_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenode_skey[1].sk_attno = Anum_pg_class_relfilenode;
-
- /*
- * Only create the RelfilenodeMapHash now, so we don't end up partially
- * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
- * error.
- */
- ctl.keysize = sizeof(RelfilenodeMapKey);
- ctl.entrysize = sizeof(RelfilenodeMapEntry);
- ctl.hcxt = CacheMemoryContext;
-
- RelfilenodeMapHash =
- hash_create("RelfilenodeMap cache", 64, &ctl,
- HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
-
- /* Watch for invalidation events. */
- CacheRegisterRelcacheCallback(RelfilenodeMapInvalidateCallback,
- (Datum) 0);
-}
-
-/*
- * Map a relation's (tablespace, filenode) to a relation's oid and cache the
- * result.
- *
- * Returns InvalidOid if no relation matching the criteria could be found.
- */
-Oid
-RelidByRelfilenode(Oid reltablespace, Oid relfilenode)
-{
- RelfilenodeMapKey key;
- RelfilenodeMapEntry *entry;
- bool found;
- SysScanDesc scandesc;
- Relation relation;
- HeapTuple ntp;
- ScanKeyData skey[2];
- Oid relid;
-
- if (RelfilenodeMapHash == NULL)
- InitializeRelfilenodeMap();
-
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenode = relfilenode;
-
- /*
- * Check cache and return entry if one is found. Even if no target
- * relation can be found later on we store the negative match and return a
- * InvalidOid from cache. That's not really necessary for performance
- * since querying invalid values isn't supposed to be a frequent thing,
- * but it's basically free.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_FIND, &found);
-
- if (found)
- return entry->relid;
-
- /* ok, no previous cache entry, do it the hard way */
-
- /* initialize empty/negative cache entry before doing the actual lookups */
- relid = InvalidOid;
-
- if (reltablespace == GLOBALTABLESPACE_OID)
- {
- /*
- * Ok, shared table, check relmapper.
- */
- relid = RelationMapFilenodeToOid(relfilenode, true);
- }
- else
- {
- /*
- * Not a shared table, could either be a plain relation or a
- * non-shared, nailed one, like e.g. pg_class.
- */
-
- /* check for plain relations by looking in pg_class */
- relation = table_open(RelationRelationId, AccessShareLock);
-
- /* copy scankey to local copy, it will be modified during the scan */
- memcpy(skey, relfilenode_skey, sizeof(skey));
-
- /* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenode);
-
- scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
- true,
- NULL,
- 2,
- skey);
-
- found = false;
-
- while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
- {
- Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
-
- if (found)
- elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenode %u",
- reltablespace, relfilenode);
- found = true;
-
- Assert(classform->reltablespace == reltablespace);
- Assert(classform->relfilenode == relfilenode);
- relid = classform->oid;
- }
-
- systable_endscan(scandesc);
- table_close(relation, AccessShareLock);
-
- /* check for tables that are mapped but not shared */
- if (!found)
- relid = RelationMapFilenodeToOid(relfilenode, false);
- }
-
- /*
- * Only enter entry into cache now, our opening of pg_class could have
- * caused cache invalidations to be executed which would have deleted a
- * new entry if we had entered it above.
- */
- entry = hash_search(RelfilenodeMapHash, (void *) &key, HASH_ENTER, &found);
- if (found)
- elog(ERROR, "corrupted hashtable");
- entry->relid = relid;
-
- return relid;
-}
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
new file mode 100644
index 0000000..3dc45e9
--- /dev/null
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -0,0 +1,244 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.c
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/utils/cache/relfilenumbermap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/htup_details.h"
+#include "access/table.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/catcache.h"
+#include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/inval.h"
+#include "utils/rel.h"
+#include "utils/relfilenumbermap.h"
+#include "utils/relmapper.h"
+
+/* Hash table for information about each relfilenumber <-> oid pair */
+static HTAB *RelfilenumberMapHash = NULL;
+
+/* built first time through in InitializeRelfilenumberMap */
+static ScanKeyData relfilenumber_skey[2];
+
+typedef struct
+{
+ Oid reltablespace;
+ RelFileNumber relfilenumber;
+} RelfilenumberMapKey;
+
+typedef struct
+{
+ RelfilenumberMapKey key; /* lookup key - must be first */
+ Oid relid; /* pg_class.oid */
+} RelfilenumberMapEntry;
+
+/*
+ * RelfilenumberMapInvalidateCallback
+ * Flush mapping entries when pg_class is updated in a relevant fashion.
+ */
+static void
+RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
+{
+ HASH_SEQ_STATUS status;
+ RelfilenumberMapEntry *entry;
+
+ /* callback only gets registered after creating the hash */
+ Assert(RelfilenumberMapHash != NULL);
+
+ hash_seq_init(&status, RelfilenumberMapHash);
+ while ((entry = (RelfilenumberMapEntry *) hash_seq_search(&status)) != NULL)
+ {
+ /*
+ * If relid is InvalidOid, signaling a complete reset, we must remove
+ * all entries, otherwise just remove the specific relation's entry.
+ * Always remove negative cache entries.
+ */
+ if (relid == InvalidOid || /* complete reset */
+ entry->relid == InvalidOid || /* negative cache entry */
+ entry->relid == relid) /* individual flushed relation */
+ {
+ if (hash_search(RelfilenumberMapHash,
+ (void *) &entry->key,
+ HASH_REMOVE,
+ NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+ }
+ }
+}
+
+/*
+ * InitializeRelfilenumberMap
+ * Initialize cache, either on first use or after a reset.
+ */
+static void
+InitializeRelfilenumberMap(void)
+{
+ HASHCTL ctl;
+ int i;
+
+ /* Make sure we've initialized CacheMemoryContext. */
+ if (CacheMemoryContext == NULL)
+ CreateCacheMemoryContext();
+
+ /* build skey */
+ MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
+
+ for (i = 0; i < 2; i++)
+ {
+ fmgr_info_cxt(F_OIDEQ,
+ &relfilenumber_skey[i].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[i].sk_subtype = InvalidOid;
+ relfilenumber_skey[i].sk_collation = InvalidOid;
+ }
+
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
+ relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+
+ /*
+ * Only create the RelfilenumberMapHash now, so we don't end up partially
+ * initialized when fmgr_info_cxt() above ERRORs out with an out of memory
+ * error.
+ */
+ ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.entrysize = sizeof(RelfilenumberMapEntry);
+ ctl.hcxt = CacheMemoryContext;
+
+ RelfilenumberMapHash =
+ hash_create("RelfilenumberMap cache", 64, &ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+ /* Watch for invalidation events. */
+ CacheRegisterRelcacheCallback(RelfilenumberMapInvalidateCallback,
+ (Datum) 0);
+}
+
+/*
+ * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * the result.
+ *
+ * Returns InvalidOid if no relation matching the criteria could be found.
+ */
+Oid
+RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
+{
+ RelfilenumberMapKey key;
+ RelfilenumberMapEntry *entry;
+ bool found;
+ SysScanDesc scandesc;
+ Relation relation;
+ HeapTuple ntp;
+ ScanKeyData skey[2];
+ Oid relid;
+
+ if (RelfilenumberMapHash == NULL)
+ InitializeRelfilenumberMap();
+
+ /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
+ if (reltablespace == MyDatabaseTableSpace)
+ reltablespace = 0;
+
+ MemSet(&key, 0, sizeof(key));
+ key.reltablespace = reltablespace;
+ key.relfilenumber = relfilenumber;
+
+ /*
+ * Check cache and return entry if one is found. Even if no target
+ * relation can be found later on we store the negative match and return a
+ * InvalidOid from cache. That's not really necessary for performance
+ * since querying invalid values isn't supposed to be a frequent thing,
+ * but it's basically free.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+
+ if (found)
+ return entry->relid;
+
+ /* ok, no previous cache entry, do it the hard way */
+
+ /* initialize empty/negative cache entry before doing the actual lookups */
+ relid = InvalidOid;
+
+ if (reltablespace == GLOBALTABLESPACE_OID)
+ {
+ /*
+ * Ok, shared table, check relmapper.
+ */
+ relid = RelationMapFilenumberToOid(relfilenumber, true);
+ }
+ else
+ {
+ /*
+ * Not a shared table, could either be a plain relation or a
+ * non-shared, nailed one, like e.g. pg_class.
+ */
+
+ /* check for plain relations by looking in pg_class */
+ relation = table_open(RelationRelationId, AccessShareLock);
+
+ /* copy scankey to local copy, it will be modified during the scan */
+ memcpy(skey, relfilenumber_skey, sizeof(skey));
+
+ /* set scan arguments */
+ skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
+ skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+
+ scandesc = systable_beginscan(relation,
+ ClassTblspcRelfilenodeIndexId,
+ true,
+ NULL,
+ 2,
+ skey);
+
+ found = false;
+
+ while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
+ {
+ Form_pg_class classform = (Form_pg_class) GETSTRUCT(ntp);
+
+ if (found)
+ elog(ERROR,
+ "unexpected duplicate for tablespace %u, relfilenumber %u",
+ reltablespace, relfilenumber);
+ found = true;
+
+ Assert(classform->reltablespace == reltablespace);
+ Assert(classform->relfilenode == relfilenumber);
+ relid = classform->oid;
+ }
+
+ systable_endscan(scandesc);
+ table_close(relation, AccessShareLock);
+
+ /* check for tables that are mapped but not shared */
+ if (!found)
+ relid = RelationMapFilenumberToOid(relfilenumber, false);
+ }
+
+ /*
+ * Only enter entry into cache now, our opening of pg_class could have
+ * caused cache invalidations to be executed which would have deleted a
+ * new entry if we had entered it above.
+ */
+ entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ if (found)
+ elog(ERROR, "corrupted hashtable");
+ entry->relid = relid;
+
+ return relid;
+}
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 2a330cf..e2ac0fa 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.c
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
* For most tables, the physical file underlying the table is specified by
* pg_class.relfilenode. However, that obviously won't work for pg_class
@@ -11,7 +11,7 @@
* update other databases' pg_class entries when relocating a shared catalog.
* Therefore, for these special catalogs (henceforth referred to as "mapped
* catalogs") we rely on a separately maintained file that shows the mapping
- * from catalog OIDs to filenode numbers. Each database has a map file for
+ * from catalog OIDs to filenumbers. Each database has a map file for
* its local mapped catalogs, and there is a separate map file for shared
* catalogs. Mapped catalogs have zero in their pg_class.relfilenode entries.
*
@@ -78,8 +78,8 @@
typedef struct RelMapping
{
- Oid mapoid; /* OID of a catalog */
- Oid mapfilenode; /* its filenode number */
+ Oid mapoid; /* OID of a catalog */
+ RelFileNumber mapfilenumber; /* its rel file number */
} RelMapping;
typedef struct RelMapFile
@@ -116,7 +116,7 @@ static RelMapFile local_map;
* subtransactions, so one set of transaction-level changes is sufficient.
*
* The active_xxx variables contain updates that are valid in our transaction
- * and should be honored by RelationMapOidToFilenode. The pending_xxx
+ * and should be honored by RelationMapOidToFilenumber. The pending_xxx
* variables contain updates we have been told about that aren't active yet;
* they will become active at the next CommandCounterIncrement. This setup
* lets map updates act similarly to updates of pg_class rows, ie, they
@@ -132,8 +132,8 @@ static RelMapFile pending_local_updates;
/* non-export function prototypes */
-static void apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode,
- bool add_okay);
+static void apply_map_update(RelMapFile *map, Oid relationId,
+ RelFileNumber filenumber, bool add_okay);
static void merge_map_updates(RelMapFile *map, const RelMapFile *updates,
bool add_okay);
static void load_relmap_file(bool shared, bool lock_held);
@@ -146,19 +146,20 @@ static void perform_relmap_update(bool shared, const RelMapFile *updates);
/*
- * RelationMapOidToFilenode
+ * RelationMapOidToFilenumber
*
- * The raison d' etre ... given a relation OID, look up its filenode.
+ * The raison d' etre ... given a relation OID, look up its filenumber.
*
* Although shared and local relation OIDs should never overlap, the caller
* always knows which we need --- so pass that information to avoid useless
* searching.
*
- * Returns InvalidOid if the OID is not known (which should never happen,
- * but the caller is in a better position to report a meaningful error).
+ * Returns InvalidRelFileNumber if the OID is not known (which should never
+ * happen, but the caller is in a better position to report a meaningful
+ * error).
*/
-Oid
-RelationMapOidToFilenode(Oid relationId, bool shared)
+RelFileNumber
+RelationMapOidToFilenumber(Oid relationId, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -170,13 +171,13 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
else
@@ -185,33 +186,33 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
if (relationId == map->mappings[i].mapoid)
- return map->mappings[i].mapfilenode;
+ return map->mappings[i].mapfilenumber;
}
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
- * RelationMapFilenodeToOid
+ * RelationMapFilenumberToOid
*
* Do the reverse of the normal direction of mapping done in
- * RelationMapOidToFilenode.
+ * RelationMapOidToFilenumber.
*
* This is not supposed to be used during normal running but rather for
* information purposes when looking at the filesystem or xlog.
*
* Returns InvalidOid if the OID is not known; this can easily happen if the
- * relfilenode doesn't pertain to a mapped relation.
+ * relfilenumber doesn't pertain to a mapped relation.
*/
Oid
-RelationMapFilenodeToOid(Oid filenode, bool shared)
+RelationMapFilenumberToOid(RelFileNumber filenumber, bool shared)
{
const RelMapFile *map;
int32 i;
@@ -222,13 +223,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_shared_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &shared_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -237,13 +238,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
map = &active_local_updates;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
map = &local_map;
for (i = 0; i < map->num_mappings; i++)
{
- if (filenode == map->mappings[i].mapfilenode)
+ if (filenumber == map->mappings[i].mapfilenumber)
return map->mappings[i].mapoid;
}
}
@@ -252,13 +253,13 @@ RelationMapFilenodeToOid(Oid filenode, bool shared)
}
/*
- * RelationMapOidToFilenodeForDatabase
+ * RelationMapOidToFilenumberForDatabase
*
- * Like RelationMapOidToFilenode, but reads the mapping from the indicated
+ * Like RelationMapOidToFilenumber, but reads the mapping from the indicated
* path instead of using the one for the current database.
*/
-Oid
-RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
+RelFileNumber
+RelationMapOidToFilenumberForDatabase(char *dbpath, Oid relationId)
{
RelMapFile map;
int i;
@@ -270,10 +271,10 @@ RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId)
for (i = 0; i < map.num_mappings; i++)
{
if (relationId == map.mappings[i].mapoid)
- return map.mappings[i].mapfilenode;
+ return map.mappings[i].mapfilenumber;
}
- return InvalidOid;
+ return InvalidRelFileNumber;
}
/*
@@ -311,13 +312,13 @@ RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath, char *dstdbpath)
/*
* RelationMapUpdateMap
*
- * Install a new relfilenode mapping for the specified relation.
+ * Install a new relfilenumber mapping for the specified relation.
*
* If immediate is true (or we're bootstrapping), the mapping is activated
* immediately. Otherwise it is made pending until CommandCounterIncrement.
*/
void
-RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
+RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber, bool shared,
bool immediate)
{
RelMapFile *map;
@@ -362,7 +363,7 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
map = &pending_local_updates;
}
}
- apply_map_update(map, relationId, fileNode, true);
+ apply_map_update(map, relationId, fileNumber, true);
}
/*
@@ -375,7 +376,8 @@ RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
* add_okay = false to draw an error if not.
*/
static void
-apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
+apply_map_update(RelMapFile *map, Oid relationId, RelFileNumber fileNumber,
+ bool add_okay)
{
int32 i;
@@ -384,7 +386,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
{
if (relationId == map->mappings[i].mapoid)
{
- map->mappings[i].mapfilenode = fileNode;
+ map->mappings[i].mapfilenumber = fileNumber;
return;
}
}
@@ -396,7 +398,7 @@ apply_map_update(RelMapFile *map, Oid relationId, Oid fileNode, bool add_okay)
if (map->num_mappings >= MAX_MAPPINGS)
elog(ERROR, "ran out of space in relation map");
map->mappings[map->num_mappings].mapoid = relationId;
- map->mappings[map->num_mappings].mapfilenode = fileNode;
+ map->mappings[map->num_mappings].mapfilenumber = fileNumber;
map->num_mappings++;
}
@@ -415,7 +417,7 @@ merge_map_updates(RelMapFile *map, const RelMapFile *updates, bool add_okay)
{
apply_map_update(map,
updates->mappings[i].mapoid,
- updates->mappings[i].mapfilenode,
+ updates->mappings[i].mapfilenumber,
add_okay);
}
}
@@ -983,12 +985,12 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
for (i = 0; i < newmap->num_mappings; i++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
- rnode.spcNode = tsid;
- rnode.dbNode = dbid;
- rnode.relNode = newmap->mappings[i].mapfilenode;
- RelationPreserveStorage(rnode, false);
+ rlocator.spcOid = tsid;
+ rlocator.dbOid = dbid;
+ rlocator.relNumber = newmap->mappings[i].mapfilenumber;
+ RelationPreserveStorage(rlocator, false);
}
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c871cb7..6b90e7c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4803,16 +4803,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
bool is_index)
{
PQExpBuffer upgrade_query = createPQExpBuffer();
- PGresult *upgrade_res;
- Oid relfilenode;
- Oid toast_oid;
- Oid toast_relfilenode;
- char relkind;
- Oid toast_index_oid;
- Oid toast_index_relfilenode;
+ PGresult *upgrade_res;
+ RelFileNumber relfilenumber;
+ Oid toast_oid;
+ RelFileNumber toast_relfilenumber;
+ char relkind;
+ Oid toast_index_oid;
+ RelFileNumber toast_index_relfilenumber;
/*
- * Preserve the OID and relfilenode of the table, table's index, table's
+ * Preserve the OID and relfilenumber of the table, table's index, table's
* toast table and toast table's index if any.
*
* One complexity is that the current table definition might not require
@@ -4835,15 +4835,15 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenode = atooid(PQgetvalue(upgrade_res, 0,
+ toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
@@ -4857,13 +4857,13 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
/*
* Not every relation has storage. Also, in a pre-v12 database,
- * partitioned tables have a relfilenode, which should not be
+ * partitioned tables have a relfilenumber, which should not be
* preserved when upgrading.
*/
- if (OidIsValid(relfilenode) && relkind != RELKIND_PARTITIONED_TABLE)
+ if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
/*
* In a pre-v12 database, partitioned tables might be marked as having
@@ -4877,7 +4877,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
- toast_relfilenode);
+ toast_relfilenumber);
/* every toast table has an index */
appendPQExpBuffer(upgrade_buffer,
@@ -4885,20 +4885,20 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- toast_index_relfilenode);
+ toast_index_relfilenumber);
}
PQclear(upgrade_res);
}
else
{
- /* Preserve the OID and relfilenode of the index */
+ /* Preserve the OID and relfilenumber of the index */
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
"SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
- relfilenode);
+ relfilenumber);
}
appendPQExpBufferChar(upgrade_buffer, '\n');
diff --git a/src/bin/pg_rewind/datapagemap.h b/src/bin/pg_rewind/datapagemap.h
index ae4965f..235b676 100644
--- a/src/bin/pg_rewind/datapagemap.h
+++ b/src/bin/pg_rewind/datapagemap.h
@@ -10,7 +10,7 @@
#define DATAPAGEMAP_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
struct datapagemap
{
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 6252931..269ed64 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -56,7 +56,7 @@ static uint32 hash_string_pointer(const char *s);
static filehash_hash *filehash;
static bool isRelDataFile(const char *path);
-static char *datasegpath(RelFileNode rnode, ForkNumber forknum,
+static char *datasegpath(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber segno);
static file_entry_t *insert_filehash_entry(const char *path);
@@ -288,7 +288,7 @@ process_target_file(const char *path, file_type_t type, size_t size,
* hash table!
*/
void
-process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
+process_target_wal_block_change(ForkNumber forknum, RelFileLocator rlocator,
BlockNumber blkno)
{
char *path;
@@ -299,7 +299,7 @@ process_target_wal_block_change(ForkNumber forknum, RelFileNode rnode,
segno = blkno / RELSEG_SIZE;
blkno_inseg = blkno % RELSEG_SIZE;
- path = datasegpath(rnode, forknum, segno);
+ path = datasegpath(rlocator, forknum, segno);
entry = lookup_filehash_entry(path);
pfree(path);
@@ -508,7 +508,7 @@ print_filemap(filemap_t *filemap)
static bool
isRelDataFile(const char *path)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
unsigned int segNo;
int nmatch;
bool matched;
@@ -532,32 +532,32 @@ isRelDataFile(const char *path)
*
*----
*/
- rnode.spcNode = InvalidOid;
- rnode.dbNode = InvalidOid;
- rnode.relNode = InvalidOid;
+ rlocator.spcOid = InvalidOid;
+ rlocator.dbOid = InvalidOid;
+ rlocator.relNumber = InvalidRelFileNumber;
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rnode.relNode, &segNo);
+ nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
- rnode.spcNode = GLOBALTABLESPACE_OID;
- rnode.dbNode = 0;
+ rlocator.spcOid = GLOBALTABLESPACE_OID;
+ rlocator.dbOid = 0;
matched = true;
}
else
{
nmatch = sscanf(path, "base/%u/%u.%u",
- &rnode.dbNode, &rnode.relNode, &segNo);
+ &rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
- rnode.spcNode = DEFAULTTABLESPACE_OID;
+ rlocator.spcOid = DEFAULTTABLESPACE_OID;
matched = true;
}
else
{
nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
- &rnode.spcNode, &rnode.dbNode, &rnode.relNode,
+ &rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
matched = true;
@@ -567,12 +567,12 @@ isRelDataFile(const char *path)
/*
* The sscanf tests above can match files that have extra characters at
* the end. To eliminate such cases, cross-check that GetRelationPath
- * creates the exact same filename, when passed the RelFileNode
+ * creates the exact same filename, when passed the RelFileLocator
* information we extracted from the filename.
*/
if (matched)
{
- char *check_path = datasegpath(rnode, MAIN_FORKNUM, segNo);
+ char *check_path = datasegpath(rlocator, MAIN_FORKNUM, segNo);
if (strcmp(check_path, path) != 0)
matched = false;
@@ -589,12 +589,12 @@ isRelDataFile(const char *path)
* The returned path is palloc'd
*/
static char *
-datasegpath(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
+datasegpath(RelFileLocator rlocator, ForkNumber forknum, BlockNumber segno)
{
char *path;
char *segpath;
- path = relpathperm(rnode, forknum);
+ path = relpathperm(rlocator, forknum);
if (segno > 0)
{
segpath = psprintf("%s.%u", path, segno);
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 096f57a..0e011fb 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -10,7 +10,7 @@
#include "datapagemap.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* these enum values are sorted in the order we want actions to be processed */
typedef enum
@@ -103,7 +103,7 @@ extern void process_source_file(const char *path, file_type_t type,
extern void process_target_file(const char *path, file_type_t type,
size_t size, const char *link_target);
extern void process_target_wal_block_change(ForkNumber forknum,
- RelFileNode rnode,
+ RelFileLocator rlocator,
BlockNumber blkno);
extern filemap_t *decide_file_actions(void);
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index c6792da..d97240e 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -445,18 +445,18 @@ extractPageInfo(XLogReaderState *record)
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
- ForkNumber forknum;
- BlockNumber blkno;
+ RelFileLocator rlocator;
+ ForkNumber forknum;
+ BlockNumber blkno;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blkno, NULL))
+ &rlocator, &forknum, &blkno, NULL))
continue;
/* We only care about the main fork; others are copied in toto */
if (forknum != MAIN_FORKNUM)
continue;
- process_target_wal_block_change(forknum, rnode, blkno);
+ process_target_wal_block_change(forknum, rlocator, blkno);
}
}
diff --git a/src/bin/pg_rewind/pg_rewind.h b/src/bin/pg_rewind/pg_rewind.h
index 393182f..8b4b50a 100644
--- a/src/bin/pg_rewind/pg_rewind.h
+++ b/src/bin/pg_rewind/pg_rewind.h
@@ -16,7 +16,7 @@
#include "datapagemap.h"
#include "libpq-fe.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/* Configuration options */
extern char *datadir_target;
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 587793e..7f8042f 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -19,7 +19,7 @@ OBJS = \
option.o \
parallel.o \
pg_upgrade.o \
- relfilenode.o \
+ relfilenumber.o \
server.o \
tablespace.o \
util.o \
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 36b0670..5d30b87 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -190,9 +190,9 @@ create_rel_filename_map(const char *old_data, const char *new_data,
map->new_tablespace_suffix = new_cluster.tablespace_suffix;
}
- /* DB oid and relfilenodes are preserved between old and new cluster */
+ /* DB oid and relfilenumbers are preserved between old and new cluster */
map->db_oid = old_db->db_oid;
- map->relfilenode = old_rel->relfilenode;
+ map->relfilenumber = old_rel->relfilenumber;
/* used only for logging and error reporting, old/new are identical */
map->nspname = old_rel->nspname;
@@ -399,7 +399,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenode,
+ i_relfilenumber,
i_reltablespace;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
@@ -495,7 +495,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_toastheap = PQfnumber(res, "toastheap");
i_nspname = PQfnumber(res, "nspname");
i_relname = PQfnumber(res, "relname");
- i_relfilenode = PQfnumber(res, "relfilenode");
+ i_relfilenumber = PQfnumber(res, "relfilenode");
i_reltablespace = PQfnumber(res, "reltablespace");
i_spclocation = PQfnumber(res, "spclocation");
@@ -527,7 +527,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+ curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 55de244..30c3ee6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -132,15 +132,15 @@ extern char *output_files[];
typedef struct
{
/* Can't use NAMEDATALEN; not guaranteed to be same on client */
- char *nspname; /* namespace name */
- char *relname; /* relation name */
- Oid reloid; /* relation OID */
- Oid relfilenode; /* relation file node */
- Oid indtable; /* if index, OID of its table, else 0 */
- Oid toastheap; /* if toast table, OID of base table, else 0 */
- char *tablespace; /* tablespace path; "" for cluster default */
- bool nsp_alloc; /* should nspname be freed? */
- bool tblsp_alloc; /* should tablespace be freed? */
+ char *nspname; /* namespace name */
+ char *relname; /* relation name */
+ Oid reloid; /* relation OID */
+ RelFileNumber relfilenumber; /* relation file number */
+ Oid indtable; /* if index, OID of its table, else 0 */
+ Oid toastheap; /* if toast table, OID of base table, else 0 */
+ char *tablespace; /* tablespace path; "" for cluster default */
+ bool nsp_alloc; /* should nspname be freed? */
+ bool tblsp_alloc; /* should tablespace be freed? */
} RelInfo;
typedef struct
@@ -159,7 +159,7 @@ typedef struct
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
Oid db_oid;
- Oid relfilenode;
+ RelFileNumber relfilenumber;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
@@ -400,7 +400,7 @@ void parseCommandLine(int argc, char *argv[]);
void adjust_data_dir(ClusterInfo *cluster);
void get_sock_dir(ClusterInfo *cluster, bool live_check);
-/* relfilenode.c */
+/* relfilenumber.c */
void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
deleted file mode 100644
index d23ac88..0000000
--- a/src/bin/pg_upgrade/relfilenode.c
+++ /dev/null
@@ -1,259 +0,0 @@
-/*
- * relfilenode.c
- *
- * relfilenode functions
- *
- * Copyright (c) 2010-2022, PostgreSQL Global Development Group
- * src/bin/pg_upgrade/relfilenode.c
- */
-
-#include "postgres_fe.h"
-
-#include <sys/stat.h>
-
-#include "access/transam.h"
-#include "catalog/pg_class_d.h"
-#include "pg_upgrade.h"
-
-static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
-static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
-
-
-/*
- * transfer_all_new_tablespaces()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata)
-{
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- prep_status_progress("Cloning user relation files");
- break;
- case TRANSFER_MODE_COPY:
- prep_status_progress("Copying user relation files");
- break;
- case TRANSFER_MODE_LINK:
- prep_status_progress("Linking user relation files");
- break;
- }
-
- /*
- * Transferring files by tablespace is tricky because a single database
- * can use multiple tablespaces. For non-parallel mode, we just pass a
- * NULL tablespace path, which matches all tablespaces. In parallel mode,
- * we pass the default tablespace and all user-created tablespaces and let
- * those operations happen in parallel.
- */
- if (user_opts.jobs <= 1)
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, NULL);
- else
- {
- int tblnum;
-
- /* transfer default tablespace */
- parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
- new_pgdata, old_pgdata);
-
- for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
- parallel_transfer_all_new_dbs(old_db_arr,
- new_db_arr,
- old_pgdata,
- new_pgdata,
- os_info.old_tablespaces[tblnum]);
- /* reap all children */
- while (reap_child(true) == true)
- ;
- }
-
- end_progress_output();
- check_ok();
-}
-
-
-/*
- * transfer_all_new_dbs()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
- char *old_pgdata, char *new_pgdata, char *old_tablespace)
-{
- int old_dbnum,
- new_dbnum;
-
- /* Scan the old cluster databases and transfer their files */
- for (old_dbnum = new_dbnum = 0;
- old_dbnum < old_db_arr->ndbs;
- old_dbnum++, new_dbnum++)
- {
- DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
- *new_db = NULL;
- FileNameMap *mappings;
- int n_maps;
-
- /*
- * Advance past any databases that exist in the new cluster but not in
- * the old, e.g. "postgres". (The user might have removed the
- * 'postgres' database from the old cluster.)
- */
- for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
- {
- new_db = &new_db_arr->dbs[new_dbnum];
- if (strcmp(old_db->db_name, new_db->db_name) == 0)
- break;
- }
-
- if (new_dbnum >= new_db_arr->ndbs)
- pg_fatal("old database \"%s\" not found in the new cluster\n",
- old_db->db_name);
-
- mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
- new_pgdata);
- if (n_maps)
- {
- transfer_single_new_db(mappings, n_maps, old_tablespace);
- }
- /* We allocate something even for n_maps == 0 */
- pg_free(mappings);
- }
-}
-
-/*
- * transfer_single_new_db()
- *
- * create links for mappings stored in "maps" array.
- */
-static void
-transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
-{
- int mapnum;
- bool vm_must_add_frozenbit = false;
-
- /*
- * Do we need to rewrite visibilitymap?
- */
- if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
- new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
- vm_must_add_frozenbit = true;
-
- for (mapnum = 0; mapnum < size; mapnum++)
- {
- if (old_tablespace == NULL ||
- strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
- {
- /* transfer primary file */
- transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
-
- /*
- * Copy/link any fsm and vm files, if they exist
- */
- transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
- transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
- }
- }
-}
-
-
-/*
- * transfer_relfile()
- *
- * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
- * is true, visibility map forks are converted and rewritten, even in link
- * mode.
- */
-static void
-transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
-{
- char old_file[MAXPGPATH];
- char new_file[MAXPGPATH];
- int segno;
- char extent_suffix[65];
- struct stat statbuf;
-
- /*
- * Now copy/link any related segments as well. Remember, PG breaks large
- * files into 1GB segments, the first segment has no extension, subsequent
- * segments are named relfilenode.1, relfilenode.2, relfilenode.3.
- */
- for (segno = 0;; segno++)
- {
- if (segno == 0)
- extent_suffix[0] = '\0';
- else
- snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
-
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
- map->old_tablespace,
- map->old_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
- map->new_tablespace,
- map->new_tablespace_suffix,
- map->db_oid,
- map->relfilenode,
- type_suffix,
- extent_suffix);
-
- /* Is it an extent, fsm, or vm file? */
- if (type_suffix[0] != '\0' || segno != 0)
- {
- /* Did file open fail? */
- if (stat(old_file, &statbuf) != 0)
- {
- /* File does not exist? That's OK, just return */
- if (errno == ENOENT)
- return;
- else
- pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
- map->nspname, map->relname, old_file, new_file,
- strerror(errno));
- }
-
- /* If file is empty, just return */
- if (statbuf.st_size == 0)
- return;
- }
-
- unlink(new_file);
-
- /* Copying files might take some time, so give feedback. */
- pg_log(PG_STATUS, "%s", old_file);
-
- if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
- {
- /* Need to rewrite visibility map format */
- pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
- old_file, new_file);
- rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
- }
- else
- switch (user_opts.transfer_mode)
- {
- case TRANSFER_MODE_CLONE:
- pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
- old_file, new_file);
- cloneFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_COPY:
- pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
- old_file, new_file);
- copyFile(old_file, new_file, map->nspname, map->relname);
- break;
- case TRANSFER_MODE_LINK:
- pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
- old_file, new_file);
- linkFile(old_file, new_file, map->nspname, map->relname);
- }
- }
-}
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
new file mode 100644
index 0000000..b3ad820
--- /dev/null
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -0,0 +1,259 @@
+/*
+ * relfilenumber.c
+ *
+ * relfilenumber functions
+ *
+ * Copyright (c) 2010-2022, PostgreSQL Global Development Group
+ * src/bin/pg_upgrade/relfilenumber.c
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+
+#include "access/transam.h"
+#include "catalog/pg_class_d.h"
+#include "pg_upgrade.h"
+
+static void transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace);
+static void transfer_relfile(FileNameMap *map, const char *suffix, bool vm_must_add_frozenbit);
+
+
+/*
+ * transfer_all_new_tablespaces()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata)
+{
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ prep_status_progress("Cloning user relation files");
+ break;
+ case TRANSFER_MODE_COPY:
+ prep_status_progress("Copying user relation files");
+ break;
+ case TRANSFER_MODE_LINK:
+ prep_status_progress("Linking user relation files");
+ break;
+ }
+
+ /*
+ * Transferring files by tablespace is tricky because a single database
+ * can use multiple tablespaces. For non-parallel mode, we just pass a
+ * NULL tablespace path, which matches all tablespaces. In parallel mode,
+ * we pass the default tablespace and all user-created tablespaces and let
+ * those operations happen in parallel.
+ */
+ if (user_opts.jobs <= 1)
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, NULL);
+ else
+ {
+ int tblnum;
+
+ /* transfer default tablespace */
+ parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+ new_pgdata, old_pgdata);
+
+ for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+ parallel_transfer_all_new_dbs(old_db_arr,
+ new_db_arr,
+ old_pgdata,
+ new_pgdata,
+ os_info.old_tablespaces[tblnum]);
+ /* reap all children */
+ while (reap_child(true) == true)
+ ;
+ }
+
+ end_progress_output();
+ check_ok();
+}
+
+
+/*
+ * transfer_all_new_dbs()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+ char *old_pgdata, char *new_pgdata, char *old_tablespace)
+{
+ int old_dbnum,
+ new_dbnum;
+
+ /* Scan the old cluster databases and transfer their files */
+ for (old_dbnum = new_dbnum = 0;
+ old_dbnum < old_db_arr->ndbs;
+ old_dbnum++, new_dbnum++)
+ {
+ DbInfo *old_db = &old_db_arr->dbs[old_dbnum],
+ *new_db = NULL;
+ FileNameMap *mappings;
+ int n_maps;
+
+ /*
+ * Advance past any databases that exist in the new cluster but not in
+ * the old, e.g. "postgres". (The user might have removed the
+ * 'postgres' database from the old cluster.)
+ */
+ for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
+ {
+ new_db = &new_db_arr->dbs[new_dbnum];
+ if (strcmp(old_db->db_name, new_db->db_name) == 0)
+ break;
+ }
+
+ if (new_dbnum >= new_db_arr->ndbs)
+ pg_fatal("old database \"%s\" not found in the new cluster\n",
+ old_db->db_name);
+
+ mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
+ new_pgdata);
+ if (n_maps)
+ {
+ transfer_single_new_db(mappings, n_maps, old_tablespace);
+ }
+ /* We allocate something even for n_maps == 0 */
+ pg_free(mappings);
+ }
+}
+
+/*
+ * transfer_single_new_db()
+ *
+ * create links for mappings stored in "maps" array.
+ */
+static void
+transfer_single_new_db(FileNameMap *maps, int size, char *old_tablespace)
+{
+ int mapnum;
+ bool vm_must_add_frozenbit = false;
+
+ /*
+ * Do we need to rewrite visibilitymap?
+ */
+ if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_FROZEN_BIT_CAT_VER &&
+ new_cluster.controldata.cat_ver >= VISIBILITY_MAP_FROZEN_BIT_CAT_VER)
+ vm_must_add_frozenbit = true;
+
+ for (mapnum = 0; mapnum < size; mapnum++)
+ {
+ if (old_tablespace == NULL ||
+ strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
+ {
+ /* transfer primary file */
+ transfer_relfile(&maps[mapnum], "", vm_must_add_frozenbit);
+
+ /*
+ * Copy/link any fsm and vm files, if they exist
+ */
+ transfer_relfile(&maps[mapnum], "_fsm", vm_must_add_frozenbit);
+ transfer_relfile(&maps[mapnum], "_vm", vm_must_add_frozenbit);
+ }
+ }
+}
+
+
+/*
+ * transfer_relfile()
+ *
+ * Copy or link file from old cluster to new one. If vm_must_add_frozenbit
+ * is true, visibility map forks are converted and rewritten, even in link
+ * mode.
+ */
+static void
+transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_frozenbit)
+{
+ char old_file[MAXPGPATH];
+ char new_file[MAXPGPATH];
+ int segno;
+ char extent_suffix[65];
+ struct stat statbuf;
+
+ /*
+ * Now copy/link any related segments as well. Remember, PG breaks large
+ * files into 1GB segments, the first segment has no extension, subsequent
+ * segments are named relfilenumber.1, relfilenumber.2, relfilenumber.3.
+ */
+ for (segno = 0;; segno++)
+ {
+ if (segno == 0)
+ extent_suffix[0] = '\0';
+ else
+ snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
+
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ map->old_tablespace,
+ map->old_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ map->new_tablespace,
+ map->new_tablespace_suffix,
+ map->db_oid,
+ map->relfilenumber,
+ type_suffix,
+ extent_suffix);
+
+ /* Is it an extent, fsm, or vm file? */
+ if (type_suffix[0] != '\0' || segno != 0)
+ {
+ /* Did file open fail? */
+ if (stat(old_file, &statbuf) != 0)
+ {
+ /* File does not exist? That's OK, just return */
+ if (errno == ENOENT)
+ return;
+ else
+ pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+ map->nspname, map->relname, old_file, new_file,
+ strerror(errno));
+ }
+
+ /* If file is empty, just return */
+ if (statbuf.st_size == 0)
+ return;
+ }
+
+ unlink(new_file);
+
+ /* Copying files might take some time, so give feedback. */
+ pg_log(PG_STATUS, "%s", old_file);
+
+ if (vm_must_add_frozenbit && strcmp(type_suffix, "_vm") == 0)
+ {
+ /* Need to rewrite visibility map format */
+ pg_log(PG_VERBOSE, "rewriting \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ rewriteVisibilityMap(old_file, new_file, map->nspname, map->relname);
+ }
+ else
+ switch (user_opts.transfer_mode)
+ {
+ case TRANSFER_MODE_CLONE:
+ pg_log(PG_VERBOSE, "cloning \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ cloneFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_COPY:
+ pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ copyFile(old_file, new_file, map->nspname, map->relname);
+ break;
+ case TRANSFER_MODE_LINK:
+ pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n",
+ old_file, new_file);
+ linkFile(old_file, new_file, map->nspname, map->relname);
+ }
+ }
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5dc6010..6528113 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -37,7 +37,7 @@ static const char *progname;
static int WalSegSz;
static volatile sig_atomic_t time_to_stop = false;
-static const RelFileNode emptyRelFileNode = {0, 0, 0};
+static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
typedef struct XLogDumpPrivate
{
@@ -63,7 +63,7 @@ typedef struct XLogDumpConfig
bool filter_by_rmgr_enabled;
TransactionId filter_by_xid;
bool filter_by_xid_enabled;
- RelFileNode filter_by_relation;
+ RelFileLocator filter_by_relation;
bool filter_by_extended;
bool filter_by_relation_enabled;
BlockNumber filter_by_relation_block;
@@ -393,7 +393,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
*/
static bool
XLogRecordMatchesRelationBlock(XLogReaderState *record,
- RelFileNode matchRnode,
+ RelFileLocator matchRlocator,
BlockNumber matchBlock,
ForkNumber matchFork)
{
@@ -401,17 +401,17 @@ XLogRecordMatchesRelationBlock(XLogReaderState *record,
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blk;
if (!XLogRecGetBlockTagExtended(record, block_id,
- &rnode, &forknum, &blk, NULL))
+ &rlocator, &forknum, &blk, NULL))
continue;
if ((matchFork == InvalidForkNumber || matchFork == forknum) &&
- (RelFileNodeEquals(matchRnode, emptyRelFileNode) ||
- RelFileNodeEquals(matchRnode, rnode)) &&
+ (RelFileLocatorEquals(matchRlocator, emptyRelFileLocator) ||
+ RelFileLocatorEquals(matchRlocator, rlocator)) &&
(matchBlock == InvalidBlockNumber || matchBlock == blk))
return true;
}
@@ -885,11 +885,11 @@ main(int argc, char **argv)
break;
case 'R':
if (sscanf(optarg, "%u/%u/%u",
- &config.filter_by_relation.spcNode,
- &config.filter_by_relation.dbNode,
- &config.filter_by_relation.relNode) != 3 ||
- !OidIsValid(config.filter_by_relation.spcNode) ||
- !OidIsValid(config.filter_by_relation.relNode))
+ &config.filter_by_relation.spcOid,
+ &config.filter_by_relation.dbOid,
+ &config.filter_by_relation.relNumber) != 3 ||
+ !OidIsValid(config.filter_by_relation.spcOid) ||
+ !RelFileNumberIsValid(config.filter_by_relation.relNumber))
{
pg_log_error("invalid relation specification: \"%s\"", optarg);
pg_log_error_detail("Expecting \"tablespace OID/database OID/relation filenode\".");
@@ -1132,7 +1132,7 @@ main(int argc, char **argv)
!XLogRecordMatchesRelationBlock(xlogreader_state,
config.filter_by_relation_enabled ?
config.filter_by_relation :
- emptyRelFileNode,
+ emptyRelFileLocator,
config.filter_by_relation_block_enabled ?
config.filter_by_relation_block :
InvalidBlockNumber,
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 636c96e..1b6b620 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -107,24 +107,24 @@ forkname_chars(const char *str, ForkNumber *fork)
* XXX this must agree with GetRelationPath()!
*/
char *
-GetDatabasePath(Oid dbNode, Oid spcNode)
+GetDatabasePath(Oid dbOid, Oid spcOid)
{
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
return pstrdup("global");
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
- return psprintf("base/%u", dbNode);
+ return psprintf("base/%u", dbOid);
}
else
{
/* All other tablespaces are accessed via symlinks */
return psprintf("pg_tblspc/%u/%s/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY, dbNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY, dbOid);
}
}
@@ -138,44 +138,44 @@ GetDatabasePath(Oid dbNode, Oid spcNode)
* the trouble considering BackendId is just int anyway.
*/
char *
-GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber)
{
char *path;
- if (spcNode == GLOBALTABLESPACE_OID)
+ if (spcOid == GLOBALTABLESPACE_OID)
{
/* Shared system relations live in {datadir}/global */
- Assert(dbNode == 0);
+ Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
path = psprintf("global/%u_%s",
- relNode, forkNames[forkNumber]);
+ relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNode);
+ path = psprintf("global/%u", relNumber);
}
- else if (spcNode == DEFAULTTABLESPACE_OID)
+ else if (spcOid == DEFAULTTABLESPACE_OID)
{
/* The default tablespace is {datadir}/base */
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/%u_%s",
- dbNode, relNode,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/%u",
- dbNode, relNode);
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
- dbNode, backendId, relNode,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("base/%u/t%d_%u",
- dbNode, backendId, relNode);
+ dbOid, backendId, relNumber);
}
}
else
@@ -185,25 +185,25 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode,
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
- spcNode, TABLESPACE_VERSION_DIRECTORY,
- dbNode, backendId, relNode);
+ spcOid, TABLESPACE_VERSION_DIRECTORY,
+ dbOid, backendId, relNumber);
}
}
return path;
diff --git a/src/include/access/brin_xlog.h b/src/include/access/brin_xlog.h
index 95bfc7e..012a9af 100644
--- a/src/include/access/brin_xlog.h
+++ b/src/include/access/brin_xlog.h
@@ -18,7 +18,7 @@
#include "lib/stringinfo.h"
#include "storage/bufpage.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
diff --git a/src/include/access/ginxlog.h b/src/include/access/ginxlog.h
index 21de389..7f98503 100644
--- a/src/include/access/ginxlog.h
+++ b/src/include/access/ginxlog.h
@@ -110,7 +110,7 @@ typedef struct
typedef struct ginxlogSplit
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber rrlink; /* right link, or root's blocknumber if root
* split */
BlockNumber leftChildBlkno; /* valid on a non-leaf split */
@@ -167,7 +167,7 @@ typedef struct ginxlogDeletePage
*/
typedef struct ginxlogUpdateMeta
{
- RelFileNode node;
+ RelFileLocator locator;
GinMetaPageData metadata;
BlockNumber prevTail;
BlockNumber newRightlink;
diff --git a/src/include/access/gistxlog.h b/src/include/access/gistxlog.h
index 4537e67..9bbe4c2 100644
--- a/src/include/access/gistxlog.h
+++ b/src/include/access/gistxlog.h
@@ -97,7 +97,7 @@ typedef struct gistxlogPageDelete
*/
typedef struct gistxlogPageReuse
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} gistxlogPageReuse;
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 2d8a7f6..1705e73 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
@@ -370,9 +370,9 @@ typedef struct xl_heap_new_cid
CommandId combocid; /* just for debugging */
/*
- * Store the relfilenode/ctid pair to facilitate lookups.
+ * Store the relfilelocator/ctid pair to facilitate lookups.
*/
- RelFileNode target_node;
+ RelFileLocator target_locator;
ItemPointerData target_tid;
} xl_heap_new_cid;
@@ -415,7 +415,7 @@ extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
MultiXactId *relminmxid_out);
extern void heap_execute_freeze_tuple(HeapTupleHeader tuple,
xl_heap_freeze_tuple *xlrec_tp);
-extern XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer,
+extern XLogRecPtr log_heap_visible(RelFileLocator rlocator, Buffer heap_buffer,
Buffer vm_buffer, TransactionId cutoff_xid, uint8 flags);
#endif /* HEAPAM_XLOG_H */
diff --git a/src/include/access/nbtxlog.h b/src/include/access/nbtxlog.h
index de362d3..d79489e 100644
--- a/src/include/access/nbtxlog.h
+++ b/src/include/access/nbtxlog.h
@@ -180,12 +180,12 @@ typedef struct xl_btree_dedup
* This is what we need to know about page reuse within btree. This record
* only exists to generate a conflict point for Hot Standby.
*
- * Note that we must include a RelFileNode in the record because we don't
+ * Note that we must include a RelFileLocator in the record because we don't
* actually register the buffer with the record.
*/
typedef struct xl_btree_reuse_page
{
- RelFileNode node;
+ RelFileLocator locator;
BlockNumber block;
FullTransactionId latestRemovedFullXid;
} xl_btree_reuse_page;
diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h
index 3e27790..353cbb2 100644
--- a/src/include/access/rewriteheap.h
+++ b/src/include/access/rewriteheap.h
@@ -15,7 +15,7 @@
#include "access/htup.h"
#include "storage/itemptr.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* struct definition is private to rewriteheap.c */
@@ -34,8 +34,8 @@ extern bool rewrite_heap_dead_tuple(RewriteState state, HeapTuple oldTuple);
*/
typedef struct LogicalRewriteMappingData
{
- RelFileNode old_node;
- RelFileNode new_node;
+ RelFileLocator old_locator;
+ RelFileLocator new_locator;
ItemPointerData old_tid;
ItemPointerData new_tid;
} LogicalRewriteMappingData;
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index fe869c6..9df4e7c 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -560,32 +560,32 @@ typedef struct TableAmRoutine
*/
/*
- * This callback needs to create a new relation filenode for `rel`, with
+ * This callback needs to create new relation storage for `rel`, with
* appropriate durability behaviour for `persistence`.
*
* Note that only the subset of the relcache filled by
* RelationBuildLocalRelation() can be relied upon and that the relation's
* catalog entries will either not yet exist (new relation), or will still
- * reference the old relfilenode.
+ * reference the old relfilelocator.
*
* As output *freezeXid, *minmulti must be set to the values appropriate
* for pg_class.{relfrozenxid, relminmxid}. For AMs that don't need those
* fields to be filled they can be set to InvalidTransactionId and
* InvalidMultiXactId, respectively.
*
- * See also table_relation_set_new_filenode().
+ * See also table_relation_set_new_filelocator().
*/
- void (*relation_set_new_filenode) (Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti);
+ void (*relation_set_new_filelocator) (Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti);
/*
* This callback needs to remove all contents from `rel`'s current
- * relfilenode. No provisions for transactional behaviour need to be made.
- * Often this can be implemented by truncating the underlying storage to
- * its minimal size.
+ * relfilelocator. No provisions for transactional behaviour need to be
+ * made. Often this can be implemented by truncating the underlying
+ * storage to its minimal size.
*
* See also table_relation_nontransactional_truncate().
*/
@@ -598,7 +598,7 @@ typedef struct TableAmRoutine
* storage, unless it contains references to the tablespace internally.
*/
void (*relation_copy_data) (Relation rel,
- const RelFileNode *newrnode);
+ const RelFileLocator *newrlocator);
/* See table_relation_copy_for_cluster() */
void (*relation_copy_for_cluster) (Relation NewTable,
@@ -1348,7 +1348,7 @@ table_index_delete_tuples(Relation rel, TM_IndexDeleteOp *delstate)
* RelationGetBufferForTuple. See that method for more information.
*
* TABLE_INSERT_FROZEN should only be specified for inserts into
- * relfilenodes created during the current subtransaction and when
+ * relation storage created during the current subtransaction and when
* there are no prior snapshots or pre-existing portals open.
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
@@ -1577,33 +1577,34 @@ table_finish_bulk_insert(Relation rel, int options)
*/
/*
- * Create storage for `rel` in `newrnode`, with persistence set to
+ * Create storage for `rel` in `newrlocator`, with persistence set to
* `persistence`.
*
* This is used both during relation creation and various DDL operations to
- * create a new relfilenode that can be filled from scratch. When creating
- * new storage for an existing relfilenode, this should be called before the
+ * create new rel storage that can be filled from scratch. When creating
+ * new storage for an existing relfilelocator, this should be called before the
* relcache entry has been updated.
*
* *freezeXid, *minmulti are set to the xid / multixact horizon for the table
* that pg_class.{relfrozenxid, relminmxid} have to be set to.
*/
static inline void
-table_relation_set_new_filenode(Relation rel,
- const RelFileNode *newrnode,
- char persistence,
- TransactionId *freezeXid,
- MultiXactId *minmulti)
+table_relation_set_new_filelocator(Relation rel,
+ const RelFileLocator *newrlocator,
+ char persistence,
+ TransactionId *freezeXid,
+ MultiXactId *minmulti)
{
- rel->rd_tableam->relation_set_new_filenode(rel, newrnode, persistence,
- freezeXid, minmulti);
+ rel->rd_tableam->relation_set_new_filelocator(rel, newrlocator,
+ persistence, freezeXid,
+ minmulti);
}
/*
* Remove all table contents from `rel`, in a non-transactional manner.
* Non-transactional meaning that there's no need to support rollbacks. This
- * commonly only is used to perform truncations for relfilenodes created in the
- * current transaction.
+ * commonly only is used to perform truncations for relation storage created in
+ * the current transaction.
*/
static inline void
table_relation_nontransactional_truncate(Relation rel)
@@ -1612,15 +1613,15 @@ table_relation_nontransactional_truncate(Relation rel)
}
/*
- * Copy data from `rel` into the new relfilenode `newrnode`. The new
- * relfilenode may not have storage associated before this function is
+ * Copy data from `rel` into the new relfilelocator `newrlocator`. The new
+ * relfilelocator may not have storage associated before this function is
* called. This is only supposed to be used for low level operations like
* changing a relation's tablespace.
*/
static inline void
-table_relation_copy_data(Relation rel, const RelFileNode *newrnode)
+table_relation_copy_data(Relation rel, const RelFileLocator *newrlocator)
{
- rel->rd_tableam->relation_copy_data(rel, newrnode);
+ rel->rd_tableam->relation_copy_data(rel, newrlocator);
}
/*
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 4794941..7d2b352 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -19,7 +19,7 @@
#include "datatype/timestamp.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/sinval.h"
/*
@@ -174,7 +174,7 @@ typedef struct SavedTransactionCharacteristics
*/
#define XACT_XINFO_HAS_DBINFO (1U << 0)
#define XACT_XINFO_HAS_SUBXACTS (1U << 1)
-#define XACT_XINFO_HAS_RELFILENODES (1U << 2)
+#define XACT_XINFO_HAS_RELFILELOCATORS (1U << 2)
#define XACT_XINFO_HAS_INVALS (1U << 3)
#define XACT_XINFO_HAS_TWOPHASE (1U << 4)
#define XACT_XINFO_HAS_ORIGIN (1U << 5)
@@ -252,12 +252,12 @@ typedef struct xl_xact_subxacts
} xl_xact_subxacts;
#define MinSizeOfXactSubxacts offsetof(xl_xact_subxacts, subxacts)
-typedef struct xl_xact_relfilenodes
+typedef struct xl_xact_relfilelocators
{
int nrels; /* number of relations */
- RelFileNode xnodes[FLEXIBLE_ARRAY_MEMBER];
-} xl_xact_relfilenodes;
-#define MinSizeOfXactRelfilenodes offsetof(xl_xact_relfilenodes, xnodes)
+ RelFileLocator xlocators[FLEXIBLE_ARRAY_MEMBER];
+} xl_xact_relfilelocators;
+#define MinSizeOfXactRelfileLocators offsetof(xl_xact_relfilelocators, xlocators)
/*
* A transactionally dropped statistics entry.
@@ -305,7 +305,7 @@ typedef struct xl_xact_commit
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* xl_xact_invals follows if XINFO_HAS_INVALS */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -321,7 +321,7 @@ typedef struct xl_xact_abort
/* xl_xact_xinfo follows if XLOG_XACT_HAS_INFO */
/* xl_xact_dbinfo follows if XINFO_HAS_DBINFO */
/* xl_xact_subxacts follows if XINFO_HAS_SUBXACT */
- /* xl_xact_relfilenodes follows if XINFO_HAS_RELFILENODES */
+ /* xl_xact_relfilelocators follows if XINFO_HAS_RELFILELOCATORS */
/* xl_xact_stats_items follows if XINFO_HAS_DROPPED_STATS */
/* No invalidation messages needed. */
/* xl_xact_twophase follows if XINFO_HAS_TWOPHASE */
@@ -367,7 +367,7 @@ typedef struct xl_xact_parsed_commit
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -378,7 +378,7 @@ typedef struct xl_xact_parsed_commit
TransactionId twophase_xid; /* only for 2PC */
char twophase_gid[GIDSIZE]; /* only for 2PC */
int nabortrels; /* only for 2PC */
- RelFileNode *abortnodes; /* only for 2PC */
+ RelFileLocator *abortlocators; /* only for 2PC */
int nabortstats; /* only for 2PC */
xl_xact_stats_item *abortstats; /* only for 2PC */
@@ -400,7 +400,7 @@ typedef struct xl_xact_parsed_abort
TransactionId *subxacts;
int nrels;
- RelFileNode *xnodes;
+ RelFileLocator *xlocators;
int nstats;
xl_xact_stats_item *stats;
@@ -483,7 +483,7 @@ extern int xactGetCommittedChildren(TransactionId **ptr);
extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int nmsgs, SharedInvalidationMessage *msgs,
@@ -494,7 +494,7 @@ extern XLogRecPtr XactLogCommitRecord(TimestampTz commit_time,
extern XLogRecPtr XactLogAbortRecord(TimestampTz abort_time,
int nsubxacts, TransactionId *subxacts,
- int nrels, RelFileNode *rels,
+ int nrels, RelFileLocator *rels,
int nstats,
xl_xact_stats_item *stats,
int xactflags, TransactionId twophase_xid,
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index fae0bef..3524c39 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,7 +25,7 @@
#include "lib/stringinfo.h"
#include "pgtime.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
diff --git a/src/include/access/xloginsert.h b/src/include/access/xloginsert.h
index 5fc340c..c04f77b 100644
--- a/src/include/access/xloginsert.h
+++ b/src/include/access/xloginsert.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "storage/block.h"
#include "storage/buf.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/*
@@ -45,16 +45,16 @@ extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info);
extern void XLogEnsureRecordSpace(int max_block_id, int ndatas);
extern void XLogRegisterData(char *data, int len);
extern void XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags);
-extern void XLogRegisterBlock(uint8 block_id, RelFileNode *rnode,
+extern void XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator,
ForkNumber forknum, BlockNumber blknum, char *page,
uint8 flags);
extern void XLogRegisterBufData(uint8 block_id, char *data, int len);
extern void XLogResetInsertion(void);
extern bool XLogCheckBufferNeedsBackup(Buffer buffer);
-extern XLogRecPtr log_newpage(RelFileNode *rnode, ForkNumber forkNum,
+extern XLogRecPtr log_newpage(RelFileLocator *rlocator, ForkNumber forkNum,
BlockNumber blk, char *page, bool page_std);
-extern void log_newpages(RelFileNode *rnode, ForkNumber forkNum, int num_pages,
+extern void log_newpages(RelFileLocator *rlocator, ForkNumber forkNum, int num_pages,
BlockNumber *blknos, char **pages, bool page_std);
extern XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std);
extern void log_newpage_range(Relation rel, ForkNumber forkNum,
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index e73ea4a..5395f15 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -122,7 +122,7 @@ typedef struct
bool in_use;
/* Identify the block this refers to */
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forknum;
BlockNumber blkno;
@@ -430,10 +430,10 @@ extern FullTransactionId XLogRecGetFullXid(XLogReaderState *record);
extern bool RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page);
extern char *XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len);
extern void XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum);
extern bool XLogRecGetBlockTagExtended(XLogReaderState *record, uint8 block_id,
- RelFileNode *rnode, ForkNumber *forknum,
+ RelFileLocator *rlocator, ForkNumber *forknum,
BlockNumber *blknum,
Buffer *prefetch_buffer);
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 052ac68..7e467ef 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -15,7 +15,7 @@
#include "access/xlogdefs.h"
#include "port/pg_crc32c.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* The overall layout of an XLOG record is:
@@ -97,7 +97,7 @@ typedef struct XLogRecordBlockHeader
* image) */
/* If BKPBLOCK_HAS_IMAGE, an XLogRecordBlockImageHeader struct follows */
- /* If BKPBLOCK_SAME_REL is not set, a RelFileNode follows */
+ /* If BKPBLOCK_SAME_REL is not set, a RelFileLocator follows */
/* BlockNumber follows */
} XLogRecordBlockHeader;
@@ -175,7 +175,7 @@ typedef struct XLogRecordBlockCompressHeader
(SizeOfXLogRecordBlockHeader + \
SizeOfXLogRecordBlockImageHeader + \
SizeOfXLogRecordBlockCompressHeader + \
- sizeof(RelFileNode) + \
+ sizeof(RelFileLocator) + \
sizeof(BlockNumber))
/*
@@ -187,7 +187,7 @@ typedef struct XLogRecordBlockCompressHeader
#define BKPBLOCK_HAS_IMAGE 0x10 /* block data is an XLogRecordBlockImage */
#define BKPBLOCK_HAS_DATA 0x20
#define BKPBLOCK_WILL_INIT 0x40 /* redo will re-init the page */
-#define BKPBLOCK_SAME_REL 0x80 /* RelFileNode omitted, same as previous */
+#define BKPBLOCK_SAME_REL 0x80 /* RelFileLocator omitted, same as previous */
/*
* XLogRecordDataHeaderShort/Long are used for the "main data" portion of
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index c9d0b75..ef18297 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -60,9 +60,9 @@ extern PGDLLIMPORT HotStandbyState standbyState;
extern bool XLogHaveInvalidPages(void);
extern void XLogCheckInvalidPages(void);
-extern void XLogDropRelation(RelFileNode rnode, ForkNumber forknum);
+extern void XLogDropRelation(RelFileLocator rlocator, ForkNumber forknum);
extern void XLogDropDatabase(Oid dbid);
-extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum,
+extern void XLogTruncateRelation(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber nblocks);
/* Result codes for XLogReadBufferForRedo[Extended] */
@@ -89,11 +89,11 @@ extern XLogRedoAction XLogReadBufferForRedoExtended(XLogReaderState *record,
ReadBufferMode mode, bool get_cleanup_lock,
Buffer *buf);
-extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
+extern Buffer XLogReadBufferExtended(RelFileLocator rlocator, ForkNumber forknum,
BlockNumber blkno, ReadBufferMode mode,
Buffer recent_buffer);
-extern Relation CreateFakeRelcacheEntry(RelFileNode rnode);
+extern Relation CreateFakeRelcacheEntry(RelFileLocator rlocator);
extern void FreeFakeRelcacheEntry(Relation fakerel);
extern int read_local_xlog_page(XLogReaderState *state,
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index 0b6944b..fd93442 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -22,11 +22,11 @@ extern PGDLLIMPORT Oid binary_upgrade_next_mrng_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_mrng_array_pg_type_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_index_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
-extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_relfilenode;
+extern PGDLLIMPORT RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 60c1215..66900f1 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,7 +38,8 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern Oid GetNewRelFileNode(Oid reltablespace, Relation pg_class,
- char relpersistence);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ Relation pg_class,
+ char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 07c5b88..5774c46 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -50,7 +50,7 @@ extern Relation heap_create(const char *relname,
Oid relnamespace,
Oid reltablespace,
Oid relid,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid accessmtd,
TupleDesc tupDesc,
char relkind,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index a1d6e3b..1bdb00a 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -71,7 +71,7 @@ extern Oid index_create(Relation heapRelation,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
- Oid relFileNode,
+ RelFileNumber relFileNumber,
IndexInfo *indexInfo,
List *indexColNames,
Oid accessMethodObjectId,
diff --git a/src/include/catalog/storage.h b/src/include/catalog/storage.h
index 59f3404..9964c31 100644
--- a/src/include/catalog/storage.h
+++ b/src/include/catalog/storage.h
@@ -15,23 +15,23 @@
#define STORAGE_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
/* GUC variables */
extern PGDLLIMPORT int wal_skip_threshold;
-extern SMgrRelation RelationCreateStorage(RelFileNode rnode,
+extern SMgrRelation RelationCreateStorage(RelFileLocator rlocator,
char relpersistence,
bool register_delete);
extern void RelationDropStorage(Relation rel);
-extern void RelationPreserveStorage(RelFileNode rnode, bool atCommit);
+extern void RelationPreserveStorage(RelFileLocator rlocator, bool atCommit);
extern void RelationPreTruncate(Relation rel);
extern void RelationTruncate(Relation rel, BlockNumber nblocks);
extern void RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
-extern bool RelFileNodeSkippingWAL(RelFileNode rnode);
+extern bool RelFileLocatorSkippingWAL(RelFileLocator rlocator);
extern Size EstimatePendingSyncsSpace(void);
extern void SerializePendingSyncs(Size maxSize, char *startAddress);
extern void RestorePendingSyncs(char *startAddress);
@@ -42,7 +42,7 @@ extern void RestorePendingSyncs(char *startAddress);
*/
extern void smgrDoPendingDeletes(bool isCommit);
extern void smgrDoPendingSyncs(bool isCommit, bool isParallelWorker);
-extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr);
+extern int smgrGetPendingDeletes(bool forCommit, RelFileLocator **ptr);
extern void AtSubCommit_smgr(void);
extern void AtSubAbort_smgr(void);
extern void PostPrepare_smgr(void);
diff --git a/src/include/catalog/storage_xlog.h b/src/include/catalog/storage_xlog.h
index 622de22..44a5e20 100644
--- a/src/include/catalog/storage_xlog.h
+++ b/src/include/catalog/storage_xlog.h
@@ -17,7 +17,7 @@
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Declarations for smgr-related XLOG records
@@ -32,7 +32,7 @@
typedef struct xl_smgr_create
{
- RelFileNode rnode;
+ RelFileLocator rlocator;
ForkNumber forkNum;
} xl_smgr_create;
@@ -46,11 +46,11 @@ typedef struct xl_smgr_create
typedef struct xl_smgr_truncate
{
BlockNumber blkno;
- RelFileNode rnode;
+ RelFileLocator rlocator;
int flags;
} xl_smgr_truncate;
-extern void log_smgrcreate(const RelFileNode *rnode, ForkNumber forkNum);
+extern void log_smgrcreate(const RelFileLocator *rlocator, ForkNumber forkNum);
extern void smgr_redo(XLogReaderState *record);
extern void smgr_desc(StringInfo buf, XLogReaderState *record);
diff --git a/src/include/commands/sequence.h b/src/include/commands/sequence.h
index 9da2300..d38c0e2 100644
--- a/src/include/commands/sequence.h
+++ b/src/include/commands/sequence.h
@@ -19,7 +19,7 @@
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "parser/parse_node.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
typedef struct FormData_pg_sequence_data
@@ -47,7 +47,7 @@ typedef FormData_pg_sequence_data *Form_pg_sequence_data;
typedef struct xl_seq_rec
{
- RelFileNode node;
+ RelFileLocator locator;
/* SEQUENCE TUPLE DATA FOLLOWS AT THE END */
} xl_seq_rec;
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 5d4037f..0c48654 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -66,7 +66,7 @@ extern void SetRelationHasSubclass(Oid relationId, bool relhassubclass);
extern bool CheckRelationTableSpaceMove(Relation rel, Oid newTableSpaceId);
extern void SetRelationTableSpace(Relation rel, Oid newTableSpaceId,
- Oid newRelFileNode);
+ RelFileNumber newRelFileNumber);
extern ObjectAddress renameatt(RenameStmt *stmt);
diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index 24b6473..1f80907 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -50,7 +50,7 @@ extern void DropTableSpace(DropTableSpaceStmt *stmt);
extern ObjectAddress RenameTableSpace(const char *oldname, const char *newname);
extern Oid AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
-extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
+extern void TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo);
extern Oid GetDefaultTablespace(char relpersistence, bool partitioned);
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 13849a3..3ab7132 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -64,27 +64,27 @@ extern int forkname_chars(const char *str, ForkNumber *fork);
/*
* Stuff for computing filesystem pathnames for relations.
*/
-extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
+extern char *GetDatabasePath(Oid dbOid, Oid spcOid);
-extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
+extern char *GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
int backendId, ForkNumber forkNumber);
/*
* Wrapper macros for GetRelationPath. Beware of multiple
- * evaluation of the RelFileNode or RelFileNodeBackend argument!
+ * evaluation of the RelFileLocator or RelFileLocatorBackend argument!
*/
-/* First argument is a RelFileNode */
-#define relpathbackend(rnode, backend, forknum) \
- GetRelationPath((rnode).dbNode, (rnode).spcNode, (rnode).relNode, \
+/* First argument is a RelFileLocator */
+#define relpathbackend(rlocator, backend, forknum) \
+ GetRelationPath((rlocator).dbOid, (rlocator).spcOid, (rlocator).relNumber, \
backend, forknum)
-/* First argument is a RelFileNode */
-#define relpathperm(rnode, forknum) \
- relpathbackend(rnode, InvalidBackendId, forknum)
+/* First argument is a RelFileLocator */
+#define relpathperm(rlocator, forknum) \
+ relpathbackend(rlocator, InvalidBackendId, forknum)
-/* First argument is a RelFileNodeBackend */
-#define relpath(rnode, forknum) \
- relpathbackend((rnode).node, (rnode).backend, forknum)
+/* First argument is a RelFileLocatorBackend */
+#define relpath(rlocator, forknum) \
+ relpathbackend((rlocator).locator, (rlocator).backend, forknum)
#endif /* RELPATH_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f93d866..9a21417 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3248,10 +3248,10 @@ typedef struct IndexStmt
List *excludeOpNames; /* exclusion operator names, or NIL if none */
char *idxcomment; /* comment to apply to index, or NULL */
Oid indexOid; /* OID of an existing index, if any */
- Oid oldNode; /* relfilenode of existing storage, if any */
- SubTransactionId oldCreateSubid; /* rd_createSubid of oldNode */
- SubTransactionId oldFirstRelfilenodeSubid; /* rd_firstRelfilenodeSubid of
- * oldNode */
+ RelFileNumber oldNumber; /* relfilenumber of existing storage, if any */
+ SubTransactionId oldCreateSubid; /* rd_createSubid of oldNumber */
+ SubTransactionId oldFirstRelfilelocatorSubid; /* rd_firstRelfilelocatorSubid
+ * of oldNumber */
bool unique; /* is index unique? */
bool nulls_not_distinct; /* null treatment for UNIQUE constraints */
bool primary; /* is index a primary key? */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index fdb61b7..d8af68b 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,6 +46,13 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
+/*
+ * RelFileNumber data type identifies the specific relation file name.
+ */
+typedef Oid RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+#define RelFileNumberIsValid(relnumber) \
+ ((bool) ((relnumber) != InvalidRelFileNumber))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 2511ef4..b67fb1e 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -16,7 +16,7 @@
#define _BGWRITER_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 4a01f87..d109d0b 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -99,7 +99,7 @@ typedef struct ReorderBufferChange
struct
{
/* relation that has been changed */
- RelFileNode relnode;
+ RelFileLocator rlocator;
/* no previously reassembled toast chunks are necessary anymore */
bool clear_toast_afterwards;
@@ -145,7 +145,7 @@ typedef struct ReorderBufferChange
*/
struct
{
- RelFileNode node;
+ RelFileLocator locator;
ItemPointerData tid;
CommandId cmin;
CommandId cmax;
@@ -657,7 +657,7 @@ extern void ReorderBufferAddSnapshot(ReorderBuffer *, TransactionId, XLogRecPtr
extern void ReorderBufferAddNewCommandId(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
CommandId cid);
extern void ReorderBufferAddNewTupleCids(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
- RelFileNode node, ItemPointerData pt,
+ RelFileLocator locator, ItemPointerData pt,
CommandId cmin, CommandId cmax, CommandId combocid);
extern void ReorderBufferAddInvalidations(ReorderBuffer *, TransactionId, XLogRecPtr lsn,
Size nmsgs, SharedInvalidationMessage *msgs);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index a17e7b2..b85b94f 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,30 +90,30 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileLocator rlocator; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rlocator.spcOid = InvalidOid, \
+ (a).rlocator.dbOid = InvalidOid, \
+ (a).rlocator.relNumber = InvalidRelFileNumber, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
-#define INIT_BUFFERTAG(a,xx_rnode,xx_forkNum,xx_blockNum) \
+#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rnode = (xx_rnode), \
+ (a).rlocator = (xx_rlocator), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -291,11 +291,11 @@ extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
*/
typedef struct CkptSortItem
{
- Oid tsId;
- Oid relNode;
- ForkNumber forkNum;
- BlockNumber blockNum;
- int buf_id;
+ Oid tsId;
+ RelFileNumber relNumber;
+ ForkNumber forkNum;
+ BlockNumber blockNum;
+ int buf_id;
} CkptSortItem;
extern PGDLLIMPORT CkptSortItem *CkptBufferIds;
@@ -337,9 +337,9 @@ extern PrefetchBufferResult PrefetchLocalBuffer(SMgrRelation smgr,
extern BufferDesc *LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum,
BlockNumber blockNum, bool *foundPtr);
extern void MarkLocalBufferDirty(Buffer buffer);
-extern void DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
+extern void DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber firstDelBlock);
-extern void DropRelFileNodeAllLocalBuffers(RelFileNode rnode);
+extern void DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator);
extern void AtEOXact_LocalBuffers(bool isCommit);
#endif /* BUFMGR_INTERNALS_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 5839140..96e473e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,7 +17,7 @@
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -176,13 +176,13 @@ extern PrefetchBufferResult PrefetchSharedBuffer(struct SMgrRelationData *smgr_r
BlockNumber blockNum);
extern PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum,
BlockNumber blockNum);
-extern bool ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum,
+extern bool ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum,
BlockNumber blockNum, Buffer recent_buffer);
extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum);
extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy);
-extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode,
+extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool permanent);
@@ -204,13 +204,13 @@ extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
extern void FlushOneBuffer(Buffer buffer);
extern void FlushRelationBuffers(Relation rel);
extern void FlushRelationsAllBuffers(struct SMgrRelationData **smgrs, int nrels);
-extern void CreateAndCopyRelationData(RelFileNode src_rnode,
- RelFileNode dst_rnode,
+extern void CreateAndCopyRelationData(RelFileLocator src_rlocator,
+ RelFileLocator dst_rlocator,
bool permanent);
extern void FlushDatabaseBuffers(Oid dbid);
-extern void DropRelFileNodeBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
+extern void DropRelFileLocatorBuffers(struct SMgrRelationData *smgr_reln, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock);
-extern void DropRelFileNodesAllBuffers(struct SMgrRelationData **smgr_reln, int nnodes);
+extern void DropRelFileLocatorsAllBuffers(struct SMgrRelationData **smgr_reln, int nlocators);
extern void DropDatabaseBuffers(Oid dbid);
#define RelationGetNumberOfBlocks(reln) \
@@ -223,7 +223,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileLocator *rlocator,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/freespace.h b/src/include/storage/freespace.h
index dcc40eb..fcb0802 100644
--- a/src/include/storage/freespace.h
+++ b/src/include/storage/freespace.h
@@ -15,7 +15,7 @@
#define FREESPACE_H_
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
/* prototypes for public functions in freespace.c */
@@ -27,7 +27,7 @@ extern BlockNumber RecordAndGetPageWithFreeSpace(Relation rel,
Size spaceNeeded);
extern void RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk,
Size spaceAvail);
-extern void XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
+extern void XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
Size spaceAvail);
extern BlockNumber FreeSpaceMapPrepareTruncateRel(Relation rel,
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ffffa40..10aa1b0 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -15,7 +15,7 @@
#define MD_H
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
@@ -25,7 +25,7 @@ extern void mdopen(SMgrRelation reln);
extern void mdclose(SMgrRelation reln, ForkNumber forknum);
extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
-extern void mdunlink(RelFileNodeBackend rnode, ForkNumber forknum, bool isRedo);
+extern void mdunlink(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo);
extern void mdextend(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern bool mdprefetch(SMgrRelation reln, ForkNumber forknum,
@@ -42,7 +42,7 @@ extern void mdtruncate(SMgrRelation reln, ForkNumber forknum,
extern void mdimmedsync(SMgrRelation reln, ForkNumber forknum);
extern void ForgetDatabaseSyncRequests(Oid dbid);
-extern void DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo);
+extern void DropRelationFiles(RelFileLocator *delrels, int ndelrels, bool isRedo);
/* md sync callbacks */
extern int mdsyncfiletag(const FileTag *ftag, char *path);
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
new file mode 100644
index 0000000..7211fe7
--- /dev/null
+++ b/src/include/storage/relfilelocator.h
@@ -0,0 +1,99 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilelocator.h
+ * Physical access information for relations.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/relfilelocator.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILELOCATOR_H
+#define RELFILELOCATOR_H
+
+#include "common/relpath.h"
+#include "storage/backendid.h"
+
+/*
+ * RelFileLocator must provide all that we need to know to physically access
+ * a relation, with the exception of the backend ID, which can be provided
+ * separately. Note, however, that a "physical" relation is comprised of
+ * multiple files on the filesystem, as each fork is stored as a separate
+ * file, and each fork can be divided into multiple segments. See md.c.
+ *
+ * spcOid identifies the tablespace of the relation. It corresponds to
+ * pg_tablespace.oid.
+ *
+ * dbOid identifies the database of the relation. It is zero for
+ * "shared" relations (those common to all databases of a cluster).
+ * Nonzero dbOid values correspond to pg_database.oid.
+ *
+ * relNumber identifies the specific relation. relNumber corresponds to
+ * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
+ * to assign new physical files to relations in some situations).
+ * Notice that relNumber is only unique within a database in a particular
+ * tablespace.
+ *
+ * Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
+ * zero. We support shared relations only in the "global" tablespace.
+ *
+ * Note: in pg_class we allow reltablespace == 0 to denote that the
+ * relation is stored in its database's "default" tablespace (as
+ * identified by pg_database.dattablespace). However this shorthand
+ * is NOT allowed in RelFileLocator structs --- the real tablespace ID
+ * must be supplied when setting spcOid.
+ *
+ * Note: in pg_class, relfilenode can be zero to denote that the relation
+ * is a "mapped" relation, whose current true filenode number is available
+ * from relmapper.c. Again, this case is NOT allowed in RelFileLocators.
+ *
+ * Note: various places use RelFileLocator in hashtable keys. Therefore,
+ * there *must not* be any unused padding bytes in this struct. That
+ * should be safe as long as all the fields are of type Oid.
+ */
+typedef struct RelFileLocator
+{
+ Oid spcOid; /* tablespace */
+ Oid dbOid; /* database */
+ RelFileNumber relNumber; /* relation */
+} RelFileLocator;
+
+/*
+ * Augmenting a relfilelocator with the backend ID provides all the information
+ * we need to locate the physical storage. The backend ID is InvalidBackendId
+ * for regular relations (those accessible to more than one backend), or the
+ * owning backend's ID for backend-local relations. Backend-local relations
+ * are always transient and removed in case of a database crash; they are
+ * never WAL-logged or fsync'd.
+ */
+typedef struct RelFileLocatorBackend
+{
+ RelFileLocator locator;
+ BackendId backend;
+} RelFileLocatorBackend;
+
+#define RelFileLocatorBackendIsTemp(rlocator) \
+ ((rlocator).backend != InvalidBackendId)
+
+/*
+ * Note: RelFileLocatorEquals and RelFileLocatorBackendEquals compare relNumber first
+ * since that is most likely to be different in two unequal RelFileLocators. It
+ * is probably redundant to compare spcOid if the other fields are found equal,
+ * but do it anyway to be sure. Likewise for checking the backend ID in
+ * RelFileLocatorBackendEquals.
+ */
+#define RelFileLocatorEquals(locator1, locator2) \
+ ((locator1).relNumber == (locator2).relNumber && \
+ (locator1).dbOid == (locator2).dbOid && \
+ (locator1).spcOid == (locator2).spcOid)
+
+#define RelFileLocatorBackendEquals(locator1, locator2) \
+ ((locator1).locator.relNumber == (locator2).locator.relNumber && \
+ (locator1).locator.dbOid == (locator2).locator.dbOid && \
+ (locator1).backend == (locator2).backend && \
+ (locator1).locator.spcOid == (locator2).locator.spcOid)
+
+#endif /* RELFILELOCATOR_H */
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
deleted file mode 100644
index 4fdc606..0000000
--- a/src/include/storage/relfilenode.h
+++ /dev/null
@@ -1,99 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenode.h
- * Physical access information for relations.
- *
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/storage/relfilenode.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODE_H
-#define RELFILENODE_H
-
-#include "common/relpath.h"
-#include "storage/backendid.h"
-
-/*
- * RelFileNode must provide all that we need to know to physically access
- * a relation, with the exception of the backend ID, which can be provided
- * separately. Note, however, that a "physical" relation is comprised of
- * multiple files on the filesystem, as each fork is stored as a separate
- * file, and each fork can be divided into multiple segments. See md.c.
- *
- * spcNode identifies the tablespace of the relation. It corresponds to
- * pg_tablespace.oid.
- *
- * dbNode identifies the database of the relation. It is zero for
- * "shared" relations (those common to all databases of a cluster).
- * Nonzero dbNode values correspond to pg_database.oid.
- *
- * relNode identifies the specific relation. relNode corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNode is only unique within a database in a particular
- * tablespace.
- *
- * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
- * zero. We support shared relations only in the "global" tablespace.
- *
- * Note: in pg_class we allow reltablespace == 0 to denote that the
- * relation is stored in its database's "default" tablespace (as
- * identified by pg_database.dattablespace). However this shorthand
- * is NOT allowed in RelFileNode structs --- the real tablespace ID
- * must be supplied when setting spcNode.
- *
- * Note: in pg_class, relfilenode can be zero to denote that the relation
- * is a "mapped" relation, whose current true filenode number is available
- * from relmapper.c. Again, this case is NOT allowed in RelFileNodes.
- *
- * Note: various places use RelFileNode in hashtable keys. Therefore,
- * there *must not* be any unused padding bytes in this struct. That
- * should be safe as long as all the fields are of type Oid.
- */
-typedef struct RelFileNode
-{
- Oid spcNode; /* tablespace */
- Oid dbNode; /* database */
- Oid relNode; /* relation */
-} RelFileNode;
-
-/*
- * Augmenting a relfilenode with the backend ID provides all the information
- * we need to locate the physical storage. The backend ID is InvalidBackendId
- * for regular relations (those accessible to more than one backend), or the
- * owning backend's ID for backend-local relations. Backend-local relations
- * are always transient and removed in case of a database crash; they are
- * never WAL-logged or fsync'd.
- */
-typedef struct RelFileNodeBackend
-{
- RelFileNode node;
- BackendId backend;
-} RelFileNodeBackend;
-
-#define RelFileNodeBackendIsTemp(rnode) \
- ((rnode).backend != InvalidBackendId)
-
-/*
- * Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
- * since that is most likely to be different in two unequal RelFileNodes. It
- * is probably redundant to compare spcNode if the other fields are found equal,
- * but do it anyway to be sure. Likewise for checking the backend ID in
- * RelFileNodeBackendEquals.
- */
-#define RelFileNodeEquals(node1, node2) \
- ((node1).relNode == (node2).relNode && \
- (node1).dbNode == (node2).dbNode && \
- (node1).spcNode == (node2).spcNode)
-
-#define RelFileNodeBackendEquals(node1, node2) \
- ((node1).node.relNode == (node2).node.relNode && \
- (node1).node.dbNode == (node2).node.dbNode && \
- (node1).backend == (node2).backend && \
- (node1).node.spcNode == (node2).node.spcNode)
-
-#endif /* RELFILENODE_H */
diff --git a/src/include/storage/sinval.h b/src/include/storage/sinval.h
index e7cd456..56c6fc9 100644
--- a/src/include/storage/sinval.h
+++ b/src/include/storage/sinval.h
@@ -16,7 +16,7 @@
#include <signal.h>
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* We support several types of shared-invalidation messages:
@@ -90,7 +90,7 @@ typedef struct
int8 id; /* type field --- must be first */
int8 backend_hi; /* high bits of backend ID, if temprel */
uint16 backend_lo; /* low bits of backend ID, if temprel */
- RelFileNode rnode; /* spcNode, dbNode, relNode */
+ RelFileLocator rlocator; /* spcOid, dbOid, relNumber */
} SharedInvalSmgrMsg;
#define SHAREDINVALRELMAP_ID (-4)
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 6b63c60..a077153 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -16,7 +16,7 @@
#include "lib/ilist.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* smgr.c maintains a table of SMgrRelation objects, which are essentially
@@ -38,8 +38,8 @@
*/
typedef struct SMgrRelationData
{
- /* rnode is the hashtable lookup key, so it must be first! */
- RelFileNodeBackend smgr_rnode; /* relation physical identifier */
+ /* rlocator is the hashtable lookup key, so it must be first! */
+ RelFileLocatorBackend smgr_rlocator; /* relation physical identifier */
/* pointer to owning pointer, or NULL if none */
struct SMgrRelationData **smgr_owner;
@@ -75,16 +75,16 @@ typedef struct SMgrRelationData
typedef SMgrRelationData *SMgrRelation;
#define SmgrIsTemp(smgr) \
- RelFileNodeBackendIsTemp((smgr)->smgr_rnode)
+ RelFileLocatorBackendIsTemp((smgr)->smgr_rlocator)
extern void smgrinit(void);
-extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
+extern SMgrRelation smgropen(RelFileLocator rlocator, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclose(SMgrRelation reln);
extern void smgrcloseall(void);
-extern void smgrclosenode(RelFileNodeBackend rnode);
+extern void smgrcloserellocator(RelFileLocatorBackend rlocator);
extern void smgrrelease(SMgrRelation reln);
extern void smgrreleaseall(void);
extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 6a77632..dacef92 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -17,7 +17,7 @@
#include "datatype/timestamp.h"
#include "storage/lock.h"
#include "storage/procsignal.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/standbydefs.h"
/* User-settable GUC parameters */
@@ -30,9 +30,9 @@ extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithSnapshotFullXid(FullTransactionId latestRemovedFullXid,
- RelFileNode node);
+ RelFileLocator locator);
extern void ResolveRecoveryConflictWithTablespace(Oid tsid);
extern void ResolveRecoveryConflictWithDatabase(Oid dbid);
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 9737e1e..049af87 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -13,7 +13,7 @@
#ifndef SYNC_H
#define SYNC_H
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
/*
* Type of sync request. These are used to manage the set of pending
@@ -51,7 +51,7 @@ typedef struct FileTag
{
int16 handler; /* SyncRequestHandler value, saving space */
int16 forknum; /* ForkNumber, saving space */
- RelFileNode rnode;
+ RelFileLocator rlocator;
uint32 segno;
} FileTag;
diff --git a/src/include/utils/inval.h b/src/include/utils/inval.h
index 0e0323b..23748b7 100644
--- a/src/include/utils/inval.h
+++ b/src/include/utils/inval.h
@@ -15,7 +15,7 @@
#define INVAL_H
#include "access/htup.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "utils/relcache.h"
extern PGDLLIMPORT int debug_discard_caches;
@@ -48,7 +48,7 @@ extern void CacheInvalidateRelcacheByTuple(HeapTuple classTuple);
extern void CacheInvalidateRelcacheByRelid(Oid relid);
-extern void CacheInvalidateSmgr(RelFileNodeBackend rnode);
+extern void CacheInvalidateSmgr(RelFileLocatorBackend rlocator);
extern void CacheInvalidateRelmap(Oid databaseId);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1896a9a..54f9c5c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -23,7 +23,7 @@
#include "partitioning/partdefs.h"
#include "rewrite/prs2lock.h"
#include "storage/block.h"
-#include "storage/relfilenode.h"
+#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "utils/relcache.h"
#include "utils/reltrigger.h"
@@ -53,7 +53,7 @@ typedef LockInfoData *LockInfo;
typedef struct RelationData
{
- RelFileNode rd_node; /* relation physical identifier */
+ RelFileLocator rd_locator; /* relation physical identifier */
SMgrRelation rd_smgr; /* cached file handle, or NULL */
int rd_refcnt; /* reference count */
BackendId rd_backend; /* owning backend id, if temporary relation */
@@ -66,44 +66,44 @@ typedef struct RelationData
/*----------
* rd_createSubid is the ID of the highest subtransaction the rel has
- * survived into or zero if the rel or its rd_node was created before the
- * current top transaction. (IndexStmt.oldNode leads to the case of a new
- * rel with an old rd_node.) rd_firstRelfilenodeSubid is the ID of the
- * highest subtransaction an rd_node change has survived into or zero if
- * rd_node matches the value it had at the start of the current top
+ * survived into or zero if the rel or its storage was created before the
+ * current top transaction. (IndexStmt.oldNumber leads to the case of a new
+ * rel with an old rd_locator.) rd_firstRelfilelocatorSubid is the ID of the
+ * highest subtransaction an rd_locator change has survived into or zero if
+ * rd_locator matches the value it had at the start of the current top
* transaction. (Rolling back the subtransaction that
- * rd_firstRelfilenodeSubid denotes would restore rd_node to the value it
+ * rd_firstRelfilelocatorSubid denotes would restore rd_locator to the value it
* had at the start of the current top transaction. Rolling back any
* lower subtransaction would not.) Their accuracy is critical to
* RelationNeedsWAL().
*
- * rd_newRelfilenodeSubid is the ID of the highest subtransaction the
- * most-recent relfilenode change has survived into or zero if not changed
+ * rd_newRelfilelocatorSubid is the ID of the highest subtransaction the
+ * most-recent relfilenumber change has survived into or zero if not changed
* in the current transaction (or we have forgotten changing it). This
* field is accurate when non-zero, but it can be zero when a relation has
- * multiple new relfilenodes within a single transaction, with one of them
+ * multiple new relfilenumbers within a single transaction, with one of them
* occurring in a subsequently aborted subtransaction, e.g.
* BEGIN;
* TRUNCATE t;
* SAVEPOINT save;
* TRUNCATE t;
* ROLLBACK TO save;
- * -- rd_newRelfilenodeSubid is now forgotten
+ * -- rd_newRelfilelocatorSubid is now forgotten
*
* If every rd_*Subid field is zero, they are read-only outside
- * relcache.c. Files that trigger rd_node changes by updating
+ * relcache.c. Files that trigger rd_locator changes by updating
* pg_class.reltablespace and/or pg_class.relfilenode call
- * RelationAssumeNewRelfilenode() to update rd_*Subid.
+ * RelationAssumeNewRelfilelocator() to update rd_*Subid.
*
* rd_droppedSubid is the ID of the highest subtransaction that a drop of
* the rel has survived into. In entries visible outside relcache.c, this
* is always zero.
*/
SubTransactionId rd_createSubid; /* rel was created in current xact */
- SubTransactionId rd_newRelfilenodeSubid; /* highest subxact changing
- * rd_node to current value */
- SubTransactionId rd_firstRelfilenodeSubid; /* highest subxact changing
- * rd_node to any value */
+ SubTransactionId rd_newRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to current value */
+ SubTransactionId rd_firstRelfilelocatorSubid; /* highest subxact changing
+ * rd_locator to any value */
SubTransactionId rd_droppedSubid; /* dropped with another Subid set */
Form_pg_class rd_rel; /* RELATION tuple */
@@ -531,12 +531,12 @@ typedef struct ViewOptions
/*
* RelationIsMapped
- * True if the relation uses the relfilenode map. Note multiple eval
+ * True if the relation uses the relfilenumber map. Note multiple eval
* of argument!
*/
#define RelationIsMapped(relation) \
(RELKIND_HAS_STORAGE((relation)->rd_rel->relkind) && \
- ((relation)->rd_rel->relfilenode == InvalidOid))
+ ((relation)->rd_rel->relfilenode == InvalidRelFileNumber))
/*
* RelationGetSmgr
@@ -555,7 +555,7 @@ static inline SMgrRelation
RelationGetSmgr(Relation rel)
{
if (unlikely(rel->rd_smgr == NULL))
- smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_node, rel->rd_backend));
+ smgrsetowner(&(rel->rd_smgr), smgropen(rel->rd_locator, rel->rd_backend));
return rel->rd_smgr;
}
@@ -607,12 +607,12 @@ RelationGetSmgr(Relation rel)
*
* Returns false if wal_level = minimal and this relation is created or
* truncated in the current transaction. See "Skipping WAL for New
- * RelFileNode" in src/backend/access/transam/README.
+ * RelFileLocator" in src/backend/access/transam/README.
*/
#define RelationNeedsWAL(relation) \
(RelationIsPermanent(relation) && (XLogIsNeeded() || \
(relation->rd_createSubid == InvalidSubTransactionId && \
- relation->rd_firstRelfilenodeSubid == InvalidSubTransactionId)))
+ relation->rd_firstRelfilelocatorSubid == InvalidSubTransactionId)))
/*
* RelationUsesLocalBuffers
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index c93d865..ba35d6b 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -103,7 +103,7 @@ extern Relation RelationBuildLocalRelation(const char *relname,
TupleDesc tupDesc,
Oid relid,
Oid accessmtd,
- Oid relfilenode,
+ RelFileNumber relfilenumber,
Oid reltablespace,
bool shared_relation,
bool mapped_relation,
@@ -111,10 +111,10 @@ extern Relation RelationBuildLocalRelation(const char *relname,
char relkind);
/*
- * Routines to manage assignment of new relfilenode to a relation
+ * Routines to manage assignment of new relfilenumber to a relation
*/
-extern void RelationSetNewRelfilenode(Relation relation, char persistence);
-extern void RelationAssumeNewRelfilenode(Relation relation);
+extern void RelationSetNewRelfilenumber(Relation relation, char persistence);
+extern void RelationAssumeNewRelfilelocator(Relation relation);
/*
* Routines for flushing/rebuilding relcache entries in various scenarios
diff --git a/src/include/utils/relfilenodemap.h b/src/include/utils/relfilenodemap.h
deleted file mode 100644
index 77d8046..0000000
--- a/src/include/utils/relfilenodemap.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * relfilenodemap.h
- * relfilenode to oid mapping cache.
- *
- * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/relfilenodemap.h
- *
- *-------------------------------------------------------------------------
- */
-#ifndef RELFILENODEMAP_H
-#define RELFILENODEMAP_H
-
-extern Oid RelidByRelfilenode(Oid reltablespace, Oid relfilenode);
-
-#endif /* RELFILENODEMAP_H */
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
new file mode 100644
index 0000000..c149a93
--- /dev/null
+++ b/src/include/utils/relfilenumbermap.h
@@ -0,0 +1,19 @@
+/*-------------------------------------------------------------------------
+ *
+ * relfilenumbermap.h
+ * relfilenumber to oid mapping cache.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/relfilenumbermap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef RELFILENUMBERMAP_H
+#define RELFILENUMBERMAP_H
+
+extern Oid RelidByRelfilenumber(Oid reltablespace,
+ RelFileNumber relfilenumber);
+
+#endif /* RELFILENUMBERMAP_H */
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 557f77e..2bb2e25 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* relmapper.h
- * Catalog-to-filenode mapping
+ * Catalog-to-filenumber mapping
*
*
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
@@ -35,14 +35,15 @@ typedef struct xl_relmap_update
#define MinSizeOfRelmapUpdate offsetof(xl_relmap_update, data)
-extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumber(Oid relationId, bool shared);
-extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
-extern Oid RelationMapOidToFilenodeForDatabase(char *dbpath, Oid relationId);
+extern Oid RelationMapFilenumberToOid(RelFileNumber relationId, bool shared);
+extern RelFileNumber RelationMapOidToFilenumberForDatabase(char *dbpath,
+ Oid relationId);
extern void RelationMapCopy(Oid dbid, Oid tsid, char *srcdbpath,
char *dstdbpath);
-extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
- bool immediate);
+extern void RelationMapUpdateMap(Oid relationId, RelFileNumber fileNumber,
+ bool shared, bool immediate);
extern void RelationMapRemoveMapping(Oid relationId);
diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
index 4700d49..869d9d5 100644
--- a/src/test/recovery/t/018_wal_optimize.pl
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -5,7 +5,7 @@
#
# These tests exercise code that once violated the mandate described in
# src/backend/access/transam/README section "Skipping WAL for New
-# RelFileNode". The tests work by committing some transactions, initiating an
+# RelFileLocator". The tests work by committing some transactions, initiating an
# immediate shutdown, and confirming that the expected data survives recovery.
# For many years, individual commands made the decision to skip WAL, hence the
# frequent appearance of COPY in these tests.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c7f550e..34a76ce 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2257,8 +2257,8 @@ ReindexObjectType
ReindexParams
ReindexStmt
ReindexType
-RelFileNode
-RelFileNodeBackend
+RelFileLocator
+RelFileLocatorBackend
RelIdCacheEnt
RelInfo
RelInfoArr
@@ -2276,8 +2276,8 @@ RelationPtr
RelationSyncEntry
RelcacheCallbackFunction
ReleaseMatchCB
-RelfilenodeMapEntry
-RelfilenodeMapKey
+RelfilenumberMapEntry
+RelfilenumberMapKey
Relids
RelocationBufferInfo
RelptrFreePageBtree
@@ -3879,7 +3879,7 @@ xl_xact_parsed_abort
xl_xact_parsed_commit
xl_xact_parsed_prepare
xl_xact_prepare
-xl_xact_relfilenodes
+xl_xact_relfilelocators
xl_xact_stats_item
xl_xact_stats_items
xl_xact_subxacts
--
1.8.3.1
v6-0002-Preliminary-refactoring-for-supporting-larger.patchtext/x-patch; charset=US-ASCII; name=v6-0002-Preliminary-refactoring-for-supporting-larger.patchDownload
From a7f69bfbfa64eee8267242c6538b76ad8679a21f Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Tue, 5 Jul 2022 12:51:25 +0530
Subject: [PATCH v6 2/5] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
contrib/pg_prewarm/autoprewarm.c | 7 +-
src/backend/storage/buffer/bufmgr.c | 113 ++++++++++++++++++--------
src/backend/storage/buffer/localbuf.c | 22 +++--
src/include/storage/buf_internals.h | 43 ++++++++--
5 files changed, 137 insertions(+), 54 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 713f52a..abc8813 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
+ fctx->record[i].relfilenumber = BufTagGetFileNumber(bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 7f1d55c..ca80d5a 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -631,9 +631,10 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetFileNumber(bufHdr->tag);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7071ff6..a2c1e81 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BufTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BufTagRelFileLocatorEquals(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetFileNumber(bufHdr->tag);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,7 +2790,7 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
+ BufTagCopyRelFileLocator(bufHdr->tag, *rlocator);
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,7 +2841,12 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -3141,14 +3149,14 @@ DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but the
* incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3301,7 +3309,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, locators[j]))
{
rlocator = &locators[j];
break;
@@ -3310,7 +3318,10 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, locator);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3320,7 +3331,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3380,7 +3391,7 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3419,11 +3430,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3447,13 +3458,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3473,12 +3487,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3535,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3564,13 +3582,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3644,7 +3662,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,7 +3671,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3665,7 +3686,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagRelFileLocatorEquals(bufHdr->tag, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3867,13 +3888,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4033,6 +4054,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4041,8 +4066,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4650,8 +4674,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ BufTagCopyRelFileLocator(buf->tag, rlocator);
+ path = relpathperm(rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4701,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathperm(rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,8 +4723,11 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathbackend(rlocator, MyBackendId, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4787,9 +4820,14 @@ WaitBufHdrUnlocked(BufferDesc *buf)
static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
- int ret;
+ int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ BufTagCopyRelFileLocator(*ba, rlocatora);
+ BufTagCopyRelFileLocator(*bb, rlocatorb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
@@ -4946,10 +4984,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BufTagCopyRelFileLocator(tag, currlocator);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4957,10 +4997,13 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BufTagCopyRelFileLocator(next->tag, nextrlocator);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +5022,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
+ reln = smgropen(currlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 3dc9cc7..ce73172 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,9 +213,12 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -337,16 +340,22 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ BufTagRelFileLocatorEquals(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,12 +392,15 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagRelFileLocatorEquals(bufHdr->tag, rlocator))
{
+ RelFileLocator rlocator;
+
+ BufTagCopyRelFileLocator(bufHdr->tag, rlocator);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index b85b94f..78484a9 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,61 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
- BlockNumber blockNum; /* blknum relative to begin of reln */
+ Oid spcOid; /* tablespace oid. */
+ Oid dbOid; /* database oid. */
+ RelFileNumber relNumber; /* relation file number. */
+ ForkNumber forkNum;
+ BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+#define BufTagGetFileNumber(a) ((a).relNumber)
+
+#define BufTagSetFileNumber(a, relnumber) \
+( \
+ (a).relNumber = (relnumber) \
+)
+
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ BufTagSetFileNumber(a, InvalidRelFileNumber), \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rlocator = (xx_rlocator), \
+ (a).spcOid = (xx_rlocator).spcOid, \
+ (a).dbOid = (xx_rlocator).dbOid, \
+ BufTagSetFileNumber(a, (xx_rlocator).relNumber), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).relNumber == (b).relNumber && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BufTagCopyRelFileLocator(a, locator) \
+do { \
+ (locator).spcOid = (a).spcOid; \
+ (locator).dbOid = (a).dbOid; \
+ (locator).relNumber = (a).relNumber; \
+} while(0)
+
+#define BufTagRelFileLocatorEquals(a, locator) \
+( \
+ (a).spcOid == (locator).spcOid && \
+ (a).dbOid == (locator).dbOid && \
+ (a).relNumber == (locator).relNumber \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
On Wed, Jul 6, 2022 at 7:55 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Okay, changed that and changed a few more occurrences in 0001 which
were on similar lines. I also tested the performance of pg_bench
where concurrently I am running the script which creates/drops
relation but I do not see any regression with fairly small values of
VAR_RELNUMBER_PREFETCH, the smallest value I tried was 8. That
doesn't mean I am suggesting this small value but I think we can keep
the value something like 512 or 1024 without worrying much about the
performance, so changed to 512 in the latest patch.
OK, I have committed 0001 now with a few changes. pgindent did not
agree with some of your whitespace changes, and I also cleaned up a
few long lines. I replaced one instance of InvalidOid with
InvalidRelFileNumber also, and changed a word in a comment.
I think 0002 and 0003 need more work yet; I'll try to write a review
of those soon.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, Jul 6, 2022 at 11:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
I think 0002 and 0003 need more work yet; I'll try to write a review
of those soon.
Regarding 0002:
I don't particularly like the names BufTagCopyRelFileLocator and
BufTagRelFileLocatorEquals. My suggestion is to rename
BufTagRelFileLocatorEquals to BufTagMatchesRelFileLocator, because it
doesn't really make sense to me to talk about equality between values
of different data types. Instead of BufTagCopyRelFileLocator I would
prefer BufTagGetRelFileLocator. That would make it more similar to
BufTagGetFileNumber and BufTagSetFileNumber, which I think would be a
good thing.
Other than that I think 0002 seems fine.
Regarding 0003:
/*
* Don't try to prefetch
anything in this database until
- * it has been created, or we
might confuse the blocks of
- * different generations, if a
database OID or
- * relfilenumber is reused.
It's also more efficient than
+ * it has been created,
because it's more efficient than
* discovering that relations
don't exist on disk yet with
* ENOENT errors.
*/
I'm worried that this might not be correct. The comment changes here
(and I think also in some other plces) imply that we've eliminated
relfilenode ruse, but I think that's not true. createdb() and movedb()
don't seem to be modified, so I think it's possible to just copy a
template database over without change, which means that relfilenumbers
and even relfilelocators could be reused. So I feel like maybe this
and similar places shouldn't be modified in this way. Am I
misunderstanding?
/*
- * Relfilenumbers are not unique in databases across
tablespaces, so we need
- * to allocate a new one in the new tablespace.
+ * Generate a new relfilenumber. Although relfilenumber are
unique within a
+ * cluster, we are unable to use the old relfilenumber since unused
+ * relfilenumber are not unlinked until commit. So if within a
+ * transaction, if we set the old tablespace again, we will
get conflicting
+ * relfilenumber file.
*/
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
-
rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber();
I can't clearly understand this comment. Is it saying that the code
which follows is broken and needs to be fixed by a future patch before
things are OK again? If so, that's not good.
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * callers should be GetNewOidWithIndex() in catalog/catalog.c.
If there is only one, it should say "caller", not "callers".
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because relfilenumber is 56 bit wide so logically there should not be any
+collisions. So cleaning up isn't really necessary.
I don't agree that orphaned files are harmless, but changing that is
beyond the scope of this patch. I think that the way you've ended the
sentence isn't sufficiently clear and correct even if we accept the
principle that orphaned files are harmless. What I think we should
stay instead is "because the relfilenode counter is monotonically
increasing. The maximum value is 2^56-1, and there is no provision for
wraparound."
+ /*
+ * Check if we set the new relfilenumber then do we run out of
the logged
+ * relnumber, if so then we need to WAL log again. Otherwise,
just adjust
+ * the relnumbercount.
+ */
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH);
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
Would it be clearer, here and elsewhere, if VariableCacheData tracked
nextRelFileNumber and nextUnloggedRelFileNumber instead of
nextRelFileNumber and relnumbercount? I'm not 100% sure, but the idea
seems worth considering.
+ * Flush xlog record to disk before returning. To protect against file
+ * system changes reaching the disk before the
XLOG_NEXT_RELFILENUMBER log.
The way this is worded, you would need it to be just one sentence,
like "Flush xlog record to disk before returning to protect
against...". Or else add "this is," like "This is to protect
against..."
But I'm thinking maybe we could reword it a little more, perhaps
something like this: "Flush xlog record to disk before returning. We
want to be sure that the in-memory nextRelFileNumber value is always
larger than any relfilenumber that is already in use on disk. To
maintain that invariant, we must make sure that the record we just
logged reaches the disk before any new files are created."
This isn't a full review, I think, but I'm kind of out of time and
energy for today.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Jul 7, 2022 at 2:54 AM Robert Haas <robertmhaas@gmail.com> wrote:
Thanks for committing the 0001.
On Wed, Jul 6, 2022 at 11:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
I think 0002 and 0003 need more work yet; I'll try to write a review
of those soon.Regarding 0002:
I don't particularly like the names BufTagCopyRelFileLocator and
BufTagRelFileLocatorEquals. My suggestion is to rename
BufTagRelFileLocatorEquals to BufTagMatchesRelFileLocator, because it
doesn't really make sense to me to talk about equality between values
of different data types. Instead of BufTagCopyRelFileLocator I would
prefer BufTagGetRelFileLocator. That would make it more similar to
BufTagGetFileNumber and BufTagSetFileNumber, which I think would be a
good thing.Other than that I think 0002 seems fine.
Changed as suggested. Although I feel BufTagCopyRelFileLocator is
actually copying the relfilelocator from buffer tag to an input
variable, I am fine with BufTagGetRelFileLocator so that it is similar
to the other names.
Changed some other macro names as below because field name they are
getting/setting is relNumber
BufTagSetFileNumber -> BufTagSetRelNumber
BufTagGetFileNumber -> BufTagGetRelNumber
Regarding 0003:
I'm worried that this might not be correct. The comment changes here
(and I think also in some other plces) imply that we've eliminated
relfilenode ruse, but I think that's not true. createdb() and movedb()
don't seem to be modified, so I think it's possible to just copy a
template database over without change, which means that relfilenumbers
and even relfilelocators could be reused. So I feel like maybe this
and similar places shouldn't be modified in this way. Am I
misunderstanding?
I think you are right, so I changed it.
/* - * Relfilenumbers are not unique in databases across tablespaces, so we need - * to allocate a new one in the new tablespace. + * Generate a new relfilenumber. Although relfilenumber are unique within a + * cluster, we are unable to use the old relfilenumber since unused + * relfilenumber are not unlinked until commit. So if within a + * transaction, if we set the old tablespace again, we will get conflicting + * relfilenumber file. */ - newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL, - rel->rd_rel->relpersistence); + newrelfilenumber = GetNewRelFileNumber();I can't clearly understand this comment. Is it saying that the code
which follows is broken and needs to be fixed by a future patch before
things are OK again? If so, that's not good.
No it is not broken in this patch. Basically, before our patch the
reason for allocating the new relfilenumber was that if we create the
file with oldrelfilenumber in new tablespace then it is possible that
in the new tablespace file with same name exist because relfilenumber
was unique in databases across tablespaces so there could be conflict.
But now that is not the case but still we can not reuse the old
relfilenumber because from the old tablespace the old relfilenumber
file is not removed until the next checkpoint so if we move the table
back to the old tablespace again then there could be conflict. And
even after we get the final patch of removing the tombstone file on
commit then also we can not reuse the old relfilenumber because within
a transaction we can switch between the tablespaces multiple times and
the relfilenumber file from the old tablespace will be removed only on
commit. This is what I am trying to explain in the comment.
Now I have modified the comment slightly, such that in 0002 I am
saying files are not removed until the next checkpoint and in 0004 I
am modifying that and saying not removed until commit.
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in - * catalog/catalog.c. + * callers should be GetNewOidWithIndex() in catalog/catalog.c.If there is only one, it should say "caller", not "callers".
Orphan files are harmless --- at worst they waste a bit of disk space --- -because we check for on-disk collisions when allocating new relfilenumber -OIDs. So cleaning up isn't really necessary. +because relfilenumber is 56 bit wide so logically there should not be any +collisions. So cleaning up isn't really necessary.I don't agree that orphaned files are harmless, but changing that is
beyond the scope of this patch. I think that the way you've ended the
sentence isn't sufficiently clear and correct even if we accept the
principle that orphaned files are harmless. What I think we should
stay instead is "because the relfilenode counter is monotonically
increasing. The maximum value is 2^56-1, and there is no provision for
wraparound."
Done
+ /* + * Check if we set the new relfilenumber then do we run out of the logged + * relnumber, if so then we need to WAL log again. Otherwise, just adjust + * the relnumbercount. + */ + relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber; + if (ShmemVariableCache->relnumbercount <= relnumbercount) + { + LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH); + ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH; + } + else + ShmemVariableCache->relnumbercount -= relnumbercount;Would it be clearer, here and elsewhere, if VariableCacheData tracked
nextRelFileNumber and nextUnloggedRelFileNumber instead of
nextRelFileNumber and relnumbercount? I'm not 100% sure, but the idea
seems worth considering.
I think it is in line with oidCount, what do you think?
+ * Flush xlog record to disk before returning. To protect against file + * system changes reaching the disk before the XLOG_NEXT_RELFILENUMBER log.The way this is worded, you would need it to be just one sentence,
like "Flush xlog record to disk before returning to protect
against...". Or else add "this is," like "This is to protect
against..."But I'm thinking maybe we could reword it a little more, perhaps
something like this: "Flush xlog record to disk before returning. We
want to be sure that the in-memory nextRelFileNumber value is always
larger than any relfilenumber that is already in use on disk. To
maintain that invariant, we must make sure that the record we just
logged reaches the disk before any new files are created."
Done
This isn't a full review, I think, but I'm kind of out of time and
energy for today.
I have updated some other comments as well.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v7-0001-Preliminary-refactoring-for-supporting-larger.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Preliminary-refactoring-for-supporting-larger.patchDownload
From b39b6d0081a417375c19d7966bb2d94ced5e8a03 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Thu, 7 Jul 2022 10:15:23 +0530
Subject: [PATCH v7 1/4] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 6 +-
contrib/pg_prewarm/autoprewarm.c | 7 +-
src/backend/storage/buffer/bufmgr.c | 111 ++++++++++++++++++--------
src/backend/storage/buffer/localbuf.c | 22 +++--
src/include/storage/buf_internals.h | 41 ++++++++--
5 files changed, 135 insertions(+), 52 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..f5eb197 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 13eee4a..cc67aa6 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -631,9 +631,10 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(bufHdr->tag);
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e4de4b3..0086716 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,7 +1647,7 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BufTagMatchesRelFileLocator(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1658,7 +1658,7 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
+ BufTagMatchesRelFileLocator(bufHdr->tag, relation->rd_locator) &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -2000,8 +2000,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(bufHdr->tag);
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,7 +2790,7 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
+ BufTagGetRelFileLocator(bufHdr->tag, *rlocator);
*forknum = bufHdr->tag.forkNum;
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,7 +2841,12 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -3141,14 +3149,14 @@ DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(bufHdr->tag, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rlocator.locator) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -3301,7 +3309,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, locators[j]))
{
rlocator = &locators[j];
break;
@@ -3310,7 +3318,10 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, locator);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3320,7 +3331,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3380,7 +3391,7 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3419,11 +3430,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3447,13 +3458,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(buf->tag, rlocator);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3473,12 +3487,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3535,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3564,13 +3582,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(bufHdr->tag, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3644,7 +3662,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,7 +3671,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3665,7 +3686,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3867,13 +3888,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4033,6 +4054,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4041,8 +4066,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4650,8 +4674,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+ path = relpathperm(rlocator, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4701,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathperm(rlocator, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,8 +4723,11 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathbackend(rlocator, MyBackendId, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4788,8 +4821,13 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ BufTagGetRelFileLocator(*ba, rlocatora);
+ BufTagGetRelFileLocator(*bb, rlocatorb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
@@ -4946,10 +4984,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BufTagGetRelFileLocator(tag, currlocator);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4957,10 +4997,13 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BufTagGetRelFileLocator(next->tag, nextrlocator);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4979,7 +5022,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
+ reln = smgropen(currlocator, InvalidBackendId);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 41a0807..76e8556 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,9 +213,12 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -337,16 +340,22 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
+ BufTagMatchesRelFileLocator(bufHdr->tag, rlocator) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,12 +392,15 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(bufHdr->tag, rlocator))
{
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
+ relpathbackend(rlocator, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index aded5e8..4c36d55 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,61 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+#define BufTagGetRelNumber(a) ((a).relNumber)
+
+#define BufTagSetRelNumber(a, relnumber) \
+( \
+ (a).relNumber = (relnumber) \
+)
+
+#define BufTagGetRelFileLocator(a, locator) \
+do { \
+ (locator).spcOid = (a).spcOid; \
+ (locator).dbOid = (a).dbOid; \
+ (locator).relNumber = (a).relNumber; \
+} while(0)
+
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ BufTagSetRelNumber(a, InvalidRelFileNumber), \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rlocator = (xx_rlocator), \
+ (a).spcOid = (xx_rlocator).spcOid, \
+ (a).dbOid = (xx_rlocator).dbOid, \
+ BufTagSetRelNumber(a, (xx_rlocator).relNumber), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).relNumber == (b).relNumber && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BufTagMatchesRelFileLocator(a, locator) \
+( \
+ (a).spcOid == (locator).spcOid && \
+ (a).dbOid == (locator).dbOid && \
+ (a).relNumber == (locator).relNumber \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v7-0003-Assert-checking-to-be-merged-with-0002.patchtext/x-patch; charset=US-ASCII; name=v7-0003-Assert-checking-to-be-merged-with-0002.patchDownload
From 44965db6004a8fce320ac09519e4b49d7243ae40 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 7 Jul 2022 16:30:26 +0530
Subject: [PATCH v7 3/4] Assert checking (to be merged with 0002)
---
src/backend/catalog/catalog.c | 54 ++++++++++++++++++++++++++++++++++++++++
src/backend/catalog/heap.c | 5 ++++
src/backend/catalog/storage.c | 6 +++++
src/backend/commands/tablecmds.c | 3 +++
src/include/catalog/catalog.h | 9 +++++++
5 files changed, 77 insertions(+)
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 155400c..9a22203 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -583,3 +583,57 @@ pg_stop_making_pinned_objects(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+
+#ifdef USE_ASSERT_CHECKING
+
+/*
+ * Assert that there is no existing diskfile for input relnumber.
+ */
+void
+AssertRelfileNumberFileNotExists(Oid spcoid, RelFileNumber relnumber,
+ char relpersistence)
+{
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ /*
+ * If we ever get here during pg_upgrade, there's something wrong; all
+ * relfilenode assignments during a binary-upgrade run should be
+ * determined by commands in the dump script.
+ */
+ Assert(!IsBinaryUpgrade);
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid = spcoid ? spcoid : MyDatabaseTableSpace;
+ rlocator.locator.dbOid =
+ (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid :
+ MyDatabaseId;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must initialize
+ * that properly here to make sure that any collisions based on filename
+ * are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+
+ Assert(access(rpath, F_OK) != 0);
+}
+#endif
\ No newline at end of file
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 955ae3b..30fd15c 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -345,7 +345,12 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
+ {
relfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(reltablespace,
+ relfilenumber,
+ relpersistence);
+ }
}
/*
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..db85bd8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,9 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +984,9 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3f48523..0678681 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14377,6 +14377,9 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
* will get the conflicting relfilenumber file.
*/
newrelfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(newTableSpace,
+ newrelfilenumber,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index b452530..be6ba13 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -39,4 +39,13 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
+#ifdef USE_ASSERT_CHECKING
+extern void AssertRelfileNumberFileNotExists(Oid spcoid,
+ RelFileNumber relnumber,
+ char relpersistence);
+#else
+#define AssertRelfileNumberFileNotExists(spcoid, relnumber, relpersistence) \
+ ((void)true)
+#endif
+
#endif /* CATALOG_H */
--
1.8.3.1
v7-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v7-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From b1c252264a1f9225a026e24341625da66533b62d Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Thu, 7 Jul 2022 10:50:25 +0530
Subject: [PATCH v7 2/4] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 3 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 31 ++++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 +++--
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 97 +++++++++++++++++++++-
src/backend/access/transam/xlog.c | 51 ++++++++++++
src/backend/access/transam/xlogprefetcher.c | 14 ++--
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +--
src/backend/catalog/catalog.c | 95 ---------------------
src/backend/catalog/heap.c | 15 ++--
src/backend/catalog/index.c | 11 +--
src/backend/commands/tablecmds.c | 9 +-
src/backend/nodes/outfuncs.c | 7 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 ++
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 4 +-
src/backend/utils/adt/pg_upgrade_support.c | 9 +-
src/backend/utils/cache/relcache.c | 3 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 ++---
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 ++---
src/fe_utils/option_utils.c | 42 ++++++++++
src/include/access/transam.h | 5 ++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 ++--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +--
src/include/fe_utils/option_utils.h | 3 +
src/include/postgres_ext.h | 7 +-
src/include/storage/buf_internals.h | 22 +++--
src/include/storage/relfilelocator.h | 9 +-
src/test/regress/expected/alter_table.out | 24 +++---
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
59 files changed, 443 insertions(+), 257 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..2fbb62f 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -7,7 +7,8 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+ pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
+ pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index f5eb197..31aa332 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +247,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 4f3f375..3a48c35 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..e7116c3 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. So cleaning up
+isn't really necessary.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..1711936 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to prefetch (preallocate) per XLOG write */
+#define VAR_RELNUMBER_PREFETCH 512
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,97 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /* if we run out of logged RelFileNumber then we must log more */
+ if (ShmemVariableCache->relnumbercount == 0)
+ {
+ LogNextRelFileNumber(ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PREFETCH);
+
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+ (ShmemVariableCache->relnumbercount)--;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ int relnumbercount;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "the RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the
+ * relfilenumber order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * Check that if we set this new relfilenumber then do we run out of the
+ * logged values, if so then we need to WAL log again. Otherwise, just
+ * adjust the relnumbercount counter.
+ */
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH);
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1b2f240..28c3d31 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,9 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5026,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6472,6 +6477,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ checkPoint.nextRelFileNumber += ShmemVariableCache->relnumbercount;
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7350,6 +7361,32 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * Flush xlog record to disk before returning. We want to be sure that the
+ * in-memory nextRelFileNumber value is always larger than any
+ * relfilenumber that is already in use on disk. To maintain that
+ * invariant, we must make sure that the record we just logged reaches the
+ * disk before any new files are created.
+ *
+ * This should not impact the performance, since we are not WAL logging
+ * it for every allocation, but only after allocating 512 RelFileNumber.
+ */
+ XLogFlush(recptr);
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7564,6 +7601,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7578,6 +7625,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5..1d95fc0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index e770ea6..955ae3b 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -345,7 +345,7 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ relfilenumber = GetNewRelFileNumber();
}
/*
@@ -898,7 +898,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1170,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1224,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c5d463a..3402f49 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -900,12 +900,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -937,8 +932,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index ef5b34a..3f48523 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14371,11 +14371,12 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
+ * Generate a new relfilenumber. We can not reuse the old relfilenumber
+ * because the unused relfilenumber files are not unlinked until the next
+ * checkpoint. So if move the relation to the old tablespace again, we
+ * will get the conflicting relfilenumber file.
*/
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ newrelfilenumber = GetNewRelFileNumber();
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 05f27f0..8b53cd3 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -55,6 +55,11 @@ static void outChar(StringInfo str, char c);
#define WRITE_INT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
+/* Write an unsigned integer field (anything written with INT64_FORMAT) */
+#define WRITE_INT64_FIELD(fldname) \
+ appendStringInfo(str, " :" CppAsString(fldname) " " INT64_FORMAT, \
+ node->fldname)
+
/* Write an unsigned integer field (anything written as ":fldname %u") */
#define WRITE_UINT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %u", node->fldname)
@@ -2932,7 +2937,7 @@ _outIndexStmt(StringInfo str, const IndexStmt *node)
WRITE_NODE_FIELD(excludeOpNames);
WRITE_STRING_FIELD(idxcomment);
WRITE_OID_FIELD(indexOid);
- WRITE_OID_FIELD(oldNumber);
+ WRITE_INT64_FIELD(oldNumber);
WRITE_UINT_FIELD(oldCreateSubid);
WRITE_UINT_FIELD(oldFirstRelfilelocatorSubid);
WRITE_BOOL_FIELD(unique);
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..d9e72cf 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX although this all was true when the relfilenumbers were 32 bits wide but
+ * now the relfilenumbers are 56 bits wide so we don't have risk of
+ * relfilenumber being reused so in future we can immediately unlink the first
+ * segment as well. Although we can reuse the relfilenumber during createdb()
+ * using file copy method or during movedb() but the above scenario is only
+ * applicable when we create a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index b21d8c3..5f6c12a 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..344ee1b 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,7 +898,7 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
/* test needed so RelidByRelfilenumber doesn't misbehave */
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..1f473d7 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -98,10 +98,11 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +121,11 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +144,11 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..44c14b8 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3708,8 +3708,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
RelFileLocator newrlocator;
/* Allocate a new relfilenumber */
- newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ newrelfilenumber = GetNewRelFileNumber();
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..fa9dac8 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " INT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 21dfe1b..65fc623 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,9 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_int64(optarg, "-f/--filenode", 0,
+ LLONG_MAX,
+ NULL))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f317f0a..a85fa3d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4835,16 +4835,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4862,7 +4862,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4876,7 +4876,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4884,7 +4884,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4897,7 +4897,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5d30b87..0f88cb2 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 265d829..4c4f03a 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index b3ad820..50e94df 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..53528fd 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,45 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_int64
+ *
+ * Same as option_parse_int but parse int64.
+ */
+bool
+option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (errno == ERANGE || val < min_range || val > max_range)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, min_range, max_range);
+ return false;
+ }
+
+ if (result)
+ *result = val;
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..360696c 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -213,6 +213,9 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ uint32 relnumbercount; /* relfilenumbers available before must do
+ * XLOG work */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +296,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..bd683cc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..4768e5e 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..8c0e818 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,8 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_int64(const char *optarg, const char *optname,
+ int64 min_range, int64 max_range,
+ int64 *result);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..a69ee72 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,14 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 4c36d55..3b6a973 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,25 +90,28 @@
*/
typedef struct buftag
{
- Oid spcOid; /* tablespace oid */
- Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+ Oid spcOid; /* tablespace oid. */
+ Oid dbOid; /* database oid. */
+ uint32 relNumber_low; /* relfilenumber 32 lower bits */
+ uint32 relNumber_hi:24; /* relfilenumber 24 high bits */
+ uint32 forkNum:8; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define BufTagGetRelNumber(a) ((a).relNumber)
+#define BufTagGetRelNumber(a) \
+ ((((uint64) (a).relNumber_hi << 32) | ((uint32) (a).relNumber_low)))
#define BufTagSetRelNumber(a, relnumber) \
( \
- (a).relNumber = (relnumber) \
+ (a).relNumber_hi = (relnumber) >> 32, \
+ (a).relNumber_low = (relnumber) & 0xffffffff \
)
#define BufTagGetRelFileLocator(a, locator) \
do { \
(locator).spcOid = (a).spcOid; \
(locator).dbOid = (a).dbOid; \
- (locator).relNumber = (a).relNumber; \
+ (locator).relNumber = BufTagGetRelNumber(a); \
} while(0)
#define CLEAR_BUFFERTAG(a) \
@@ -133,7 +136,8 @@ do { \
( \
(a).spcOid == (b).spcOid && \
(a).dbOid == (b).dbOid && \
- (a).relNumber == (b).relNumber && \
+ (a).relNumber_low == (b).relNumber_low && \
+ (a).relNumber_hi == (b).relNumber_hi && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
@@ -142,7 +146,7 @@ do { \
( \
(a).spcOid == (locator).spcOid && \
(a).dbOid == (locator).dbOid && \
- (a).relNumber == (locator).relNumber \
+ BufTagGetRelNumber(a) == (locator).relNumber \
)
/*
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..7e68480 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -34,8 +34,7 @@
* relNumber identifies the specific relation. relNumber corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * Notice that relNumber is unique within a cluster.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +74,12 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 5ede56d..6230fcb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 52001e3..4190b12 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
v7-0004-Don-t-delay-removing-Tombstone-file-until-next.patchtext/x-patch; charset=US-ASCII; name=v7-0004-Don-t-delay-removing-Tombstone-file-until-next.patchDownload
From 159cf437cab348a358332262b5f3944b0330b6e6 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 7 Jul 2022 16:34:45 +0530
Subject: [PATCH v7 4/4] Don't delay removing Tombstone file until next
checkpoint
Prior to making relfilenode to 56bit wider, we can not
remove the unused relfilenode until the next checkpoint
because if we remove them immediately then there is a risk
of reusing the same relfilenode for two different relations
during single checkpoint due to Oid wraparound.
Now as part of the previous patch set we have made relfilenode
56 bit wider and removed the risk of wraparound so now we don't
need to wait till the next checkpoint for removing the unused
relation file and we can clean them up on commit.
---
src/backend/access/transam/xlog.c | 5 --
src/backend/commands/tablecmds.c | 6 +-
src/backend/storage/smgr/md.c | 154 +++++++++++---------------------------
src/backend/storage/sync/sync.c | 101 -------------------------
src/include/storage/sync.h | 2 -
5 files changed, 46 insertions(+), 222 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 28c3d31..73531c1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6641,11 +6641,6 @@ CreateCheckPoint(int flags)
END_CRIT_SECTION();
/*
- * Let smgr do post-checkpoint cleanup (eg, deleting old files).
- */
- SyncPostCheckpoint();
-
- /*
* Update the average distance between checkpoints if the prior checkpoint
* exists.
*/
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0678681..f5d653c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14372,9 +14372,9 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
/*
* Generate a new relfilenumber. We can not reuse the old relfilenumber
- * because the unused relfilenumber files are not unlinked until the next
- * checkpoint. So if move the relation to the old tablespace again, we
- * will get the conflicting relfilenumber file.
+ * because the unused relfilenumber files are not unlinked until commit.
+ * So if move the relation to the old tablespace again within a same
+ * transaction, we will get the conflicting relfilenumber file.
*/
newrelfilenumber = GetNewRelFileNumber();
AssertRelfileNumberFileNotExists(newTableSpace,
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index d9e72cf..7d15920 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -24,6 +24,7 @@
#include <unistd.h>
#include <fcntl.h>
#include <sys/file.h>
+#include <sys/stat.h>
#include "access/xlog.h"
#include "access/xlogutils.h"
@@ -126,8 +127,6 @@ static void mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum,
static MdfdVec *mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior);
static void register_dirty_segment(SMgrRelation reln, ForkNumber forknum,
MdfdVec *seg);
-static void register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno);
static void register_forget_request(RelFileLocatorBackend rlocator, ForkNumber forknum,
BlockNumber segno);
static void _fdvec_resize(SMgrRelation reln,
@@ -240,41 +239,14 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
* to delete all forks.
*
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
- * the next checkpoint. Additional segments can be unlinked immediately,
- * however. Leaving the empty file in place prevents that relfilenumber
- * from being reused. The scenario this protects us from is:
- * 1. We delete a relation (and commit, and actually remove its file).
- * 2. We create a new relation, which by chance gets the same relfilenumber as
- * the just-deleted one (OIDs must've wrapped around for that to happen).
- * 3. We crash before another checkpoint occurs.
- * During replay, we would delete the file and then recreate it, which is fine
- * if the contents of the file were repopulated by subsequent WAL entries.
- * But if we didn't WAL-log insertions, but instead relied on fsyncing the
- * file after populating it (as we do at wal_level=minimal), the contents of
- * the file would be lost forever. By leaving the empty file until after the
- * next checkpoint, we prevent reassignment of the relfilenumber until it's
- * safe, because relfilenumber assignment skips over any existing file.
- *
- * XXX although this all was true when the relfilenumbers were 32 bits wide but
- * now the relfilenumbers are 56 bits wide so we don't have risk of
- * relfilenumber being reused so in future we can immediately unlink the first
- * segment as well. Although we can reuse the relfilenumber during createdb()
- * using file copy method or during movedb() but the above scenario is only
- * applicable when we create a new relation.
- *
- * We do not need to go through this dance for temp relations, though, because
- * we never make WAL entries for temp rels, and so a temp rel poses no threat
- * to the health of a regular rel that has taken over its relfilenumber.
- * The fact that temp rels and regular rels have different file naming
- * patterns provides additional safety.
+ * We do not carefully track whether other forks have been created or not, but
+ * just attempt to unlink them unconditionally; so we should never complain
+ * about ENOENT.
*
- * All the above applies only to the relation's main fork; other forks can
- * just be removed immediately, since they are not needed to prevent the
- * relfilenumber from being recycled. Also, we do not carefully
- * track whether other forks have been created or not, but just attempt to
- * unlink them unconditionally; so we should never complain about ENOENT.
+ * Note that now we can immediately unlink the first segment of the regular
+ * relation as well because the relfilenumber is 56 bits wide since PG 16. So
+ * we don't have to worry about relfilenumber getting reused for some unrelated
+ * relation file.
*
* If isRedo is true, it's unsurprising for the relation to be already gone.
* Also, we should remove the file immediately instead of queuing a request
@@ -325,90 +297,67 @@ static void
mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forkNum, bool isRedo)
{
char *path;
- int ret;
+ char *segpath;
+ int segno;
+ int lastsegment = -1;
+ struct stat statbuf;
path = relpath(rlocator, forkNum);
+ segpath = (char *) palloc(strlen(path) + 12);
- /*
- * Delete or truncate the first segment.
- */
- if (isRedo || forkNum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
+ /* compute number of segments. */
+ for (segno = 0;; segno++)
{
- if (!RelFileLocatorBackendIsTemp(rlocator))
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Forget any pending sync requests for the first segment */
- register_forget_request(rlocator, forkNum, 0 /* first seg */ );
- }
+ if (segno == 0)
+ sprintf(segpath, "%s", path);
else
- ret = 0;
+ sprintf(segpath, "%s.%u", path, segno);
- /* Next unlink the file, unless it was already found to be missing */
- if (ret == 0 || errno != ENOENT)
+ if (stat(segpath, &statbuf) != 0)
{
- ret = unlink(path);
- if (ret < 0 && errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
+ /* ENOENT is expected after the last segment... */
+ if (errno == ENOENT)
+ break;
}
- }
- else
- {
- /* Prevent other backends' fds from holding on to the disk space */
- ret = do_truncate(path);
-
- /* Register request to unlink first segment later */
- register_unlink_segment(rlocator, forkNum, 0 /* first seg */ );
+ lastsegment = segno;
}
/*
- * Delete any additional segments.
+ * Unlink segment files in descending order so that if there is any failure
+ * while deleting any of the segment files, we do not create any gaps in
+ * segment files sequence.
*/
- if (ret >= 0)
+ for (segno = lastsegment; segno >= 0; segno--)
{
- char *segpath = (char *) palloc(strlen(path) + 12);
- BlockNumber segno;
-
- /*
- * Note that because we loop until getting ENOENT, we will correctly
- * remove all inactive segments as well as active ones.
- */
- for (segno = 1;; segno++)
- {
+ if (segno == 0)
+ sprintf(segpath, "%s", path);
+ else
sprintf(segpath, "%s.%u", path, segno);
if (!RelFileLocatorBackendIsTemp(rlocator))
{
/*
- * Prevent other backends' fds from holding on to the disk
+ * prevent other backends' fds from holding on to the disk
* space.
*/
- if (do_truncate(segpath) < 0 && errno == ENOENT)
- break;
+ do_truncate(path);
- /*
- * Forget any pending sync requests for this segment before we
- * try to unlink.
- */
+ /* forget any pending sync requests for the first segment. */
register_forget_request(rlocator, forkNum, segno);
}
- if (unlink(segpath) < 0)
- {
- /* ENOENT is expected after the last segment... */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", segpath)));
- break;
- }
- }
- pfree(segpath);
+ /*
+ * Unlink the file, we have already checked for file existence in
+ * the above loop while computing the segments so we do not need to
+ * check for ENOENT.
+ */
+ if (unlink(path))
+ ereport(WARNING,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", path)));
}
+ pfree(segpath);
pfree(path);
}
@@ -1009,23 +958,6 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
}
/*
- * register_unlink_segment() -- Schedule a file to be deleted after next checkpoint
- */
-static void
-register_unlink_segment(RelFileLocatorBackend rlocator, ForkNumber forknum,
- BlockNumber segno)
-{
- FileTag tag;
-
- INIT_MD_FILETAG(tag, rlocator.locator, forknum, segno);
-
- /* Should never be used with temp relations */
- Assert(!RelFileLocatorBackendIsTemp(rlocator));
-
- RegisterSyncRequest(&tag, SYNC_UNLINK_REQUEST, true /* retryOnError */ );
-}
-
-/*
* register_forget_request() -- forget any fsyncs for a relation fork's segment
*/
static void
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e1fb631..9a4a31c 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -201,92 +201,6 @@ SyncPreCheckpoint(void)
}
/*
- * SyncPostCheckpoint() -- Do post-checkpoint work
- *
- * Remove any lingering files that can now be safely removed.
- */
-void
-SyncPostCheckpoint(void)
-{
- int absorb_counter;
- ListCell *lc;
-
- absorb_counter = UNLINKS_PER_ABSORB;
- foreach(lc, pendingUnlinks)
- {
- PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(lc);
- char path[MAXPGPATH];
-
- /* Skip over any canceled entries */
- if (entry->canceled)
- continue;
-
- /*
- * New entries are appended to the end, so if the entry is new we've
- * reached the end of old entries.
- *
- * Note: if just the right number of consecutive checkpoints fail, we
- * could be fooled here by cycle_ctr wraparound. However, the only
- * consequence is that we'd delay unlinking for one more checkpoint,
- * which is perfectly tolerable.
- */
- if (entry->cycle_ctr == checkpoint_cycle_ctr)
- break;
-
- /* Unlink the file */
- if (syncsw[entry->tag.handler].sync_unlinkfiletag(&entry->tag,
- path) < 0)
- {
- /*
- * There's a race condition, when the database is dropped at the
- * same time that we process the pending unlink requests. If the
- * DROP DATABASE deletes the file before we do, we will get ENOENT
- * here. rmtree() also has to ignore ENOENT errors, to deal with
- * the possibility that we delete the file first.
- */
- if (errno != ENOENT)
- ereport(WARNING,
- (errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
- }
-
- /* Mark the list entry as canceled, just in case */
- entry->canceled = true;
-
- /*
- * As in ProcessSyncRequests, we don't want to stop absorbing fsync
- * requests for a long time when there are many deletions to be done.
- * We can safely call AbsorbSyncRequests() at this point in the loop.
- */
- if (--absorb_counter <= 0)
- {
- AbsorbSyncRequests();
- absorb_counter = UNLINKS_PER_ABSORB;
- }
- }
-
- /*
- * If we reached the end of the list, we can just remove the whole list
- * (remembering to pfree all the PendingUnlinkEntry objects). Otherwise,
- * we must keep the entries at or after "lc".
- */
- if (lc == NULL)
- {
- list_free_deep(pendingUnlinks);
- pendingUnlinks = NIL;
- }
- else
- {
- int ntodelete = list_cell_number(pendingUnlinks, lc);
-
- for (int i = 0; i < ntodelete; i++)
- pfree(list_nth(pendingUnlinks, i));
-
- pendingUnlinks = list_delete_first_n(pendingUnlinks, ntodelete);
- }
-}
-
-/*
* ProcessSyncRequests() -- Process queued fsync requests.
*/
void
@@ -532,21 +446,6 @@ RememberSyncRequest(const FileTag *ftag, SyncRequestType type)
entry->canceled = true;
}
}
- else if (type == SYNC_UNLINK_REQUEST)
- {
- /* Unlink request: put it in the linked list */
- MemoryContext oldcxt = MemoryContextSwitchTo(pendingOpsCxt);
- PendingUnlinkEntry *entry;
-
- entry = palloc(sizeof(PendingUnlinkEntry));
- entry->tag = *ftag;
- entry->cycle_ctr = checkpoint_cycle_ctr;
- entry->canceled = false;
-
- pendingUnlinks = lappend(pendingUnlinks, entry);
-
- MemoryContextSwitchTo(oldcxt);
- }
else
{
/* Normal case: enter a request to fsync this segment */
diff --git a/src/include/storage/sync.h b/src/include/storage/sync.h
index 049af87..2c0b812 100644
--- a/src/include/storage/sync.h
+++ b/src/include/storage/sync.h
@@ -23,7 +23,6 @@
typedef enum SyncRequestType
{
SYNC_REQUEST, /* schedule a call of sync function */
- SYNC_UNLINK_REQUEST, /* schedule a call of unlink function */
SYNC_FORGET_REQUEST, /* forget all calls for a tag */
SYNC_FILTER_REQUEST /* forget all calls satisfying match fn */
} SyncRequestType;
@@ -57,7 +56,6 @@ typedef struct FileTag
extern void InitSync(void);
extern void SyncPreCheckpoint(void);
-extern void SyncPostCheckpoint(void);
extern void ProcessSyncRequests(void);
extern void RememberSyncRequest(const FileTag *ftag, SyncRequestType type);
extern bool RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
--
1.8.3.1
Trying to compile with 0001 and 0002 applied and -Wall -Werror in use, I get:
buf_init.c:119:4: error: implicit truncation from 'int' to bit-field
changes value from -1 to 255 [-Werror,-Wbitfield-constant-conversion]
CLEAR_BUFFERTAG(buf->tag);
^~~~~~~~~~~~~~~~~~~~~~~~~
../../../../src/include/storage/buf_internals.h:122:14: note: expanded
from macro 'CLEAR_BUFFERTAG'
(a).forkNum = InvalidForkNumber, \
^ ~~~~~~~~~~~~~~~~~
1 error generated.
More review comments:
In pg_buffercache_pages_internal(), I suggest that we add an error
check. If fctx->record[i].relfilenumber is greater than the largest
value that can be represented as an OID, then let's do something like:
ERROR: relfilenode is too large to be represented as an OID
HINT: Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE
That way, instead of confusing people by giving them an incorrect
answer, we'll push them toward a step that they may have overlooked.
In src/backend/access/transam/README, I think the sentence "So
cleaning up isn't really necessary." isn't too helpful. I suggest
replacing it with "Thus, on-disk collisions aren't possible."
I think it is in line with oidCount, what do you think?
Oh it definitely is, and maybe it's OK the way you have it. But the
OID stuff has wraparound to worry about, and this doesn't; and this
has the SetNextRelFileNumber and that doesn't; so it is not
necessarily the case that the design which is best for that case is
also best for this case.
I believe that the persistence model for SetNextRelFileNumber needs
more thought. Right now I believe it's relying on the fact that, after
we try to restore the dump, we'll try to perform a clean shutdown of
the server before doing anything important, and that will persist the
final value, whatever it ends up being. However, there's no comment
explaining that theory of operation, and it seems pretty fragile
anyway. What if things don't go as planned? Suppose the power goes out
halfway through restoring the dump, and the user for some reason then
gives up on running pg_upgrade and just tries to do random things with
that server? Then I think there will be trouble, because nothing has
updated the nextrelfilenumber value and yet there are potentially new
files on disk. Maybe that's a stretch since I think other things might
also break if you do that, but I'm also not sure that's the only
scenario to worry about, especially if you factor in the possibility
of future code changes, like changes to the timing of when we shut
down and restart the server during pg_upgrade, or other uses of
binary-upgrade mode, or whatever. I don't know. Perhaps it's not
actually broken but I'm inclined to think it should be logging its
changes.
A related thought is that I don't think this patch has as many
cross-checks as it could have. For instance, suppose that when we
replay a WAL record that creates relation storage, we cross-check that
the value is less than the counter. I think you have a check in there
someplace that will error out if there is an actual collision --
although I can't find it at the moment, and possibly we want to add
some comments there even if it's in existing code -- but this kind of
thing would detect bugs that could lead to collisions even if no
collision actually occurs, e.g. because a duplicate relfilenumber is
used but in a different database or tablespace. It might be worth
spending some time thinking about other possible cross-checks too.
We're trying to create a system where the relfilenumber counter is
always ahead of all the relfilenumbers used on disk, but the coupling
between the relfilenumber-advancement machinery and the
make-files-on-disk machinery is pretty loose, and so there is a risk
that bugs could escape detection. Whatever we can do to increase the
probability of noticing when things have gone wrong, and/or to notice
it quicker, will be good.
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "the RelFileNumber can be set only during
binary upgrade");
I think you should remove the word "the". Primary error messages are
written telegram-style and "the" is usually omitted, especially at the
beginning of the message.
+ * This should not impact the performance, since we are not WAL logging
+ * it for every allocation, but only after allocating 512 RelFileNumber.
I think this claim is overly bold, and it would be better if the
current value of the constant weren't encoded in the comment. I'm not
sure we really need this part of the comment at all, but if we do,
maybe it should be reworded to something like: This is potentially a
somewhat expensive operation, but fortunately we only need to do it
for every VAR_RELNUMBER_PREFETCH new relfilenodes. Or maybe it's
better to put this explanation in GetNewRelFileNumber instead, e.g.
"If we run out of logged RelFileNumbers, then we must log more, and
also wait for the xlog record to be flushed to disk. This is somewhat
expensive, but hopefully VAR_RELNUMBER_PREFETCH is large enough that
this doesn't slow things down too much."
One thing that isn't great about this whole scheme is that it can lead
to lock pile-ups. Once somebody is waiting for an
XLOG_NEXT_RELFILENUMBER record to reach the disk, any other backend
that tries to get a new relfilenumber is going to block waiting for
RelFileNumberGenLock. I wonder whether this effect is observable in
practice: suppose we just create relations in a tight loop from inside
a stored procedure, and do that simultaneously in multiple backends?
What does the wait event distribution look like? Can we observe a lot
of RelFileNumberGenLock events or not really? I guess if we reduce
VAR_RELNUMBER_PREFETCH enough we can probably create a problem, but
how small a value is needed?
One thing we could think about doing here is try to stagger the xlog
and the flush. When we've used VAR_RELNUMBER_PREFETCH/2
relfilenumbers, log a record reserving VAR_RELNUMBER_PREFETCH from
where we are now, and remember the LSN. When we've used up our entire
previous allocation, XLogFlush() that record before allowing the
additional values to be used. The bookkeeping would be a bit more
complicated than currently, but I don't think it would be too bad. I'm
not sure how much it would actually help, though, or whether we need
it. If new relfilenumbers are being used up really quickly, then maybe
the record won't get flushed into the background before we run out of
available numbers anyway, and if they aren't, then maybe it doesn't
matter. On the other hand, even one transaction commit between when
the record is logged and when we run out of the previous allocation is
enough to force a flush, at least with synchronous_commit=on, so maybe
the chances of being able to piggyback on an existing flush are not so
bad after all. I'm not sure.
+ * Generate a new relfilenumber. We can not reuse the old relfilenumber
+ * because the unused relfilenumber files are not unlinked
until the next
+ * checkpoint. So if move the relation to the old tablespace again, we
+ * will get the conflicting relfilenumber file.
This is much clearer now but the grammar has some issues, e.g. "the
unused relfilenumber" should be just "unused relfilenumber" and "So if
move" is not right either. I suggest: We cannot reuse the old
relfilenumber because of the possibility that that relation will be
moved back to the original tablespace before the next checkpoint. At
that point, the first segment of the main fork won't have been
unlinked yet, and an attempt to create new relation storage with that
same relfilenumber will fail."
In theory I suppose there's another way we could solve this problem:
keep using the same relfilenumber, and if the scenario described here
occurs, just reuse the old file. The reason why we can't do that today
is because we could be running with wal_level=minimal and replace a
relation with one whose contents aren't logged. If WAL replay then
replays the drop, we're in trouble. But if the only time we reuse a
relfilenumber for new relation storage is when relations are moved
around, then I think that scenario can't happen. However, I think
assigning a new relfilenumber is probably better, because it gets us
closer to a world in which relfilenumbers are never reused at all. It
doesn't get us all the way there because of createdb() and movedb(),
but it gets us closer and I prefer that.
+ * XXX although this all was true when the relfilenumbers were 32 bits wide but
+ * now the relfilenumbers are 56 bits wide so we don't have risk of
+ * relfilenumber being reused so in future we can immediately unlink the first
+ * segment as well. Although we can reuse the relfilenumber during createdb()
+ * using file copy method or during movedb() but the above scenario is only
+ * applicable when we create a new relation.
Here is an edited version:
XXX. Although all of this was true when relfilenumbers were 32 bits wide, they
are now 56 bits wide and do not wrap around, so in the future we can change
the code to immediately unlink the first segment of the relation along
with all the
others. We still do reuse relfilenumbers when createdb() is performed using the
file-copy method or during movedb(), but the scenario described above can only
happen when creating a new relation.
I think that pg_filenode_relation,
binary_upgrade_set_next_heap_relfilenode, and other functions that are
now going to be accepting a RelFileNode using the SQL int8 datatype
should bounds-check the argument. It could be <0 or >2^56, and I
believe it'd be best to throw an error for that straight off. The
three functions in pg_upgrade_support.c could share a static
subroutine for this, to avoid duplicating code.
This bounds-checking issue also applies to the -f argument to pg_checksums.
I notice that the patch makes no changes to relmapper.c, and I think
that's a problem. Notice in particular:
#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
I believe that making RelFileNumber into a 64-bit value will cause the
8 in the calculation above to change to 16, defeating the intention
that the size of the file ought to be the smallest imaginable size of
a disk sector. It does seem like it would have been smart to include a
StaticAssertStmt in this file someplace that checks that the data
structure has the expected size, and now might be a good time, perhaps
in a separate patch, to add one. If we do nothing fancy here, the
maximum number of mappings will have to be reduced from 62 to 31,
which is a problem because global/pg_filenode.map currently has 48
entries. We could try to arrange to squeeze padding out of the
RelMapping struct, which would let us use just 12 bytes per mapping,
which would increase the limit to 41, but that's still less than we're
using already, never mind leaving room for future growth.
I don't know what to do about this exactly. I believe it's been
previously suggested that the actual minimum sector size on reasonably
modern hardware is never as small as 512 bytes, so maybe the file size
can just be increased to 1kB or something. If that idea is judged
unsafe, I can think of two other possible approaches offhand. One is
that we could move away from the idea of storing the OIDs in the file
along with the RelFileNodes, and instead store the offset for a given
RelFileNode at a fixed offset in the file. That would require either
hard-wiring offset tables into the code someplace, or generating them
as part of the build process, with separate tables for shared and
database-local relation map files. The other is that we could have
multiple 512-byte sectors and try to arrange for each relation to be
in the same sector with the indexes of that relation, since the
comments in relmapper.c say this:
* aborts. An important factor here is that the indexes and toast table of
* a mapped catalog must also be mapped, so that the rewrites/relocations of
* all these files commit in a single map file update rather than being tied
* to transaction commit.
This suggests that atomicity is required across a table and its
indexes, but that it's needed across arbitrary sets of entries in the
file.
Whatever we do, we shouldn't forget to bump RELMAPPER_FILEMAGIC.
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId)
BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId)
BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES)
BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for
database) */
Oid reltablespace BKI_DEFAULT(0)
BKI_LOOKUP_OPT(pg_tablespace);
As Andres said elsewhere, this stinks. Not sure what the resolution of
the discussion over on the "AIX support" thread is going to be yet,
but hopefully not this.
+ uint32 relNumber_low; /* relfilenumber 32 lower bits */
+ uint32 relNumber_hi:24; /* relfilenumber 24 high bits */
+ uint32 forkNum:8; /* fork number */
I still think we'd be better off with something like uint32
relForkDetails[2]. The bitfields would be nice if they meant that we
didn't have to do bit-shifting and masking operations ourselves, but
with the field split this way, we do anyway. So what's the point in
mixing the approaches?
* relNumber identifies the specific relation. relNumber corresponds to
* pg_class.relfilenode (NOT pg_class.oid, because we need to be able
* to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * Notice that relNumber is unique within a cluster.
I think this paragraph would benefit from more revision. I think that
we should just nuke the parenthesized part altogether, since we'll now
never use pg_class.oid as relNumber, and to suggest otherwise is just
confusing. As for the last sentence, "Notice that relNumber is unique
within a cluster." isn't wrong, but I think we could be more precise
and informative. Perhaps: "relNumber values are assigned by
GetNewRelFileNumber(), which will only ever assign the same value once
during the lifetime of a cluster. However, since CREATE DATABASE
duplicates the relfilenumbers of the template database, the values are
in practice only unique within a database, not globally."
That's all I've got for now.
Thanks,
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Jul 7, 2022 at 10:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
I have accepted all the suggestion, find my inline replies where we
need more thoughts.
buf_init.c:119:4: error: implicit truncation from 'int' to bit-field
changes value from -1 to 255 [-Werror,-Wbitfield-constant-conversion]
CLEAR_BUFFERTAG(buf->tag);
^~~~~~~~~~~~~~~~~~~~~~~~~
../../../../src/include/storage/buf_internals.h:122:14: note: expanded
from macro 'CLEAR_BUFFERTAG'
(a).forkNum = InvalidForkNumber, \
^ ~~~~~~~~~~~~~~~~~
1 error generated.
Hmm so now we are using an unsigned int field so IMHO we can make
InvalidForkNumber to 255 instead of -1?
I think it is in line with oidCount, what do you think?
Oh it definitely is, and maybe it's OK the way you have it. But the
OID stuff has wraparound to worry about, and this doesn't; and this
has the SetNextRelFileNumber and that doesn't; so it is not
necessarily the case that the design which is best for that case is
also best for this case.
Yeah right, but now with the latest changes for piggybacking the
XlogFlush I think it is cleaner to have the count.
I believe that the persistence model for SetNextRelFileNumber needs
more thought. Right now I believe it's relying on the fact that, after
we try to restore the dump, we'll try to perform a clean shutdown of
the server before doing anything important, and that will persist the
final value, whatever it ends up being. However, there's no comment
explaining that theory of operation, and it seems pretty fragile
anyway. What if things don't go as planned? Suppose the power goes out
halfway through restoring the dump, and the user for some reason then
gives up on running pg_upgrade and just tries to do random things with
that server? Then I think there will be trouble, because nothing has
updated the nextrelfilenumber value and yet there are potentially new
files on disk. Maybe that's a stretch since I think other things might
also break if you do that, but I'm also not sure that's the only
scenario to worry about, especially if you factor in the possibility
of future code changes, like changes to the timing of when we shut
down and restart the server during pg_upgrade, or other uses of
binary-upgrade mode, or whatever. I don't know. Perhaps it's not
actually broken but I'm inclined to think it should be logging its
changes.
But we are already logging this if we are setting the relfilenumber
which is out of the already logged range, am I missing something?
Check this change.
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH, NULL);
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
A related thought is that I don't think this patch has as many
cross-checks as it could have. For instance, suppose that when we
replay a WAL record that creates relation storage, we cross-check that
the value is less than the counter. I think you have a check in there
someplace that will error out if there is an actual collision --
although I can't find it at the moment, and possibly we want to add
some comments there even if it's in existing code -- but this kind of
thing would detect bugs that could lead to collisions even if no
collision actually occurs, e.g. because a duplicate relfilenumber is
used but in a different database or tablespace. It might be worth
spending some time thinking about other possible cross-checks too.
We're trying to create a system where the relfilenumber counter is
always ahead of all the relfilenumbers used on disk, but the coupling
between the relfilenumber-advancement machinery and the
make-files-on-disk machinery is pretty loose, and so there is a risk
that bugs could escape detection. Whatever we can do to increase the
probability of noticing when things have gone wrong, and/or to notice
it quicker, will be good.
I had those changes in v7-0003, now I have merged with 0002. This has
assert check while replaying the WAL for smgr create and smgr
truncate, and while during normal path when allocating the new
relfilenumber we are asserting for any existing file.
One thing that isn't great about this whole scheme is that it can lead
to lock pile-ups. Once somebody is waiting for an
XLOG_NEXT_RELFILENUMBER record to reach the disk, any other backend
that tries to get a new relfilenumber is going to block waiting for
RelFileNumberGenLock. I wonder whether this effect is observable in
practice: suppose we just create relations in a tight loop from inside
a stored procedure, and do that simultaneously in multiple backends?
What does the wait event distribution look like? Can we observe a lot
of RelFileNumberGenLock events or not really? I guess if we reduce
VAR_RELNUMBER_PREFETCH enough we can probably create a problem, but
how small a value is needed?
I have done some performance tests, with very small values I can see a
lot of wait events for RelFileNumberGen but with bigger numbers like
256 or 512 it is not really bad. See results at the end of the
mail[1]Wait event details
One thing we could think about doing here is try to stagger the xlog
and the flush. When we've used VAR_RELNUMBER_PREFETCH/2
relfilenumbers, log a record reserving VAR_RELNUMBER_PREFETCH from
where we are now, and remember the LSN. When we've used up our entire
previous allocation, XLogFlush() that record before allowing the
additional values to be used. The bookkeeping would be a bit more
complicated than currently, but I don't think it would be too bad. I'm
not sure how much it would actually help, though, or whether we need
it. If new relfilenumbers are being used up really quickly, then maybe
the record won't get flushed into the background before we run out of
available numbers anyway, and if they aren't, then maybe it doesn't
matter. On the other hand, even one transaction commit between when
the record is logged and when we run out of the previous allocation is
enough to force a flush, at least with synchronous_commit=on, so maybe
the chances of being able to piggyback on an existing flush are not so
bad after all. I'm not sure.
I have done these changes during GetNewRelFileNumber() this required
to track the last logged record pointer as well but I think this looks
clean. With this I can see some reduction in RelFileNumberGen wait
event[1]Wait event details
In theory I suppose there's another way we could solve this problem:
keep using the same relfilenumber, and if the scenario described here
occurs, just reuse the old file. The reason why we can't do that today
is because we could be running with wal_level=minimal and replace a
relation with one whose contents aren't logged. If WAL replay then
replays the drop, we're in trouble. But if the only time we reuse a
relfilenumber for new relation storage is when relations are moved
around, then I think that scenario can't happen. However, I think
assigning a new relfilenumber is probably better, because it gets us
closer to a world in which relfilenumbers are never reused at all. It
doesn't get us all the way there because of createdb() and movedb(),
but it gets us closer and I prefer that.
I agree with you.
I notice that the patch makes no changes to relmapper.c, and I think
that's a problem. Notice in particular:#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
I believe that making RelFileNumber into a 64-bit value will cause the
8 in the calculation above to change to 16, defeating the intention
that the size of the file ought to be the smallest imaginable size of
a disk sector. It does seem like it would have been smart to include a
StaticAssertStmt in this file someplace that checks that the data
structure has the expected size, and now might be a good time, perhaps
in a separate patch, to add one. If we do nothing fancy here, the
maximum number of mappings will have to be reduced from 62 to 31,
which is a problem because global/pg_filenode.map currently has 48
entries. We could try to arrange to squeeze padding out of the
RelMapping struct, which would let us use just 12 bytes per mapping,
which would increase the limit to 41, but that's still less than we're
using already, never mind leaving room for future growth.I don't know what to do about this exactly. I believe it's been
previously suggested that the actual minimum sector size on reasonably
modern hardware is never as small as 512 bytes, so maybe the file size
can just be increased to 1kB or something. If that idea is judged
unsafe, I can think of two other possible approaches offhand. One is
that we could move away from the idea of storing the OIDs in the file
along with the RelFileNodes, and instead store the offset for a given
RelFileNode at a fixed offset in the file. That would require either
hard-wiring offset tables into the code someplace, or generating them
as part of the build process, with separate tables for shared and
database-local relation map files. The other is that we could have
multiple 512-byte sectors and try to arrange for each relation to be
in the same sector with the indexes of that relation, since the
comments in relmapper.c say this:* aborts. An important factor here is that the indexes and toast table of
* a mapped catalog must also be mapped, so that the rewrites/relocations of
* all these files commit in a single map file update rather than being tied
* to transaction commit.This suggests that atomicity is required across a table and its
indexes, but that it's needed across arbitrary sets of entries in the
file.Whatever we do, we shouldn't forget to bump RELMAPPER_FILEMAGIC.
I am not sure what is the best solution here, but I agree that most of
the modern hardware will have bigger sector size than 512 so we can
just change file size of 1024.
The current value of RELMAPPER_FILEMAGIC is 0x592717, I am not sure
how this version ID is decide is this some random magic number or
based on some logic?
+ uint32 relNumber_low; /* relfilenumber 32 lower bits */ + uint32 relNumber_hi:24; /* relfilenumber 24 high bits */ + uint32 forkNum:8; /* fork number */I still think we'd be better off with something like uint32
relForkDetails[2]. The bitfields would be nice if they meant that we
didn't have to do bit-shifting and masking operations ourselves, but
with the field split this way, we do anyway. So what's the point in
mixing the approaches?
Actually with this we were able to access the forkNum directly, but I
also think changing as relForkDetails[2] is cleaner so done that. And
as part of the related changes in 0001 I have removed the direct
access to the forkNum.
[1]: Wait event details
Procedure:
CREATE OR REPLACE FUNCTION create_table(count int) RETURNS void AS $$
DECLARE
relname varchar;
pid int;
i int;
BEGIN
SELECT pg_backend_pid() INTO pid;
relname := 'test_' || pid;
FOR i IN 1..count LOOP
EXECUTE format('CREATE TABLE %s(a int)', relname);
EXECUTE format('DROP TABLE %s', relname);
END LOOP;
END;
Target test: Executed "select create_table(100);" query from pgbench
with 32 concurrent backends.
VAR_RELNUMBER_PREFETCH = 8
905 LWLock | LockManager
346 LWLock | RelFileNumberGen
192
190 Activity | WalWriterMain
VAR_RELNUMBER_PREFETCH=128
1187 LWLock | LockManager
247 LWLock | RelFileNumberGen
139 Activity | CheckpointerMain
VAR_RELNUMBER_PREFETCH=256
1029 LWLock | LockManager
158 LWLock | BufferContent
134 Activity | CheckpointerMain
134 Activity | AutoVacuumMain
133 Activity | BgWriterMain
132 Activity | WalWriterMain
130 Activity | LogicalLauncherMain
123 LWLock | RelFileNumberGen
VAR_RELNUMBER_PREFETCH=512
1174 LWLock | LockManager
136 Activity | CheckpointerMain
136 Activity | BgWriterMain
136 Activity | AutoVacuumMain
134 Activity | WalWriterMain
134 Activity | LogicalLauncherMain
99 LWLock | BufferContent
35 LWLock | RelFileNumberGen
VAR_RELNUMBER_PREFETCH=2048
1070 LWLock | LockManager
160 LWLock | BufferContent
156 Activity | CheckpointerMain
156
155 Activity | BgWriterMain
154 Activity | AutoVacuumMain
153 Activity | WalWriterMain
149 Activity | LogicalLauncherMain
31 LWLock | RelFileNumberGen
28 Timeout | VacuumDelay
VAR_RELNUMBER_PREFETCH=4096
Note, no wait event for RelFileNumberGen at value 4096
New patch with piggybacking XLogFlush()
VAR_RELNUMBER_PREFETCH = 8
1105 LWLock | LockManager
143 LWLock | BufferContent
140 Activity | CheckpointerMain
140 Activity | BgWriterMain
139 Activity | WalWriterMain
138 Activity | AutoVacuumMain
137 Activity | LogicalLauncherMain
115 LWLock | RelFileNumberGen
VAR_RELNUMBER_PREFETCH = 256
1130 LWLock | LockManager
141 Activity | CheckpointerMain
139 Activity | BgWriterMain
137 Activity | AutoVacuumMain
136 Activity | LogicalLauncherMain
135 Activity | WalWriterMain
69 LWLock | BufferContent
31 LWLock | RelFileNumberGen
VAR_RELNUMBER_PREFETCH = 1024
Note: no wait event for RelFileNumberGen at value 1024
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v8-0001-Preliminary-refactoring-for-supporting-larger-rel.patchtext/x-patch; charset=US-ASCII; name=v8-0001-Preliminary-refactoring-for-supporting-larger-rel.patchDownload
From 2ff9c1409367f9f42ca4812fdafcba417b1768c8 Mon Sep 17 00:00:00 2001
From: dilip kumar <dilipbalaut@localhost.localdomain>
Date: Thu, 7 Jul 2022 10:15:23 +0530
Subject: [PATCH v8 1/2] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 +-
src/backend/storage/buffer/bufmgr.c | 140 +++++++++++++++++---------
src/backend/storage/buffer/localbuf.c | 30 ++++--
src/include/storage/buf_internals.h | 43 ++++++--
5 files changed, 159 insertions(+), 72 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..6966e54 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 13eee4a..55670c2 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -631,10 +631,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e4de4b3..0734bb4 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,8 +1647,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(bufHdr->tag, relation->rd_locator) &&
+ BufTagGetForkNum(bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1658,8 +1658,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(bufHdr->tag, relation->rd_locator) &&
+ BufTagGetForkNum(bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2000,9 +2000,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, BufTagGetForkNum(buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,8 +2790,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ BufTagGetRelFileLocator(bufHdr->tag, *rlocator);
+ *forknum = BufTagGetForkNum(bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,9 +2841,14 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2899,7 +2907,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2920,7 +2928,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3141,15 +3149,15 @@ DropRelFileLocatorBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(bufHdr->tag, rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rlocator.locator) &&
+ BufTagGetForkNum(bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3301,7 +3309,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, locators[j]))
{
rlocator = &locators[j];
break;
@@ -3310,7 +3318,10 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, locator);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3320,7 +3331,7 @@ DropRelFileLocatorsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, (*rlocator)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3380,8 +3391,8 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rlocator) &&
+ BufTagGetForkNum(bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3419,11 +3430,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3447,13 +3458,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(buf->tag, rlocator);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, BufTagGetForkNum(buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3473,12 +3487,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, BufTagGetForkNum(buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3517,7 +3535,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3535,7 +3553,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3564,13 +3582,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(bufHdr->tag, rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3644,7 +3662,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3653,7 +3671,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3665,7 +3686,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(bufHdr->tag, srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3867,13 +3888,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileLocatorBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4033,6 +4054,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4041,8 +4066,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4650,8 +4674,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ BufTagGetRelFileLocator(buf->tag, rlocator);
+ path = relpathperm(rlocator, BufTagGetForkNum(buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4675,7 +4701,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathperm(rlocator, BufTagGetForkNum(bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4693,8 +4723,12 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
+ path = relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4788,15 +4822,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ BufTagGetRelFileLocator(*ba, rlocatora);
+ BufTagGetRelFileLocator(*bb, rlocatorb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(*ba) < BufTagGetForkNum(*bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(*ba) > BufTagGetForkNum(*bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4946,10 +4985,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ BufTagGetRelFileLocator(tag, currlocator);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4957,11 +4998,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ BufTagGetRelFileLocator(next->tag, nextrlocator);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
+ BufTagGetForkNum(cur->tag) != BufTagGetForkNum(next->tag))
break;
/* ok, block queued twice, skip */
@@ -4979,8 +5023,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 41a0807..0d42086 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,15 +213,18 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +340,22 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(bufHdr->tag, rlocator) &&
+ BufTagGetForkNum(bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(bufHdr->tag)),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +392,16 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(bufHdr->tag, rlocator))
{
+ RelFileLocator rlocator;
+
+ BufTagGetRelFileLocator(bufHdr->tag, rlocator);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index aded5e8..40af078 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,34 +90,63 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+#define BufTagGetRelNumber(a) ((a).relNumber)
+
+#define BufTagSetRelNumber(a, relnumber) \
+( \
+ (a).relNumber = (relnumber) \
+)
+
+#define BufTagGetForkNum(a) ((a).forkNum)
+
+#define BufTagGetRelFileLocator(a, locator) \
+do { \
+ (locator).spcOid = (a).spcOid; \
+ (locator).dbOid = (a).dbOid; \
+ (locator).relNumber = (a).relNumber; \
+} while(0)
+
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
+ (a).spcOid = InvalidOid, \
+ (a).dbOid = InvalidOid, \
+ BufTagSetRelNumber(a, InvalidRelFileNumber), \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
( \
- (a).rlocator = (xx_rlocator), \
+ (a).spcOid = (xx_rlocator).spcOid, \
+ (a).dbOid = (xx_rlocator).dbOid, \
+ BufTagSetRelNumber(a, (xx_rlocator).relNumber), \
(a).forkNum = (xx_forkNum), \
(a).blockNum = (xx_blockNum) \
)
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
+ (a).spcOid == (b).spcOid && \
+ (a).dbOid == (b).dbOid && \
+ (a).relNumber == (b).relNumber && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
+#define BufTagMatchesRelFileLocator(a, locator) \
+( \
+ (a).spcOid == (locator).spcOid && \
+ (a).dbOid == (locator).dbOid && \
+ (a).relNumber == (locator).relNumber \
+)
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v8-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v8-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From 6180e92e810a162b2cd3dfc5c5e2e101655a3c25 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Mon, 11 Jul 2022 14:31:38 +0530
Subject: [PATCH v8 2/2] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 +++++-
contrib/pg_prewarm/autoprewarm.c | 2 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 131 +++++++++++++++++-
src/backend/access/transam/xlog.c | 54 ++++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 149 ++++++++-------------
src/backend/catalog/heap.c | 20 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 6 +
src/backend/commands/tablecmds.c | 16 ++-
src/backend/nodes/outfuncs.c | 5 +
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 11 +-
src/backend/utils/adt/pg_upgrade_support.c | 22 ++-
src/backend/utils/cache/relcache.c | 3 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/cache/relmapper.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 ++++++
src/include/access/transam.h | 7 +
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 12 +-
src/include/catalog/pg_class.h | 16 +--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 4 +-
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 49 ++++---
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++--
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
63 files changed, 614 insertions(+), 271 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..c20439a 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 6966e54..9274825 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE")));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 55670c2..5de54f8 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -534,7 +534,7 @@ autoprewarm_database_main(Datum main_arg)
* smgrexists is not safe for illegal forknum, hence check whether
* the passed forknum is valid before using it in smgrexists.
*/
- if (blk->forknum > InvalidForkNumber &&
+ if (blk->forknum != InvalidForkNumber &&
blk->forknum <= MAX_FORKNUM &&
smgrexists(RelationGetSmgr(rel), blk->forknum))
nblocks = RelationGetNumberOfBlocksInFork(rel, blk->forknum);
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 4f3f375..3a48c35 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..e13e3a5 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,9 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to prefetch (preallocate) per XLOG write */
+#define VAR_RELNUMBER_PREFETCH 512
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +524,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +615,131 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If we have consumed over half the total logged RelFileNumbers, log
+ * more. Ideally, we can wait until all relfilenumbers have been consumed
+ * before logging more. Nevertheless, if we do that, we must immediately
+ * flush the logged wal record because we want to ensure that the
+ * nextRelFileNumber is always larger than any relfilenumber already in
+ * use on disk. And, to maintain that invariant, we must make sure that
+ * the record we log reaches the disk before any new files are created
+ * with the newly logged range. So in order to avoid flushing the wal
+ * immediately, we always log before consuming all the relfilenumber, and
+ * now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PREFETCH is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->relnumbercount <= VAR_RELNUMBER_PREFETCH / 2)
+ {
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previosly logged record pointer and we
+ * will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ LogNextRelFileNumber(ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PREFETCH,
+ &ShmemVariableCache->lastRelFileNumberRecPtr);
+
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+ (ShmemVariableCache->relnumbercount)--;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ int relnumbercount;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * Check that if we set this new relfilenumber then do we run out of the
+ * logged values, if so then we need to WAL log again. Otherwise, just
+ * adjust the relnumbercount counter.
+ *
+ * XXX. this is only done during binary upgrade so we don't need to do any
+ * special logic for reducing extra XlogFlush like we do in
+ * GetNewRelFileNumber().
+ */
+ relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber;
+ if (ShmemVariableCache->relnumbercount <= relnumbercount)
+ {
+ LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH, NULL);
+ ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH;
+ }
+ else
+ ShmemVariableCache->relnumbercount -= relnumbercount;
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b809a21..9bb9feb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
+ ShmemVariableCache->lastRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->relnumbercount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,12 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ checkPoint.nextRelFileNumber += ShmemVariableCache->relnumbercount;
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7373,34 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. '*prevrecptr' is valid then flush
+ * the wal upto this record pointer otherwise flush upto currently logged
+ * record. Also store the currenly log record in the '*prevrecptr'.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7615,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7639,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->relnumbercount = 0;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5..1d95fc0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..945690d 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
@@ -678,3 +583,57 @@ pg_stop_making_pinned_objects(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+
+#ifdef USE_ASSERT_CHECKING
+
+/*
+ * Assert that there is no existing diskfile for input relnumber.
+ */
+void
+AssertRelfileNumberFileNotExists(Oid spcoid, RelFileNumber relnumber,
+ char relpersistence)
+{
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ /*
+ * If we ever get here during pg_upgrade, there's something wrong; all
+ * relfilenode assignments during a binary-upgrade run should be
+ * determined by commands in the dump script.
+ */
+ Assert(!IsBinaryUpgrade);
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid = spcoid ? spcoid : MyDatabaseTableSpace;
+ rlocator.locator.dbOid =
+ (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid :
+ MyDatabaseId;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must initialize
+ * that properly here to make sure that any collisions based on filename
+ * are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+
+ Assert(access(rpath, F_OK) != 0);
+}
+#endif
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index e770ea6..30fd15c 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -345,7 +345,12 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ relfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(reltablespace,
+ relfilenumber,
+ relpersistence);
+ }
}
/*
@@ -898,7 +903,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1175,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1229,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c5d463a..3402f49 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -900,12 +900,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -937,8 +932,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..db85bd8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,9 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +984,9 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ Assert(xlrec->rlocator.relNumber <=
+ ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index ef5b34a..9726b6b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14371,11 +14371,17 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(newTableSpace,
+ newrelfilenumber,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4d776e7..2ca0811 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -41,6 +41,11 @@ static void outChar(StringInfo str, char c);
#define WRITE_INT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
+/* Write an unsigned integer field (anything written with INT64_FORMAT) */
+#define WRITE_INT64_FIELD(fldname) \
+ appendStringInfo(str, " :" CppAsString(fldname) " " INT64_FORMAT, \
+ node->fldname)
+
/* Write an unsigned integer field (anything written as ":fldname %u") */
#define WRITE_UINT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %u", node->fldname)
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..a88fef2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index b21d8c3..5f6c12a 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..31d7471 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,16 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
+ (relfilenumber))));
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..ff8f0c2 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -29,6 +30,15 @@ do { \
errmsg("function can only be called when server is in binary upgrade mode"))); \
} while (0)
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range", \
+ (relfilenumber)))); \
+} while (0)
+
Datum
binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
{
@@ -98,10 +108,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +132,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +156,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..44c14b8 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3708,8 +3708,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
RelFileLocator newrlocator;
/* Allocate a new relfilenumber */
- newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ newrelfilenumber = GetNewRelFileNumber();
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..fa9dac8 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " INT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 8e5595b..cbc1239 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -61,7 +61,7 @@
* The map file is critical data: we have no automatic method for recovering
* from loss or corruption of it. We use a CRC so that we can detect
* corruption. To minimize the risk of failed updates, the map file should
- * be kept to no more than one standard-size disk sector (ie 512 bytes),
+ * be kept to no more than one standard-size disk sector (ie 1024 bytes),
* and we use overwrite-in-place rather than playing renaming games.
* The struct layout below is designed to occupy exactly 512 bytes, which
* might make filesystem updates a bit more efficient.
@@ -74,7 +74,7 @@
#define RELMAPPER_FILEMAGIC 0x592717 /* version ID value */
-#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
+#define MAX_MAPPINGS 62 /* 62 * 16 + 16 < 1024 */
typedef struct RelMapping
{
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e4fdb6b..5179642 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4842,16 +4842,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4869,7 +4869,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4883,7 +4883,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4891,7 +4891,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4904,7 +4904,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5d30b87..0f88cb2 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 265d829..4c4f03a 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index b3ad820..50e94df 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..9135ae6 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -213,6 +213,11 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ uint32 relnumbercount; /* relfilenumbers available before must do
+ * XLOG work */
+ XLogRecPtr lastRelFileNumberRecPtr; /* record pointer w.r.t. the last
+ * last logged relfilenumber */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +298,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..be6ba13 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,14 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
+
+#ifdef USE_ASSERT_CHECKING
+extern void AssertRelfileNumberFileNotExists(Oid spcoid,
+ RelFileNumber relnumber,
+ char relpersistence);
+#else
+#define AssertRelfileNumberFileNotExists(spcoid, relnumber, relpersistence) \
+ ((void)true)
+#endif
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..4768e5e 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..c3b5f91 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -39,17 +39,17 @@
*/
typedef enum ForkNumber
{
- InvalidForkNumber = -1,
MAIN_FORKNUM = 0,
FSM_FORKNUM,
VISIBILITYMAP_FORKNUM,
- INIT_FORKNUM
+ INIT_FORKNUM,
/*
* NOTE: if you add a new fork, change MAX_FORKNUM and possibly
* FORKNAMECHARS below, and update the forkNames array in
* src/common/relpath.c
*/
+ InvalidForkNumber = 255
} ForkNumber;
#define MAX_FORKNUM INIT_FORKNUM
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..d871d4d 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 40af078..e92cadc 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,33 +92,53 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first interger and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define BufTagGetRelNumber(a) ((a).relNumber)
+/* Number of bits to represent relNumber in relForkDetails[0] */
+#define BUFFERTAG_RELNUMBER_BITS 24
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFFERTAG_RELNUMBER_MASK ((1U << BUFFERTAG_RELNUMBER_BITS) - 1)
+
+#define BufTagGetRelNumber(a) \
+ (((((uint64) (a).relForkDetails[0] << 32) & BUFFERTAG_RELNUMBER_MASK) | \
+ ((uint32) (a).relForkDetails[1])))
-#define BufTagSetRelNumber(a, relnumber) \
+#define BufTagSetRelForkDetails(a, relnumber, forknum) \
( \
- (a).relNumber = (relnumber) \
+ (a).relForkDetails[0] = (((relnumber) >> 32) & BUFFERTAG_RELNUMBER_MASK) | \
+ ((forknum) << BUFFERTAG_RELNUMBER_BITS), \
+ (a).relForkDetails[1] = (relnumber) & 0xffffffff \
)
-#define BufTagGetForkNum(a) ((a).forkNum)
+#define BufTagGetForkNum(a) ((a).relForkDetails[0] >> BUFFERTAG_RELNUMBER_BITS)
#define BufTagGetRelFileLocator(a, locator) \
do { \
(locator).spcOid = (a).spcOid; \
(locator).dbOid = (a).dbOid; \
- (locator).relNumber = (a).relNumber; \
+ (locator).relNumber = BufTagGetRelNumber((a)); \
} while(0)
#define CLEAR_BUFFERTAG(a) \
( \
(a).spcOid = InvalidOid, \
(a).dbOid = InvalidOid, \
- BufTagSetRelNumber(a, InvalidRelFileNumber), \
- (a).forkNum = InvalidForkNumber, \
+ BufTagSetRelForkDetails(a, InvalidRelFileNumber, InvalidForkNumber), \
(a).blockNum = InvalidBlockNumber \
)
@@ -126,8 +146,7 @@ do { \
( \
(a).spcOid = (xx_rlocator).spcOid, \
(a).dbOid = (xx_rlocator).dbOid, \
- BufTagSetRelNumber(a, (xx_rlocator).relNumber), \
- (a).forkNum = (xx_forkNum), \
+ BufTagSetRelForkDetails(a, (xx_rlocator).relNumber, (xx_forkNum)), \
(a).blockNum = (xx_blockNum) \
)
@@ -135,16 +154,16 @@ do { \
( \
(a).spcOid == (b).spcOid && \
(a).dbOid == (b).dbOid && \
- (a).relNumber == (b).relNumber && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
+ (a).relForkDetails[0] == (b).relForkDetails[0] && \
+ (a).relForkDetails[1] == (b).relForkDetails[1] && \
+ (a).blockNum == (b).blockNum \
)
#define BufTagMatchesRelFileLocator(a, locator) \
( \
(a).spcOid == (locator).spcOid && \
(a).dbOid == (locator).dbOid && \
- (a).relNumber == (locator).relNumber \
+ BufTagGetRelNumber(a) == (locator).relNumber \
)
/*
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 5ede56d..6230fcb 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2556,7 +2554,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 52001e3..4190b12 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1638,7 +1636,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
On Mon, Jul 11, 2022 at 7:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
buf_init.c:119:4: error: implicit truncation from 'int' to bit-field
changes value from -1 to 255 [-Werror,-Wbitfield-constant-conversion]
CLEAR_BUFFERTAG(buf->tag);
^~~~~~~~~~~~~~~~~~~~~~~~~
../../../../src/include/storage/buf_internals.h:122:14: note: expanded
from macro 'CLEAR_BUFFERTAG'
(a).forkNum = InvalidForkNumber, \
^ ~~~~~~~~~~~~~~~~~
1 error generated.Hmm so now we are using an unsigned int field so IMHO we can make
InvalidForkNumber to 255 instead of -1?
If we're going to do that I think we had better do it as a separate,
preparatory patch.
It also makes me wonder why we're using macros rather than static
inline functions in buf_internals.h. I wonder whether we could do
something like this, for example, and keep InvalidForkNumber as -1:
static inline ForkNumber
BufTagGetForkNum(BufferTag *tagPtr)
{
int8 ret;
StaticAssertStmt(MAX_FORKNUM <= INT8_MAX);
ret = (int8) ((tagPtr->relForkDetails[0] >> BUFFERTAG_RELNUMBER_BITS);
return (ForkNumber) ret;
}
Even if we don't use that particular trick, I think we've generally
been moving toward using static inline functions rather than macros,
because it provides better type-safety and the code is often easier to
read. Maybe we should also approach it that way here. Or even commit a
preparatory patch replacing the existing macros with inline functions.
Or maybe it's best to leave it alone, not sure.
It feels like some of the changes to buf_internals.h in 0002 could be
moved into 0001. If we're going to introduce a combined method to set
the relnumber and fork, I think we could do that in 0001 rather than
making 0001 introduce a macro to set just the relfilenumber and then
having 0002 change it around again.
BUFFERTAG_RELNUMBER_BITS feels like a lie. It's defined to be 24, but
based on the name you'd expect it to be 56.
But we are already logging this if we are setting the relfilenumber which is out of the already logged range, am I missing something? Check this change. + relnumbercount = relnumber - ShmemVariableCache->nextRelFileNumber; + if (ShmemVariableCache->relnumbercount <= relnumbercount) + { + LogNextRelFileNumber(relnumber + VAR_RELNUMBER_PREFETCH, NULL); + ShmemVariableCache->relnumbercount = VAR_RELNUMBER_PREFETCH; + } + else + ShmemVariableCache->relnumbercount -= relnumbercount;
Oh, I guess I missed that.
I had those changes in v7-0003, now I have merged with 0002. This has
assert check while replaying the WAL for smgr create and smgr
truncate, and while during normal path when allocating the new
relfilenumber we are asserting for any existing file.
I think a test-and-elog might be better. Most users won't be running
assert-enabled builds, but this seems worth checking regardless.
I have done some performance tests, with very small values I can see a
lot of wait events for RelFileNumberGen but with bigger numbers like
256 or 512 it is not really bad. See results at the end of the
mail[1]
It's a little hard to interpret these results because you don't say
how often you were checking the wait events, or how often the
operation took to complete. I suppose we can guess the relative time
scale from the number of Activity events: if there were 190
WalWriterMain events observed, then the time to complete the operation
is probably 190 times how often you were checking the wait events, but
was that every second or every half second or every tenth of a second?
I have done these changes during GetNewRelFileNumber() this required
to track the last logged record pointer as well but I think this looks
clean. With this I can see some reduction in RelFileNumberGen wait
event[1]
I find the code you wrote here a little bit magical. I believe it
depends heavily on choosing to issue the new WAL record when we've
exhausted exactly 50% of the available space. I suggest having two
constants, one of which is the number of relfilenumber values per WAL
record, and the other of which is the threshold for issuing a new WAL
record. Maybe something like RFN_VALUES_PER_XLOG and
RFN_NEW_XLOG_THRESHOLD, or something. And then work code that works
correctly for any value of RFN_NEW_XLOG_THRESHOLD between 0 (don't log
new RFNs until old allocation is completely exhausted) and
RFN_VALUES_PER_XLOG - 1 (log new RFNs after using just 1 item from the
previous allocation). That way, if in the future someone decides to
change the constant values, they can do that and the code still works.
I am not sure what is the best solution here, but I agree that most of
the modern hardware will have bigger sector size than 512 so we can
just change file size of 1024.
I went looking for previous discussion of this topic. Here's Heikki
doubting whether even 512 is too big:
/messages/by-id/f03d9166-ad12-2a3c-f605-c1873ee86ae4@iki.fi
Here's Thomas saying that he thinks it's probably mostly 4kB these
days, except when it isn't:
/messages/by-id/CAEepm=1e91zMk-vZszCOGDtKd=DhMLQjgENRSxcbSEhxuEPpfA@mail.gmail.com
Here's Tom with another idea how to reduce space usage:
/messages/by-id/7235.1566626302@sss.pgh.pa.us
It doesn't look to me like there's a consensus that some bigger number is safe.
The current value of RELMAPPER_FILEMAGIC is 0x592717, I am not sure
how this version ID is decide is this some random magic number or
based on some logic?
Hmm, maybe we're not supposed to bump this value after all. I guess
maybe it's intended strictly as a magic number, rather than as a
version indicator. At least, we've never changed it up until now.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 2022-07-07 13:26:29 -0400, Robert Haas wrote:
We're trying to create a system where the relfilenumber counter is
always ahead of all the relfilenumbers used on disk, but the coupling
between the relfilenumber-advancement machinery and the
make-files-on-disk machinery is pretty loose, and so there is a risk
that bugs could escape detection. Whatever we can do to increase the
probability of noticing when things have gone wrong, and/or to notice
it quicker, will be good.
ISTM that we should redefine pg_class_tblspc_relfilenode_index to only cover
relfilenode - afaics there's no real connection to the tablespace
anymore. That'd a) reduce the size of the index b) guarantee uniqueness across
tablespaces.
I don't know where we could fit a sanity check that connects to all databases
and detects duplicates across all the pg_class instances. Perhaps pg_amcheck?
It may be worth changing RelidByRelfilenumber() / its infrastructure to not
use reltablespace anymore.
One thing we could think about doing here is try to stagger the xlog
and the flush. When we've used VAR_RELNUMBER_PREFETCH/2
relfilenumbers, log a record reserving VAR_RELNUMBER_PREFETCH from
where we are now, and remember the LSN. When we've used up our entire
previous allocation, XLogFlush() that record before allowing the
additional values to be used. The bookkeeping would be a bit more
complicated than currently, but I don't think it would be too bad. I'm
not sure how much it would actually help, though, or whether we need
it.
I think that's a very good idea. My concern around doing an XLogFlush() is
that it could lead to a lot of tiny f[data]syncs(), because not much else
needs to be written out. But the scheme you describe would likely lead the
XLogFlush() flushing plenty other WAL writes out, addressing that.
If new relfilenumbers are being used up really quickly, then maybe
the record won't get flushed into the background before we run out of
available numbers anyway, and if they aren't, then maybe it doesn't
matter. On the other hand, even one transaction commit between when
the record is logged and when we run out of the previous allocation is
enough to force a flush, at least with synchronous_commit=on, so maybe
the chances of being able to piggyback on an existing flush are not so
bad after all. I'm not sure.
Even if the record isn't yet flushed out by the time we need to, the
deferred-ness means that there's a good chance more useful records can also be
flushed out at the same time...
I notice that the patch makes no changes to relmapper.c, and I think
that's a problem. Notice in particular:#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
I believe that making RelFileNumber into a 64-bit value will cause the
8 in the calculation above to change to 16, defeating the intention
that the size of the file ought to be the smallest imaginable size of
a disk sector. It does seem like it would have been smart to include a
StaticAssertStmt in this file someplace that checks that the data
structure has the expected size, and now might be a good time, perhaps
in a separate patch, to add one.
+1
Perhaps MAX_MAPPINGS should be at least partially computed instead of doing
the math in a comment? sizeof(RelMapping) could directly be used, and we could
define SIZEOF_RELMAPFILE_START with a StaticAssert() enforcing it to be equal
to offsetof(RelMapFile, mappings), if we move crc & pad to *before* mappings -
afaics that should be entirely doable.
If we do nothing fancy here, the maximum number of mappings will have to be
reduced from 62 to 31, which is a problem because global/pg_filenode.map
currently has 48 entries. We could try to arrange to squeeze padding out of
the RelMapping struct, which would let us use just 12 bytes per mapping,
which would increase the limit to 41, but that's still less than we're using
already, never mind leaving room for future growth.
Ugh.
I don't know what to do about this exactly. I believe it's been
previously suggested that the actual minimum sector size on reasonably
modern hardware is never as small as 512 bytes, so maybe the file size
can just be increased to 1kB or something.
I'm not so sure that's a good idea - while the hardware sector size likely
isn't 512 on much storage anymore, it's still the size that most storage
protocols use. Which then means you need to be confident that you not just
rely on storage atomicity, but also that nothing in the
filesystem <-> block layer <-> driver
stack somehow causes a single larger write to be split up into two.
And if you use a filesystem with a smaller filesystem block size, there might
not even be a choice for the write to be split into two writes. E.g. XFS still
supports 512 byte blocks (although I think it's deprecating block size < 1024).
Maybe the easiest fix here would be to replace the file atomically. Then we
don't need this <= 512 byte stuff. These are done rarely enough that I don't
think the overhead of creating a separate file, fsyncing that, renaming,
fsyncing, would be a problem?
Greetings,
Andres Freund
On Mon, Jul 11, 2022 at 2:57 PM Andres Freund <andres@anarazel.de> wrote:
ISTM that we should redefine pg_class_tblspc_relfilenode_index to only cover
relfilenode - afaics there's no real connection to the tablespace
anymore. That'd a) reduce the size of the index b) guarantee uniqueness across
tablespaces.
Sounds like a good idea.
I don't know where we could fit a sanity check that connects to all databases
and detects duplicates across all the pg_class instances. Perhaps pg_amcheck?
Unless we're going to change the way CREATE DATABASE works, uniqueness
across databases is not guaranteed.
I think that's a very good idea. My concern around doing an XLogFlush() is
that it could lead to a lot of tiny f[data]syncs(), because not much else
needs to be written out. But the scheme you describe would likely lead the
XLogFlush() flushing plenty other WAL writes out, addressing that.
Oh, interesting. I hadn't considered that angle.
Maybe the easiest fix here would be to replace the file atomically. Then we
don't need this <= 512 byte stuff. These are done rarely enough that I don't
think the overhead of creating a separate file, fsyncing that, renaming,
fsyncing, would be a problem?
Anything we can reasonably do to reduce the number of places where
we're relying on things being <= 512 bytes seems like a step in the
right direction to me. It's very difficult to know whether such code
is correct, or what the probability is that crossing the 512-byte
boundary would break anything.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 2022-07-11 15:08:57 -0400, Robert Haas wrote:
On Mon, Jul 11, 2022 at 2:57 PM Andres Freund <andres@anarazel.de> wrote:
I don't know where we could fit a sanity check that connects to all databases
and detects duplicates across all the pg_class instances. Perhaps pg_amcheck?Unless we're going to change the way CREATE DATABASE works, uniqueness
across databases is not guaranteed.
You could likely address that by not flagging conflicts iff oid also matches?
Not sure if worth it, but ...
Maybe the easiest fix here would be to replace the file atomically. Then we
don't need this <= 512 byte stuff. These are done rarely enough that I don't
think the overhead of creating a separate file, fsyncing that, renaming,
fsyncing, would be a problem?Anything we can reasonably do to reduce the number of places where
we're relying on things being <= 512 bytes seems like a step in the
right direction to me. It's very difficult to know whether such code
is correct, or what the probability is that crossing the 512-byte
boundary would break anything.
Seems pretty simple to do. Have write_relmapper_file() write to a .tmp file
first (likely adding O_TRUNC to flags), use durable_rename() to rename it into
place. The tempfile should probably be written out before the XLogInsert(),
the durable_rename() after, although I think it'd also be correct to more
closely approximate the current sequence.
It's a lot more problematic to do this for the control file, because we can
end up updating that at a high frequency on standbys, due to minRecoveryPoint.
I have wondered about maintaining that in a dedicated file instead, and
perhaps even doing so on a primary.
Greetings,
Andres Freund
On Mon, Jul 11, 2022 at 3:34 PM Andres Freund <andres@anarazel.de> wrote:
Seems pretty simple to do. Have write_relmapper_file() write to a .tmp file
first (likely adding O_TRUNC to flags), use durable_rename() to rename it into
place. The tempfile should probably be written out before the XLogInsert(),
the durable_rename() after, although I think it'd also be correct to more
closely approximate the current sequence.
Something like this?
I chose not to use durable_rename() here, because that allowed me to
do more of the work before starting the critical section, and it's
probably slightly more efficient this way, too. That could be changed,
though, if you really want to stick with durable_rename().
I haven't done anything about actually making the file variable-length
here, either, which I think is what we would want to do. If this seems
more or less all right, I can work on that next.
--
Robert Haas
EDB: http://www.enterprisedb.com
Attachments:
v1-0001-rough-draft-of-removing-relmap-size-restriction.patchapplication/octet-stream; name=v1-0001-rough-draft-of-removing-relmap-size-restriction.patchDownload
From ff0e75ea7ff0e6eb791e6d60333de2c45790b4af Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Mon, 11 Jul 2022 16:01:27 -0400
Subject: [PATCH v1] rough draft of removing relmap size restriction
---
doc/src/sgml/monitoring.sgml | 6 +-
src/backend/utils/activity/wait_event.c | 3 +
src/backend/utils/cache/relmapper.c | 99 +++++++++++++++++--------
src/include/utils/wait_event.h | 1 +
4 files changed, 77 insertions(+), 32 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4549c2560e..2ba1c157ac 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1403,9 +1403,13 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry><literal>RelationMapRead</literal></entry>
<entry>Waiting for a read of the relation map file.</entry>
</row>
+ <row>
+ <entry><literal>RelationMapReplace</literal></entry>
+ <entry>Waiting for durable replacement of a relation map file.</entry>
+ </row>
<row>
<entry><literal>RelationMapSync</literal></entry>
- <entry>Waiting for the relation map file to reach durable storage.</entry>
+ <entry>Waiting for a temporary relation map file to reach durable storage.</entry>
</row>
<row>
<entry><literal>RelationMapWrite</literal></entry>
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 87c15b9c6f..9f3715fa31 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -630,6 +630,9 @@ pgstat_get_wait_io(WaitEventIO w)
case WAIT_EVENT_RELATION_MAP_READ:
event_name = "RelationMapRead";
break;
+ case WAIT_EVENT_RELATION_MAP_REPLACE:
+ event_name = "RelationMapReplace";
+ break;
case WAIT_EVENT_RELATION_MAP_SYNC:
event_name = "RelationMapSync";
break;
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 8e5595b468..977223fedf 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -71,6 +71,7 @@
* worth the trouble given the intended size of the mapping sets.
*/
#define RELMAPPER_FILENAME "pg_filenode.map"
+#define RELMAPPER_TEMP_FILENAME "pg_filenode.map.tmp"
#define RELMAPPER_FILEMAGIC 0x592717 /* version ID value */
@@ -877,6 +878,7 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
{
int fd;
char mapfilename[MAXPGPATH];
+ char maptempfilename[MAXPGPATH];
/*
* Fill in the overhead fields and update CRC.
@@ -890,17 +892,62 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
FIN_CRC32C(newmap->crc);
/*
- * Open the target file. We prefer to do this before entering the
- * critical section, so that an open() failure need not force PANIC.
+ * Construct filenames -- a temporary file that we'll create to write the
+ * data initially, and then the permanent name to which we will rename it.
*/
snprintf(mapfilename, sizeof(mapfilename), "%s/%s",
dbpath, RELMAPPER_FILENAME);
- fd = OpenTransientFile(mapfilename, O_WRONLY | O_CREAT | PG_BINARY);
+ snprintf(maptempfilename, sizeof(maptempfilename), "%s/%s",
+ dbpath, RELMAPPER_TEMP_FILENAME);
+
+ /*
+ * Open a temporary file. If a file already exists with this name, it must
+ * be left over from a previous crash, so we can overwrite it. Concurrent
+ * calls to this function are not allowed.
+ */
+ fd = OpenTransientFile(maptempfilename,
+ O_WRONLY | O_CREAT | O_TRUNC | PG_BINARY);
if (fd < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m",
- mapfilename)));
+ maptempfilename)));
+
+ /* Write new data to the file. */
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
+ if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
+ /* if write didn't set errno, assume problem is no disk space */
+ if (errno == 0)
+ errno = ENOSPC;
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ maptempfilename)));
+ }
+ pgstat_report_wait_end();
+
+ /*
+ * Make sure it's durably on disk.
+ *
+ * We must fsync() the parent directory too, to make sure that the new
+ * file can't vanish after a crash.
+ */
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_SYNC);
+ if (pg_fsync(fd) != 0)
+ ereport(data_sync_elevel(ERROR),
+ (errcode_for_file_access(),
+ errmsg("could not fsync file \"%s\": %m",
+ maptempfilename)));
+ fsync_fname_ext(dbpath, true, false, ERROR);
+ pgstat_report_wait_end();
+
+ /* And close the file. */
+ if (CloseTransientFile(fd) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not close file \"%s\": %m",
+ maptempfilename)));
if (write_wal)
{
@@ -924,40 +971,30 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
XLogFlush(lsn);
}
- errno = 0;
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
- if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
- {
- /* if write didn't set errno, assume problem is no disk space */
- if (errno == 0)
- errno = ENOSPC;
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not write file \"%s\": %m",
- mapfilename)));
- }
- pgstat_report_wait_end();
-
/*
- * We choose to fsync the data to disk before considering the task done.
- * It would be possible to relax this if it turns out to be a performance
- * issue, but it would complicate checkpointing --- see notes for
+ * We could use durable_rename() here and skip the calls to pg_fsync()
+ * and fsync_fname_ext() above, but by doing it this way, we minimize
+ * the amount of work that must be done in the critical section. We first
+ * rename the file, and then fsync it under the new name, and also the
+ * containing directory.
+ *
+ * It is possible that we could avoid waiting for fsync() here as well,
+ * but it would complicate checkpointing --- see notes for
* CheckPointRelationMap.
+ *
+ * NB: Although we instruct fsync_fname_ext() to use ERROR, we will often
+ * be in a critical section at this point; if so, ERROR will become PANIC.
*/
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_SYNC);
- if (pg_fsync(fd) != 0)
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_REPLACE);
+ if (rename(maptempfilename, mapfilename) < 0)
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
- errmsg("could not fsync file \"%s\": %m",
- mapfilename)));
+ errmsg("could not rename file \"%s\" to \"%s\": %m",
+ maptempfilename, mapfilename)));
+ fsync_fname_ext(mapfilename, false, false, ERROR);
+ fsync_fname_ext(dbpath, true, false, ERROR);
pgstat_report_wait_end();
- if (CloseTransientFile(fd) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not close file \"%s\": %m",
- mapfilename)));
-
/*
* Now that the file is safely on disk, send sinval message to let other
* backends know to re-read it. We must do this inside the critical
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index b578e2ec75..f570c6b7f3 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -193,6 +193,7 @@ typedef enum
WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE,
WAIT_EVENT_LOGICAL_REWRITE_WRITE,
WAIT_EVENT_RELATION_MAP_READ,
+ WAIT_EVENT_RELATION_MAP_REPLACE,
WAIT_EVENT_RELATION_MAP_SYNC,
WAIT_EVENT_RELATION_MAP_WRITE,
WAIT_EVENT_REORDER_BUFFER_READ,
--
2.24.3 (Apple Git-128)
On 2022-07-11 16:11:53 -0400, Robert Haas wrote:
On Mon, Jul 11, 2022 at 3:34 PM Andres Freund <andres@anarazel.de> wrote:
Seems pretty simple to do. Have write_relmapper_file() write to a .tmp file
first (likely adding O_TRUNC to flags), use durable_rename() to rename it into
place. The tempfile should probably be written out before the XLogInsert(),
the durable_rename() after, although I think it'd also be correct to more
closely approximate the current sequence.Something like this?
Yea. I've not looked carefully, but on a quick skim it looks good.
I chose not to use durable_rename() here, because that allowed me to
do more of the work before starting the critical section, and it's
probably slightly more efficient this way, too. That could be changed,
though, if you really want to stick with durable_rename().
I guess I'm not enthused in duplicating the necessary knowledge in evermore
places. We've forgotten one of the magic incantations in the past, and needing
to find all the places that need to be patched is a bit bothersome.
Perhaps we could add extract helpers out of durable_rename()?
OTOH, I don't really see what we gain by keeping things out of the critical
section? It does seem good to have the temp-file creation/truncation and write
separately, but after that I don't think it's worth much to avoid a
PANIC. What legitimate issue does it avoid?
Greetings,
Andres Freund
On Mon, Jul 11, 2022 at 7:22 PM Andres Freund <andres@anarazel.de> wrote:
I guess I'm not enthused in duplicating the necessary knowledge in evermore
places. We've forgotten one of the magic incantations in the past, and needing
to find all the places that need to be patched is a bit bothersome.Perhaps we could add extract helpers out of durable_rename()?
OTOH, I don't really see what we gain by keeping things out of the critical
section? It does seem good to have the temp-file creation/truncation and write
separately, but after that I don't think it's worth much to avoid a
PANIC. What legitimate issue does it avoid?
OK, so then I think we should just use durable_rename(). Here's a
patch that does it that way. I briefly considered the idea of
extracting helpers, but it doesn't seem worthwhile to me. There's not
that much code in durable_rename() in the first place.
In this version, I also removed the struct padding, changed the limit
on the number of entries to a nice round 64, and made some comment
updates. I considered trying to go further and actually make the file
variable-size, so that we never again need to worry about the limit on
the number of entries, but I don't actually think that's a good idea.
It would require substantially more changes to the code in this file,
and that means there's more risk of introducing bugs, and I don't see
that there's much value anyway, because if we ever do hit the current
limit, we can just raise the limit.
If we were going to split up durable_rename(), the only intelligible
split I can see would be to have a second version of the function, or
a flag to the existing function, that caters to the situation where
the old file is already known to have been fsync()'d.
--
Robert Haas
EDB: http://www.enterprisedb.com
Attachments:
v2-0001-Remove-the-restriction-that-the-relmap-must-be-51.patchapplication/octet-stream; name=v2-0001-Remove-the-restriction-that-the-relmap-must-be-51.patchDownload
From 64f65a23edf0133585c919ebeb7fad3b2950b45c Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 12 Jul 2022 09:04:44 -0400
Subject: [PATCH v2] Remove the restriction that the relmap must be 512 bytes.
Instead of relying on the ability to atomically overwrite the
entire relmap file in one shot, write a new one and durably
rename it into place. Removing the struct padding and the
calculation showing why the map is exactly 512 bytes, and change
the maximum number of entries to a nearby round number.
Patch by me, reviewed by Andres Freund.
Discussion: http://postgr.es/m/CA+TgmoacMgLv_0edhN=oWjnUvJyFjXww4Q4re4kfm+qkSBtjaQ@mail.gmail.com
---
doc/src/sgml/monitoring.sgml | 4 +-
src/backend/utils/activity/wait_event.c | 4 +-
src/backend/utils/cache/relmapper.c | 94 ++++++++++++++-----------
src/include/utils/wait_event.h | 2 +-
4 files changed, 58 insertions(+), 46 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4549c2560e..3c611a175f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1404,8 +1404,8 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting for a read of the relation map file.</entry>
</row>
<row>
- <entry><literal>RelationMapSync</literal></entry>
- <entry>Waiting for the relation map file to reach durable storage.</entry>
+ <entry><literal>RelationMapRename</literal></entry>
+ <entry>Waiting for durable replacement of a relation map file.</entry>
</row>
<row>
<entry><literal>RelationMapWrite</literal></entry>
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 87c15b9c6f..b5d4841d1e 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -630,8 +630,8 @@ pgstat_get_wait_io(WaitEventIO w)
case WAIT_EVENT_RELATION_MAP_READ:
event_name = "RelationMapRead";
break;
- case WAIT_EVENT_RELATION_MAP_SYNC:
- event_name = "RelationMapSync";
+ case WAIT_EVENT_RELATION_MAP_RENAME:
+ event_name = "RelationMapRename";
break;
case WAIT_EVENT_RELATION_MAP_WRITE:
event_name = "RelationMapWrite";
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 8e5595b468..17ce5f3946 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -60,21 +60,26 @@
/*
* The map file is critical data: we have no automatic method for recovering
* from loss or corruption of it. We use a CRC so that we can detect
- * corruption. To minimize the risk of failed updates, the map file should
- * be kept to no more than one standard-size disk sector (ie 512 bytes),
- * and we use overwrite-in-place rather than playing renaming games.
- * The struct layout below is designed to occupy exactly 512 bytes, which
- * might make filesystem updates a bit more efficient.
+ * corruption. Since the file might be more tha none standard-size disk
+ * sector in size, we cannot rely on overwrite-in-place. Instead, we generate
+ * a new file and rename it into place, atomically replacing the original file.
*
* Entries in the mappings[] array are in no particular order. We could
* speed searching by insisting on OID order, but it really shouldn't be
* worth the trouble given the intended size of the mapping sets.
*/
#define RELMAPPER_FILENAME "pg_filenode.map"
+#define RELMAPPER_TEMP_FILENAME "pg_filenode.map.tmp"
#define RELMAPPER_FILEMAGIC 0x592717 /* version ID value */
-#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
+/*
+ * There's no need for this constant to have any particular value, and we
+ * can raise it as necessary if we end up with more mapped relations. For
+ * now, we just pick a round number that is modestly larger thn the expected
+ * number of mappings.
+ */
+#define MAX_MAPPINGS 64
typedef struct RelMapping
{
@@ -88,7 +93,6 @@ typedef struct RelMapFile
int32 num_mappings; /* number of valid RelMapping entries */
RelMapping mappings[MAX_MAPPINGS];
pg_crc32c crc; /* CRC of all above */
- int32 pad; /* to make the struct size be 512 exactly */
} RelMapFile;
/*
@@ -877,6 +881,7 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
{
int fd;
char mapfilename[MAXPGPATH];
+ char maptempfilename[MAXPGPATH];
/*
* Fill in the overhead fields and update CRC.
@@ -890,17 +895,47 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
FIN_CRC32C(newmap->crc);
/*
- * Open the target file. We prefer to do this before entering the
- * critical section, so that an open() failure need not force PANIC.
+ * Construct filenames -- a temporary file that we'll create to write the
+ * data initially, and then the permanent name to which we will rename it.
*/
snprintf(mapfilename, sizeof(mapfilename), "%s/%s",
dbpath, RELMAPPER_FILENAME);
- fd = OpenTransientFile(mapfilename, O_WRONLY | O_CREAT | PG_BINARY);
+ snprintf(maptempfilename, sizeof(maptempfilename), "%s/%s",
+ dbpath, RELMAPPER_TEMP_FILENAME);
+
+ /*
+ * Open a temporary file. If a file already exists with this name, it must
+ * be left over from a previous crash, so we can overwrite it. Concurrent
+ * calls to this function are not allowed.
+ */
+ fd = OpenTransientFile(maptempfilename,
+ O_WRONLY | O_CREAT | O_TRUNC | PG_BINARY);
if (fd < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m",
- mapfilename)));
+ maptempfilename)));
+
+ /* Write new data to the file. */
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
+ if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
+ /* if write didn't set errno, assume problem is no disk space */
+ if (errno == 0)
+ errno = ENOSPC;
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ maptempfilename)));
+ }
+ pgstat_report_wait_end();
+
+ /* And close the file. */
+ if (CloseTransientFile(fd) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not close file \"%s\": %m",
+ maptempfilename)));
if (write_wal)
{
@@ -924,40 +959,17 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
XLogFlush(lsn);
}
- errno = 0;
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
- if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
- {
- /* if write didn't set errno, assume problem is no disk space */
- if (errno == 0)
- errno = ENOSPC;
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not write file \"%s\": %m",
- mapfilename)));
- }
- pgstat_report_wait_end();
-
/*
- * We choose to fsync the data to disk before considering the task done.
- * It would be possible to relax this if it turns out to be a performance
- * issue, but it would complicate checkpointing --- see notes for
- * CheckPointRelationMap.
+ * durable_rename() does all the hard work of making sure that we rename
+ * the temporary file into place in a crash-safe manner.
+ *
+ * NB: Although we instruct durable_rename() to use ERROR, we will often
+ * be in a critical section at this point; if so, ERROR will become PANIC.
*/
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_SYNC);
- if (pg_fsync(fd) != 0)
- ereport(data_sync_elevel(ERROR),
- (errcode_for_file_access(),
- errmsg("could not fsync file \"%s\": %m",
- mapfilename)));
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_RENAME);
+ durable_rename(maptempfilename, mapfilename, ERROR);
pgstat_report_wait_end();
- if (CloseTransientFile(fd) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not close file \"%s\": %m",
- mapfilename)));
-
/*
* Now that the file is safely on disk, send sinval message to let other
* backends know to re-read it. We must do this inside the critical
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index b578e2ec75..5d3775ccde 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -193,7 +193,7 @@ typedef enum
WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE,
WAIT_EVENT_LOGICAL_REWRITE_WRITE,
WAIT_EVENT_RELATION_MAP_READ,
- WAIT_EVENT_RELATION_MAP_SYNC,
+ WAIT_EVENT_RELATION_MAP_RENAME,
WAIT_EVENT_RELATION_MAP_WRITE,
WAIT_EVENT_REORDER_BUFFER_READ,
WAIT_EVENT_REORDER_BUFFER_WRITE,
--
2.24.3 (Apple Git-128)
On Mon, Jul 11, 2022 at 9:49 PM Robert Haas <robertmhaas@gmail.com> wrote:
It also makes me wonder why we're using macros rather than static
inline functions in buf_internals.h. I wonder whether we could do
something like this, for example, and keep InvalidForkNumber as -1:static inline ForkNumber
BufTagGetForkNum(BufferTag *tagPtr)
{
int8 ret;StaticAssertStmt(MAX_FORKNUM <= INT8_MAX);
ret = (int8) ((tagPtr->relForkDetails[0] >> BUFFERTAG_RELNUMBER_BITS);
return (ForkNumber) ret;
}Even if we don't use that particular trick, I think we've generally
been moving toward using static inline functions rather than macros,
because it provides better type-safety and the code is often easier to
read. Maybe we should also approach it that way here. Or even commit a
preparatory patch replacing the existing macros with inline functions.
Or maybe it's best to leave it alone, not sure.
I think it make sense to convert existing macros as well, I have
attached a patch for the same,
I had those changes in v7-0003, now I have merged with 0002. This has
assert check while replaying the WAL for smgr create and smgr
truncate, and while during normal path when allocating the new
relfilenumber we are asserting for any existing file.I think a test-and-elog might be better. Most users won't be running
assert-enabled builds, but this seems worth checking regardless.
IMHO the recovery time asserts we can convert to elog but one which we
are doing after each GetNewRelFileNumber is better to keep as an
assert as we are doing the file access so it can be costly?
I have done some performance tests, with very small values I can see a
lot of wait events for RelFileNumberGen but with bigger numbers like
256 or 512 it is not really bad. See results at the end of the
mail[1]It's a little hard to interpret these results because you don't say
how often you were checking the wait events, or how often the
operation took to complete. I suppose we can guess the relative time
scale from the number of Activity events: if there were 190
WalWriterMain events observed, then the time to complete the operation
is probably 190 times how often you were checking the wait events, but
was that every second or every half second or every tenth of a second?
I am executing it after every 0.5 sec using below script in psql
\t
select wait_event_type, wait_event from pg_stat_activity where pid !=
pg_backend_pid()
\watch 0.5
And running test for 60 sec
./pgbench -c 32 -j 32 -T 60 -f create_script.sql -p 54321 postgres
$ cat create_script.sql
select create_table(100);
// function body 'create_table'
CREATE OR REPLACE FUNCTION create_table(count int) RETURNS void AS $$
DECLARE
relname varchar;
pid int;
i int;
BEGIN
SELECT pg_backend_pid() INTO pid;
relname := 'test_' || pid;
FOR i IN 1..count LOOP
EXECUTE format('CREATE TABLE %s(a int)', relname);
EXECUTE format('DROP TABLE %s', relname);
END LOOP;
END;
$$ LANGUAGE plpgsql;
I have done these changes during GetNewRelFileNumber() this required
to track the last logged record pointer as well but I think this looks
clean. With this I can see some reduction in RelFileNumberGen wait
event[1]I find the code you wrote here a little bit magical. I believe it
depends heavily on choosing to issue the new WAL record when we've
exhausted exactly 50% of the available space. I suggest having two
constants, one of which is the number of relfilenumber values per WAL
record, and the other of which is the threshold for issuing a new WAL
record. Maybe something like RFN_VALUES_PER_XLOG and
RFN_NEW_XLOG_THRESHOLD, or something. And then work code that works
correctly for any value of RFN_NEW_XLOG_THRESHOLD between 0 (don't log
new RFNs until old allocation is completely exhausted) and
RFN_VALUES_PER_XLOG - 1 (log new RFNs after using just 1 item from the
previous allocation). That way, if in the future someone decides to
change the constant values, they can do that and the code still works.
ok
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v1-0001-Convert-buf_internal.h-macros-to-static-inline-fu.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Convert-buf_internal.h-macros-to-static-inline-fu.patchDownload
From 553a6b185ee7270bf206387421e50b48c7a8b97b Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 12 Jul 2022 17:10:04 +0530
Subject: [PATCH v1] Convert buf_internal.h macros to static inline functions
Readability wise inline functions are better compared to macros and this
will also help to write cleaner and readable code for 64-bit relfilenode
because as part of that patch we will have to do some complex bitwise
operation so doing that inside the inline function will be cleaner.
---
src/backend/storage/buffer/buf_init.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 16 ++--
src/backend/storage/buffer/localbuf.c | 12 +--
src/include/storage/buf_internals.h | 156 +++++++++++++++++++++-------------
4 files changed, 111 insertions(+), 75 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 2862e9e..55f646d 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -116,7 +116,7 @@ InitBufferPool(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
pg_atomic_init_u32(&buf->state, 0);
buf->wait_backend_pgprocno = INVALID_PGPROCNO;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e257ae2..0ec5891 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
+ INIT_BUFFERTAG(&newTag, smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
+ INIT_BUFFERTAG(&tag, rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -640,7 +640,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
buf_state = pg_atomic_read_u32(&bufHdr->state);
/* Is it still valid and holding the right tag? */
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/* Bump local buffer's ref and usage counts. */
ResourceOwnerRememberBuffer(CurrentResourceOwner, recent_buffer);
@@ -669,7 +669,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
else
buf_state = LockBufHdr(bufHdr);
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/*
* It's now safe to pin the buffer. We can't pin first and ask
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1507,7 +1507,7 @@ retry:
buf_state = LockBufHdr(buf);
/* If it's changed while we were waiting for lock, do nothing */
- if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+ if (!BUFFERTAGS_EQUAL(&buf->tag, &oldTag))
{
UnlockBufHdr(buf, buf_state);
LWLockRelease(oldPartitionLock);
@@ -1539,7 +1539,7 @@ retry:
* linear scans of the buffer array don't think the buffer is valid.
*/
oldFlags = buf_state & BUF_FLAG_MASK;
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
UnlockBufHdr(buf, buf_state);
@@ -3356,7 +3356,7 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
+ INIT_BUFFERTAG(&bufTag, rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 41a0807..6618ef1 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -131,7 +131,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
b = hresult->id;
bufHdr = GetLocalBufferDescriptor(b);
- Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
+ Assert(BUFFERTAGS_EQUAL(&bufHdr->tag, &newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
@@ -253,7 +253,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* mark buffer invalid just in case hash insert fails */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~(BM_VALID | BM_TAG_VALID);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
@@ -354,7 +354,7 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -398,7 +398,7 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index aded5e8..d8ddb54 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -95,28 +95,32 @@ typedef struct buftag
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define CLEAR_BUFFERTAG(a) \
-( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
- (a).forkNum = InvalidForkNumber, \
- (a).blockNum = InvalidBlockNumber \
-)
-
-#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
-( \
- (a).rlocator = (xx_rlocator), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
-)
-
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
-)
+static inline void
+CLEAR_BUFFERTAG(BufferTag *tag)
+{
+ tag->rlocator.spcOid = InvalidOid;
+ tag->rlocator.dbOid = InvalidOid;
+ tag->rlocator.relNumber = InvalidRelFileNumber;
+ tag->forkNum = InvalidForkNumber;
+ tag->blockNum = InvalidBlockNumber;
+}
+
+static inline void
+INIT_BUFFERTAG(BufferTag *tag, RelFileLocator rlocator,
+ ForkNumber forkNum, BlockNumber blockNum)
+{
+ tag->rlocator = rlocator;
+ tag->forkNum = forkNum;
+ tag->blockNum = blockNum;
+}
+
+static inline bool
+BUFFERTAGS_EQUAL(BufferTag *tag1, BufferTag *tag2)
+{
+ return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ (tag1->blockNum == tag2->blockNum) &&
+ (tag1->forkNum == tag2->forkNum);
+}
/*
* The shared buffer mapping table is partitioned to reduce contention.
@@ -124,13 +128,24 @@ typedef struct buftag
* hash code with BufTableHashCode(), then apply BufMappingPartitionLock().
* NB: NUM_BUFFER_PARTITIONS must be a power of 2!
*/
-#define BufTableHashPartition(hashcode) \
- ((hashcode) % NUM_BUFFER_PARTITIONS)
-#define BufMappingPartitionLock(hashcode) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + \
- BufTableHashPartition(hashcode)].lock)
-#define BufMappingPartitionLockByIndex(i) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + (i)].lock)
+static inline uint32
+BufTableHashPartition(uint32 hashcode)
+{
+ return hashcode % NUM_BUFFER_PARTITIONS;
+}
+
+static inline LWLock *
+BufMappingPartitionLock(uint32 hashcode)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET +
+ BufTableHashPartition(hashcode)].lock;
+}
+
+static inline LWLock *
+BufMappingPartitionLockByIndex(uint32 index)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + index].lock;
+}
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
@@ -220,37 +235,6 @@ typedef union BufferDescPadded
char pad[BUFFERDESC_PAD_TO_SIZE];
} BufferDescPadded;
-#define GetBufferDescriptor(id) (&BufferDescriptors[(id)].bufferdesc)
-#define GetLocalBufferDescriptor(id) (&LocalBufferDescriptors[(id)])
-
-#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
-
-#define BufferDescriptorGetIOCV(bdesc) \
- (&(BufferIOCVArray[(bdesc)->buf_id]).cv)
-#define BufferDescriptorGetContentLock(bdesc) \
- ((LWLock*) (&(bdesc)->content_lock))
-
-extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
-
-/*
- * The freeNext field is either the index of the next freelist entry,
- * or one of these special values:
- */
-#define FREENEXT_END_OF_LIST (-1)
-#define FREENEXT_NOT_IN_LIST (-2)
-
-/*
- * Functions for acquiring/releasing a shared buffer header's spinlock. Do
- * not apply these to local buffers!
- */
-extern uint32 LockBufHdr(BufferDesc *desc);
-#define UnlockBufHdr(desc, s) \
- do { \
- pg_write_barrier(); \
- pg_atomic_write_u32(&(desc)->state, (s) & (~BM_LOCKED)); \
- } while (0)
-
-
/*
* The PendingWriteback & WritebackContext structure are used to keep
* information about pending flush requests to be issued to the OS.
@@ -276,11 +260,63 @@ typedef struct WritebackContext
/* in buf_init.c */
extern PGDLLIMPORT BufferDescPadded *BufferDescriptors;
+extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
extern PGDLLIMPORT WritebackContext BackendWritebackContext;
/* in localbuf.c */
extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
+
+static inline BufferDesc *
+GetBufferDescriptor(uint32 id)
+{
+ return &(BufferDescriptors[id]).bufferdesc;
+}
+
+static inline BufferDesc *
+GetLocalBufferDescriptor(uint32 id)
+{
+ return &LocalBufferDescriptors[id];
+}
+
+static inline Buffer
+BufferDescriptorGetBuffer(BufferDesc *bdesc)
+{
+ return (Buffer) (bdesc->buf_id + 1);
+}
+
+static inline ConditionVariable *
+BufferDescriptorGetIOCV(BufferDesc *bdesc)
+{
+ return &(BufferIOCVArray[bdesc->buf_id]).cv;
+}
+
+static inline LWLock *
+BufferDescriptorGetContentLock(BufferDesc *bdesc)
+{
+ return (LWLock *) (&bdesc->content_lock);
+}
+
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST (-1)
+#define FREENEXT_NOT_IN_LIST (-2)
+
+/*
+ * Functions for acquiring/releasing a shared buffer header's spinlock. Do
+ * not apply these to local buffers!
+ */
+extern uint32 LockBufHdr(BufferDesc *desc);
+
+static inline void
+UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
+{
+ pg_write_barrier();
+ pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
+}
+
/* in bufmgr.c */
/*
--
1.8.3.1
Hi,
On 2022-07-12 09:51:12 -0400, Robert Haas wrote:
On Mon, Jul 11, 2022 at 7:22 PM Andres Freund <andres@anarazel.de> wrote:
I guess I'm not enthused in duplicating the necessary knowledge in evermore
places. We've forgotten one of the magic incantations in the past, and needing
to find all the places that need to be patched is a bit bothersome.Perhaps we could add extract helpers out of durable_rename()?
OTOH, I don't really see what we gain by keeping things out of the critical
section? It does seem good to have the temp-file creation/truncation and write
separately, but after that I don't think it's worth much to avoid a
PANIC. What legitimate issue does it avoid?OK, so then I think we should just use durable_rename(). Here's a
patch that does it that way. I briefly considered the idea of
extracting helpers, but it doesn't seem worthwhile to me. There's not
that much code in durable_rename() in the first place.
Cool.
In this version, I also removed the struct padding, changed the limit
on the number of entries to a nice round 64, and made some comment
updates.
What does currently happen if we exceed that?
I wonder if we should just reference a new define generated by genbki.pl
documenting the number of relations that need to be tracked. Then we don't
need to maintain this manually going forward.
I considered trying to go further and actually make the file
variable-size, so that we never again need to worry about the limit on
the number of entries, but I don't actually think that's a good idea.
Yea, I don't really see what we'd gain. For this stuff to change we need to
recompile anyway.
If we were going to split up durable_rename(), the only intelligible
split I can see would be to have a second version of the function, or
a flag to the existing function, that caters to the situation where
the old file is already known to have been fsync()'d.
I was thinking of something like durable_rename_prep() that'd fsync the
file/directories under their old names, and then durable_rename_exec() that
actually renames and then fsyncs. But without a clear usecase...
+ /* Write new data to the file. */ + pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE); + if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
...
+ pgstat_report_wait_end();
+
Not for this patch, but we eventually should move this sequence into a
wrapper. Perhaps combined with retry handling for short writes, the ENOSPC
stuff and an error message when the write fails. It's a bit insane how many
copies of this we have.
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h index b578e2ec75..5d3775ccde 100644 --- a/src/include/utils/wait_event.h +++ b/src/include/utils/wait_event.h @@ -193,7 +193,7 @@ typedef enum WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE, WAIT_EVENT_LOGICAL_REWRITE_WRITE, WAIT_EVENT_RELATION_MAP_READ, - WAIT_EVENT_RELATION_MAP_SYNC, + WAIT_EVENT_RELATION_MAP_RENAME,
Very minor nitpick: To me REPLACE would be a bit more accurate than RENAME,
since it includes fsync etc?
Greetings,
Andres Freund
On Tue, Jul 12, 2022 at 1:09 PM Andres Freund <andres@anarazel.de> wrote:
What does currently happen if we exceed that?
elog
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h index b578e2ec75..5d3775ccde 100644 --- a/src/include/utils/wait_event.h +++ b/src/include/utils/wait_event.h @@ -193,7 +193,7 @@ typedef enum WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE, WAIT_EVENT_LOGICAL_REWRITE_WRITE, WAIT_EVENT_RELATION_MAP_READ, - WAIT_EVENT_RELATION_MAP_SYNC, + WAIT_EVENT_RELATION_MAP_RENAME,Very minor nitpick: To me REPLACE would be a bit more accurate than RENAME,
since it includes fsync etc?
Sure, I had it that way for a while and changed it at the last minute.
I can change it back.
--
Robert Haas
EDB: http://www.enterprisedb.com
Re: staticAssertStmt(MAX_FORKNUM <= INT8_MAX);
Have you really thought through making the ForkNum 8-bit ?
For example this would limit a columnar storage with each column
stored in it's own fork (which I'd say is not entirely unreasonable)
to having just about ~250 columns.
And there can easily be other use cases where we do not want to limit
number of forks so much
Cheers
Hannu
Show quoted text
On Tue, Jul 12, 2022 at 10:36 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jul 12, 2022 at 1:09 PM Andres Freund <andres@anarazel.de> wrote:
What does currently happen if we exceed that?
elog
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h index b578e2ec75..5d3775ccde 100644 --- a/src/include/utils/wait_event.h +++ b/src/include/utils/wait_event.h @@ -193,7 +193,7 @@ typedef enum WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE, WAIT_EVENT_LOGICAL_REWRITE_WRITE, WAIT_EVENT_RELATION_MAP_READ, - WAIT_EVENT_RELATION_MAP_SYNC, + WAIT_EVENT_RELATION_MAP_RENAME,Very minor nitpick: To me REPLACE would be a bit more accurate than RENAME,
since it includes fsync etc?Sure, I had it that way for a while and changed it at the last minute.
I can change it back.--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
Please don't top quote - as mentioned a couple times recently.
On 2022-07-12 23:00:22 +0200, Hannu Krosing wrote:
Re: staticAssertStmt(MAX_FORKNUM <= INT8_MAX);
Have you really thought through making the ForkNum 8-bit ?
MAX_FORKNUM is way lower right now. And hardcoded. So this doesn't imply a new
restriction. As we iterate over 0..MAX_FORKNUM in a bunch of places (with
filesystem access each time), it's not feasible to make that number large.
Greetings,
Andres Freund
On Tue, Jul 12, 2022 at 6:02 PM Andres Freund <andres@anarazel.de> wrote:
MAX_FORKNUM is way lower right now. And hardcoded. So this doesn't imply a new
restriction. As we iterate over 0..MAX_FORKNUM in a bunch of places (with
filesystem access each time), it's not feasible to make that number large.
Yeah. TBH, what I'd really like to do is kill the entire fork system
with fire and replace it with something more scalable, which would
maybe permit the sort of thing Hannu suggests here. With the current
system, forget it.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jul 12, 2022 at 7:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
In this version, I also removed the struct padding, changed the limit
on the number of entries to a nice round 64, and made some comment
updates. I considered trying to go further and actually make the file
variable-size, so that we never again need to worry about the limit on
the number of entries, but I don't actually think that's a good idea.
It would require substantially more changes to the code in this file,
and that means there's more risk of introducing bugs, and I don't see
that there's much value anyway, because if we ever do hit the current
limit, we can just raise the limit.If we were going to split up durable_rename(), the only intelligible
split I can see would be to have a second version of the function, or
a flag to the existing function, that caters to the situation where
the old file is already known to have been fsync()'d.
The patch looks good except one minor comment
+ * corruption. Since the file might be more tha none standard-size disk
+ * sector in size, we cannot rely on overwrite-in-place. Instead, we generate
typo "more tha none" -> "more than one"
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 13, 2022 at 9:35 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 12, 2022 at 7:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
In this version, I also removed the struct padding, changed the limit
on the number of entries to a nice round 64, and made some comment
updates. I considered trying to go further and actually make the file
variable-size, so that we never again need to worry about the limit on
the number of entries, but I don't actually think that's a good idea.
It would require substantially more changes to the code in this file,
and that means there's more risk of introducing bugs, and I don't see
that there's much value anyway, because if we ever do hit the current
limit, we can just raise the limit.If we were going to split up durable_rename(), the only intelligible
split I can see would be to have a second version of the function, or
a flag to the existing function, that caters to the situation where
the old file is already known to have been fsync()'d.The patch looks good except one minor comment
+ * corruption. Since the file might be more tha none standard-size disk + * sector in size, we cannot rely on overwrite-in-place. Instead, we generatetypo "more tha none" -> "more than one"
I have fixed this and included this change in the new patch series.
Apart from this I have fixed all the pending issues that includes
- Change existing macros to inline functions done in 0001.
- Change pg_class index from (tbspc, relfilenode) to relfilenode and
also change RelidByRelfilenumber(). In RelidByRelfilenumber I have
changed the hash to maintain based on just the relfilenumber but we
still need to pass the tablespace to identify whether it is a shared
relation or not. If we want we can make it bool but I don't think
that is really needed here.
- Changed logic of GetNewRelFileNumber() based on what Robert
described, and instead of tracking the pending logged relnumbercount
now I am tracking last loggedRelNumber, which help little bit in
SetNextRelFileNumber in making code cleaner, but otherwise it doesn't
make much difference.
- Some new asserts in buf_internal inline function to validate value
of computed/input relfilenumber.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v9-0001-Convert-buf_internal.h-macros-to-static-inline-fu.patchtext/x-patch; charset=US-ASCII; name=v9-0001-Convert-buf_internal.h-macros-to-static-inline-fu.patchDownload
From 117afa75ece07f300624c7721f790ab467835b29 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 12 Jul 2022 17:10:04 +0530
Subject: [PATCH v9 1/4] Convert buf_internal.h macros to static inline
functions
Readability wise inline functions are better compared to macros and this
will also help to write cleaner and readable code for 64-bit relfilenode
because as part of that patch we will have to do some complex bitwise
operation so doing that inside the inline function will be cleaner.
---
src/backend/storage/buffer/buf_init.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 16 ++--
src/backend/storage/buffer/localbuf.c | 12 +--
src/include/storage/buf_internals.h | 156 +++++++++++++++++++++-------------
4 files changed, 111 insertions(+), 75 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 2862e9e..55f646d 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -116,7 +116,7 @@ InitBufferPool(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
pg_atomic_init_u32(&buf->state, 0);
buf->wait_backend_pgprocno = INVALID_PGPROCNO;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index c7d7abc..24d894e 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
+ INIT_BUFFERTAG(&newTag, &smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
+ INIT_BUFFERTAG(&tag, &rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -640,7 +640,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
buf_state = pg_atomic_read_u32(&bufHdr->state);
/* Is it still valid and holding the right tag? */
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/* Bump local buffer's ref and usage counts. */
ResourceOwnerRememberBuffer(CurrentResourceOwner, recent_buffer);
@@ -669,7 +669,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
else
buf_state = LockBufHdr(bufHdr);
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/*
* It's now safe to pin the buffer. We can't pin first and ask
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1507,7 +1507,7 @@ retry:
buf_state = LockBufHdr(buf);
/* If it's changed while we were waiting for lock, do nothing */
- if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+ if (!BUFFERTAGS_EQUAL(&buf->tag, &oldTag))
{
UnlockBufHdr(buf, buf_state);
LWLockRelease(oldPartitionLock);
@@ -1539,7 +1539,7 @@ retry:
* linear scans of the buffer array don't think the buffer is valid.
*/
oldFlags = buf_state & BUF_FLAG_MASK;
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
UnlockBufHdr(buf, buf_state);
@@ -3355,7 +3355,7 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
+ INIT_BUFFERTAG(&bufTag, &rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 9c03885..91e174e 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -131,7 +131,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
b = hresult->id;
bufHdr = GetLocalBufferDescriptor(b);
- Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
+ Assert(BUFFERTAGS_EQUAL(&bufHdr->tag, &newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
@@ -253,7 +253,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* mark buffer invalid just in case hash insert fails */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~(BM_VALID | BM_TAG_VALID);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
@@ -354,7 +354,7 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -398,7 +398,7 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 69e4590..2f0e60e 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -95,28 +95,32 @@ typedef struct buftag
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define CLEAR_BUFFERTAG(a) \
-( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
- (a).forkNum = InvalidForkNumber, \
- (a).blockNum = InvalidBlockNumber \
-)
-
-#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
-( \
- (a).rlocator = (xx_rlocator), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
-)
-
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
-)
+static inline void
+CLEAR_BUFFERTAG(BufferTag *tag)
+{
+ tag->rlocator.spcOid = InvalidOid;
+ tag->rlocator.dbOid = InvalidOid;
+ tag->rlocator.relNumber = InvalidRelFileNumber;
+ tag->forkNum = InvalidForkNumber;
+ tag->blockNum = InvalidBlockNumber;
+}
+
+static inline void
+INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
+ ForkNumber forkNum, BlockNumber blockNum)
+{
+ tag->rlocator = *rlocator;
+ tag->forkNum = forkNum;
+ tag->blockNum = blockNum;
+}
+
+static inline bool
+BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
+{
+ return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ (tag1->blockNum == tag2->blockNum) &&
+ (tag1->forkNum == tag2->forkNum);
+}
/*
* The shared buffer mapping table is partitioned to reduce contention.
@@ -124,13 +128,24 @@ typedef struct buftag
* hash code with BufTableHashCode(), then apply BufMappingPartitionLock().
* NB: NUM_BUFFER_PARTITIONS must be a power of 2!
*/
-#define BufTableHashPartition(hashcode) \
- ((hashcode) % NUM_BUFFER_PARTITIONS)
-#define BufMappingPartitionLock(hashcode) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + \
- BufTableHashPartition(hashcode)].lock)
-#define BufMappingPartitionLockByIndex(i) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + (i)].lock)
+static inline uint32
+BufTableHashPartition(uint32 hashcode)
+{
+ return hashcode % NUM_BUFFER_PARTITIONS;
+}
+
+static inline LWLock *
+BufMappingPartitionLock(uint32 hashcode)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET +
+ BufTableHashPartition(hashcode)].lock;
+}
+
+static inline LWLock *
+BufMappingPartitionLockByIndex(uint32 index)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + index].lock;
+}
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
@@ -220,37 +235,6 @@ typedef union BufferDescPadded
char pad[BUFFERDESC_PAD_TO_SIZE];
} BufferDescPadded;
-#define GetBufferDescriptor(id) (&BufferDescriptors[(id)].bufferdesc)
-#define GetLocalBufferDescriptor(id) (&LocalBufferDescriptors[(id)])
-
-#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
-
-#define BufferDescriptorGetIOCV(bdesc) \
- (&(BufferIOCVArray[(bdesc)->buf_id]).cv)
-#define BufferDescriptorGetContentLock(bdesc) \
- ((LWLock*) (&(bdesc)->content_lock))
-
-extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
-
-/*
- * The freeNext field is either the index of the next freelist entry,
- * or one of these special values:
- */
-#define FREENEXT_END_OF_LIST (-1)
-#define FREENEXT_NOT_IN_LIST (-2)
-
-/*
- * Functions for acquiring/releasing a shared buffer header's spinlock. Do
- * not apply these to local buffers!
- */
-extern uint32 LockBufHdr(BufferDesc *desc);
-#define UnlockBufHdr(desc, s) \
- do { \
- pg_write_barrier(); \
- pg_atomic_write_u32(&(desc)->state, (s) & (~BM_LOCKED)); \
- } while (0)
-
-
/*
* The PendingWriteback & WritebackContext structure are used to keep
* information about pending flush requests to be issued to the OS.
@@ -276,11 +260,63 @@ typedef struct WritebackContext
/* in buf_init.c */
extern PGDLLIMPORT BufferDescPadded *BufferDescriptors;
+extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
extern PGDLLIMPORT WritebackContext BackendWritebackContext;
/* in localbuf.c */
extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
+
+static inline BufferDesc *
+GetBufferDescriptor(uint32 id)
+{
+ return &(BufferDescriptors[id]).bufferdesc;
+}
+
+static inline BufferDesc *
+GetLocalBufferDescriptor(uint32 id)
+{
+ return &LocalBufferDescriptors[id];
+}
+
+static inline Buffer
+BufferDescriptorGetBuffer(const BufferDesc *bdesc)
+{
+ return (Buffer) (bdesc->buf_id + 1);
+}
+
+static inline ConditionVariable *
+BufferDescriptorGetIOCV(const BufferDesc *bdesc)
+{
+ return &(BufferIOCVArray[bdesc->buf_id]).cv;
+}
+
+static inline LWLock *
+BufferDescriptorGetContentLock(const BufferDesc *bdesc)
+{
+ return (LWLock *) (&bdesc->content_lock);
+}
+
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST (-1)
+#define FREENEXT_NOT_IN_LIST (-2)
+
+/*
+ * Functions for acquiring/releasing a shared buffer header's spinlock. Do
+ * not apply these to local buffers!
+ */
+extern uint32 LockBufHdr(BufferDesc *desc);
+
+static inline void
+UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
+{
+ pg_write_barrier();
+ pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
+}
+
/* in bufmgr.c */
/*
--
1.8.3.1
v9-0002-Preliminary-refactoring-for-supporting-larger-rel.patchtext/x-patch; charset=US-ASCII; name=v9-0002-Preliminary-refactoring-for-supporting-larger-rel.patchDownload
From c7e1f721564724228a47dc952f1fd88038d08457 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Wed, 13 Jul 2022 13:12:53 +0530
Subject: [PATCH v9 2/4] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 +-
src/backend/storage/buffer/bufmgr.c | 140 +++++++++++++++++---------
src/backend/storage/buffer/localbuf.c | 30 ++++--
src/include/storage/buf_internals.h | 64 ++++++++++--
5 files changed, 178 insertions(+), 74 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..c5754ea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(&bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 13eee4a..d863981 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -631,10 +631,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(&bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(&bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 24d894e..e6e6ae7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,8 +1647,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1658,8 +1658,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2000,9 +2000,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(&bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(&bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, BufTagGetForkNum(&buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,8 +2790,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ *rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ *forknum = BufTagGetForkNum(&bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,9 +2841,14 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2899,7 +2907,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2920,7 +2928,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3141,15 +3149,15 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3300,7 +3308,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &locators[j]))
{
rlocator = &locators[j];
break;
@@ -3309,7 +3317,10 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ locator = BufTagGetRelFileLocator(&bufHdr->tag);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3319,7 +3330,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, rlocator))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3379,8 +3390,8 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3418,11 +3429,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3446,13 +3457,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3472,12 +3486,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3516,7 +3534,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3534,7 +3552,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3563,13 +3581,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3643,7 +3661,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3652,7 +3670,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3664,7 +3685,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3866,13 +3887,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4032,6 +4053,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4040,8 +4065,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4649,8 +4673,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4674,7 +4700,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4692,8 +4722,12 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4787,15 +4821,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ rlocatora = BufTagGetRelFileLocator(ba);
+ rlocatorb = BufTagGetRelFileLocator(bb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(ba) < BufTagGetForkNum(bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(ba) > BufTagGetForkNum(bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4945,10 +4984,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ currlocator = BufTagGetRelFileLocator(&tag);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4956,11 +4997,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ nextrlocator = BufTagGetRelFileLocator(&next->tag);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
+ BufTagGetForkNum(&cur->tag) != BufTagGetForkNum(&next->tag))
break;
/* ok, block queued twice, skip */
@@ -4978,8 +5022,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 91e174e..972f3f3 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,15 +213,18 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +340,22 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +392,16 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 2f0e60e..092e959 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,18 +90,51 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+static inline RelFileNumber
+BufTagGetRelNumber(const BufferTag *tag)
+{
+ return tag->relNumber;
+}
+
+static inline ForkNumber
+BufTagGetForkNum(const BufferTag *tag)
+{
+ return tag->forkNum;
+}
+
+static inline void
+BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
+ ForkNumber forknum)
+{
+ tag->relNumber = relnumber;
+ tag->forkNum = forknum;
+}
+
+static inline RelFileLocator
+BufTagGetRelFileLocator(const BufferTag *tag)
+{
+ RelFileLocator rlocator;
+
+ rlocator.spcOid = tag->spcOid;
+ rlocator.dbOid = tag->dbOid;
+ rlocator.relNumber = BufTagGetRelNumber(tag);
+
+ return rlocator;
+}
+
static inline void
CLEAR_BUFFERTAG(BufferTag *tag)
{
- tag->rlocator.spcOid = InvalidOid;
- tag->rlocator.dbOid = InvalidOid;
- tag->rlocator.relNumber = InvalidRelFileNumber;
- tag->forkNum = InvalidForkNumber;
+ tag->spcOid = InvalidOid;
+ tag->dbOid = InvalidOid;
+ BufTagSetRelForkDetails(tag, InvalidRelFileNumber, InvalidForkNumber);
tag->blockNum = InvalidBlockNumber;
}
@@ -109,19 +142,32 @@ static inline void
INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
ForkNumber forkNum, BlockNumber blockNum)
{
- tag->rlocator = *rlocator;
- tag->forkNum = forkNum;
+ tag->spcOid = rlocator->spcOid;
+ tag->dbOid = rlocator->dbOid;
+ BufTagSetRelForkDetails(tag, rlocator->relNumber, forkNum);
tag->blockNum = blockNum;
}
static inline bool
BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
- return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ return (tag1->spcOid == tag2->spcOid) &&
+ (tag1->dbOid == tag2->dbOid) &&
+ (tag1->relNumber == tag2->relNumber) &&
(tag1->blockNum == tag2->blockNum) &&
(tag1->forkNum == tag2->forkNum);
}
+static inline bool
+BufTagMatchesRelFileLocator(const BufferTag *tag,
+ const RelFileLocator *rlocator)
+{
+ return (tag->spcOid == rlocator->spcOid) &&
+ (tag->dbOid == rlocator->dbOid) &&
+ (BufTagGetRelNumber(tag) == rlocator->relNumber);
+}
+
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v9-0003-Remove-the-restriction-that-the-relmap-must-be-51.patchtext/x-patch; charset=US-ASCII; name=v9-0003-Remove-the-restriction-that-the-relmap-must-be-51.patchDownload
From 614d41ab24d8a3d4a08fbc64c59309e32cf82dc9 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 12 Jul 2022 09:04:44 -0400
Subject: [PATCH v9 3/4] Remove the restriction that the relmap must be 512
bytes.
Instead of relying on the ability to atomically overwrite the
entire relmap file in one shot, write a new one and durably
rename it into place. Removing the struct padding and the
calculation showing why the map is exactly 512 bytes, and change
the maximum number of entries to a nearby round number.
Patch by me, reviewed by Andres Freund.
Discussion: http://postgr.es/m/CA+TgmoacMgLv_0edhN=oWjnUvJyFjXww4Q4re4kfm+qkSBtjaQ@mail.gmail.com
---
doc/src/sgml/monitoring.sgml | 4 +-
src/backend/utils/activity/wait_event.c | 4 +-
src/backend/utils/cache/relmapper.c | 94 +++++++++++++++++++--------------
src/include/utils/wait_event.h | 2 +-
4 files changed, 58 insertions(+), 46 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4549c25..3c611a1 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1404,8 +1404,8 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting for a read of the relation map file.</entry>
</row>
<row>
- <entry><literal>RelationMapSync</literal></entry>
- <entry>Waiting for the relation map file to reach durable storage.</entry>
+ <entry><literal>RelationMapRename</literal></entry>
+ <entry>Waiting for durable replacement of a relation map file.</entry>
</row>
<row>
<entry><literal>RelationMapWrite</literal></entry>
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 87c15b9..b5d4841 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -630,8 +630,8 @@ pgstat_get_wait_io(WaitEventIO w)
case WAIT_EVENT_RELATION_MAP_READ:
event_name = "RelationMapRead";
break;
- case WAIT_EVENT_RELATION_MAP_SYNC:
- event_name = "RelationMapSync";
+ case WAIT_EVENT_RELATION_MAP_RENAME:
+ event_name = "RelationMapRename";
break;
case WAIT_EVENT_RELATION_MAP_WRITE:
event_name = "RelationMapWrite";
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 8e5595b..b3b4756 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -60,21 +60,26 @@
/*
* The map file is critical data: we have no automatic method for recovering
* from loss or corruption of it. We use a CRC so that we can detect
- * corruption. To minimize the risk of failed updates, the map file should
- * be kept to no more than one standard-size disk sector (ie 512 bytes),
- * and we use overwrite-in-place rather than playing renaming games.
- * The struct layout below is designed to occupy exactly 512 bytes, which
- * might make filesystem updates a bit more efficient.
+ * corruption. Since the file might be more than one standard-size disk
+ * sector in size, we cannot rely on overwrite-in-place. Instead, we generate
+ * a new file and rename it into place, atomically replacing the original file.
*
* Entries in the mappings[] array are in no particular order. We could
* speed searching by insisting on OID order, but it really shouldn't be
* worth the trouble given the intended size of the mapping sets.
*/
#define RELMAPPER_FILENAME "pg_filenode.map"
+#define RELMAPPER_TEMP_FILENAME "pg_filenode.map.tmp"
#define RELMAPPER_FILEMAGIC 0x592717 /* version ID value */
-#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
+/*
+ * There's no need for this constant to have any particular value, and we
+ * can raise it as necessary if we end up with more mapped relations. For
+ * now, we just pick a round number that is modestly larger thn the expected
+ * number of mappings.
+ */
+#define MAX_MAPPINGS 64
typedef struct RelMapping
{
@@ -88,7 +93,6 @@ typedef struct RelMapFile
int32 num_mappings; /* number of valid RelMapping entries */
RelMapping mappings[MAX_MAPPINGS];
pg_crc32c crc; /* CRC of all above */
- int32 pad; /* to make the struct size be 512 exactly */
} RelMapFile;
/*
@@ -877,6 +881,7 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
{
int fd;
char mapfilename[MAXPGPATH];
+ char maptempfilename[MAXPGPATH];
/*
* Fill in the overhead fields and update CRC.
@@ -890,17 +895,47 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
FIN_CRC32C(newmap->crc);
/*
- * Open the target file. We prefer to do this before entering the
- * critical section, so that an open() failure need not force PANIC.
+ * Construct filenames -- a temporary file that we'll create to write the
+ * data initially, and then the permanent name to which we will rename it.
*/
snprintf(mapfilename, sizeof(mapfilename), "%s/%s",
dbpath, RELMAPPER_FILENAME);
- fd = OpenTransientFile(mapfilename, O_WRONLY | O_CREAT | PG_BINARY);
+ snprintf(maptempfilename, sizeof(maptempfilename), "%s/%s",
+ dbpath, RELMAPPER_TEMP_FILENAME);
+
+ /*
+ * Open a temporary file. If a file already exists with this name, it must
+ * be left over from a previous crash, so we can overwrite it. Concurrent
+ * calls to this function are not allowed.
+ */
+ fd = OpenTransientFile(maptempfilename,
+ O_WRONLY | O_CREAT | O_TRUNC | PG_BINARY);
if (fd < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m",
- mapfilename)));
+ maptempfilename)));
+
+ /* Write new data to the file. */
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
+ if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
+ /* if write didn't set errno, assume problem is no disk space */
+ if (errno == 0)
+ errno = ENOSPC;
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ maptempfilename)));
+ }
+ pgstat_report_wait_end();
+
+ /* And close the file. */
+ if (CloseTransientFile(fd) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not close file \"%s\": %m",
+ maptempfilename)));
if (write_wal)
{
@@ -924,40 +959,17 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
XLogFlush(lsn);
}
- errno = 0;
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
- if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
- {
- /* if write didn't set errno, assume problem is no disk space */
- if (errno == 0)
- errno = ENOSPC;
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not write file \"%s\": %m",
- mapfilename)));
- }
- pgstat_report_wait_end();
-
/*
- * We choose to fsync the data to disk before considering the task done.
- * It would be possible to relax this if it turns out to be a performance
- * issue, but it would complicate checkpointing --- see notes for
- * CheckPointRelationMap.
+ * durable_rename() does all the hard work of making sure that we rename
+ * the temporary file into place in a crash-safe manner.
+ *
+ * NB: Although we instruct durable_rename() to use ERROR, we will often
+ * be in a critical section at this point; if so, ERROR will become PANIC.
*/
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_SYNC);
- if (pg_fsync(fd) != 0)
- ereport(data_sync_elevel(ERROR),
- (errcode_for_file_access(),
- errmsg("could not fsync file \"%s\": %m",
- mapfilename)));
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_RENAME);
+ durable_rename(maptempfilename, mapfilename, ERROR);
pgstat_report_wait_end();
- if (CloseTransientFile(fd) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not close file \"%s\": %m",
- mapfilename)));
-
/*
* Now that the file is safely on disk, send sinval message to let other
* backends know to re-read it. We must do this inside the critical
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index b578e2e..5d3775c 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -193,7 +193,7 @@ typedef enum
WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE,
WAIT_EVENT_LOGICAL_REWRITE_WRITE,
WAIT_EVENT_RELATION_MAP_READ,
- WAIT_EVENT_RELATION_MAP_SYNC,
+ WAIT_EVENT_RELATION_MAP_RENAME,
WAIT_EVENT_RELATION_MAP_WRITE,
WAIT_EVENT_REORDER_BUFFER_READ,
WAIT_EVENT_REORDER_BUFFER_WRITE,
--
1.8.3.1
v9-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v9-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From c08263566f3104b788178629c4dd902d69fcec29 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 14 Jul 2022 14:25:50 +0530
Subject: [PATCH v9 4/4] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 +++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 138 ++++++++++++++++++-
src/backend/access/transam/xlog.c | 60 +++++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 149 ++++++++-------------
src/backend/catalog/heap.c | 20 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 ++
src/backend/commands/tablecmds.c | 16 ++-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/nodes/outfuncs.c | 5 +
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 11 +-
src/backend/utils/adt/pg_upgrade_support.c | 22 ++-
src/backend/utils/cache/relcache.c | 3 +-
src/backend/utils/cache/relfilenumbermap.c | 44 +++---
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 20 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 ++++++
src/include/access/transam.h | 6 +
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 12 +-
src/include/catalog/pg_class.h | 16 +--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 11 +-
src/include/storage/buf_internals.h | 63 +++++++--
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++--
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
63 files changed, 661 insertions(+), 291 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..c20439a 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..42e3767 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE")));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 4f3f375..3a48c35 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..e3564e2 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +530,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +621,132 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previosly logged record pointer and we
+ * will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b809a21..0941214 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,18 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ {
+ if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber)
+ elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT,
+ checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber);
+
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ }
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7379,34 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. '*prevrecptr' is valid then flush
+ * the wal upto this record pointer otherwise flush upto currently logged
+ * record. Also store the currenly log record in the '*prevrecptr'.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7621,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7645,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5..1d95fc0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..945690d 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
@@ -678,3 +583,57 @@ pg_stop_making_pinned_objects(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+
+#ifdef USE_ASSERT_CHECKING
+
+/*
+ * Assert that there is no existing diskfile for input relnumber.
+ */
+void
+AssertRelfileNumberFileNotExists(Oid spcoid, RelFileNumber relnumber,
+ char relpersistence)
+{
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ /*
+ * If we ever get here during pg_upgrade, there's something wrong; all
+ * relfilenode assignments during a binary-upgrade run should be
+ * determined by commands in the dump script.
+ */
+ Assert(!IsBinaryUpgrade);
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid = spcoid ? spcoid : MyDatabaseTableSpace;
+ rlocator.locator.dbOid =
+ (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid :
+ MyDatabaseId;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must initialize
+ * that properly here to make sure that any collisions based on filename
+ * are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+
+ Assert(access(rpath, F_OK) != 0);
+}
+#endif
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index e770ea6..30fd15c 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -345,7 +345,12 @@ heap_create(const char *relname,
* with oid same as relid.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ relfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(reltablespace,
+ relfilenumber,
+ relpersistence);
+ }
}
/*
@@ -898,7 +903,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1175,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1229,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c5d463a..3402f49 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -900,12 +900,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -937,8 +932,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..c1bd4f8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f2947ea..f99ba7c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14346,11 +14346,17 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(newTableSpace,
+ newrelfilenumber,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 96af175..76d62df 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -883,12 +883,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 2b85f97..1149e98 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -41,6 +41,11 @@ static void outChar(StringInfo str, char c);
#define WRITE_INT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
+/* Write an unsigned integer field (anything written with INT64_FORMAT) */
+#define WRITE_INT64_FIELD(fldname) \
+ appendStringInfo(str, " :" CppAsString(fldname) " " INT64_FORMAT, \
+ node->fldname)
+
/* Write an unsigned integer field (anything written as ":fldname %u") */
#define WRITE_UINT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %u", node->fldname)
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..a88fef2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..31d7471 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,16 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
+ (relfilenumber))));
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..ff8f0c2 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -29,6 +30,15 @@ do { \
errmsg("function can only be called when server is in binary upgrade mode"))); \
} while (0)
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range", \
+ (relfilenumber)))); \
+} while (0)
+
Datum
binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
{
@@ -98,10 +108,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +132,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +156,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..44c14b8 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3708,8 +3708,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
RelFileLocator newrlocator;
/* Allocate a new relfilenumber */
- newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ newrelfilenumber = GetNewRelFileNumber();
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..b09266a 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,11 +32,10 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
RelFileNumber relfilenumber;
} RelfilenumberMapKey;
@@ -88,7 +87,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,18 +95,13 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
@@ -129,7 +122,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -143,18 +136,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
key.relfilenumber = relfilenumber;
/*
@@ -195,14 +183,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +200,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e4fdb6b..5179642 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4842,16 +4842,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4869,7 +4869,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4883,7 +4883,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4891,7 +4891,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4904,7 +4904,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index df374ce..e97ba4d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..3b38937 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -213,6 +213,10 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +297,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..be6ba13 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,14 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
+
+#ifdef USE_ASSERT_CHECKING
+extern void AssertRelfileNumberFileNotExists(Oid spcoid,
+ RelFileNumber relnumber,
+ char relpersistence);
+#else
+#define AssertRelfileNumberFileNotExists(spcoid, relnumber, relpersistence) \
+ ((void)true)
+#endif
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..448bc6e 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,18 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 092e959..cdb42aa 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,74 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first interger and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0] << BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ relnum |= tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +198,9 @@ BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index e3dac16..c0084ae 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
--
1.8.3.1
On Thu, Jul 14, 2022 at 5:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Apart from this I have fixed all the pending issues that includes
- Change existing macros to inline functions done in 0001.
- Change pg_class index from (tbspc, relfilenode) to relfilenode and
also change RelidByRelfilenumber(). In RelidByRelfilenumber I have
changed the hash to maintain based on just the relfilenumber but we
still need to pass the tablespace to identify whether it is a shared
relation or not. If we want we can make it bool but I don't think
that is really needed here.
- Changed logic of GetNewRelFileNumber() based on what Robert
described, and instead of tracking the pending logged relnumbercount
now I am tracking last loggedRelNumber, which help little bit in
SetNextRelFileNumber in making code cleaner, but otherwise it doesn't
make much difference.
- Some new asserts in buf_internal inline function to validate value
of computed/input relfilenumber.
I was doing some more testing by setting the FirstNormalRelFileNumber
to a high value(more than 32 bits) I have noticed a couple of problems
there e.g. relpath is still using OIDCHARS macro which says max
relfilenumber file name can be only 10 character long which is no
longer true. So there we need to change this value to 20 and also
need to carefully rename the macros and other variable names used for
this purpose.
Similarly there was some issue in macro in buf_internal.h while
fetching the relfilenumber. So I will relook into all those issues
and repost the patch soon.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Jul 18, 2022 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I was doing some more testing by setting the FirstNormalRelFileNumber
to a high value(more than 32 bits) I have noticed a couple of problems
there e.g. relpath is still using OIDCHARS macro which says max
relfilenumber file name can be only 10 character long which is no
longer true. So there we need to change this value to 20 and also
need to carefully rename the macros and other variable names used for
this purpose.Similarly there was some issue in macro in buf_internal.h while
fetching the relfilenumber. So I will relook into all those issues
and repost the patch soon.
I have fixed these existing issues and there was also some issue in
pg_dump.c which was creating problems in upgrading to the same version
while using a higher range of the relfilenumber.
There was also an issue where the user table from the old cluster's
relfilenode could conflict with the system table of the new cluster.
As a solution currently for system table object (while creating
storage first time) we are keeping the low range of relfilenumber,
basically we are using the same relfilenumber as OID so that during
upgrade the normal user table from the old cluster will not conflict
with the system tables in the new cluster. But with this solution
Robert told me (in off list chat) a problem that in future if we want
to make relfilenumber completely unique within a cluster by
implementing the CREATEDB differently then we can not do that as we
have created fixed relfilenodes for the system tables.
I am not sure what exactly we can do to avoid that because even if we
do something to avoid that in the new cluster the old cluster might
be already using the non-unique relfilenode so after upgrading the new
cluster will also get those non-unique relfilenode.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v10-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchtext/x-patch; charset=US-ASCII; name=v10-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchDownload
From a834e80cb9cebe3bd32d67c4ffc364a43aff8ff8 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 12 Jul 2022 17:10:04 +0530
Subject: [PATCH v10 1/4] Convert buf_internal.h macros to static inline
functions
Readability wise inline functions are better compared to macros and this
will also help to write cleaner and readable code for 64-bit relfilenode
because as part of that patch we will have to do some complex bitwise
operation so doing that inside the inline function will be cleaner.
---
src/backend/storage/buffer/buf_init.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 16 ++--
src/backend/storage/buffer/localbuf.c | 12 +--
src/include/storage/buf_internals.h | 156 +++++++++++++++++++++-------------
4 files changed, 111 insertions(+), 75 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 2862e9e..55f646d 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -116,7 +116,7 @@ InitBufferPool(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
pg_atomic_init_u32(&buf->state, 0);
buf->wait_backend_pgprocno = INVALID_PGPROCNO;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index c7d7abc..24d894e 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
+ INIT_BUFFERTAG(&newTag, &smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
+ INIT_BUFFERTAG(&tag, &rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -640,7 +640,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
buf_state = pg_atomic_read_u32(&bufHdr->state);
/* Is it still valid and holding the right tag? */
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/* Bump local buffer's ref and usage counts. */
ResourceOwnerRememberBuffer(CurrentResourceOwner, recent_buffer);
@@ -669,7 +669,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
else
buf_state = LockBufHdr(bufHdr);
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/*
* It's now safe to pin the buffer. We can't pin first and ask
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1507,7 +1507,7 @@ retry:
buf_state = LockBufHdr(buf);
/* If it's changed while we were waiting for lock, do nothing */
- if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+ if (!BUFFERTAGS_EQUAL(&buf->tag, &oldTag))
{
UnlockBufHdr(buf, buf_state);
LWLockRelease(oldPartitionLock);
@@ -1539,7 +1539,7 @@ retry:
* linear scans of the buffer array don't think the buffer is valid.
*/
oldFlags = buf_state & BUF_FLAG_MASK;
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
UnlockBufHdr(buf, buf_state);
@@ -3355,7 +3355,7 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
+ INIT_BUFFERTAG(&bufTag, &rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 9c03885..91e174e 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -131,7 +131,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
b = hresult->id;
bufHdr = GetLocalBufferDescriptor(b);
- Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
+ Assert(BUFFERTAGS_EQUAL(&bufHdr->tag, &newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
@@ -253,7 +253,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* mark buffer invalid just in case hash insert fails */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~(BM_VALID | BM_TAG_VALID);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
@@ -354,7 +354,7 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -398,7 +398,7 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 69e4590..2f0e60e 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -95,28 +95,32 @@ typedef struct buftag
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define CLEAR_BUFFERTAG(a) \
-( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
- (a).forkNum = InvalidForkNumber, \
- (a).blockNum = InvalidBlockNumber \
-)
-
-#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
-( \
- (a).rlocator = (xx_rlocator), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
-)
-
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
-)
+static inline void
+CLEAR_BUFFERTAG(BufferTag *tag)
+{
+ tag->rlocator.spcOid = InvalidOid;
+ tag->rlocator.dbOid = InvalidOid;
+ tag->rlocator.relNumber = InvalidRelFileNumber;
+ tag->forkNum = InvalidForkNumber;
+ tag->blockNum = InvalidBlockNumber;
+}
+
+static inline void
+INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
+ ForkNumber forkNum, BlockNumber blockNum)
+{
+ tag->rlocator = *rlocator;
+ tag->forkNum = forkNum;
+ tag->blockNum = blockNum;
+}
+
+static inline bool
+BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
+{
+ return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ (tag1->blockNum == tag2->blockNum) &&
+ (tag1->forkNum == tag2->forkNum);
+}
/*
* The shared buffer mapping table is partitioned to reduce contention.
@@ -124,13 +128,24 @@ typedef struct buftag
* hash code with BufTableHashCode(), then apply BufMappingPartitionLock().
* NB: NUM_BUFFER_PARTITIONS must be a power of 2!
*/
-#define BufTableHashPartition(hashcode) \
- ((hashcode) % NUM_BUFFER_PARTITIONS)
-#define BufMappingPartitionLock(hashcode) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + \
- BufTableHashPartition(hashcode)].lock)
-#define BufMappingPartitionLockByIndex(i) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + (i)].lock)
+static inline uint32
+BufTableHashPartition(uint32 hashcode)
+{
+ return hashcode % NUM_BUFFER_PARTITIONS;
+}
+
+static inline LWLock *
+BufMappingPartitionLock(uint32 hashcode)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET +
+ BufTableHashPartition(hashcode)].lock;
+}
+
+static inline LWLock *
+BufMappingPartitionLockByIndex(uint32 index)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + index].lock;
+}
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
@@ -220,37 +235,6 @@ typedef union BufferDescPadded
char pad[BUFFERDESC_PAD_TO_SIZE];
} BufferDescPadded;
-#define GetBufferDescriptor(id) (&BufferDescriptors[(id)].bufferdesc)
-#define GetLocalBufferDescriptor(id) (&LocalBufferDescriptors[(id)])
-
-#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
-
-#define BufferDescriptorGetIOCV(bdesc) \
- (&(BufferIOCVArray[(bdesc)->buf_id]).cv)
-#define BufferDescriptorGetContentLock(bdesc) \
- ((LWLock*) (&(bdesc)->content_lock))
-
-extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
-
-/*
- * The freeNext field is either the index of the next freelist entry,
- * or one of these special values:
- */
-#define FREENEXT_END_OF_LIST (-1)
-#define FREENEXT_NOT_IN_LIST (-2)
-
-/*
- * Functions for acquiring/releasing a shared buffer header's spinlock. Do
- * not apply these to local buffers!
- */
-extern uint32 LockBufHdr(BufferDesc *desc);
-#define UnlockBufHdr(desc, s) \
- do { \
- pg_write_barrier(); \
- pg_atomic_write_u32(&(desc)->state, (s) & (~BM_LOCKED)); \
- } while (0)
-
-
/*
* The PendingWriteback & WritebackContext structure are used to keep
* information about pending flush requests to be issued to the OS.
@@ -276,11 +260,63 @@ typedef struct WritebackContext
/* in buf_init.c */
extern PGDLLIMPORT BufferDescPadded *BufferDescriptors;
+extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
extern PGDLLIMPORT WritebackContext BackendWritebackContext;
/* in localbuf.c */
extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
+
+static inline BufferDesc *
+GetBufferDescriptor(uint32 id)
+{
+ return &(BufferDescriptors[id]).bufferdesc;
+}
+
+static inline BufferDesc *
+GetLocalBufferDescriptor(uint32 id)
+{
+ return &LocalBufferDescriptors[id];
+}
+
+static inline Buffer
+BufferDescriptorGetBuffer(const BufferDesc *bdesc)
+{
+ return (Buffer) (bdesc->buf_id + 1);
+}
+
+static inline ConditionVariable *
+BufferDescriptorGetIOCV(const BufferDesc *bdesc)
+{
+ return &(BufferIOCVArray[bdesc->buf_id]).cv;
+}
+
+static inline LWLock *
+BufferDescriptorGetContentLock(const BufferDesc *bdesc)
+{
+ return (LWLock *) (&bdesc->content_lock);
+}
+
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST (-1)
+#define FREENEXT_NOT_IN_LIST (-2)
+
+/*
+ * Functions for acquiring/releasing a shared buffer header's spinlock. Do
+ * not apply these to local buffers!
+ */
+extern uint32 LockBufHdr(BufferDesc *desc);
+
+static inline void
+UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
+{
+ pg_write_barrier();
+ pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
+}
+
/* in bufmgr.c */
/*
--
1.8.3.1
v10-0002-Preliminary-refactoring-for-supporting-larger-re.patchtext/x-patch; charset=US-ASCII; name=v10-0002-Preliminary-refactoring-for-supporting-larger-re.patchDownload
From c67af5e52f274571634e4f003ccfb1bb8d844f89 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Wed, 13 Jul 2022 13:12:53 +0530
Subject: [PATCH v10 2/4] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 +-
src/backend/storage/buffer/bufmgr.c | 140 +++++++++++++++++---------
src/backend/storage/buffer/localbuf.c | 30 ++++--
src/include/storage/buf_internals.h | 64 ++++++++++--
5 files changed, 178 insertions(+), 74 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..c5754ea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(&bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index b2d6026..63f0d41 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -630,10 +630,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(&bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(&bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 24d894e..e6e6ae7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1647,8 +1647,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1658,8 +1658,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2000,9 +2000,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(&bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(&bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2692,6 +2692,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2707,8 +2708,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, BufTagGetForkNum(&buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2787,8 +2790,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ *rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ *forknum = BufTagGetForkNum(&bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2838,9 +2841,14 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2899,7 +2907,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2920,7 +2928,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3141,15 +3149,15 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3300,7 +3308,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &locators[j]))
{
rlocator = &locators[j];
break;
@@ -3309,7 +3317,10 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ locator = BufTagGetRelFileLocator(&bufHdr->tag);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3319,7 +3330,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, rlocator))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3379,8 +3390,8 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3418,11 +3429,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3446,13 +3457,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3472,12 +3486,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3516,7 +3534,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3534,7 +3552,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3563,13 +3581,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3643,7 +3661,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3652,7 +3670,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3664,7 +3685,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3866,13 +3887,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4032,6 +4053,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4040,8 +4065,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4649,8 +4673,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4674,7 +4700,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4692,8 +4722,12 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4787,15 +4821,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ rlocatora = BufTagGetRelFileLocator(ba);
+ rlocatorb = BufTagGetRelFileLocator(bb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(ba) < BufTagGetForkNum(bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(ba) > BufTagGetForkNum(bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4945,10 +4984,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ currlocator = BufTagGetRelFileLocator(&tag);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4956,11 +4997,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ nextrlocator = BufTagGetRelFileLocator(&next->tag);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
+ BufTagGetForkNum(&cur->tag) != BufTagGetForkNum(&next->tag))
break;
/* ok, block queued twice, skip */
@@ -4978,8 +5022,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 91e174e..972f3f3 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,15 +213,18 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +340,22 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +392,16 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 2f0e60e..092e959 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,18 +90,51 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+static inline RelFileNumber
+BufTagGetRelNumber(const BufferTag *tag)
+{
+ return tag->relNumber;
+}
+
+static inline ForkNumber
+BufTagGetForkNum(const BufferTag *tag)
+{
+ return tag->forkNum;
+}
+
+static inline void
+BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
+ ForkNumber forknum)
+{
+ tag->relNumber = relnumber;
+ tag->forkNum = forknum;
+}
+
+static inline RelFileLocator
+BufTagGetRelFileLocator(const BufferTag *tag)
+{
+ RelFileLocator rlocator;
+
+ rlocator.spcOid = tag->spcOid;
+ rlocator.dbOid = tag->dbOid;
+ rlocator.relNumber = BufTagGetRelNumber(tag);
+
+ return rlocator;
+}
+
static inline void
CLEAR_BUFFERTAG(BufferTag *tag)
{
- tag->rlocator.spcOid = InvalidOid;
- tag->rlocator.dbOid = InvalidOid;
- tag->rlocator.relNumber = InvalidRelFileNumber;
- tag->forkNum = InvalidForkNumber;
+ tag->spcOid = InvalidOid;
+ tag->dbOid = InvalidOid;
+ BufTagSetRelForkDetails(tag, InvalidRelFileNumber, InvalidForkNumber);
tag->blockNum = InvalidBlockNumber;
}
@@ -109,19 +142,32 @@ static inline void
INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
ForkNumber forkNum, BlockNumber blockNum)
{
- tag->rlocator = *rlocator;
- tag->forkNum = forkNum;
+ tag->spcOid = rlocator->spcOid;
+ tag->dbOid = rlocator->dbOid;
+ BufTagSetRelForkDetails(tag, rlocator->relNumber, forkNum);
tag->blockNum = blockNum;
}
static inline bool
BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
- return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ return (tag1->spcOid == tag2->spcOid) &&
+ (tag1->dbOid == tag2->dbOid) &&
+ (tag1->relNumber == tag2->relNumber) &&
(tag1->blockNum == tag2->blockNum) &&
(tag1->forkNum == tag2->forkNum);
}
+static inline bool
+BufTagMatchesRelFileLocator(const BufferTag *tag,
+ const RelFileLocator *rlocator)
+{
+ return (tag->spcOid == rlocator->spcOid) &&
+ (tag->dbOid == rlocator->dbOid) &&
+ (BufTagGetRelNumber(tag) == rlocator->relNumber);
+}
+
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v10-0003-Remove-the-restriction-that-the-relmap-must-be-5.patchtext/x-patch; charset=US-ASCII; name=v10-0003-Remove-the-restriction-that-the-relmap-must-be-5.patchDownload
From bd0a17ef6536da5e63794e7a50808eb4ffa32180 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 12 Jul 2022 09:04:44 -0400
Subject: [PATCH v10 3/4] Remove the restriction that the relmap must be 512
bytes.
Instead of relying on the ability to atomically overwrite the
entire relmap file in one shot, write a new one and durably
rename it into place. Removing the struct padding and the
calculation showing why the map is exactly 512 bytes, and change
the maximum number of entries to a nearby round number.
Patch by me, reviewed by Andres Freund.
Discussion: http://postgr.es/m/CA+TgmoacMgLv_0edhN=oWjnUvJyFjXww4Q4re4kfm+qkSBtjaQ@mail.gmail.com
---
doc/src/sgml/monitoring.sgml | 4 +-
src/backend/utils/activity/wait_event.c | 4 +-
src/backend/utils/cache/relmapper.c | 94 +++++++++++++++++++--------------
src/include/utils/wait_event.h | 2 +-
4 files changed, 58 insertions(+), 46 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7dbbab6..475223c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1409,8 +1409,8 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting for a read of the relation map file.</entry>
</row>
<row>
- <entry><literal>RelationMapSync</literal></entry>
- <entry>Waiting for the relation map file to reach durable storage.</entry>
+ <entry><literal>RelationMapRename</literal></entry>
+ <entry>Waiting for durable replacement of a relation map file.</entry>
</row>
<row>
<entry><literal>RelationMapWrite</literal></entry>
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index da57a93..9ebd6fc 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -633,8 +633,8 @@ pgstat_get_wait_io(WaitEventIO w)
case WAIT_EVENT_RELATION_MAP_READ:
event_name = "RelationMapRead";
break;
- case WAIT_EVENT_RELATION_MAP_SYNC:
- event_name = "RelationMapSync";
+ case WAIT_EVENT_RELATION_MAP_RENAME:
+ event_name = "RelationMapRename";
break;
case WAIT_EVENT_RELATION_MAP_WRITE:
event_name = "RelationMapWrite";
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 8e5595b..b3b4756 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -60,21 +60,26 @@
/*
* The map file is critical data: we have no automatic method for recovering
* from loss or corruption of it. We use a CRC so that we can detect
- * corruption. To minimize the risk of failed updates, the map file should
- * be kept to no more than one standard-size disk sector (ie 512 bytes),
- * and we use overwrite-in-place rather than playing renaming games.
- * The struct layout below is designed to occupy exactly 512 bytes, which
- * might make filesystem updates a bit more efficient.
+ * corruption. Since the file might be more than one standard-size disk
+ * sector in size, we cannot rely on overwrite-in-place. Instead, we generate
+ * a new file and rename it into place, atomically replacing the original file.
*
* Entries in the mappings[] array are in no particular order. We could
* speed searching by insisting on OID order, but it really shouldn't be
* worth the trouble given the intended size of the mapping sets.
*/
#define RELMAPPER_FILENAME "pg_filenode.map"
+#define RELMAPPER_TEMP_FILENAME "pg_filenode.map.tmp"
#define RELMAPPER_FILEMAGIC 0x592717 /* version ID value */
-#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
+/*
+ * There's no need for this constant to have any particular value, and we
+ * can raise it as necessary if we end up with more mapped relations. For
+ * now, we just pick a round number that is modestly larger thn the expected
+ * number of mappings.
+ */
+#define MAX_MAPPINGS 64
typedef struct RelMapping
{
@@ -88,7 +93,6 @@ typedef struct RelMapFile
int32 num_mappings; /* number of valid RelMapping entries */
RelMapping mappings[MAX_MAPPINGS];
pg_crc32c crc; /* CRC of all above */
- int32 pad; /* to make the struct size be 512 exactly */
} RelMapFile;
/*
@@ -877,6 +881,7 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
{
int fd;
char mapfilename[MAXPGPATH];
+ char maptempfilename[MAXPGPATH];
/*
* Fill in the overhead fields and update CRC.
@@ -890,17 +895,47 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
FIN_CRC32C(newmap->crc);
/*
- * Open the target file. We prefer to do this before entering the
- * critical section, so that an open() failure need not force PANIC.
+ * Construct filenames -- a temporary file that we'll create to write the
+ * data initially, and then the permanent name to which we will rename it.
*/
snprintf(mapfilename, sizeof(mapfilename), "%s/%s",
dbpath, RELMAPPER_FILENAME);
- fd = OpenTransientFile(mapfilename, O_WRONLY | O_CREAT | PG_BINARY);
+ snprintf(maptempfilename, sizeof(maptempfilename), "%s/%s",
+ dbpath, RELMAPPER_TEMP_FILENAME);
+
+ /*
+ * Open a temporary file. If a file already exists with this name, it must
+ * be left over from a previous crash, so we can overwrite it. Concurrent
+ * calls to this function are not allowed.
+ */
+ fd = OpenTransientFile(maptempfilename,
+ O_WRONLY | O_CREAT | O_TRUNC | PG_BINARY);
if (fd < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m",
- mapfilename)));
+ maptempfilename)));
+
+ /* Write new data to the file. */
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
+ if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
+ /* if write didn't set errno, assume problem is no disk space */
+ if (errno == 0)
+ errno = ENOSPC;
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ maptempfilename)));
+ }
+ pgstat_report_wait_end();
+
+ /* And close the file. */
+ if (CloseTransientFile(fd) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not close file \"%s\": %m",
+ maptempfilename)));
if (write_wal)
{
@@ -924,40 +959,17 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
XLogFlush(lsn);
}
- errno = 0;
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
- if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
- {
- /* if write didn't set errno, assume problem is no disk space */
- if (errno == 0)
- errno = ENOSPC;
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not write file \"%s\": %m",
- mapfilename)));
- }
- pgstat_report_wait_end();
-
/*
- * We choose to fsync the data to disk before considering the task done.
- * It would be possible to relax this if it turns out to be a performance
- * issue, but it would complicate checkpointing --- see notes for
- * CheckPointRelationMap.
+ * durable_rename() does all the hard work of making sure that we rename
+ * the temporary file into place in a crash-safe manner.
+ *
+ * NB: Although we instruct durable_rename() to use ERROR, we will often
+ * be in a critical section at this point; if so, ERROR will become PANIC.
*/
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_SYNC);
- if (pg_fsync(fd) != 0)
- ereport(data_sync_elevel(ERROR),
- (errcode_for_file_access(),
- errmsg("could not fsync file \"%s\": %m",
- mapfilename)));
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_RENAME);
+ durable_rename(maptempfilename, mapfilename, ERROR);
pgstat_report_wait_end();
- if (CloseTransientFile(fd) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not close file \"%s\": %m",
- mapfilename)));
-
/*
* Now that the file is safely on disk, send sinval message to let other
* backends know to re-read it. We must do this inside the critical
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index c3ade01..39f770a 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -194,7 +194,7 @@ typedef enum
WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE,
WAIT_EVENT_LOGICAL_REWRITE_WRITE,
WAIT_EVENT_RELATION_MAP_READ,
- WAIT_EVENT_RELATION_MAP_SYNC,
+ WAIT_EVENT_RELATION_MAP_RENAME,
WAIT_EVENT_RELATION_MAP_WRITE,
WAIT_EVENT_REORDER_BUFFER_READ,
WAIT_EVENT_REORDER_BUFFER_WRITE,
--
1.8.3.1
v10-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v10-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From 8513ad85c58aa703a7dc2da0d44491da6fd2a24a Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 14 Jul 2022 14:25:50 +0530
Subject: [PATCH v10 4/4] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 +++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 138 +++++++++++++++++++-
src/backend/access/transam/xlog.c | 61 +++++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 142 +++++++--------------
src/backend/catalog/heap.c | 25 ++--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 ++
src/backend/commands/dbcommands.c | 4 +-
src/backend/commands/tablecmds.c | 16 ++-
src/backend/commands/tablespace.c | 3 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/basebackup.c | 12 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 46 +++----
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 11 +-
src/backend/utils/adt/pg_upgrade_support.c | 22 +++-
src/backend/utils/cache/relcache.c | 6 +-
src/backend/utils/cache/relfilenumbermap.c | 65 ++++------
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 ++--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 ++++++
src/include/access/transam.h | 21 +++
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 12 +-
src/include/catalog/pg_class.h | 16 +--
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 2 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 +++++++--
src/include/storage/reinit.h | 3 +-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++--
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
70 files changed, 721 insertions(+), 346 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..c20439a 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..42e3767 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE")));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 670a540..70ef032 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..e3564e2 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,6 +30,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +530,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +621,132 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(void)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previosly logged record pointer and we
+ * will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9854b51..e09e189 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,18 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ {
+ if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber)
+ elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT,
+ checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber);
+
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ }
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7379,35 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush upto
+ * currently logged record. Also store the currenly log record in the
+ * '*prevrecptr' if prevrecptr is not NULL.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7622,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7646,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5..1d95fc0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..a51dce6 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
@@ -678,3 +583,50 @@ pg_stop_making_pinned_objects(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+
+#ifdef USE_ASSERT_CHECKING
+
+/*
+ * Assert that there is no existing diskfile for input relnumber.
+ */
+void
+AssertRelfileNumberFileNotExists(Oid spcoid, RelFileNumber relnumber,
+ char relpersistence)
+{
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid = spcoid ? spcoid : MyDatabaseTableSpace;
+ rlocator.locator.dbOid =
+ (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ? InvalidOid :
+ MyDatabaseId;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must initialize
+ * that properly here to make sure that any collisions based on filename
+ * are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+
+ Assert(access(rpath, F_OK) != 0);
+}
+#endif
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..9d58451 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ relfilenumber = relid < FirstNormalObjectId ?
+ relid : GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(reltablespace,
+ relfilenumber,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..c1bd4f8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 099d369..d09168d 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -250,7 +250,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenumber;
+ RelFileNumber relfilenumber;
Page page;
List *rlocatorlist = NIL;
LockRelId relid;
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenumber = InvalidRelFileNumber;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7fbee0c..b0539d6 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14331,11 +14331,17 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
- rel->rd_rel->relpersistence);
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(newTableSpace,
+ newrelfilenumber,
+ rel->rd_rel->relpersistence);
/* Open old and new relation */
newrlocator = rel->rd_locator;
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index cb7d460..66edc64 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -290,7 +290,8 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS >
+ MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index f3309c3..9ddefc2 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -953,12 +953,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 637c0ce..4e62d85 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1172,7 +1172,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
int excludeIdx;
bool excludeFound;
ForkNumber relForkNum; /* Type of fork if file is a relation */
- int relOidChars; /* Chars in filename that are the rel oid */
+ int relnumchars; /* Chars in filename that are the relnumber */
/* Skip special stuff */
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
@@ -1222,23 +1222,23 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
/* Exclude all forks for unlogged tables except the init fork */
if (isDbDir &&
- parse_filename_for_nontemp_relation(de->d_name, &relOidChars,
+ parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&relForkNum))
{
/* Never exclude init forks */
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relOid[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
* with the same OID. If so, the file can be excluded.
*/
- memcpy(relOid, de->d_name, relOidChars);
- relOid[relOidChars] = '\0';
+ memcpy(relNumber, de->d_name, relnumchars);
+ relNumber[relnumchars] = '\0';
snprintf(initForkFile, sizeof(initForkFile), "%s/%s_init",
- path, relOid);
+ path, relNumber);
if (lstat(initForkFile, &statbuf) == 0)
{
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index f053fe0..b31b88a 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -195,11 +195,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -235,11 +235,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -285,13 +285,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -304,10 +304,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
dbspacedirname, de->d_name);
/* Construct destination pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(dstpath, sizeof(dstpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
/* OK, we're ready to perform the actual copy. */
@@ -328,12 +328,12 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -342,10 +342,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/* Construct main fork pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(mainpath, sizeof(mainpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
fsync_fname(mainpath, false);
@@ -372,13 +372,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *oidchars has been set to the a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
- * portion of the filename. This is critical to protect against a possible
- * buffer overrun.
+ * *relnumchars has been set to the a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
+ * RelFileNumber portion of the filename. This is critical to protect against
+ * a possible buffer overrun.
*/
bool
-parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
ForkNumber *fork)
{
int pos;
@@ -386,9 +386,9 @@ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
- *oidchars = pos;
+ *relnumchars = pos;
/* Check for a fork name. */
if (name[pos] != '_')
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..a88fef2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..31d7471 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,16 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
+ (relfilenumber))));
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..ff8f0c2 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -29,6 +30,15 @@ do { \
errmsg("function can only be called when server is in binary upgrade mode"))); \
} while (0)
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range", \
+ (relfilenumber)))); \
+} while (0)
+
Datum
binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
{
@@ -98,10 +108,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +132,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +156,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..c0de6b1 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3708,8 +3708,10 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
RelFileLocator newrlocator;
/* Allocate a new relfilenumber */
- newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ newrelfilenumber = GetNewRelFileNumber();
+ AssertRelfileNumberFileNotExists(relation->rd_rel->reltablespace,
+ newrelfilenumber,
+ persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e4fdb6b..546ca23 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3142,9 +3142,9 @@ dumpDatabase(Archive *fout)
PQExpBuffer loFrozenQry = createPQExpBuffer();
PQExpBuffer loOutQry = createPQExpBuffer();
int i_relfrozenxid,
- i_relfilenode,
i_oid,
i_relminmxid;
+ RelFileNumber i_relfilenode;
/*
* pg_largeobject
@@ -3170,11 +3170,11 @@ dumpDatabase(Archive *fout)
appendPQExpBufferStr(loOutQry, "\n-- For binary upgrade, preserve values for pg_largeobject and its index\n");
for (int i = 0; i < PQntuples(lo_res); ++i)
appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
- "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = '%u'\n"
+ "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = " INT64_FORMAT "\n"
"WHERE oid = %u;\n",
atooid(PQgetvalue(lo_res, i, i_relfrozenxid)),
atooid(PQgetvalue(lo_res, i, i_relminmxid)),
- atooid(PQgetvalue(lo_res, i, i_relfilenode)),
+ atorelnumber(PQgetvalue(lo_res, i, i_relfilenode)),
atooid(PQgetvalue(lo_res, i, i_oid)));
ArchiveEntry(fout, nilCatalogId, createDumpId(),
@@ -4842,16 +4842,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4869,7 +4869,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4883,7 +4883,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4891,7 +4891,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4904,7 +4904,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index df374ce..e97ba4d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..931c187 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,21 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster. Anyway,
+ * the new cluster must not have any user tables while upgrading, so we needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -213,6 +228,10 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +312,8 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(void);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..be6ba13 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,14 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
+
+#ifdef USE_ASSERT_CHECKING
+extern void AssertRelfileNumberFileNotExists(Oid spcoid,
+ RelFileNumber relnumber,
+ char relpersistence);
+#else
+#define AssertRelfileNumberFileNotExists(spcoid, relnumber, relpersistence) \
+ ((void)true)
+#endif
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..f873306 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -29,6 +29,8 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+/* Characters to allow for an RelFileNumber in a relation path */
+#define RELNUMBERCHARS 20 /* max chars printed by %lu */
/*
* Stuff for fork names.
*
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 092e959..6e85873 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first interger and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/reinit.h b/src/include/storage/reinit.h
index bf2c10d..b990d28 100644
--- a/src/include/storage/reinit.h
+++ b/src/include/storage/reinit.h
@@ -20,7 +20,8 @@
extern void ResetUnloggedRelations(int op);
extern bool parse_filename_for_nontemp_relation(const char *name,
- int *oidchars, ForkNumber *fork);
+ int *relnumchars,
+ ForkNumber *fork);
#define UNLOGGED_RELATION_CLEANUP 0x0001
#define UNLOGGED_RELATION_INIT 0x0002
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index e3dac16..c0084ae 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Wed, Jul 20, 2022 at 11:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
[v10 patch set]
Hi Dilip, I'm experimenting with these patches and will hopefully have
more to say soon, but I just wanted to point out that this builds with
warnings and failed on 3/4 of the CI OSes on cfbot's last run. Maybe
there is the good kind of uninitialised data on Linux, and the bad
kind of uninitialised data on those other pesky systems?
On Wed, Jul 20, 2022 at 4:57 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Jul 18, 2022 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I was doing some more testing by setting the FirstNormalRelFileNumber
to a high value(more than 32 bits) I have noticed a couple of problems
there e.g. relpath is still using OIDCHARS macro which says max
relfilenumber file name can be only 10 character long which is no
longer true. So there we need to change this value to 20 and also
need to carefully rename the macros and other variable names used for
this purpose.Similarly there was some issue in macro in buf_internal.h while
fetching the relfilenumber. So I will relook into all those issues
and repost the patch soon.I have fixed these existing issues and there was also some issue in
pg_dump.c which was creating problems in upgrading to the same version
while using a higher range of the relfilenumber.There was also an issue where the user table from the old cluster's
relfilenode could conflict with the system table of the new cluster.
As a solution currently for system table object (while creating
storage first time) we are keeping the low range of relfilenumber,
basically we are using the same relfilenumber as OID so that during
upgrade the normal user table from the old cluster will not conflict
with the system tables in the new cluster. But with this solution
Robert told me (in off list chat) a problem that in future if we want
to make relfilenumber completely unique within a cluster by
implementing the CREATEDB differently then we can not do that as we
have created fixed relfilenodes for the system tables.I am not sure what exactly we can do to avoid that because even if we
do something to avoid that in the new cluster the old cluster might
be already using the non-unique relfilenode so after upgrading the new
cluster will also get those non-unique relfilenode.
Thanks for the patch, my comments from the initial review:
1) Since we have changed the macros to inline functions, should we
change the function names similar to the other inline functions in the
same file like: ClearBufferTag, InitBufferTag & BufferTagsEqual:
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
-)
+static inline void
+CLEAR_BUFFERTAG(BufferTag *tag)
+{
+ tag->rlocator.spcOid = InvalidOid;
+ tag->rlocator.dbOid = InvalidOid;
+ tag->rlocator.relNumber = InvalidRelFileNumber;
+ tag->forkNum = InvalidForkNumber;
+ tag->blockNum = InvalidBlockNumber;
+}
2) We could move this macros along with the other macros at the top of the file:
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST (-1)
+#define FREENEXT_NOT_IN_LIST (-2)
3) typo thn should be then:
+ * can raise it as necessary if we end up with more mapped relations. For
+ * now, we just pick a round number that is modestly larger thn the expected
+ * number of mappings.
+ */
4) There is one whitespace issue:
git am v10-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patch
Applying: Widen relfilenumber from 32 bits to 56 bits
.git/rebase-apply/patch:1500: space before tab in indent.
(relfilenumber)))); \
warning: 1 line adds whitespace errors.
Regards,
Vignesh
Hi,
As oid and relfilenumber are linked with each other, I still see that if
the oid value reaches the threshold limit, we are unable to create a table
with storage. For example I set FirstNormalObjectId to 4294967294 (one
value less than the range limit of 2^32 -1 = 4294967295). Now when I try to
create a table, the CREATE TABLE command gets stuck because it is unable to
find the OID for the comp type although it can find a new relfilenumber.
postgres=# create table t1(a int);
CREATE TABLE
postgres=# select oid, reltype, relfilenode from pg_class where relname =
't1';
oid | reltype | relfilenode
------------+------------+-------------
4294967295 | 4294967294 | 100000
(1 row)
postgres=# create table t2(a int);
^CCancel request sent
ERROR: canceling statement due to user request
creation of t2 table gets stuck as it is unable to find a new oid.
Basically the point that I am trying to put here is even though we will be
able to find the new relfile number by increasing the relfilenumber size
but still the commands like above will not execute if the oid value (of 32
bits) has reached the threshold limit.
--
With Regards,
Ashutosh Sharma.
On Wed, Jul 20, 2022 at 4:57 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Show quoted text
On Mon, Jul 18, 2022 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I was doing some more testing by setting the FirstNormalRelFileNumber
to a high value(more than 32 bits) I have noticed a couple of problems
there e.g. relpath is still using OIDCHARS macro which says max
relfilenumber file name can be only 10 character long which is no
longer true. So there we need to change this value to 20 and also
need to carefully rename the macros and other variable names used for
this purpose.Similarly there was some issue in macro in buf_internal.h while
fetching the relfilenumber. So I will relook into all those issues
and repost the patch soon.I have fixed these existing issues and there was also some issue in
pg_dump.c which was creating problems in upgrading to the same version
while using a higher range of the relfilenumber.There was also an issue where the user table from the old cluster's
relfilenode could conflict with the system table of the new cluster.
As a solution currently for system table object (while creating
storage first time) we are keeping the low range of relfilenumber,
basically we are using the same relfilenumber as OID so that during
upgrade the normal user table from the old cluster will not conflict
with the system tables in the new cluster. But with this solution
Robert told me (in off list chat) a problem that in future if we want
to make relfilenumber completely unique within a cluster by
implementing the CREATEDB differently then we can not do that as we
have created fixed relfilenodes for the system tables.I am not sure what exactly we can do to avoid that because even if we
do something to avoid that in the new cluster the old cluster might
be already using the non-unique relfilenode so after upgrading the new
cluster will also get those non-unique relfilenode.--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Jul 22, 2022 at 4:21 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, Jul 20, 2022 at 4:57 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Jul 18, 2022 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I was doing some more testing by setting the FirstNormalRelFileNumber
to a high value(more than 32 bits) I have noticed a couple of problems
there e.g. relpath is still using OIDCHARS macro which says max
relfilenumber file name can be only 10 character long which is no
longer true. So there we need to change this value to 20 and also
need to carefully rename the macros and other variable names used for
this purpose.Similarly there was some issue in macro in buf_internal.h while
fetching the relfilenumber. So I will relook into all those issues
and repost the patch soon.I have fixed these existing issues and there was also some issue in
pg_dump.c which was creating problems in upgrading to the same version
while using a higher range of the relfilenumber.There was also an issue where the user table from the old cluster's
relfilenode could conflict with the system table of the new cluster.
As a solution currently for system table object (while creating
storage first time) we are keeping the low range of relfilenumber,
basically we are using the same relfilenumber as OID so that during
upgrade the normal user table from the old cluster will not conflict
with the system tables in the new cluster. But with this solution
Robert told me (in off list chat) a problem that in future if we want
to make relfilenumber completely unique within a cluster by
implementing the CREATEDB differently then we can not do that as we
have created fixed relfilenodes for the system tables.I am not sure what exactly we can do to avoid that because even if we
do something to avoid that in the new cluster the old cluster might
be already using the non-unique relfilenode so after upgrading the new
cluster will also get those non-unique relfilenode.Thanks for the patch, my comments from the initial review: 1) Since we have changed the macros to inline functions, should we change the function names similar to the other inline functions in the same file like: ClearBufferTag, InitBufferTag & BufferTagsEqual: -#define BUFFERTAGS_EQUAL(a,b) \ -( \ - RelFileLocatorEquals((a).rlocator, (b).rlocator) && \ - (a).blockNum == (b).blockNum && \ - (a).forkNum == (b).forkNum \ -) +static inline void +CLEAR_BUFFERTAG(BufferTag *tag) +{ + tag->rlocator.spcOid = InvalidOid; + tag->rlocator.dbOid = InvalidOid; + tag->rlocator.relNumber = InvalidRelFileNumber; + tag->forkNum = InvalidForkNumber; + tag->blockNum = InvalidBlockNumber; +}2) We could move this macros along with the other macros at the top of the file: +/* + * The freeNext field is either the index of the next freelist entry, + * or one of these special values: + */ +#define FREENEXT_END_OF_LIST (-1) +#define FREENEXT_NOT_IN_LIST (-2)3) typo thn should be then: + * can raise it as necessary if we end up with more mapped relations. For + * now, we just pick a round number that is modestly larger thn the expected + * number of mappings. + */
Few more typos in 0004 patch as well:
the a value
interger
previosly
currenly
4) There is one whitespace issue:
git am v10-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patch
Applying: Widen relfilenumber from 32 bits to 56 bits
.git/rebase-apply/patch:1500: space before tab in indent.
(relfilenumber)))); \
warning: 1 line adds whitespace errors.Regards,
Vignesh
Regards,
Amul
On Mon, Jul 25, 2022 at 9:51 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi,
As oid and relfilenumber are linked with each other, I still see that if the oid value reaches the threshold limit, we are unable to create a table with storage. For example I set FirstNormalObjectId to 4294967294 (one value less than the range limit of 2^32 -1 = 4294967295). Now when I try to create a table, the CREATE TABLE command gets stuck because it is unable to find the OID for the comp type although it can find a new relfilenumber.
First of all if the OID value reaches to max oid then it should wrap
around to the FirstNormalObjectId and find a new non conflicting OID.
Since in your case the first normaloid is 4294967294 and max oid is
42949672945 there is no scope of wraparound because in this case you
can create at most one object and once you created that then there is
no more unused oid left and with the current patch we are not at all
trying do anything about this.
Now come to the problem we are trying to solve with 56bits
relfilenode. Here we are not trying to extend the limit of the system
to create more than 4294967294 objects. What we are trying to solve
is to not to reuse the same disk filenames for different objects. And
also notice that the relfilenodes can get consumed really faster than
oid so chances of wraparound is more, I mean you can truncate/rewrite
the same relation multiple times so that relation will have the same
oid but will consume multiple relfilenodes.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Jul 21, 2022 at 9:53 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Wed, Jul 20, 2022 at 11:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
[v10 patch set]
Hi Dilip, I'm experimenting with these patches and will hopefully have
more to say soon, but I just wanted to point out that this builds with
warnings and failed on 3/4 of the CI OSes on cfbot's last run. Maybe
there is the good kind of uninitialised data on Linux, and the bad
kind of uninitialised data on those other pesky systems?
Thanks, I have figured out the issue, I will post the patch soon.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Jul 22, 2022 at 4:21 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, Jul 20, 2022 at 4:57 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Thanks for the patch, my comments from the initial review:
1) Since we have changed the macros to inline functions, should we
change the function names similar to the other inline functions in the
same file like: ClearBufferTag, InitBufferTag & BufferTagsEqual:
I have thought about it while doing so but I am not sure whether it is
a good idea or not, because before my change these all were macros
with 2 naming conventions so I just changed to inline function so why
to change the name.
-#define BUFFERTAGS_EQUAL(a,b) \ -( \ - RelFileLocatorEquals((a).rlocator, (b).rlocator) && \ - (a).blockNum == (b).blockNum && \ - (a).forkNum == (b).forkNum \ -) +static inline void +CLEAR_BUFFERTAG(BufferTag *tag) +{ + tag->rlocator.spcOid = InvalidOid; + tag->rlocator.dbOid = InvalidOid; + tag->rlocator.relNumber = InvalidRelFileNumber; + tag->forkNum = InvalidForkNumber; + tag->blockNum = InvalidBlockNumber; +}2) We could move this macros along with the other macros at the top of the file: +/* + * The freeNext field is either the index of the next freelist entry, + * or one of these special values: + */ +#define FREENEXT_END_OF_LIST (-1) +#define FREENEXT_NOT_IN_LIST (-2)
Yeah we can do that.
3) typo thn should be then: + * can raise it as necessary if we end up with more mapped relations. For + * now, we just pick a round number that is modestly larger thn the expected + * number of mappings. + */4) There is one whitespace issue:
git am v10-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patch
Applying: Widen relfilenumber from 32 bits to 56 bits
.git/rebase-apply/patch:1500: space before tab in indent.
(relfilenumber)))); \
warning: 1 line adds whitespace errors.
Okay, I will fix it.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jul 26, 2022 at 10:05 AM Amul Sul <sulamul@gmail.com> wrote:
Few more typos in 0004 patch as well:
the a value
interger
previosly
currenly
Thanks for the review, I will fix it in the next version.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Jul 21, 2022 at 9:53 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Wed, Jul 20, 2022 at 11:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
[v10 patch set]
Hi Dilip, I'm experimenting with these patches and will hopefully have
more to say soon, but I just wanted to point out that this builds with
warnings and failed on 3/4 of the CI OSes on cfbot's last run. Maybe
there is the good kind of uninitialised data on Linux, and the bad
kind of uninitialised data on those other pesky systems?
Here is the patch to fix the issue, basically, while asserting for the
file existence it was not setting the relfilenumber in the
relfilelocator before generating the path so it was just checking for
the existence of the random path so it was asserting randomly.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v11-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchDownload
From 728c82feb0fb70899a3eb51be34574abd3a1ef5d Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 12 Jul 2022 17:10:04 +0530
Subject: [PATCH v11 1/4] Convert buf_internal.h macros to static inline
functions
Readability wise inline functions are better compared to macros and this
will also help to write cleaner and readable code for 64-bit relfilenode
because as part of that patch we will have to do some complex bitwise
operation so doing that inside the inline function will be cleaner.
---
src/backend/storage/buffer/buf_init.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 16 ++--
src/backend/storage/buffer/localbuf.c | 12 +--
src/include/storage/buf_internals.h | 156 +++++++++++++++++++++-------------
4 files changed, 111 insertions(+), 75 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 2862e9e..55f646d 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -116,7 +116,7 @@ InitBufferPool(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
pg_atomic_init_u32(&buf->state, 0);
buf->wait_backend_pgprocno = INVALID_PGPROCNO;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index b7488b5..4a8e294 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
+ INIT_BUFFERTAG(&newTag, &smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
+ INIT_BUFFERTAG(&tag, &rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -642,7 +642,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
buf_state = pg_atomic_read_u32(&bufHdr->state);
/* Is it still valid and holding the right tag? */
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/*
* Bump buffer's ref and usage counts. This is equivalent of
@@ -679,7 +679,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
else
buf_state = LockBufHdr(bufHdr);
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/*
* It's now safe to pin the buffer. We can't pin first and ask
@@ -1134,7 +1134,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1517,7 +1517,7 @@ retry:
buf_state = LockBufHdr(buf);
/* If it's changed while we were waiting for lock, do nothing */
- if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+ if (!BUFFERTAGS_EQUAL(&buf->tag, &oldTag))
{
UnlockBufHdr(buf, buf_state);
LWLockRelease(oldPartitionLock);
@@ -1549,7 +1549,7 @@ retry:
* linear scans of the buffer array don't think the buffer is valid.
*/
oldFlags = buf_state & BUF_FLAG_MASK;
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
UnlockBufHdr(buf, buf_state);
@@ -3365,7 +3365,7 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
+ INIT_BUFFERTAG(&bufTag, &rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 9c03885..91e174e 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -131,7 +131,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
b = hresult->id;
bufHdr = GetLocalBufferDescriptor(b);
- Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
+ Assert(BUFFERTAGS_EQUAL(&bufHdr->tag, &newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
@@ -253,7 +253,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* mark buffer invalid just in case hash insert fails */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~(BM_VALID | BM_TAG_VALID);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
@@ -354,7 +354,7 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -398,7 +398,7 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 69e4590..2f0e60e 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -95,28 +95,32 @@ typedef struct buftag
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define CLEAR_BUFFERTAG(a) \
-( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
- (a).forkNum = InvalidForkNumber, \
- (a).blockNum = InvalidBlockNumber \
-)
-
-#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
-( \
- (a).rlocator = (xx_rlocator), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
-)
-
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
-)
+static inline void
+CLEAR_BUFFERTAG(BufferTag *tag)
+{
+ tag->rlocator.spcOid = InvalidOid;
+ tag->rlocator.dbOid = InvalidOid;
+ tag->rlocator.relNumber = InvalidRelFileNumber;
+ tag->forkNum = InvalidForkNumber;
+ tag->blockNum = InvalidBlockNumber;
+}
+
+static inline void
+INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
+ ForkNumber forkNum, BlockNumber blockNum)
+{
+ tag->rlocator = *rlocator;
+ tag->forkNum = forkNum;
+ tag->blockNum = blockNum;
+}
+
+static inline bool
+BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
+{
+ return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ (tag1->blockNum == tag2->blockNum) &&
+ (tag1->forkNum == tag2->forkNum);
+}
/*
* The shared buffer mapping table is partitioned to reduce contention.
@@ -124,13 +128,24 @@ typedef struct buftag
* hash code with BufTableHashCode(), then apply BufMappingPartitionLock().
* NB: NUM_BUFFER_PARTITIONS must be a power of 2!
*/
-#define BufTableHashPartition(hashcode) \
- ((hashcode) % NUM_BUFFER_PARTITIONS)
-#define BufMappingPartitionLock(hashcode) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + \
- BufTableHashPartition(hashcode)].lock)
-#define BufMappingPartitionLockByIndex(i) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + (i)].lock)
+static inline uint32
+BufTableHashPartition(uint32 hashcode)
+{
+ return hashcode % NUM_BUFFER_PARTITIONS;
+}
+
+static inline LWLock *
+BufMappingPartitionLock(uint32 hashcode)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET +
+ BufTableHashPartition(hashcode)].lock;
+}
+
+static inline LWLock *
+BufMappingPartitionLockByIndex(uint32 index)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + index].lock;
+}
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
@@ -220,37 +235,6 @@ typedef union BufferDescPadded
char pad[BUFFERDESC_PAD_TO_SIZE];
} BufferDescPadded;
-#define GetBufferDescriptor(id) (&BufferDescriptors[(id)].bufferdesc)
-#define GetLocalBufferDescriptor(id) (&LocalBufferDescriptors[(id)])
-
-#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
-
-#define BufferDescriptorGetIOCV(bdesc) \
- (&(BufferIOCVArray[(bdesc)->buf_id]).cv)
-#define BufferDescriptorGetContentLock(bdesc) \
- ((LWLock*) (&(bdesc)->content_lock))
-
-extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
-
-/*
- * The freeNext field is either the index of the next freelist entry,
- * or one of these special values:
- */
-#define FREENEXT_END_OF_LIST (-1)
-#define FREENEXT_NOT_IN_LIST (-2)
-
-/*
- * Functions for acquiring/releasing a shared buffer header's spinlock. Do
- * not apply these to local buffers!
- */
-extern uint32 LockBufHdr(BufferDesc *desc);
-#define UnlockBufHdr(desc, s) \
- do { \
- pg_write_barrier(); \
- pg_atomic_write_u32(&(desc)->state, (s) & (~BM_LOCKED)); \
- } while (0)
-
-
/*
* The PendingWriteback & WritebackContext structure are used to keep
* information about pending flush requests to be issued to the OS.
@@ -276,11 +260,63 @@ typedef struct WritebackContext
/* in buf_init.c */
extern PGDLLIMPORT BufferDescPadded *BufferDescriptors;
+extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
extern PGDLLIMPORT WritebackContext BackendWritebackContext;
/* in localbuf.c */
extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
+
+static inline BufferDesc *
+GetBufferDescriptor(uint32 id)
+{
+ return &(BufferDescriptors[id]).bufferdesc;
+}
+
+static inline BufferDesc *
+GetLocalBufferDescriptor(uint32 id)
+{
+ return &LocalBufferDescriptors[id];
+}
+
+static inline Buffer
+BufferDescriptorGetBuffer(const BufferDesc *bdesc)
+{
+ return (Buffer) (bdesc->buf_id + 1);
+}
+
+static inline ConditionVariable *
+BufferDescriptorGetIOCV(const BufferDesc *bdesc)
+{
+ return &(BufferIOCVArray[bdesc->buf_id]).cv;
+}
+
+static inline LWLock *
+BufferDescriptorGetContentLock(const BufferDesc *bdesc)
+{
+ return (LWLock *) (&bdesc->content_lock);
+}
+
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST (-1)
+#define FREENEXT_NOT_IN_LIST (-2)
+
+/*
+ * Functions for acquiring/releasing a shared buffer header's spinlock. Do
+ * not apply these to local buffers!
+ */
+extern uint32 LockBufHdr(BufferDesc *desc);
+
+static inline void
+UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
+{
+ pg_write_barrier();
+ pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
+}
+
/* in bufmgr.c */
/*
--
1.8.3.1
v11-0002-Preliminary-refactoring-for-supporting-larger-re.patchtext/x-patch; charset=US-ASCII; name=v11-0002-Preliminary-refactoring-for-supporting-larger-re.patchDownload
From 45cf21f55b3f25ad8b6953972bcac66a99914484 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Wed, 13 Jul 2022 13:12:53 +0530
Subject: [PATCH v11 2/4] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 +-
src/backend/storage/buffer/bufmgr.c | 140 +++++++++++++++++---------
src/backend/storage/buffer/localbuf.c | 30 ++++--
src/include/storage/buf_internals.h | 64 ++++++++++--
5 files changed, 178 insertions(+), 74 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..c5754ea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(&bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index b2d6026..63f0d41 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -630,10 +630,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(&bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(&bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4a8e294..f872c34 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1657,8 +1657,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1668,8 +1668,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2010,9 +2010,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(&bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(&bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2702,6 +2702,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2717,8 +2718,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, BufTagGetForkNum(&buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2797,8 +2800,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ *rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ *forknum = BufTagGetForkNum(&bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2848,9 +2851,14 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2909,7 +2917,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2930,7 +2938,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3151,15 +3159,15 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3310,7 +3318,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &locators[j]))
{
rlocator = &locators[j];
break;
@@ -3319,7 +3327,10 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ locator = BufTagGetRelFileLocator(&bufHdr->tag);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3329,7 +3340,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, rlocator))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3389,8 +3400,8 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3428,11 +3439,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3456,13 +3467,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3482,12 +3496,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3526,7 +3544,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3544,7 +3562,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3573,13 +3591,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3653,7 +3671,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3662,7 +3680,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3674,7 +3695,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3876,13 +3897,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4042,6 +4063,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4050,8 +4075,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4659,8 +4683,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4684,7 +4710,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4702,8 +4732,12 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4797,15 +4831,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ rlocatora = BufTagGetRelFileLocator(ba);
+ rlocatorb = BufTagGetRelFileLocator(bb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(ba) < BufTagGetForkNum(bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(ba) > BufTagGetForkNum(bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4955,10 +4994,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ currlocator = BufTagGetRelFileLocator(&tag);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4966,11 +5007,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ nextrlocator = BufTagGetRelFileLocator(&next->tag);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
+ BufTagGetForkNum(&cur->tag) != BufTagGetForkNum(&next->tag))
break;
/* ok, block queued twice, skip */
@@ -4988,8 +5032,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 91e174e..972f3f3 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,15 +213,18 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +340,22 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +392,16 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 2f0e60e..092e959 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,18 +90,51 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+static inline RelFileNumber
+BufTagGetRelNumber(const BufferTag *tag)
+{
+ return tag->relNumber;
+}
+
+static inline ForkNumber
+BufTagGetForkNum(const BufferTag *tag)
+{
+ return tag->forkNum;
+}
+
+static inline void
+BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
+ ForkNumber forknum)
+{
+ tag->relNumber = relnumber;
+ tag->forkNum = forknum;
+}
+
+static inline RelFileLocator
+BufTagGetRelFileLocator(const BufferTag *tag)
+{
+ RelFileLocator rlocator;
+
+ rlocator.spcOid = tag->spcOid;
+ rlocator.dbOid = tag->dbOid;
+ rlocator.relNumber = BufTagGetRelNumber(tag);
+
+ return rlocator;
+}
+
static inline void
CLEAR_BUFFERTAG(BufferTag *tag)
{
- tag->rlocator.spcOid = InvalidOid;
- tag->rlocator.dbOid = InvalidOid;
- tag->rlocator.relNumber = InvalidRelFileNumber;
- tag->forkNum = InvalidForkNumber;
+ tag->spcOid = InvalidOid;
+ tag->dbOid = InvalidOid;
+ BufTagSetRelForkDetails(tag, InvalidRelFileNumber, InvalidForkNumber);
tag->blockNum = InvalidBlockNumber;
}
@@ -109,19 +142,32 @@ static inline void
INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
ForkNumber forkNum, BlockNumber blockNum)
{
- tag->rlocator = *rlocator;
- tag->forkNum = forkNum;
+ tag->spcOid = rlocator->spcOid;
+ tag->dbOid = rlocator->dbOid;
+ BufTagSetRelForkDetails(tag, rlocator->relNumber, forkNum);
tag->blockNum = blockNum;
}
static inline bool
BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
- return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ return (tag1->spcOid == tag2->spcOid) &&
+ (tag1->dbOid == tag2->dbOid) &&
+ (tag1->relNumber == tag2->relNumber) &&
(tag1->blockNum == tag2->blockNum) &&
(tag1->forkNum == tag2->forkNum);
}
+static inline bool
+BufTagMatchesRelFileLocator(const BufferTag *tag,
+ const RelFileLocator *rlocator)
+{
+ return (tag->spcOid == rlocator->spcOid) &&
+ (tag->dbOid == rlocator->dbOid) &&
+ (BufTagGetRelNumber(tag) == rlocator->relNumber);
+}
+
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v11-0003-Remove-the-restriction-that-the-relmap-must-be-5.patchtext/x-patch; charset=US-ASCII; name=v11-0003-Remove-the-restriction-that-the-relmap-must-be-5.patchDownload
From 0555444cdcf3118400a655c6ca9e437b52dcc5ec Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 12 Jul 2022 09:04:44 -0400
Subject: [PATCH v11 3/4] Remove the restriction that the relmap must be 512
bytes.
Instead of relying on the ability to atomically overwrite the
entire relmap file in one shot, write a new one and durably
rename it into place. Removing the struct padding and the
calculation showing why the map is exactly 512 bytes, and change
the maximum number of entries to a nearby round number.
Patch by me, reviewed by Andres Freund.
Discussion: http://postgr.es/m/CA+TgmoacMgLv_0edhN=oWjnUvJyFjXww4Q4re4kfm+qkSBtjaQ@mail.gmail.com
---
doc/src/sgml/monitoring.sgml | 4 +-
src/backend/utils/activity/wait_event.c | 4 +-
src/backend/utils/cache/relmapper.c | 94 +++++++++++++++++++--------------
src/include/utils/wait_event.h | 2 +-
4 files changed, 58 insertions(+), 46 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7dbbab6..475223c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1409,8 +1409,8 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting for a read of the relation map file.</entry>
</row>
<row>
- <entry><literal>RelationMapSync</literal></entry>
- <entry>Waiting for the relation map file to reach durable storage.</entry>
+ <entry><literal>RelationMapRename</literal></entry>
+ <entry>Waiting for durable replacement of a relation map file.</entry>
</row>
<row>
<entry><literal>RelationMapWrite</literal></entry>
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index da57a93..9ebd6fc 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -633,8 +633,8 @@ pgstat_get_wait_io(WaitEventIO w)
case WAIT_EVENT_RELATION_MAP_READ:
event_name = "RelationMapRead";
break;
- case WAIT_EVENT_RELATION_MAP_SYNC:
- event_name = "RelationMapSync";
+ case WAIT_EVENT_RELATION_MAP_RENAME:
+ event_name = "RelationMapRename";
break;
case WAIT_EVENT_RELATION_MAP_WRITE:
event_name = "RelationMapWrite";
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 8e5595b..afbab52 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -60,21 +60,26 @@
/*
* The map file is critical data: we have no automatic method for recovering
* from loss or corruption of it. We use a CRC so that we can detect
- * corruption. To minimize the risk of failed updates, the map file should
- * be kept to no more than one standard-size disk sector (ie 512 bytes),
- * and we use overwrite-in-place rather than playing renaming games.
- * The struct layout below is designed to occupy exactly 512 bytes, which
- * might make filesystem updates a bit more efficient.
+ * corruption. Since the file might be more than one standard-size disk
+ * sector in size, we cannot rely on overwrite-in-place. Instead, we generate
+ * a new file and rename it into place, atomically replacing the original file.
*
* Entries in the mappings[] array are in no particular order. We could
* speed searching by insisting on OID order, but it really shouldn't be
* worth the trouble given the intended size of the mapping sets.
*/
#define RELMAPPER_FILENAME "pg_filenode.map"
+#define RELMAPPER_TEMP_FILENAME "pg_filenode.map.tmp"
#define RELMAPPER_FILEMAGIC 0x592717 /* version ID value */
-#define MAX_MAPPINGS 62 /* 62 * 8 + 16 = 512 */
+/*
+ * There's no need for this constant to have any particular value, and we
+ * can raise it as necessary if we end up with more mapped relations. For
+ * now, we just pick a round number that is modestly larger than the expected
+ * number of mappings.
+ */
+#define MAX_MAPPINGS 64
typedef struct RelMapping
{
@@ -88,7 +93,6 @@ typedef struct RelMapFile
int32 num_mappings; /* number of valid RelMapping entries */
RelMapping mappings[MAX_MAPPINGS];
pg_crc32c crc; /* CRC of all above */
- int32 pad; /* to make the struct size be 512 exactly */
} RelMapFile;
/*
@@ -877,6 +881,7 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
{
int fd;
char mapfilename[MAXPGPATH];
+ char maptempfilename[MAXPGPATH];
/*
* Fill in the overhead fields and update CRC.
@@ -890,17 +895,47 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
FIN_CRC32C(newmap->crc);
/*
- * Open the target file. We prefer to do this before entering the
- * critical section, so that an open() failure need not force PANIC.
+ * Construct filenames -- a temporary file that we'll create to write the
+ * data initially, and then the permanent name to which we will rename it.
*/
snprintf(mapfilename, sizeof(mapfilename), "%s/%s",
dbpath, RELMAPPER_FILENAME);
- fd = OpenTransientFile(mapfilename, O_WRONLY | O_CREAT | PG_BINARY);
+ snprintf(maptempfilename, sizeof(maptempfilename), "%s/%s",
+ dbpath, RELMAPPER_TEMP_FILENAME);
+
+ /*
+ * Open a temporary file. If a file already exists with this name, it must
+ * be left over from a previous crash, so we can overwrite it. Concurrent
+ * calls to this function are not allowed.
+ */
+ fd = OpenTransientFile(maptempfilename,
+ O_WRONLY | O_CREAT | O_TRUNC | PG_BINARY);
if (fd < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m",
- mapfilename)));
+ maptempfilename)));
+
+ /* Write new data to the file. */
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
+ if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
+ /* if write didn't set errno, assume problem is no disk space */
+ if (errno == 0)
+ errno = ENOSPC;
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ maptempfilename)));
+ }
+ pgstat_report_wait_end();
+
+ /* And close the file. */
+ if (CloseTransientFile(fd) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not close file \"%s\": %m",
+ maptempfilename)));
if (write_wal)
{
@@ -924,40 +959,17 @@ write_relmap_file(RelMapFile *newmap, bool write_wal, bool send_sinval,
XLogFlush(lsn);
}
- errno = 0;
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_WRITE);
- if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
- {
- /* if write didn't set errno, assume problem is no disk space */
- if (errno == 0)
- errno = ENOSPC;
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not write file \"%s\": %m",
- mapfilename)));
- }
- pgstat_report_wait_end();
-
/*
- * We choose to fsync the data to disk before considering the task done.
- * It would be possible to relax this if it turns out to be a performance
- * issue, but it would complicate checkpointing --- see notes for
- * CheckPointRelationMap.
+ * durable_rename() does all the hard work of making sure that we rename
+ * the temporary file into place in a crash-safe manner.
+ *
+ * NB: Although we instruct durable_rename() to use ERROR, we will often
+ * be in a critical section at this point; if so, ERROR will become PANIC.
*/
- pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_SYNC);
- if (pg_fsync(fd) != 0)
- ereport(data_sync_elevel(ERROR),
- (errcode_for_file_access(),
- errmsg("could not fsync file \"%s\": %m",
- mapfilename)));
+ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_RENAME);
+ durable_rename(maptempfilename, mapfilename, ERROR);
pgstat_report_wait_end();
- if (CloseTransientFile(fd) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not close file \"%s\": %m",
- mapfilename)));
-
/*
* Now that the file is safely on disk, send sinval message to let other
* backends know to re-read it. We must do this inside the critical
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index c3ade01..39f770a 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -194,7 +194,7 @@ typedef enum
WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE,
WAIT_EVENT_LOGICAL_REWRITE_WRITE,
WAIT_EVENT_RELATION_MAP_READ,
- WAIT_EVENT_RELATION_MAP_SYNC,
+ WAIT_EVENT_RELATION_MAP_RENAME,
WAIT_EVENT_RELATION_MAP_WRITE,
WAIT_EVENT_REORDER_BUFFER_READ,
WAIT_EVENT_REORDER_BUFFER_WRITE,
--
1.8.3.1
v11-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v11-0004-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From 1ea8da5d33c1d6e39e77d03280ea1d8cc1fb38aa Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 14 Jul 2022 14:25:50 +0530
Subject: [PATCH v11 4/4] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 ++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 ++++-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 183 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 61 +++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 95 -----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/dbcommands.c | 4 +-
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 3 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/basebackup.c | 13 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 46 +++---
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 11 +-
src/backend/utils/adt/pg_upgrade_support.c | 22 ++-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 65 +++-----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 +++++
src/include/access/transam.h | 22 +++
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 2 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 ++++++-
src/include/storage/reinit.h | 3 +-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
70 files changed, 706 insertions(+), 344 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..c20439a 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..42e3767 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE")));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a186e35..b77940f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..a2f0d35 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,173 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previously logged record pointer and
+ * we will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 15ab8d9..fa1436b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,18 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ {
+ if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber)
+ elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT,
+ checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber);
+
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ }
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7379,35 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush upto
+ * currently logged record. Also store the currently logged record pointer in
+ * the '*prevrecptr' if prevrecptr is not NULL.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7622,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7646,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e383c21..81d1f27 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..c1bd4f8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 099d369..d09168d 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -250,7 +250,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenumber;
+ RelFileNumber relfilenumber;
Page page;
List *rlocatorlist = NIL;
LockRelId relid;
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenumber = InvalidRelFileNumber;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7fbee0c..430ade2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14331,10 +14331,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index cb7d460..66edc64 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -290,7 +290,8 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS >
+ MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 86cf1b3..dc8b9da 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 637c0ce..4de249b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1172,7 +1172,8 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
int excludeIdx;
bool excludeFound;
ForkNumber relForkNum; /* Type of fork if file is a relation */
- int relOidChars; /* Chars in filename that are the rel oid */
+ int relnumchars; /* Chars in filename that are the
+ * relnumber */
/* Skip special stuff */
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
@@ -1222,23 +1223,23 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
/* Exclude all forks for unlogged tables except the init fork */
if (isDbDir &&
- parse_filename_for_nontemp_relation(de->d_name, &relOidChars,
+ parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&relForkNum))
{
/* Never exclude init forks */
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relOid[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
* with the same OID. If so, the file can be excluded.
*/
- memcpy(relOid, de->d_name, relOidChars);
- relOid[relOidChars] = '\0';
+ memcpy(relNumber, de->d_name, relnumchars);
+ relNumber[relnumchars] = '\0';
snprintf(initForkFile, sizeof(initForkFile), "%s/%s_init",
- path, relOid);
+ path, relNumber);
if (lstat(initForkFile, &statbuf) == 0)
{
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index f053fe0..a4bae7c 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -195,11 +195,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -235,11 +235,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -285,13 +285,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -304,10 +304,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
dbspacedirname, de->d_name);
/* Construct destination pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(dstpath, sizeof(dstpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
/* OK, we're ready to perform the actual copy. */
@@ -328,12 +328,12 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -342,10 +342,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/* Construct main fork pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(mainpath, sizeof(mainpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
fsync_fname(mainpath, false);
@@ -372,13 +372,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *oidchars has been set to the a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
- * portion of the filename. This is critical to protect against a possible
- * buffer overrun.
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
+ * RelFileNumber portion of the filename. This is critical to protect against
+ * a possible buffer overrun.
*/
bool
-parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
ForkNumber *fork)
{
int pos;
@@ -386,9 +386,9 @@ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
- *oidchars = pos;
+ *relnumchars = pos;
/* Check for a fork name. */
if (name[pos] != '_')
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..a88fef2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..31d7471 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,16 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
+ (relfilenumber))));
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..ef5af83 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -29,6 +30,15 @@ do { \
errmsg("function can only be called when server is in binary upgrade mode"))); \
} while (0)
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range", \
+ (relfilenumber)))); \
+} while (0)
+
Datum
binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
{
@@ -98,10 +108,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +132,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +156,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..a0b7c5b 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3709,7 +3709,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f9c51d1..0db33ef 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3142,9 +3142,9 @@ dumpDatabase(Archive *fout)
PQExpBuffer loFrozenQry = createPQExpBuffer();
PQExpBuffer loOutQry = createPQExpBuffer();
int i_relfrozenxid,
- i_relfilenode,
i_oid,
i_relminmxid;
+ RelFileNumber i_relfilenode;
/*
* pg_largeobject
@@ -3170,11 +3170,11 @@ dumpDatabase(Archive *fout)
appendPQExpBufferStr(loOutQry, "\n-- For binary upgrade, preserve values for pg_largeobject and its index\n");
for (int i = 0; i < PQntuples(lo_res); ++i)
appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
- "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = '%u'\n"
+ "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = " INT64_FORMAT "\n"
"WHERE oid = %u;\n",
atooid(PQgetvalue(lo_res, i, i_relfrozenxid)),
atooid(PQgetvalue(lo_res, i, i_relminmxid)),
- atooid(PQgetvalue(lo_res, i, i_relfilenode)),
+ atorelnumber(PQgetvalue(lo_res, i, i_relfilenode)),
atooid(PQgetvalue(lo_res, i, i_oid)));
ArchiveEntry(fout, nilCatalogId, createDumpId(),
@@ -4853,16 +4853,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4880,7 +4880,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4894,7 +4894,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4902,7 +4902,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4915,7 +4915,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index df374ce..e97ba4d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..c7c51f2 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,21 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster. Anyway,
+ * the new cluster must not have any user tables while upgrading, so we needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -213,6 +228,10 @@ typedef struct VariableCacheData
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
/*
* These fields are protected by XidGenLock.
@@ -293,6 +312,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..f873306 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -29,6 +29,8 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+/* Characters to allow for an RelFileNumber in a relation path */
+#define RELNUMBERCHARS 20 /* max chars printed by %lu */
/*
* Stuff for fork names.
*
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 092e959..70c33d1 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/reinit.h b/src/include/storage/reinit.h
index bf2c10d..b990d28 100644
--- a/src/include/storage/reinit.h
+++ b/src/include/storage/reinit.h
@@ -20,7 +20,8 @@
extern void ResetUnloggedRelations(int op);
extern bool parse_filename_for_nontemp_relation(const char *name,
- int *oidchars, ForkNumber *fork);
+ int *relnumchars,
+ ForkNumber *fork);
#define UNLOGGED_RELATION_CLEANUP 0x0001
#define UNLOGGED_RELATION_INIT 0x0002
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index e3dac16..c0084ae 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table
otherwise
+ * allocate a new relfilenumber. For more details read comments
atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ relfilenumber = relid < FirstNormalObjectId ?
+ relid : GetNewRelFileNumber();
Above code says that in the case of system table we want relfilenode to be
the same as object id. This technically means that the relfilenode or oid
for the system tables would not be exceeding 16383. However in the below
lines of code added in the patch, it says there is some chance for the
storage path of the user tables from the old cluster conflicting with the
storage path of the system tables in the new cluster. Assuming that the
OIDs for the user tables on the old cluster would start with 16384 (the
first object ID), I see no reason why there would be a conflict.
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later
for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will
start
+ * at 100000. Thus, when upgrading from an older cluster, the relation
storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster.
Anyway,
+ * the new cluster must not have any user tables while upgrading, so we
needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
==
When WAL logging the next object id we have the chosen the xlog threshold
value as 8192 whereas for relfilenode it is 512. Any reason for choosing
this low arbitrary value in case of relfilenumber?
--
With Regards,
Ashutosh Sharma.
On Tue, Jul 26, 2022 at 1:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Show quoted text
On Thu, Jul 21, 2022 at 9:53 AM Thomas Munro <thomas.munro@gmail.com>
wrote:On Wed, Jul 20, 2022 at 11:27 PM Dilip Kumar <dilipbalaut@gmail.com>
wrote:
[v10 patch set]
Hi Dilip, I'm experimenting with these patches and will hopefully have
more to say soon, but I just wanted to point out that this builds with
warnings and failed on 3/4 of the CI OSes on cfbot's last run. Maybe
there is the good kind of uninitialised data on Linux, and the bad
kind of uninitialised data on those other pesky systems?Here is the patch to fix the issue, basically, while asserting for the
file existence it was not setting the relfilenumber in the
relfilelocator before generating the path so it was just checking for
the existence of the random path so it was asserting randomly.--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jul 26, 2022 at 6:06 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi,
Note: please avoid top posting.
/* * If relfilenumber is unspecified by the caller then create storage - * with oid same as relid. + * with relfilenumber same as relid if it is a system table otherwise + * allocate a new relfilenumber. For more details read comments atop + * FirstNormalRelFileNumber declaration. */ if (!RelFileNumberIsValid(relfilenumber)) - relfilenumber = relid; + { + relfilenumber = relid < FirstNormalObjectId ? + relid : GetNewRelFileNumber();Above code says that in the case of system table we want relfilenode to be the same as object id. This technically means that the relfilenode or oid for the system tables would not be exceeding 16383. However in the below lines of code added in the patch, it says there is some chance for the storage path of the user tables from the old cluster conflicting with the storage path of the system tables in the new cluster. Assuming that the OIDs for the user tables on the old cluster would start with 16384 (the first object ID), I see no reason why there would be a conflict.
Basically, the above comment says that the initial system table
storage will be created with the same relfilenumber as Oid so you are
right that will not exceed 16383. And below code is explaining the
reason that in order to avoid the conflict with the user table from
the older cluster we do it this way. Otherwise, in the new design, we
have no intention to keep the relfilenode same as Oid. But during an
upgrade from the older cluster which is not following this new design
might have user table relfilenode which can conflict with the system
table in the new cluster so we have to ensure that with the new design
also when creating the initial cluster we keep the system table
relfilenode in low range and directly using Oid is the best idea for
this purpose instead of defining the completely new range and
maintaining a separate counter for that.
+/* ---------- + * RelFileNumber zero is InvalidRelFileNumber. + * + * For the system tables (OID < FirstNormalObjectId) the initial storage + * will be created with the relfilenumber same as their oid. And, later for + * any storage the relfilenumber allocated by GetNewRelFileNumber() will start + * at 100000. Thus, when upgrading from an older cluster, the relation storage + * path for the user table from the old cluster will not conflict with the + * relation storage path for the system table from the new cluster. Anyway, + * the new cluster must not have any user tables while upgrading, so we needn't + * worry about them. + * ---------- + */ +#define FirstNormalRelFileNumber ((RelFileNumber) 100000)==
When WAL logging the next object id we have the chosen the xlog threshold value as 8192 whereas for relfilenode it is 512. Any reason for choosing this low arbitrary value in case of relfilenumber?
For Oid when we cross the max value we will wraparound, whereas for
relfilenumber we can not expect the wraparound for cluster lifetime.
So it is better not to log forward a really large number of
relfilenumber as we do for Oid. OTOH if we make it really low like 64
then we can is RelFIleNumberGenLock in wait event in very high
concurrency where from 32 backends we are continuously
creating/dropping tables. So we thought of choosing this number 512
so that it is not very low that can create the lock contention and it
is not very high so that we need to worry about wasting those many
relfilenumbers on the crash.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Thanks Dilip. Here are few comments that could find upon quickly reviewing
the v11 patch:
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record
it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush
upto
XLogRecPtrthen -> XLogRecPtr then
==
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
I think the above check should be added at the beginning of the function
for the reason that if we come to the default switch case we won't be
acquiring the lwlock and do other stuff to get a new relfilenumber.
==
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to
the
that that relation -> that relation.
==
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
If parsing is successful, returns true;
--
With Regards,
Ashutosh Sharma.
On Tue, Jul 26, 2022 at 7:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Show quoted text
On Tue, Jul 26, 2022 at 6:06 PM Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:Hi,
Note: please avoid top posting./*
* If relfilenumber is unspecified by the caller then createstorage
- * with oid same as relid. + * with relfilenumber same as relid if it is a system tableotherwise
+ * allocate a new relfilenumber. For more details read comments
atop
+ * FirstNormalRelFileNumber declaration. */ if (!RelFileNumberIsValid(relfilenumber)) - relfilenumber = relid; + { + relfilenumber = relid < FirstNormalObjectId ? + relid : GetNewRelFileNumber();Above code says that in the case of system table we want relfilenode to
be the same as object id. This technically means that the relfilenode or
oid for the system tables would not be exceeding 16383. However in the
below lines of code added in the patch, it says there is some chance for
the storage path of the user tables from the old cluster conflicting with
the storage path of the system tables in the new cluster. Assuming that the
OIDs for the user tables on the old cluster would start with 16384 (the
first object ID), I see no reason why there would be a conflict.Basically, the above comment says that the initial system table
storage will be created with the same relfilenumber as Oid so you are
right that will not exceed 16383. And below code is explaining the
reason that in order to avoid the conflict with the user table from
the older cluster we do it this way. Otherwise, in the new design, we
have no intention to keep the relfilenode same as Oid. But during an
upgrade from the older cluster which is not following this new design
might have user table relfilenode which can conflict with the system
table in the new cluster so we have to ensure that with the new design
also when creating the initial cluster we keep the system table
relfilenode in low range and directly using Oid is the best idea for
this purpose instead of defining the completely new range and
maintaining a separate counter for that.+/* ---------- + * RelFileNumber zero is InvalidRelFileNumber. + * + * For the system tables (OID < FirstNormalObjectId) the initial storage + * will be created with the relfilenumber same as their oid. And,later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber()
will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation
storage
+ * path for the user table from the old cluster will not conflict with
the
+ * relation storage path for the system table from the new cluster.
Anyway,
+ * the new cluster must not have any user tables while upgrading, so we
needn't
+ * worry about them. + * ---------- + */ +#define FirstNormalRelFileNumber ((RelFileNumber) 100000)==
When WAL logging the next object id we have the chosen the xlog
threshold value as 8192 whereas for relfilenode it is 512. Any reason for
choosing this low arbitrary value in case of relfilenumber?For Oid when we cross the max value we will wraparound, whereas for
relfilenumber we can not expect the wraparound for cluster lifetime.
So it is better not to log forward a really large number of
relfilenumber as we do for Oid. OTOH if we make it really low like 64
then we can is RelFIleNumberGenLock in wait event in very high
concurrency where from 32 backends we are continuously
creating/dropping tables. So we thought of choosing this number 512
so that it is not very low that can create the lock contention and it
is not very high so that we need to worry about wasting those many
relfilenumbers on the crash.--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jul 26, 2022 at 2:07 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have thought about it while doing so but I am not sure whether it is
a good idea or not, because before my change these all were macros
with 2 naming conventions so I just changed to inline function so why
to change the name.
Well, the reason to change the name would be for consistency. It feels
weird to have some NAMES_LIKETHIS() and other NamesLikeThis().
Now, an argument against that is that it will make back-patching more
annoying, if any code using these functions/macros is touched. But
since the calling sequence is changing anyway (you now have to pass a
pointer rather than the object itself) that argument doesn't really
carry any weight. So I would favor ClearBufferTag(), InitBufferTag(),
etc.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jul 12, 2022 at 4:35 PM Robert Haas <robertmhaas@gmail.com> wrote:
Very minor nitpick: To me REPLACE would be a bit more accurate than RENAME,
since it includes fsync etc?Sure, I had it that way for a while and changed it at the last minute.
I can change it back.
Committed that way, also with the fix for the typo Dilip found.
--
Robert Haas
EDB: http://www.enterprisedb.com
Some more comments:
==
Shouldn't we retry for the new relfilenumber if
"ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER". There can be a
cases where some of the tables are dropped by the user and relfilenumber of
those tables can be reused for which we would need to find the
relfilenumber that can be resued. For e.g. consider below example:
postgres=# create table t1(a int);
CREATE TABLE
postgres=# create table t2(a int);
CREATE TABLE
postgres=# create table t3(a int);
ERROR: relfilenumber is out of bound
postgres=# drop table t1, t2;
DROP TABLE
postgres=# checkpoint;
CHECKPOINT
postgres=# vacuum;
VACUUM
Now if I try to recreate table t3, it should succeed, shouldn't it? But it
doesn't because we simply error out by seeing the nextRelFileNumber saved
in the shared memory.
postgres=# create table t1(a int);
ERROR: relfilenumber is out of bound
I think, above should have worked.
==
<caution>
<para>
Note that while a table's filenode often matches its OID, this is
<emphasis>not</emphasis> necessarily the case; some operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>,
<command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while preserving
the OID.
I think this note needs some improvement in storage.sgml. It says the
table's relfilenode mostly matches its OID, but it doesn't. This will
happen only in case of system table and maybe never in case of user table.
==
postgres=# create table t2(a int);
ERROR: relfilenumber is out of bound
Since this is a user-visible error, I think it would be good to mention
relfilenode instead of relfilenumber. Elsewhere (including the user manual)
we refer to this as a relfilenode.
--
With Regards,
Ashutosh Sharma.
On Tue, Jul 26, 2022 at 10:36 PM Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:
Show quoted text
Thanks Dilip. Here are few comments that could find upon quickly
reviewing the v11 patch:/* + * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it + * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid + * XLogRecPtrthen flush the wal upto this record pointer otherwise flush uptoXLogRecPtrthen -> XLogRecPtr then
==
+ switch (relpersistence) + { + case RELPERSISTENCE_TEMP: + backend = BackendIdForTempRelations(); + break; + case RELPERSISTENCE_UNLOGGED: + case RELPERSISTENCE_PERMANENT: + backend = InvalidBackendId; + break; + default: + elog(ERROR, "invalid relpersistence: %c", relpersistence); + return InvalidRelFileNumber; /* placate compiler */ + }I think the above check should be added at the beginning of the function
for the reason that if we come to the default switch case we won't be
acquiring the lwlock and do other stuff to get a new relfilenumber.==
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL, + * Generate a new relfilenumber. We cannot reuse the old relfilenumber + * because of the possibility that that relation will be moved back to thethat that relation -> that relation.
==
+ * option_parse_relfilenumber + * + * Parse relfilenumber value for an option. If the parsing is successful, + * returns; if parsing fails, returns false. + */If parsing is successful, returns true;
--
With Regards,
Ashutosh Sharma.On Tue, Jul 26, 2022 at 7:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 26, 2022 at 6:06 PM Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:Hi,
Note: please avoid top posting./*
* If relfilenumber is unspecified by the caller then createstorage
- * with oid same as relid. + * with relfilenumber same as relid if it is a system tableotherwise
+ * allocate a new relfilenumber. For more details read
comments atop
+ * FirstNormalRelFileNumber declaration. */ if (!RelFileNumberIsValid(relfilenumber)) - relfilenumber = relid; + { + relfilenumber = relid < FirstNormalObjectId ? + relid : GetNewRelFileNumber();Above code says that in the case of system table we want relfilenode to
be the same as object id. This technically means that the relfilenode or
oid for the system tables would not be exceeding 16383. However in the
below lines of code added in the patch, it says there is some chance for
the storage path of the user tables from the old cluster conflicting with
the storage path of the system tables in the new cluster. Assuming that the
OIDs for the user tables on the old cluster would start with 16384 (the
first object ID), I see no reason why there would be a conflict.Basically, the above comment says that the initial system table
storage will be created with the same relfilenumber as Oid so you are
right that will not exceed 16383. And below code is explaining the
reason that in order to avoid the conflict with the user table from
the older cluster we do it this way. Otherwise, in the new design, we
have no intention to keep the relfilenode same as Oid. But during an
upgrade from the older cluster which is not following this new design
might have user table relfilenode which can conflict with the system
table in the new cluster so we have to ensure that with the new design
also when creating the initial cluster we keep the system table
relfilenode in low range and directly using Oid is the best idea for
this purpose instead of defining the completely new range and
maintaining a separate counter for that.+/* ---------- + * RelFileNumber zero is InvalidRelFileNumber. + * + * For the system tables (OID < FirstNormalObjectId) the initialstorage
+ * will be created with the relfilenumber same as their oid. And,
later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber()
will start
+ * at 100000. Thus, when upgrading from an older cluster, the
relation storage
+ * path for the user table from the old cluster will not conflict with
the
+ * relation storage path for the system table from the new cluster.
Anyway,
+ * the new cluster must not have any user tables while upgrading, so
we needn't
+ * worry about them. + * ---------- + */ +#define FirstNormalRelFileNumber ((RelFileNumber) 100000)==
When WAL logging the next object id we have the chosen the xlog
threshold value as 8192 whereas for relfilenode it is 512. Any reason for
choosing this low arbitrary value in case of relfilenumber?For Oid when we cross the max value we will wraparound, whereas for
relfilenumber we can not expect the wraparound for cluster lifetime.
So it is better not to log forward a really large number of
relfilenumber as we do for Oid. OTOH if we make it really low like 64
then we can is RelFIleNumberGenLock in wait event in very high
concurrency where from 32 backends we are continuously
creating/dropping tables. So we thought of choosing this number 512
so that it is not very low that can create the lock contention and it
is not very high so that we need to worry about wasting those many
relfilenumbers on the crash.--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 27, 2022 at 1:24 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Some more comments:
Note: Please don't top post.
==
Shouldn't we retry for the new relfilenumber if "ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER". There can be a cases where some of the tables are dropped by the user and relfilenumber of those tables can be reused for which we would need to find the relfilenumber that can be resued. For e.g. consider below example:
postgres=# create table t1(a int);
CREATE TABLEpostgres=# create table t2(a int);
CREATE TABLEpostgres=# create table t3(a int);
ERROR: relfilenumber is out of boundpostgres=# drop table t1, t2;
DROP TABLEpostgres=# checkpoint;
CHECKPOINTpostgres=# vacuum;
VACUUMNow if I try to recreate table t3, it should succeed, shouldn't it? But it doesn't because we simply error out by seeing the nextRelFileNumber saved in the shared memory.
postgres=# create table t1(a int);
ERROR: relfilenumber is out of boundI think, above should have worked.
No, it should not, the whole point of this design is not to reuse the
relfilenumber ever within a cluster lifetime. You might want to read
this mail[1]/messages/by-id/CA+hUKG+ZrDms7gSjckme8YV2tzxgZ0KVfGcsjaFoKyzQX_f_Mw@mail.gmail.com that by the time we use 2^56 relfilenumbers the cluster
will anyway reach its lifetime by other factors.
[1]: /messages/by-id/CA+hUKG+ZrDms7gSjckme8YV2tzxgZ0KVfGcsjaFoKyzQX_f_Mw@mail.gmail.com
==
<caution>
<para>
Note that while a table's filenode often matches its OID, this is
<emphasis>not</emphasis> necessarily the case; some operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.I think this note needs some improvement in storage.sgml. It says the table's relfilenode mostly matches its OID, but it doesn't. This will happen only in case of system table and maybe never in case of user table.
Yes, this should be changed.
postgres=# create table t2(a int);
ERROR: relfilenumber is out of boundSince this is a user-visible error, I think it would be good to mention relfilenode instead of relfilenumber. Elsewhere (including the user manual) we refer to this as a relfilenode.
No this is expected to be an internal error because in general during
the cluster lifetime ideally, we should never reach this number. So
we are putting this check so that it should not reach this number due
to some other computational/programming mistake.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jul 26, 2022 at 1:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jul 21, 2022 at 9:53 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Wed, Jul 20, 2022 at 11:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
[v10 patch set]
Hi Dilip, I'm experimenting with these patches and will hopefully have
more to say soon, but I just wanted to point out that this builds with
warnings and failed on 3/4 of the CI OSes on cfbot's last run. Maybe
there is the good kind of uninitialised data on Linux, and the bad
kind of uninitialised data on those other pesky systems?Here is the patch to fix the issue, basically, while asserting for the
file existence it was not setting the relfilenumber in the
relfilelocator before generating the path so it was just checking for
the existence of the random path so it was asserting randomly.
Thanks for the updated patch, Few comments:
1) The format specifier should be changed from %u to INT64_FORMAT
autoprewarm.c -> apw_load_buffers
...............
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
...............
2) The format specifier should be changed from %u to INT64_FORMAT
autoprewarm.c -> apw_dump_now
...............
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
...............
3) should the comment "entry point for old extension version" be on
top of pg_buffercache_pages, as the current version will use
pg_buffercache_pages_v1_4
+
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
4) we could use the new style or ereport by removing the brackets
around errcode:
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+
errmsg("relfilenode" INT64_FORMAT " is too large to be represented as
an OID",
+
fctx->record[i].relfilenumber),
+
errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache
UPDATE")));
like:
ereport(ERROR,
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("relfilenode" INT64_FORMAT " is too large to be represented as
an OID",
fctx->record[i].relfilenumber),
errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache
UPDATE"));
5) Similarly in the below code too:
+ /* check whether the relfilenumber is within a valid range */
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT
" is out of range",
+ (relfilenumber))));
6) Similarly in the below code too:
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber)
\
+do {
\
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR,
\
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT
" is out of range", \
+ (relfilenumber)))); \
+} while (0)
+
7) This error code looks similar to CHECK_RELFILENUMBER_RANGE, can
this macro be used here too:
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT
" is out of range",
+ (relfilenumber))));
8) I felt this include is not required:
diff --git a/src/backend/access/transam/varsup.c
b/src/backend/access/transam/varsup.c
index 849a7ce..a2f0d35 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
9) should we change elog to ereport to use the New-style error reporting API
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary
upgrade");
10) Here nextRelFileNumber is protected by RelFileNumberGenLock, the
comment stated OidGenLock. It should be slightly adjusted.
typedef struct VariableCacheData
{
/*
* These fields are protected by OidGenLock.
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
* loggedRelFileNumber */
Regards,
Vignesh
On Wed, Jul 27, 2022 at 3:27 PM vignesh C <vignesh21@gmail.com> wrote:
Thanks for the updated patch, Few comments:
1) The format specifier should be changed from %u to INT64_FORMAT
autoprewarm.c -> apw_load_buffers
...............
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
...............2) The format specifier should be changed from %u to INT64_FORMAT
autoprewarm.c -> apw_dump_now
...............
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
...............3) should the comment "entry point for old extension version" be on top of pg_buffercache_pages, as the current version will use pg_buffercache_pages_v1_4 + +Datum +pg_buffercache_pages(PG_FUNCTION_ARGS) +{ + return pg_buffercache_pages_internal(fcinfo, OIDOID); +} + +/* entry point for old extension version */ +Datum +pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS) +{ + return pg_buffercache_pages_internal(fcinfo, INT8OID); +}4) we could use the new style or ereport by removing the brackets around errcode: + if (fctx->record[i].relfilenumber > OID_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID", + fctx->record[i].relfilenumber), + errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE")));like:
ereport(ERROR,errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("relfilenode" INT64_FORMAT " is too large to be represented as
an OID",fctx->record[i].relfilenumber),
errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache
UPDATE"));5) Similarly in the below code too: + /* check whether the relfilenumber is within a valid range */ + if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("relfilenumber " INT64_FORMAT " is out of range", + (relfilenumber))));6) Similarly in the below code too: +#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \ +do { \ + if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \ + ereport(ERROR, \ + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \ + errmsg("relfilenumber " INT64_FORMAT " is out of range", \ + (relfilenumber)))); \ +} while (0) +7) This error code looks similar to CHECK_RELFILENUMBER_RANGE, can this macro be used here too: pg_filenode_relation(PG_FUNCTION_ARGS) { Oid reltablespace = PG_GETARG_OID(0); - RelFileNumber relfilenumber = PG_GETARG_OID(1); + RelFileNumber relfilenumber = PG_GETARG_INT64(1); Oid heaprel;+ /* check whether the relfilenumber is within a valid range */ + if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("relfilenumber " INT64_FORMAT " is out of range", + (relfilenumber))));8) I felt this include is not required: diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c index 849a7ce..a2f0d35 100644 --- a/src/backend/access/transam/varsup.c +++ b/src/backend/access/transam/varsup.c @@ -13,12 +13,16 @@#include "postgres.h"
+#include <unistd.h> + #include "access/clog.h" #include "access/commit_ts.h"9) should we change elog to ereport to use the New-style error reporting API + /* safety check, we should never get this far in a HS standby */ + if (RecoveryInProgress()) + elog(ERROR, "cannot assign RelFileNumber during recovery"); + + if (IsBinaryUpgrade) + elog(ERROR, "cannot assign RelFileNumber during binary upgrade");10) Here nextRelFileNumber is protected by RelFileNumberGenLock, the
comment stated OidGenLock. It should be slightly adjusted.
typedef struct VariableCacheData
{
/*
* These fields are protected by OidGenLock.
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
* loggedRelFileNumber */
Thanks for the review I have fixed these except,
9) should we change elog to ereport to use the New-style error reporting API
I think this is internal error so if we use ereport we need to give
error code and all and I think for internal that is not necessary?
8) I felt this include is not required:
it is using access API so we do need <unistd.h>
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v12-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchDownload
From d57b3733bd3541fd13dc324bc7ca4d5962544066 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 12 Jul 2022 17:10:04 +0530
Subject: [PATCH v12 1/3] Convert buf_internal.h macros to static inline
functions
Readability wise inline functions are better compared to macros and this
will also help to write cleaner and readable code for 64-bit relfilenode
because as part of that patch we will have to do some complex bitwise
operation so doing that inside the inline function will be cleaner.
---
src/backend/storage/buffer/buf_init.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 16 ++--
src/backend/storage/buffer/localbuf.c | 12 +--
src/include/storage/buf_internals.h | 156 +++++++++++++++++++++-------------
4 files changed, 111 insertions(+), 75 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 2862e9e..55f646d 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -116,7 +116,7 @@ InitBufferPool(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
pg_atomic_init_u32(&buf->state, 0);
buf->wait_backend_pgprocno = INVALID_PGPROCNO;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index b7488b5..4a8e294 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
+ INIT_BUFFERTAG(&newTag, &smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
+ INIT_BUFFERTAG(&tag, &rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -642,7 +642,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
buf_state = pg_atomic_read_u32(&bufHdr->state);
/* Is it still valid and holding the right tag? */
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/*
* Bump buffer's ref and usage counts. This is equivalent of
@@ -679,7 +679,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
else
buf_state = LockBufHdr(bufHdr);
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
{
/*
* It's now safe to pin the buffer. We can't pin first and ask
@@ -1134,7 +1134,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1517,7 +1517,7 @@ retry:
buf_state = LockBufHdr(buf);
/* If it's changed while we were waiting for lock, do nothing */
- if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+ if (!BUFFERTAGS_EQUAL(&buf->tag, &oldTag))
{
UnlockBufHdr(buf, buf_state);
LWLockRelease(oldPartitionLock);
@@ -1549,7 +1549,7 @@ retry:
* linear scans of the buffer array don't think the buffer is valid.
*/
oldFlags = buf_state & BUF_FLAG_MASK;
- CLEAR_BUFFERTAG(buf->tag);
+ CLEAR_BUFFERTAG(&buf->tag);
buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
UnlockBufHdr(buf, buf_state);
@@ -3365,7 +3365,7 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
+ INIT_BUFFERTAG(&bufTag, &rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 9c03885..91e174e 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ INIT_BUFFERTAG(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -131,7 +131,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
b = hresult->id;
bufHdr = GetLocalBufferDescriptor(b);
- Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
+ Assert(BUFFERTAGS_EQUAL(&bufHdr->tag, &newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
@@ -253,7 +253,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* mark buffer invalid just in case hash insert fails */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~(BM_VALID | BM_TAG_VALID);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
@@ -354,7 +354,7 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -398,7 +398,7 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ CLEAR_BUFFERTAG(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 69e4590..2f0e60e 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -95,28 +95,32 @@ typedef struct buftag
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define CLEAR_BUFFERTAG(a) \
-( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
- (a).forkNum = InvalidForkNumber, \
- (a).blockNum = InvalidBlockNumber \
-)
-
-#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
-( \
- (a).rlocator = (xx_rlocator), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
-)
-
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
-)
+static inline void
+CLEAR_BUFFERTAG(BufferTag *tag)
+{
+ tag->rlocator.spcOid = InvalidOid;
+ tag->rlocator.dbOid = InvalidOid;
+ tag->rlocator.relNumber = InvalidRelFileNumber;
+ tag->forkNum = InvalidForkNumber;
+ tag->blockNum = InvalidBlockNumber;
+}
+
+static inline void
+INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
+ ForkNumber forkNum, BlockNumber blockNum)
+{
+ tag->rlocator = *rlocator;
+ tag->forkNum = forkNum;
+ tag->blockNum = blockNum;
+}
+
+static inline bool
+BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
+{
+ return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ (tag1->blockNum == tag2->blockNum) &&
+ (tag1->forkNum == tag2->forkNum);
+}
/*
* The shared buffer mapping table is partitioned to reduce contention.
@@ -124,13 +128,24 @@ typedef struct buftag
* hash code with BufTableHashCode(), then apply BufMappingPartitionLock().
* NB: NUM_BUFFER_PARTITIONS must be a power of 2!
*/
-#define BufTableHashPartition(hashcode) \
- ((hashcode) % NUM_BUFFER_PARTITIONS)
-#define BufMappingPartitionLock(hashcode) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + \
- BufTableHashPartition(hashcode)].lock)
-#define BufMappingPartitionLockByIndex(i) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + (i)].lock)
+static inline uint32
+BufTableHashPartition(uint32 hashcode)
+{
+ return hashcode % NUM_BUFFER_PARTITIONS;
+}
+
+static inline LWLock *
+BufMappingPartitionLock(uint32 hashcode)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET +
+ BufTableHashPartition(hashcode)].lock;
+}
+
+static inline LWLock *
+BufMappingPartitionLockByIndex(uint32 index)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + index].lock;
+}
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
@@ -220,37 +235,6 @@ typedef union BufferDescPadded
char pad[BUFFERDESC_PAD_TO_SIZE];
} BufferDescPadded;
-#define GetBufferDescriptor(id) (&BufferDescriptors[(id)].bufferdesc)
-#define GetLocalBufferDescriptor(id) (&LocalBufferDescriptors[(id)])
-
-#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
-
-#define BufferDescriptorGetIOCV(bdesc) \
- (&(BufferIOCVArray[(bdesc)->buf_id]).cv)
-#define BufferDescriptorGetContentLock(bdesc) \
- ((LWLock*) (&(bdesc)->content_lock))
-
-extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
-
-/*
- * The freeNext field is either the index of the next freelist entry,
- * or one of these special values:
- */
-#define FREENEXT_END_OF_LIST (-1)
-#define FREENEXT_NOT_IN_LIST (-2)
-
-/*
- * Functions for acquiring/releasing a shared buffer header's spinlock. Do
- * not apply these to local buffers!
- */
-extern uint32 LockBufHdr(BufferDesc *desc);
-#define UnlockBufHdr(desc, s) \
- do { \
- pg_write_barrier(); \
- pg_atomic_write_u32(&(desc)->state, (s) & (~BM_LOCKED)); \
- } while (0)
-
-
/*
* The PendingWriteback & WritebackContext structure are used to keep
* information about pending flush requests to be issued to the OS.
@@ -276,11 +260,63 @@ typedef struct WritebackContext
/* in buf_init.c */
extern PGDLLIMPORT BufferDescPadded *BufferDescriptors;
+extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
extern PGDLLIMPORT WritebackContext BackendWritebackContext;
/* in localbuf.c */
extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
+
+static inline BufferDesc *
+GetBufferDescriptor(uint32 id)
+{
+ return &(BufferDescriptors[id]).bufferdesc;
+}
+
+static inline BufferDesc *
+GetLocalBufferDescriptor(uint32 id)
+{
+ return &LocalBufferDescriptors[id];
+}
+
+static inline Buffer
+BufferDescriptorGetBuffer(const BufferDesc *bdesc)
+{
+ return (Buffer) (bdesc->buf_id + 1);
+}
+
+static inline ConditionVariable *
+BufferDescriptorGetIOCV(const BufferDesc *bdesc)
+{
+ return &(BufferIOCVArray[bdesc->buf_id]).cv;
+}
+
+static inline LWLock *
+BufferDescriptorGetContentLock(const BufferDesc *bdesc)
+{
+ return (LWLock *) (&bdesc->content_lock);
+}
+
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST (-1)
+#define FREENEXT_NOT_IN_LIST (-2)
+
+/*
+ * Functions for acquiring/releasing a shared buffer header's spinlock. Do
+ * not apply these to local buffers!
+ */
+extern uint32 LockBufHdr(BufferDesc *desc);
+
+static inline void
+UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
+{
+ pg_write_barrier();
+ pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
+}
+
/* in bufmgr.c */
/*
--
1.8.3.1
v12-0002-Preliminary-refactoring-for-supporting-larger-re.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Preliminary-refactoring-for-supporting-larger-re.patchDownload
From a1e106c4da4720a2174adb3b4d1702ac02d3c0e3 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Wed, 13 Jul 2022 13:12:53 +0530
Subject: [PATCH v12 2/3] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 +-
src/backend/storage/buffer/bufmgr.c | 140 +++++++++++++++++---------
src/backend/storage/buffer/localbuf.c | 30 ++++--
src/include/storage/buf_internals.h | 64 ++++++++++--
5 files changed, 178 insertions(+), 74 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..c5754ea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(&bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index b2d6026..63f0d41 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -630,10 +630,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(&bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(&bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4a8e294..f872c34 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1657,8 +1657,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1668,8 +1668,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2010,9 +2010,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(&bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(&bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2702,6 +2702,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2717,8 +2718,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, BufTagGetForkNum(&buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2797,8 +2800,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ *rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ *forknum = BufTagGetForkNum(&bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2848,9 +2851,14 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2909,7 +2917,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2930,7 +2938,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3151,15 +3159,15 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3310,7 +3318,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &locators[j]))
{
rlocator = &locators[j];
break;
@@ -3319,7 +3327,10 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ locator = BufTagGetRelFileLocator(&bufHdr->tag);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3329,7 +3340,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, rlocator))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3389,8 +3400,8 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3428,11 +3439,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3456,13 +3467,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3482,12 +3496,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3526,7 +3544,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3544,7 +3562,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3573,13 +3591,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3653,7 +3671,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3662,7 +3680,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3674,7 +3695,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3876,13 +3897,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4042,6 +4063,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4050,8 +4075,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4659,8 +4683,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4684,7 +4710,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4702,8 +4732,12 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4797,15 +4831,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ rlocatora = BufTagGetRelFileLocator(ba);
+ rlocatorb = BufTagGetRelFileLocator(bb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(ba) < BufTagGetForkNum(bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(ba) > BufTagGetForkNum(bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4955,10 +4994,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ currlocator = BufTagGetRelFileLocator(&tag);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4966,11 +5007,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ nextrlocator = BufTagGetRelFileLocator(&next->tag);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
+ BufTagGetForkNum(&cur->tag) != BufTagGetForkNum(&next->tag))
break;
/* ok, block queued twice, skip */
@@ -4988,8 +5032,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 91e174e..972f3f3 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,15 +213,18 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +340,22 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +392,16 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 2f0e60e..092e959 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,18 +90,51 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+static inline RelFileNumber
+BufTagGetRelNumber(const BufferTag *tag)
+{
+ return tag->relNumber;
+}
+
+static inline ForkNumber
+BufTagGetForkNum(const BufferTag *tag)
+{
+ return tag->forkNum;
+}
+
+static inline void
+BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
+ ForkNumber forknum)
+{
+ tag->relNumber = relnumber;
+ tag->forkNum = forknum;
+}
+
+static inline RelFileLocator
+BufTagGetRelFileLocator(const BufferTag *tag)
+{
+ RelFileLocator rlocator;
+
+ rlocator.spcOid = tag->spcOid;
+ rlocator.dbOid = tag->dbOid;
+ rlocator.relNumber = BufTagGetRelNumber(tag);
+
+ return rlocator;
+}
+
static inline void
CLEAR_BUFFERTAG(BufferTag *tag)
{
- tag->rlocator.spcOid = InvalidOid;
- tag->rlocator.dbOid = InvalidOid;
- tag->rlocator.relNumber = InvalidRelFileNumber;
- tag->forkNum = InvalidForkNumber;
+ tag->spcOid = InvalidOid;
+ tag->dbOid = InvalidOid;
+ BufTagSetRelForkDetails(tag, InvalidRelFileNumber, InvalidForkNumber);
tag->blockNum = InvalidBlockNumber;
}
@@ -109,19 +142,32 @@ static inline void
INIT_BUFFERTAG(BufferTag *tag, const RelFileLocator *rlocator,
ForkNumber forkNum, BlockNumber blockNum)
{
- tag->rlocator = *rlocator;
- tag->forkNum = forkNum;
+ tag->spcOid = rlocator->spcOid;
+ tag->dbOid = rlocator->dbOid;
+ BufTagSetRelForkDetails(tag, rlocator->relNumber, forkNum);
tag->blockNum = blockNum;
}
static inline bool
BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
- return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ return (tag1->spcOid == tag2->spcOid) &&
+ (tag1->dbOid == tag2->dbOid) &&
+ (tag1->relNumber == tag2->relNumber) &&
(tag1->blockNum == tag2->blockNum) &&
(tag1->forkNum == tag2->forkNum);
}
+static inline bool
+BufTagMatchesRelFileLocator(const BufferTag *tag,
+ const RelFileLocator *rlocator)
+{
+ return (tag->spcOid == rlocator->spcOid) &&
+ (tag->dbOid == rlocator->dbOid) &&
+ (BufTagGetRelNumber(tag) == rlocator->relNumber);
+}
+
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v12-0003-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v12-0003-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From a9aba81a34c6f96d57a92f1a4d6c0c4be638929d Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 14 Jul 2022 14:25:50 +0530
Subject: [PATCH v12 3/3] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 ++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 ++++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 183 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 61 +++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 95 -----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/dbcommands.c | 4 +-
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 3 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/basebackup.c | 13 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 46 +++---
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 11 +-
src/backend/utils/adt/pg_upgrade_support.c | 22 ++-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 65 +++-----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 +++++
src/include/access/transam.h | 26 +++
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 2 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 ++++++-
src/include/storage/reinit.h | 3 +-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
71 files changed, 712 insertions(+), 346 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..c20439a 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..15c4d38 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 63f0d41..3a23795 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," INT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," INT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a186e35..b77940f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..a2f0d35 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,173 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previously logged record pointer and
+ * we will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 15ab8d9..fa1436b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,18 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ {
+ if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber)
+ elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT,
+ checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber);
+
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ }
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7379,35 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush upto
+ * currently logged record. Also store the currently logged record pointer in
+ * the '*prevrecptr' if prevrecptr is not NULL.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7622,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7646,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e383c21..81d1f27 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..c1bd4f8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 099d369..d09168d 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -250,7 +250,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenumber;
+ RelFileNumber relfilenumber;
Page page;
List *rlocatorlist = NIL;
LockRelId relid;
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenumber = InvalidRelFileNumber;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7fbee0c..430ade2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14331,10 +14331,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index cb7d460..66edc64 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -290,7 +290,8 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS >
+ MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 86cf1b3..dc8b9da 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 637c0ce..4de249b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1172,7 +1172,8 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
int excludeIdx;
bool excludeFound;
ForkNumber relForkNum; /* Type of fork if file is a relation */
- int relOidChars; /* Chars in filename that are the rel oid */
+ int relnumchars; /* Chars in filename that are the
+ * relnumber */
/* Skip special stuff */
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
@@ -1222,23 +1223,23 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
/* Exclude all forks for unlogged tables except the init fork */
if (isDbDir &&
- parse_filename_for_nontemp_relation(de->d_name, &relOidChars,
+ parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&relForkNum))
{
/* Never exclude init forks */
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relOid[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
* with the same OID. If so, the file can be excluded.
*/
- memcpy(relOid, de->d_name, relOidChars);
- relOid[relOidChars] = '\0';
+ memcpy(relNumber, de->d_name, relnumchars);
+ relNumber[relnumchars] = '\0';
snprintf(initForkFile, sizeof(initForkFile), "%s/%s_init",
- path, relOid);
+ path, relNumber);
if (lstat(initForkFile, &statbuf) == 0)
{
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index f053fe0..a4bae7c 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -195,11 +195,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -235,11 +235,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -285,13 +285,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -304,10 +304,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
dbspacedirname, de->d_name);
/* Construct destination pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(dstpath, sizeof(dstpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
/* OK, we're ready to perform the actual copy. */
@@ -328,12 +328,12 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -342,10 +342,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/* Construct main fork pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(mainpath, sizeof(mainpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
fsync_fname(mainpath, false);
@@ -372,13 +372,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *oidchars has been set to the a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
- * portion of the filename. This is critical to protect against a possible
- * buffer overrun.
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
+ * RelFileNumber portion of the filename. This is critical to protect against
+ * a possible buffer overrun.
*/
bool
-parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
ForkNumber *fork)
{
int pos;
@@ -386,9 +386,9 @@ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
- *oidchars = pos;
+ *relnumchars = pos;
/* Check for a fork name. */
if (name[pos] != '_')
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..a88fef2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..2b313aa 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,16 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if (relfilenumber < 0 || relfilenumber > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
+ relfilenumber));
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..b1d60bd 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -29,6 +30,15 @@ do { \
errmsg("function can only be called when server is in binary upgrade mode"))); \
} while (0)
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range", \
+ (relfilenumber))); \
+} while (0)
+
Datum
binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
{
@@ -98,10 +108,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +132,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +156,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..a0b7c5b 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3709,7 +3709,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f9c51d1..0db33ef 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3142,9 +3142,9 @@ dumpDatabase(Archive *fout)
PQExpBuffer loFrozenQry = createPQExpBuffer();
PQExpBuffer loOutQry = createPQExpBuffer();
int i_relfrozenxid,
- i_relfilenode,
i_oid,
i_relminmxid;
+ RelFileNumber i_relfilenode;
/*
* pg_largeobject
@@ -3170,11 +3170,11 @@ dumpDatabase(Archive *fout)
appendPQExpBufferStr(loOutQry, "\n-- For binary upgrade, preserve values for pg_largeobject and its index\n");
for (int i = 0; i < PQntuples(lo_res); ++i)
appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
- "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = '%u'\n"
+ "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = " INT64_FORMAT "\n"
"WHERE oid = %u;\n",
atooid(PQgetvalue(lo_res, i, i_relfrozenxid)),
atooid(PQgetvalue(lo_res, i, i_relminmxid)),
- atooid(PQgetvalue(lo_res, i, i_relfilenode)),
+ atorelnumber(PQgetvalue(lo_res, i, i_relfilenode)),
atooid(PQgetvalue(lo_res, i, i_oid)));
ArchiveEntry(fout, nilCatalogId, createDumpId(),
@@ -4853,16 +4853,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4880,7 +4880,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4894,7 +4894,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4902,7 +4902,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4915,7 +4915,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index df374ce..e97ba4d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..c62fbc2 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,21 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster. Anyway,
+ * the new cluster must not have any user tables while upgrading, so we needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +230,14 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +316,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..f873306 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -29,6 +29,8 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+/* Characters to allow for an RelFileNumber in a relation path */
+#define RELNUMBERCHARS 20 /* max chars printed by %lu */
/*
* Stuff for fork names.
*
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 092e959..70c33d1 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BUFFERTAGS_EQUAL(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/reinit.h b/src/include/storage/reinit.h
index bf2c10d..b990d28 100644
--- a/src/include/storage/reinit.h
+++ b/src/include/storage/reinit.h
@@ -20,7 +20,8 @@
extern void ResetUnloggedRelations(int op);
extern bool parse_filename_for_nontemp_relation(const char *name,
- int *oidchars, ForkNumber *fork);
+ int *relnumchars,
+ ForkNumber *fork);
#define UNLOGGED_RELATION_CLEANUP 0x0001
#define UNLOGGED_RELATION_INIT 0x0002
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index d63f4f1..a489ccc 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Wed, Jul 27, 2022 at 12:07 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jul 26, 2022 at 2:07 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have thought about it while doing so but I am not sure whether it is
a good idea or not, because before my change these all were macros
with 2 naming conventions so I just changed to inline function so why
to change the name.Well, the reason to change the name would be for consistency. It feels
weird to have some NAMES_LIKETHIS() and other NamesLikeThis().Now, an argument against that is that it will make back-patching more
annoying, if any code using these functions/macros is touched. But
since the calling sequence is changing anyway (you now have to pass a
pointer rather than the object itself) that argument doesn't really
carry any weight. So I would favor ClearBufferTag(), InitBufferTag(),
etc.
Okay, so I have renamed these 2 functions and BUFFERTAGS_EQUAL as well
to BufferTagEqual().
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v13-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchtext/x-patch; charset=US-ASCII; name=v13-0001-Convert-buf_internal.h-macros-to-static-inline-f.patchDownload
From b45303d19db19e93e723d07e91371f19e86386db Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 12 Jul 2022 17:10:04 +0530
Subject: [PATCH v13 1/3] Convert buf_internal.h macros to static inline
functions
Readability wise inline functions are better compared to macros and this
will also help to write cleaner and readable code for 64-bit relfilenode
because as part of that patch we will have to do some complex bitwise
operation so doing that inside the inline function will be cleaner.
---
src/backend/storage/buffer/buf_init.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 16 ++--
src/backend/storage/buffer/localbuf.c | 12 +--
src/include/storage/buf_internals.h | 158 +++++++++++++++++++++-------------
4 files changed, 112 insertions(+), 76 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 2862e9e..6b62648 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -116,7 +116,7 @@ InitBufferPool(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
- CLEAR_BUFFERTAG(buf->tag);
+ ClearBufferTag(&buf->tag);
pg_atomic_init_u32(&buf->state, 0);
buf->wait_backend_pgprocno = INVALID_PGPROCNO;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index b7488b5..ecd1a10 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
Assert(BlockNumberIsValid(blockNum));
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
+ InitBufferTag(&newTag, &smgr_reln->smgr_rlocator.locator,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
ReservePrivateRefCountEntry();
- INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
+ InitBufferTag(&tag, &rlocator, forkNum, blockNum);
if (BufferIsLocal(recent_buffer))
{
@@ -642,7 +642,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
buf_state = pg_atomic_read_u32(&bufHdr->state);
/* Is it still valid and holding the right tag? */
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BufferTagEqual(&tag, &bufHdr->tag))
{
/*
* Bump buffer's ref and usage counts. This is equivalent of
@@ -679,7 +679,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
else
buf_state = LockBufHdr(bufHdr);
- if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+ if ((buf_state & BM_VALID) && BufferTagEqual(&tag, &bufHdr->tag))
{
/*
* It's now safe to pin the buffer. We can't pin first and ask
@@ -1134,7 +1134,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ InitBufferTag(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1517,7 +1517,7 @@ retry:
buf_state = LockBufHdr(buf);
/* If it's changed while we were waiting for lock, do nothing */
- if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+ if (!BufferTagEqual(&buf->tag, &oldTag))
{
UnlockBufHdr(buf, buf_state);
LWLockRelease(oldPartitionLock);
@@ -1549,7 +1549,7 @@ retry:
* linear scans of the buffer array don't think the buffer is valid.
*/
oldFlags = buf_state & BUF_FLAG_MASK;
- CLEAR_BUFFERTAG(buf->tag);
+ ClearBufferTag(&buf->tag);
buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
UnlockBufHdr(buf, buf_state);
@@ -3365,7 +3365,7 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
+ InitBufferTag(&bufTag, &rlocator, forkNum, curBlock);
/* determine its hash code and partition lock ID */
bufHash = BufTableHashCode(&bufTag);
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 9c03885..09c2f09 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ InitBufferTag(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+ InitBufferTag(&newTag, &smgr->smgr_rlocator.locator, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -131,7 +131,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
b = hresult->id;
bufHdr = GetLocalBufferDescriptor(b);
- Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
+ Assert(BufferTagEqual(&bufHdr->tag, &newTag));
#ifdef LBDEBUG
fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
@@ -253,7 +253,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* mark buffer invalid just in case hash insert fails */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ ClearBufferTag(&bufHdr->tag);
buf_state &= ~(BM_VALID | BM_TAG_VALID);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
@@ -354,7 +354,7 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ ClearBufferTag(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -398,7 +398,7 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
if (!hresult) /* shouldn't happen */
elog(ERROR, "local buffer hash table corrupted");
/* Mark buffer invalid */
- CLEAR_BUFFERTAG(bufHdr->tag);
+ ClearBufferTag(&bufHdr->tag);
buf_state &= ~BUF_FLAG_MASK;
buf_state &= ~BUF_USAGECOUNT_MASK;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 69e4590..0f6bc46 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -85,7 +85,7 @@
* relation is visible yet (its xact may have started before the xact that
* created the rel). The storage manager must be able to cope anyway.
*
- * Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
+ * Note: if there's any pad bytes in the struct, InitBufferTag will have
* to be fixed to zero them, since this struct is used as a hash key.
*/
typedef struct buftag
@@ -95,28 +95,32 @@ typedef struct buftag
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
-#define CLEAR_BUFFERTAG(a) \
-( \
- (a).rlocator.spcOid = InvalidOid, \
- (a).rlocator.dbOid = InvalidOid, \
- (a).rlocator.relNumber = InvalidRelFileNumber, \
- (a).forkNum = InvalidForkNumber, \
- (a).blockNum = InvalidBlockNumber \
-)
-
-#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
-( \
- (a).rlocator = (xx_rlocator), \
- (a).forkNum = (xx_forkNum), \
- (a).blockNum = (xx_blockNum) \
-)
-
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
- RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
- (a).blockNum == (b).blockNum && \
- (a).forkNum == (b).forkNum \
-)
+static inline void
+ClearBufferTag(BufferTag *tag)
+{
+ tag->rlocator.spcOid = InvalidOid;
+ tag->rlocator.dbOid = InvalidOid;
+ tag->rlocator.relNumber = InvalidRelFileNumber;
+ tag->forkNum = InvalidForkNumber;
+ tag->blockNum = InvalidBlockNumber;
+}
+
+static inline void
+InitBufferTag(BufferTag *tag, const RelFileLocator *rlocator,
+ ForkNumber forkNum, BlockNumber blockNum)
+{
+ tag->rlocator = *rlocator;
+ tag->forkNum = forkNum;
+ tag->blockNum = blockNum;
+}
+
+static inline bool
+BufferTagEqual(const BufferTag *tag1, const BufferTag *tag2)
+{
+ return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ (tag1->blockNum == tag2->blockNum) &&
+ (tag1->forkNum == tag2->forkNum);
+}
/*
* The shared buffer mapping table is partitioned to reduce contention.
@@ -124,13 +128,24 @@ typedef struct buftag
* hash code with BufTableHashCode(), then apply BufMappingPartitionLock().
* NB: NUM_BUFFER_PARTITIONS must be a power of 2!
*/
-#define BufTableHashPartition(hashcode) \
- ((hashcode) % NUM_BUFFER_PARTITIONS)
-#define BufMappingPartitionLock(hashcode) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + \
- BufTableHashPartition(hashcode)].lock)
-#define BufMappingPartitionLockByIndex(i) \
- (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + (i)].lock)
+static inline uint32
+BufTableHashPartition(uint32 hashcode)
+{
+ return hashcode % NUM_BUFFER_PARTITIONS;
+}
+
+static inline LWLock *
+BufMappingPartitionLock(uint32 hashcode)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET +
+ BufTableHashPartition(hashcode)].lock;
+}
+
+static inline LWLock *
+BufMappingPartitionLockByIndex(uint32 index)
+{
+ return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + index].lock;
+}
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
@@ -220,37 +235,6 @@ typedef union BufferDescPadded
char pad[BUFFERDESC_PAD_TO_SIZE];
} BufferDescPadded;
-#define GetBufferDescriptor(id) (&BufferDescriptors[(id)].bufferdesc)
-#define GetLocalBufferDescriptor(id) (&LocalBufferDescriptors[(id)])
-
-#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
-
-#define BufferDescriptorGetIOCV(bdesc) \
- (&(BufferIOCVArray[(bdesc)->buf_id]).cv)
-#define BufferDescriptorGetContentLock(bdesc) \
- ((LWLock*) (&(bdesc)->content_lock))
-
-extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
-
-/*
- * The freeNext field is either the index of the next freelist entry,
- * or one of these special values:
- */
-#define FREENEXT_END_OF_LIST (-1)
-#define FREENEXT_NOT_IN_LIST (-2)
-
-/*
- * Functions for acquiring/releasing a shared buffer header's spinlock. Do
- * not apply these to local buffers!
- */
-extern uint32 LockBufHdr(BufferDesc *desc);
-#define UnlockBufHdr(desc, s) \
- do { \
- pg_write_barrier(); \
- pg_atomic_write_u32(&(desc)->state, (s) & (~BM_LOCKED)); \
- } while (0)
-
-
/*
* The PendingWriteback & WritebackContext structure are used to keep
* information about pending flush requests to be issued to the OS.
@@ -276,11 +260,63 @@ typedef struct WritebackContext
/* in buf_init.c */
extern PGDLLIMPORT BufferDescPadded *BufferDescriptors;
+extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
extern PGDLLIMPORT WritebackContext BackendWritebackContext;
/* in localbuf.c */
extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
+
+static inline BufferDesc *
+GetBufferDescriptor(uint32 id)
+{
+ return &(BufferDescriptors[id]).bufferdesc;
+}
+
+static inline BufferDesc *
+GetLocalBufferDescriptor(uint32 id)
+{
+ return &LocalBufferDescriptors[id];
+}
+
+static inline Buffer
+BufferDescriptorGetBuffer(const BufferDesc *bdesc)
+{
+ return (Buffer) (bdesc->buf_id + 1);
+}
+
+static inline ConditionVariable *
+BufferDescriptorGetIOCV(const BufferDesc *bdesc)
+{
+ return &(BufferIOCVArray[bdesc->buf_id]).cv;
+}
+
+static inline LWLock *
+BufferDescriptorGetContentLock(const BufferDesc *bdesc)
+{
+ return (LWLock *) (&bdesc->content_lock);
+}
+
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST (-1)
+#define FREENEXT_NOT_IN_LIST (-2)
+
+/*
+ * Functions for acquiring/releasing a shared buffer header's spinlock. Do
+ * not apply these to local buffers!
+ */
+extern uint32 LockBufHdr(BufferDesc *desc);
+
+static inline void
+UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
+{
+ pg_write_barrier();
+ pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
+}
+
/* in bufmgr.c */
/*
--
1.8.3.1
v13-0002-Preliminary-refactoring-for-supporting-larger.patchtext/x-patch; charset=US-ASCII; name=v13-0002-Preliminary-refactoring-for-supporting-larger.patchDownload
From 2310d7a85088e28086578d641e93ca0390d5c281 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Wed, 27 Jul 2022 21:27:20 +0530
Subject: [PATCH v13 2/3] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 +-
src/backend/storage/buffer/bufmgr.c | 140 +++++++++++++++++---------
src/backend/storage/buffer/localbuf.c | 30 ++++--
src/include/storage/buf_internals.h | 64 ++++++++++--
5 files changed, 178 insertions(+), 74 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..c5754ea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(&bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index b2d6026..63f0d41 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -630,10 +630,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(&bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(&bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index ecd1a10..80f05f5 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1657,8 +1657,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1668,8 +1668,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2010,9 +2010,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(&bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(&bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2702,6 +2702,7 @@ PrintBufferLeakWarning(Buffer buffer)
char *path;
BackendId backend;
uint32 buf_state;
+ RelFileLocator rlocator;
Assert(BufferIsValid(buffer));
if (BufferIsLocal(buffer))
@@ -2717,8 +2718,10 @@ PrintBufferLeakWarning(Buffer buffer)
backend = InvalidBackendId;
}
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(rlocator, backend, BufTagGetForkNum(&buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2797,8 +2800,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ *rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ *forknum = BufTagGetForkNum(&bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2848,9 +2851,14 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ reln = smgropen(rlocator, InvalidBackendId);
+ }
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2909,7 +2917,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2930,7 +2938,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3151,15 +3159,15 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3310,7 +3318,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &locators[j]))
{
rlocator = &locators[j];
break;
@@ -3319,7 +3327,10 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ locator = BufTagGetRelFileLocator(&bufHdr->tag);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3329,7 +3340,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, rlocator))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3389,8 +3400,8 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3428,11 +3439,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3456,13 +3467,16 @@ PrintBufferDescs(void)
{
BufferDesc *buf = GetBufferDescriptor(i);
Buffer b = BufferDescriptorGetBuffer(buf);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(rlocator, InvalidBackendId, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3482,12 +3496,16 @@ PrintPinnedBufs(void)
if (GetPrivateRefCount(b) > 0)
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+
/* theoretically we should lock the bufhdr here */
elog(LOG,
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(rlocator, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3526,7 +3544,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3544,7 +3562,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3573,13 +3591,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3653,7 +3671,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3662,7 +3680,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3674,7 +3695,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3876,13 +3897,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4042,6 +4063,10 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
if (XLogHintBitIsNeeded() &&
(pg_atomic_read_u32(&bufHdr->state) & BM_PERMANENT))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
@@ -4050,8 +4075,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
*
* See src/backend/storage/page/README for longer discussion.
*/
- if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ if (RecoveryInProgress() || RelFileLocatorSkippingWAL(rlocator))
return;
/*
@@ -4659,8 +4683,10 @@ AbortBufferIO(void)
{
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
+ RelFileLocator rlocator;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ rlocator = BufTagGetRelFileLocator(&buf->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4684,7 +4710,11 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathperm(rlocator, BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4702,8 +4732,12 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path;
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ path = relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4797,15 +4831,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
+
+ rlocatora = BufTagGetRelFileLocator(ba);
+ rlocatorb = BufTagGetRelFileLocator(bb);
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(ba) < BufTagGetForkNum(bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(ba) > BufTagGetForkNum(bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4955,10 +4994,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ currlocator = BufTagGetRelFileLocator(&tag);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4966,11 +5007,14 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+ RelFileLocator nextrlocator;
+
next = &context->pending_writebacks[i + ahead + 1];
+ nextrlocator = BufTagGetRelFileLocator(&next->tag);
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator, nextrlocator) ||
+ BufTagGetForkNum(&cur->tag) != BufTagGetForkNum(&next->tag))
break;
/* ok, block queued twice, skip */
@@ -4988,8 +5032,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 09c2f09..61a4c6d 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -213,15 +213,18 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
{
SMgrRelation oreln;
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(rlocator, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +340,22 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
+ {
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
+ }
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +392,16 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator))
{
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(rlocator, MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 0f6bc46..5612ca3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,18 +90,51 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+static inline RelFileNumber
+BufTagGetRelNumber(const BufferTag *tag)
+{
+ return tag->relNumber;
+}
+
+static inline ForkNumber
+BufTagGetForkNum(const BufferTag *tag)
+{
+ return tag->forkNum;
+}
+
+static inline void
+BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
+ ForkNumber forknum)
+{
+ tag->relNumber = relnumber;
+ tag->forkNum = forknum;
+}
+
+static inline RelFileLocator
+BufTagGetRelFileLocator(const BufferTag *tag)
+{
+ RelFileLocator rlocator;
+
+ rlocator.spcOid = tag->spcOid;
+ rlocator.dbOid = tag->dbOid;
+ rlocator.relNumber = BufTagGetRelNumber(tag);
+
+ return rlocator;
+}
+
static inline void
ClearBufferTag(BufferTag *tag)
{
- tag->rlocator.spcOid = InvalidOid;
- tag->rlocator.dbOid = InvalidOid;
- tag->rlocator.relNumber = InvalidRelFileNumber;
- tag->forkNum = InvalidForkNumber;
+ tag->spcOid = InvalidOid;
+ tag->dbOid = InvalidOid;
+ BufTagSetRelForkDetails(tag, InvalidRelFileNumber, InvalidForkNumber);
tag->blockNum = InvalidBlockNumber;
}
@@ -109,19 +142,32 @@ static inline void
InitBufferTag(BufferTag *tag, const RelFileLocator *rlocator,
ForkNumber forkNum, BlockNumber blockNum)
{
- tag->rlocator = *rlocator;
- tag->forkNum = forkNum;
+ tag->spcOid = rlocator->spcOid;
+ tag->dbOid = rlocator->dbOid;
+ BufTagSetRelForkDetails(tag, rlocator->relNumber, forkNum);
tag->blockNum = blockNum;
}
static inline bool
BufferTagEqual(const BufferTag *tag1, const BufferTag *tag2)
{
- return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ return (tag1->spcOid == tag2->spcOid) &&
+ (tag1->dbOid == tag2->dbOid) &&
+ (tag1->relNumber == tag2->relNumber) &&
(tag1->blockNum == tag2->blockNum) &&
(tag1->forkNum == tag2->forkNum);
}
+static inline bool
+BufTagMatchesRelFileLocator(const BufferTag *tag,
+ const RelFileLocator *rlocator)
+{
+ return (tag->spcOid == rlocator->spcOid) &&
+ (tag->dbOid == rlocator->dbOid) &&
+ (BufTagGetRelNumber(tag) == rlocator->relNumber);
+}
+
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v13-0003-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v13-0003-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From 521d1919b178ba8715530d1492f712e92a4e9e9d Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 14 Jul 2022 14:25:50 +0530
Subject: [PATCH v13 3/3] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 ++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 ++++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 183 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 61 +++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 95 -----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/dbcommands.c | 4 +-
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 3 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/basebackup.c | 13 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 46 +++---
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 11 +-
src/backend/utils/adt/pg_upgrade_support.c | 22 ++-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 65 +++-----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 +++++
src/include/access/transam.h | 26 +++
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 2 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 ++++++-
src/include/storage/reinit.h | 3 +-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
71 files changed, 712 insertions(+), 346 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..c20439a 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..15c4d38 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 63f0d41..3a23795 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," INT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," INT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a186e35..b77940f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..a2f0d35 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,173 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previously logged record pointer and
+ * we will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 15ab8d9..fa1436b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,18 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ {
+ if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber)
+ elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT,
+ checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber);
+
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ }
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7379,35 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush upto
+ * currently logged record. Also store the currently logged record pointer in
+ * the '*prevrecptr' if prevrecptr is not NULL.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7622,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7646,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e383c21..81d1f27 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..c1bd4f8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 099d369..d09168d 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -250,7 +250,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenumber;
+ RelFileNumber relfilenumber;
Page page;
List *rlocatorlist = NIL;
LockRelId relid;
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenumber = InvalidRelFileNumber;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7fbee0c..430ade2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14331,10 +14331,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index cb7d460..66edc64 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -290,7 +290,8 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS >
+ MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 86cf1b3..dc8b9da 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 637c0ce..4de249b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1172,7 +1172,8 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
int excludeIdx;
bool excludeFound;
ForkNumber relForkNum; /* Type of fork if file is a relation */
- int relOidChars; /* Chars in filename that are the rel oid */
+ int relnumchars; /* Chars in filename that are the
+ * relnumber */
/* Skip special stuff */
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
@@ -1222,23 +1223,23 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
/* Exclude all forks for unlogged tables except the init fork */
if (isDbDir &&
- parse_filename_for_nontemp_relation(de->d_name, &relOidChars,
+ parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&relForkNum))
{
/* Never exclude init forks */
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relOid[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
* with the same OID. If so, the file can be excluded.
*/
- memcpy(relOid, de->d_name, relOidChars);
- relOid[relOidChars] = '\0';
+ memcpy(relNumber, de->d_name, relnumchars);
+ relNumber[relnumchars] = '\0';
snprintf(initForkFile, sizeof(initForkFile), "%s/%s_init",
- path, relOid);
+ path, relNumber);
if (lstat(initForkFile, &statbuf) == 0)
{
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index f053fe0..a4bae7c 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -195,11 +195,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -235,11 +235,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -285,13 +285,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -304,10 +304,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
dbspacedirname, de->d_name);
/* Construct destination pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(dstpath, sizeof(dstpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
/* OK, we're ready to perform the actual copy. */
@@ -328,12 +328,12 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -342,10 +342,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/* Construct main fork pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(mainpath, sizeof(mainpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
fsync_fname(mainpath, false);
@@ -372,13 +372,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *oidchars has been set to the a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
- * portion of the filename. This is critical to protect against a possible
- * buffer overrun.
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
+ * RelFileNumber portion of the filename. This is critical to protect against
+ * a possible buffer overrun.
*/
bool
-parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
ForkNumber *fork)
{
int pos;
@@ -386,9 +386,9 @@ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
- *oidchars = pos;
+ *relnumchars = pos;
/* Check for a fork name. */
if (name[pos] != '_')
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..a88fef2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..2b313aa 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,16 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if (relfilenumber < 0 || relfilenumber > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
+ relfilenumber));
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..b1d60bd 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -29,6 +30,15 @@ do { \
errmsg("function can only be called when server is in binary upgrade mode"))); \
} while (0)
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range", \
+ (relfilenumber))); \
+} while (0)
+
Datum
binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
{
@@ -98,10 +108,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +132,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +156,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..a0b7c5b 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3709,7 +3709,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f9c51d1..0db33ef 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3142,9 +3142,9 @@ dumpDatabase(Archive *fout)
PQExpBuffer loFrozenQry = createPQExpBuffer();
PQExpBuffer loOutQry = createPQExpBuffer();
int i_relfrozenxid,
- i_relfilenode,
i_oid,
i_relminmxid;
+ RelFileNumber i_relfilenode;
/*
* pg_largeobject
@@ -3170,11 +3170,11 @@ dumpDatabase(Archive *fout)
appendPQExpBufferStr(loOutQry, "\n-- For binary upgrade, preserve values for pg_largeobject and its index\n");
for (int i = 0; i < PQntuples(lo_res); ++i)
appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
- "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = '%u'\n"
+ "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = " INT64_FORMAT "\n"
"WHERE oid = %u;\n",
atooid(PQgetvalue(lo_res, i, i_relfrozenxid)),
atooid(PQgetvalue(lo_res, i, i_relminmxid)),
- atooid(PQgetvalue(lo_res, i, i_relfilenode)),
+ atorelnumber(PQgetvalue(lo_res, i, i_relfilenode)),
atooid(PQgetvalue(lo_res, i, i_oid)));
ArchiveEntry(fout, nilCatalogId, createDumpId(),
@@ -4853,16 +4853,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4880,7 +4880,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4894,7 +4894,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4902,7 +4902,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4915,7 +4915,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index df374ce..e97ba4d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..c62fbc2 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,21 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster. Anyway,
+ * the new cluster must not have any user tables while upgrading, so we needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +230,14 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +316,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..f873306 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -29,6 +29,8 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+/* Characters to allow for an RelFileNumber in a relation path */
+#define RELNUMBERCHARS 20 /* max chars printed by %lu */
/*
* Stuff for fork names.
*
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 5612ca3..ca603c2 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BufferTagEqual(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/reinit.h b/src/include/storage/reinit.h
index bf2c10d..b990d28 100644
--- a/src/include/storage/reinit.h
+++ b/src/include/storage/reinit.h
@@ -20,7 +20,8 @@
extern void ResetUnloggedRelations(int op);
extern bool parse_filename_for_nontemp_relation(const char *name,
- int *oidchars, ForkNumber *fork);
+ int *relnumchars,
+ ForkNumber *fork);
#define UNLOGGED_RELATION_CLEANUP 0x0001
#define UNLOGGED_RELATION_INIT 0x0002
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index d63f4f1..a489ccc 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Wed, 27 Jul 2022 at 9:49 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jul 27, 2022 at 12:07 AM Robert Haas <robertmhaas@gmail.com>
wrote:On Tue, Jul 26, 2022 at 2:07 AM Dilip Kumar <dilipbalaut@gmail.com>
wrote:
I have thought about it while doing so but I am not sure whether it is
a good idea or not, because before my change these all were macros
with 2 naming conventions so I just changed to inline function so why
to change the name.Well, the reason to change the name would be for consistency. It feels
weird to have some NAMES_LIKETHIS() and other NamesLikeThis().Now, an argument against that is that it will make back-patching more
annoying, if any code using these functions/macros is touched. But
since the calling sequence is changing anyway (you now have to pass a
pointer rather than the object itself) that argument doesn't really
carry any weight. So I would favor ClearBufferTag(), InitBufferTag(),
etc.Okay, so I have renamed these 2 functions and BUFFERTAGS_EQUAL as well
to BufferTagEqual().
Just realised that this should have been BufferTagsEqual instead of
BufferTagEqual
I will modify this and send an updated patch tomorrow.
—
Dilip
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 27, 2022 at 12:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Just realised that this should have been BufferTagsEqual instead of BufferTagEqual
I will modify this and send an updated patch tomorrow.
I changed it and committed.
What was formerly 0002 will need minor rebasing.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, Jul 27, 2022 at 11:39 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 27, 2022 at 12:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Just realised that this should have been BufferTagsEqual instead of BufferTagEqual
I will modify this and send an updated patch tomorrow.
I changed it and committed.
What was formerly 0002 will need minor rebasing.
Thanks, I have rebased other patches, actually, there is a new 0001
patch now. It seems during renaming relnode related Oid to
RelFileNumber, some of the references were missed and in the last
patch set I kept it as part of main patch 0003, but I think it's
better to keep it separate. So took out those changes and created
0001, but you think this can be committed as part of 0003 only then
also it's fine with me.
I have done some cleanup in 0002 as well, basically, earlier we were
storing the result of the BufTagGetRelFileLocator() in a separate
variable which is not required everywhere. So wherever possible I
have avoided using the intermediate variable.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v14-0002-Preliminary-refactoring-for-supporting-larger-re.patchtext/x-patch; charset=US-ASCII; name=v14-0002-Preliminary-refactoring-for-supporting-larger-re.patchDownload
From de216a910b5c3cd5d8d8ce4a9dfac0c37d8f159e Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 28 Jul 2022 11:49:30 +0530
Subject: [PATCH v14 2/3] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 ++-
src/backend/storage/buffer/bufmgr.c | 115 +++++++++++++++-----------
src/backend/storage/buffer/localbuf.c | 21 +++--
src/include/storage/buf_internals.h | 64 ++++++++++++--
5 files changed, 145 insertions(+), 73 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..c5754ea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(&bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index d9ab39d..c8d673a 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -630,10 +630,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(&bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(&bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6b30138..7a75711 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1657,8 +1657,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1668,8 +1668,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2010,9 +2010,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(&bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(&bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2718,7 +2718,8 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(BufTagGetRelFileLocator(&buf->tag), backend,
+ BufTagGetForkNum(&buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2797,8 +2798,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ *rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ *forknum = BufTagGetForkNum(&bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2848,9 +2849,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ reln = smgropen(BufTagGetRelFileLocator(&buf->tag), InvalidBackendId);
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2909,7 +2910,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2930,7 +2931,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3151,15 +3152,15 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3310,7 +3311,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &locators[j]))
{
rlocator = &locators[j];
break;
@@ -3319,7 +3320,10 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ locator = BufTagGetRelFileLocator(&bufHdr->tag);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3329,7 +3333,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, rlocator))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3389,8 +3393,8 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3428,11 +3432,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3462,7 +3466,8 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(BufTagGetRelFileLocator(&buf->tag),
+ InvalidBackendId, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3487,7 +3492,8 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(BufTagGetRelFileLocator(&buf->tag),
+ BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3526,7 +3532,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3544,7 +3550,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3573,13 +3579,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3653,7 +3659,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3662,7 +3668,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3674,7 +3683,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3876,13 +3885,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4060,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ RelFileLocatorSkippingWAL(BufTagGetRelFileLocator(&bufHdr->tag)))
return;
/*
@@ -4660,7 +4669,8 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ path = relpathperm(BufTagGetRelFileLocator(&buf->tag),
+ BufTagGetForkNum(&buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4684,7 +4694,8 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path = relpathperm(BufTagGetRelFileLocator(&bufHdr->tag),
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4702,8 +4713,9 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path = relpathbackend(BufTagGetRelFileLocator(&bufHdr->tag),
+ MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4797,15 +4809,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ rlocatora = BufTagGetRelFileLocator(ba);
+ rlocatorb = BufTagGetRelFileLocator(bb);
+
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(ba) < BufTagGetForkNum(bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(ba) > BufTagGetForkNum(bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4955,10 +4972,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ currlocator = BufTagGetRelFileLocator(&tag);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4966,11 +4985,13 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator,
+ BufTagGetRelFileLocator(&next->tag)) ||
+ BufTagGetForkNum(&cur->tag) != BufTagGetForkNum(&next->tag))
break;
/* ok, block queued twice, skip */
@@ -4988,8 +5009,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 014f644..9853007 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -215,13 +215,13 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(BufTagGetRelFileLocator(&bufHdr->tag), MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +337,18 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(BufTagGetRelFileLocator(&bufHdr->tag),
+ MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +385,14 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(BufTagGetRelFileLocator(&bufHdr->tag),
+ MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 7246655..406db6b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,18 +90,51 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+static inline RelFileNumber
+BufTagGetRelNumber(const BufferTag *tag)
+{
+ return tag->relNumber;
+}
+
+static inline ForkNumber
+BufTagGetForkNum(const BufferTag *tag)
+{
+ return tag->forkNum;
+}
+
+static inline void
+BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
+ ForkNumber forknum)
+{
+ tag->relNumber = relnumber;
+ tag->forkNum = forknum;
+}
+
+static inline RelFileLocator
+BufTagGetRelFileLocator(const BufferTag *tag)
+{
+ RelFileLocator rlocator;
+
+ rlocator.spcOid = tag->spcOid;
+ rlocator.dbOid = tag->dbOid;
+ rlocator.relNumber = BufTagGetRelNumber(tag);
+
+ return rlocator;
+}
+
static inline void
ClearBufferTag(BufferTag *tag)
{
- tag->rlocator.spcOid = InvalidOid;
- tag->rlocator.dbOid = InvalidOid;
- tag->rlocator.relNumber = InvalidRelFileNumber;
- tag->forkNum = InvalidForkNumber;
+ tag->spcOid = InvalidOid;
+ tag->dbOid = InvalidOid;
+ BufTagSetRelForkDetails(tag, InvalidRelFileNumber, InvalidForkNumber);
tag->blockNum = InvalidBlockNumber;
}
@@ -109,19 +142,32 @@ static inline void
InitBufferTag(BufferTag *tag, const RelFileLocator *rlocator,
ForkNumber forkNum, BlockNumber blockNum)
{
- tag->rlocator = *rlocator;
- tag->forkNum = forkNum;
+ tag->spcOid = rlocator->spcOid;
+ tag->dbOid = rlocator->dbOid;
+ BufTagSetRelForkDetails(tag, rlocator->relNumber, forkNum);
tag->blockNum = blockNum;
}
static inline bool
BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
- return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ return (tag1->spcOid == tag2->spcOid) &&
+ (tag1->dbOid == tag2->dbOid) &&
+ (tag1->relNumber == tag2->relNumber) &&
(tag1->blockNum == tag2->blockNum) &&
(tag1->forkNum == tag2->forkNum);
}
+static inline bool
+BufTagMatchesRelFileLocator(const BufferTag *tag,
+ const RelFileLocator *rlocator)
+{
+ return (tag->spcOid == rlocator->spcOid) &&
+ (tag->dbOid == rlocator->dbOid) &&
+ (BufTagGetRelNumber(tag) == rlocator->relNumber);
+}
+
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
v14-0001-Fixup-Oid-to-RelfileNumber.patchtext/x-patch; charset=US-ASCII; name=v14-0001-Fixup-Oid-to-RelfileNumber.patchDownload
From 78b1e22f67020f20215a17f12abb5d493d8b16f4 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 28 Jul 2022 14:20:04 +0530
Subject: [PATCH v14 1/3] Fixup:Oid to RelfileNumber
In commit b0a55e43299c4ea2a9a8c757f9c26352407d0ccc, some of the
Oid references got missed where it should have been renamed to
relfilenumber.
---
src/backend/commands/dbcommands.c | 4 ++--
src/backend/commands/tablespace.c | 3 ++-
src/backend/replication/basebackup.c | 13 +++++-----
src/backend/storage/file/reinit.c | 46 ++++++++++++++++++------------------
src/include/common/relpath.h | 3 +++
src/include/storage/reinit.h | 3 ++-
6 files changed, 39 insertions(+), 33 deletions(-)
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 099d369..d09168d 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -250,7 +250,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
BlockNumber nblocks;
BlockNumber blkno;
Buffer buf;
- Oid relfilenumber;
+ RelFileNumber relfilenumber;
Page page;
List *rlocatorlist = NIL;
LockRelId relid;
@@ -397,7 +397,7 @@ ScanSourceDatabasePgClassTuple(HeapTupleData *tuple, Oid tbid, Oid dbid,
{
CreateDBRelInfo *relinfo;
Form_pg_class classForm;
- Oid relfilenumber = InvalidRelFileNumber;
+ RelFileNumber relfilenumber = InvalidRelFileNumber;
classForm = (Form_pg_class) GETSTRUCT(tuple);
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index cb7d460..66edc64 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -290,7 +290,8 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS >
+ MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 637c0ce..4de249b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1172,7 +1172,8 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
int excludeIdx;
bool excludeFound;
ForkNumber relForkNum; /* Type of fork if file is a relation */
- int relOidChars; /* Chars in filename that are the rel oid */
+ int relnumchars; /* Chars in filename that are the
+ * relnumber */
/* Skip special stuff */
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
@@ -1222,23 +1223,23 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
/* Exclude all forks for unlogged tables except the init fork */
if (isDbDir &&
- parse_filename_for_nontemp_relation(de->d_name, &relOidChars,
+ parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&relForkNum))
{
/* Never exclude init forks */
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relOid[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
* with the same OID. If so, the file can be excluded.
*/
- memcpy(relOid, de->d_name, relOidChars);
- relOid[relOidChars] = '\0';
+ memcpy(relNumber, de->d_name, relnumchars);
+ relNumber[relnumchars] = '\0';
snprintf(initForkFile, sizeof(initForkFile), "%s/%s_init",
- path, relOid);
+ path, relNumber);
if (lstat(initForkFile, &statbuf) == 0)
{
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index f053fe0..a4bae7c 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -195,11 +195,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -235,11 +235,11 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
+ int relnumchars;
unlogged_relation_entry ent;
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -285,13 +285,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -304,10 +304,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
dbspacedirname, de->d_name);
/* Construct destination pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(dstpath, sizeof(dstpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
/* OK, we're ready to perform the actual copy. */
@@ -328,12 +328,12 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
{
ForkNumber forkNum;
- int oidchars;
- char oidbuf[OIDCHARS + 1];
+ int relnumchars;
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
- if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
+ if (!parse_filename_for_nontemp_relation(de->d_name, &relnumchars,
&forkNum))
continue;
@@ -342,10 +342,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/* Construct main fork pathname. */
- memcpy(oidbuf, de->d_name, oidchars);
- oidbuf[oidchars] = '\0';
+ memcpy(relnumbuf, de->d_name, relnumchars);
+ relnumbuf[relnumchars] = '\0';
snprintf(mainpath, sizeof(mainpath), "%s/%s%s",
- dbspacedirname, oidbuf, de->d_name + oidchars + 1 +
+ dbspacedirname, relnumbuf, de->d_name + relnumchars + 1 +
strlen(forkNames[INIT_FORKNUM]));
fsync_fname(mainpath, false);
@@ -372,13 +372,13 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *oidchars has been set to the a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
- * portion of the filename. This is critical to protect against a possible
- * buffer overrun.
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
+ * RelFileNumber portion of the filename. This is critical to protect against
+ * a possible buffer overrun.
*/
bool
-parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
ForkNumber *fork)
{
int pos;
@@ -386,9 +386,9 @@ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
- *oidchars = pos;
+ *relnumchars = pos;
/* Check for a fork name. */
if (name[pos] != '_')
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..785c3f6 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -29,6 +29,9 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+/* Characters to allow for an RelFileNumber in a relation path */
+#define RELNUMBERCHARS OIDCHARS /* same as OIDCHARS */
+
/*
* Stuff for fork names.
*
diff --git a/src/include/storage/reinit.h b/src/include/storage/reinit.h
index bf2c10d..b990d28 100644
--- a/src/include/storage/reinit.h
+++ b/src/include/storage/reinit.h
@@ -20,7 +20,8 @@
extern void ResetUnloggedRelations(int op);
extern bool parse_filename_for_nontemp_relation(const char *name,
- int *oidchars, ForkNumber *fork);
+ int *relnumchars,
+ ForkNumber *fork);
#define UNLOGGED_RELATION_CLEANUP 0x0001
#define UNLOGGED_RELATION_INIT 0x0002
--
1.8.3.1
v14-0003-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v14-0003-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From 64e2567e708d8fd9cb993cbbb387f4e7faf9627c Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 28 Jul 2022 16:25:04 +0530
Subject: [PATCH v14 3/3] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 ++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 ++++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
doc/src/sgml/storage.sgml | 5 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 183 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 61 +++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 95 -----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/tablecmds.c | 12 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 11 +-
src/backend/utils/adt/pg_upgrade_support.c | 22 ++-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 65 +++-----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 +++++
src/include/access/transam.h | 26 +++
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 2 +-
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 ++++++-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
67 files changed, 677 insertions(+), 317 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index 2ab8c65..c20439a 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
ifdef USE_PGXS
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..ee2d9c7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..15c4d38 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c8d673a..6bb6da6 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," INT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," INT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a186e35..b77940f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index f4b9f66..55a80fa 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -217,11 +217,10 @@ with the suffix <literal>_init</literal> (see <xref linkend="storage-init"/>).
<caution>
<para>
-Note that while a table's filenode often matches its OID, this is
-<emphasis>not</emphasis> necessarily the case; some operations, like
+Note that table's filenode are completely different than its OID. Although for
+system catalogs initial filenode matches with its OID, but some operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.
-Avoid assuming that filenode and table OID are the same.
Also, for certain system catalogs including <structname>pg_class</structname> itself,
<structname>pg_class</structname>.<structfield>relfilenode</structfield> contains zero. The
actual filenode number of these catalogs is stored in a lower-level data
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..a2f0d35 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,173 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previously logged record pointer and
+ * we will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 15ab8d9..fa1436b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,18 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ if (!shutdown)
+ {
+ if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber)
+ elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT,
+ checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber);
+
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ }
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7379,35 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush upto
+ * currently logged record. Also store the currently logged record pointer in
+ * the '*prevrecptr' if prevrecptr is not NULL.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7622,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7646,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e383c21..81d1f27 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2175,14 +2175,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..c1bd4f8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7fbee0c..430ade2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14331,10 +14331,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 86cf1b3..dc8b9da 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3998296..a88fef2 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..2b313aa 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,16 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ if (relfilenumber < 0 || relfilenumber > MAX_RELFILENUMBER)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
+ relfilenumber));
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..b1d60bd 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -29,6 +30,15 @@ do { \
errmsg("function can only be called when server is in binary upgrade mode"))); \
} while (0)
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range", \
+ (relfilenumber))); \
+} while (0)
+
Datum
binary_upgrade_set_next_pg_tablespace_oid(PG_FUNCTION_ARGS)
{
@@ -98,10 +108,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +132,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +156,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index bdb771d..a0b7c5b 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3709,7 +3709,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
/*
* Get a writable copy of the pg_class tuple for the given relation.
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..f727078 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"),
+ ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f9c51d1..0db33ef 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3142,9 +3142,9 @@ dumpDatabase(Archive *fout)
PQExpBuffer loFrozenQry = createPQExpBuffer();
PQExpBuffer loOutQry = createPQExpBuffer();
int i_relfrozenxid,
- i_relfilenode,
i_oid,
i_relminmxid;
+ RelFileNumber i_relfilenode;
/*
* pg_largeobject
@@ -3170,11 +3170,11 @@ dumpDatabase(Archive *fout)
appendPQExpBufferStr(loOutQry, "\n-- For binary upgrade, preserve values for pg_largeobject and its index\n");
for (int i = 0; i < PQntuples(lo_res); ++i)
appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
- "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = '%u'\n"
+ "SET relfrozenxid = '%u', relminmxid = '%u', relfilenode = " INT64_FORMAT "\n"
"WHERE oid = %u;\n",
atooid(PQgetvalue(lo_res, i, i_relfrozenxid)),
atooid(PQgetvalue(lo_res, i, i_relminmxid)),
- atooid(PQgetvalue(lo_res, i, i_relfilenode)),
+ atorelnumber(PQgetvalue(lo_res, i, i_relfilenode)),
atooid(PQgetvalue(lo_res, i, i_oid)));
ArchiveEntry(fout, nilCatalogId, createDumpId(),
@@ -4853,16 +4853,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4880,7 +4880,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4894,7 +4894,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4902,7 +4902,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4915,7 +4915,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index df374ce..e97ba4d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..c62fbc2 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,21 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster. Anyway,
+ * the new cluster must not have any user tables while upgrading, so we needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +230,14 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +316,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d..f6a5b49 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7321,11 +7321,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11191,15 +11191,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 785c3f6..3a18f5b 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -30,7 +30,7 @@
#define OIDCHARS 10 /* max chars printed by %u */
/* Characters to allow for an RelFileNumber in a relation path */
-#define RELNUMBERCHARS OIDCHARS /* same as OIDCHARS */
+#define RELNUMBERCHARS 20 /* max chars printed by %lu */
/*
* Stuff for fork names.
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 406db6b..1301301 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index d63f4f1..a489ccc 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Thu, Jul 28, 2022 at 7:32 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Thanks, I have rebased other patches, actually, there is a new 0001
patch now. It seems during renaming relnode related Oid to
RelFileNumber, some of the references were missed and in the last
patch set I kept it as part of main patch 0003, but I think it's
better to keep it separate. So took out those changes and created
0001, but you think this can be committed as part of 0003 only then
also it's fine with me.
I committed this in part. I took out the introduction of
RELNUMBERCHARS as I think that should probably be a separate commit,
but added in a comment change that you seem to have overlooked.
I have done some cleanup in 0002 as well, basically, earlier we were
storing the result of the BufTagGetRelFileLocator() in a separate
variable which is not required everywhere. So wherever possible I
have avoided using the intermediate variable.
I'll have a look at this next.
--
Robert Haas
EDB: http://www.enterprisedb.com
Not a full review, just a quick skim of 0003.
On 2022-Jul-28, Dilip Kumar wrote:
+ if (!shutdown) + { + if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber) + elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT, + checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber); + + checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber; + }
Please don't do this; rather use %llu and cast to (long long).
Otherwise the string becomes mangled for translation. I think there are
many uses of this sort of pattern in strings, but not all of them are
translatable so maybe we don't care -- for example contrib doesn't have
translations. And the rmgrdesc routines don't translate either, so we
probably don't care about it there; and nothing that uses elog either.
But this one in particular I think should be an ereport, not an elog.
There are several other ereports in various places of the patch also.
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record) if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0) { elog(FATAL, - "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u", + "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u", rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, forknum, blkno);
Should this one be an ereport, and thus you do need to change it to that
and handle it like that?
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber) + elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT, + xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
You missed one whitespace here after the INT64_FORMAT.
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index c390ec5..f727078 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -250,6 +250,8 @@ main(int argc, char *argv[]) printf(_("Latest checkpoint's NextXID: %u:%u\n"), EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid), XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid)); + printf(_("Latest checkpoint's NextRelFileNumber: " INT64_FORMAT "\n"), + ControlFile->checkPointCopy.nextRelFileNumber);
This one must definitely be translatable.
/* Characters to allow for an RelFileNumber in a relation path */ -#define RELNUMBERCHARS OIDCHARS /* same as OIDCHARS */ +#define RELNUMBERCHARS 20 /* max chars printed by %lu */
Maybe say %llu here instead.
I do wonder why do we keep relfilenodes limited to decimal digits. Why
not use hex digits? Then we know the limit is 14 chars, as in
0x00FFFFFFFFFFFFFF in the MAX_RELFILENUMBER definition.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end." (2nd Commandment for C programmers)
On Thu, Jul 28, 2022 at 11:59 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
I do wonder why do we keep relfilenodes limited to decimal digits. Why
not use hex digits? Then we know the limit is 14 chars, as in
0x00FFFFFFFFFFFFFF in the MAX_RELFILENUMBER definition.
Hmm, but surely we want the error messages to be printed using the
same format that we use for the actual filenames. We could make the
filenames use hex characters too, but I'm not wild about changing
user-visible details like that.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Jul 28, 2022 at 9:52 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Jul 28, 2022 at 11:59 AM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:I do wonder why do we keep relfilenodes limited to decimal digits. Why
not use hex digits? Then we know the limit is 14 chars, as in
0x00FFFFFFFFFFFFFF in the MAX_RELFILENUMBER definition.Hmm, but surely we want the error messages to be printed using the
same format that we use for the actual filenames. We could make the
filenames use hex characters too, but I'm not wild about changing
user-visible details like that.
From a DBA perspective this would be a regression in usability.
JD
--
- Founder - https://commandprompt.com/ - 24x7x365 Postgres since 1997
- Founder and Co-Chair - https://postgresconf.org/
- Founder - https://postgresql.us - United States PostgreSQL
- Public speaker, published author, postgresql expert, and people
believer.
- Host - More than a refresh
<https://commandprompt.com/about/more-than-a-refresh/>: A podcast about
data and the people who wrangle it.
On 2022-Jul-28, Robert Haas wrote:
On Thu, Jul 28, 2022 at 11:59 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
I do wonder why do we keep relfilenodes limited to decimal digits. Why
not use hex digits? Then we know the limit is 14 chars, as in
0x00FFFFFFFFFFFFFF in the MAX_RELFILENUMBER definition.Hmm, but surely we want the error messages to be printed using the
same format that we use for the actual filenames.
Of course.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
sources, so let's move on." (Nathaniel Smith)
On Thu, Jul 28, 2022 at 5:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
Above comment says that RelFileNumber zero is invalid which is technically
correct because we don't have any relation file in disk with zero number.
But the point is that if someone reads below definition of
CHECK_RELFILENUMBER_RANGE he/she might get confused because as per this
definition relfilenumber zero is valid.
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
\
+ (relfilenumber)))); \
+} while (0)
+
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
It seems like the relfilenumber in above definition represents relfilenode
value in pg_class which can hold zero value which actually means it's a
mapped relation. I think it would be good to provide some clarity here.
--
With Regards,
Ashutosh Sharma.
On Fri, Jul 29, 2022 at 6:26 PM Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:
On Thu, Jul 28, 2022 at 5:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
+/* ---------- + * RelFileNumber zero is InvalidRelFileNumber. + * + * For the system tables (OID < FirstNormalObjectId) the initial storageAbove comment says that RelFileNumber zero is invalid which is technically
correct because we don't have any relation file in disk with zero number.
But the point is that if someone reads below definition of
CHECK_RELFILENUMBER_RANGE he/she might get confused because as per this
definition relfilenumber zero is valid.
Please ignore the above comment shared in my previous email. It is a
little over-thinking on my part that generated this comment in my mind.
Sorry for that. Here are the other comments I have:
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE OR REPLACE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE OR REPLACE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
As we are dropping the function and view I think it would be good if we
*don't* use the "OR REPLACE" keyword when re-defining them.
==
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode" INT64_FORMAT " is too
large to be represented as an OID",
+ fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER
EXTENSION pg_buffercache UPDATE")));
I think it would be good to recommend users to upgrade to the latest
version instead of just saying upgrade the pg_buffercache using ALTER
EXTENSION ....
==
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM
pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl'
\gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname =
'sample_tbl' \gset
Is this change required? The original query is just trying to fetch table
oid not relfilenode and AFAIK we haven't changed anything in table oid.
==
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber " INT64_FORMAT " is out of range",
\
+ (relfilenumber)))); \
+} while (0)
+
I think we can shift this macro to some header file and reuse it at several
places.
==
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to
the
that that relation -> that relation
--
With Regards,
Ashutosh Sharma.
On Wed, Jul 27, 2022 at 6:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jul 27, 2022 at 3:27 PM vignesh C <vignesh21@gmail.com> wrote:
Thanks for the updated patch, Few comments:
1) The format specifier should be changed from %u to INT64_FORMAT
autoprewarm.c -> apw_load_buffers
...............
if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
...............2) The format specifier should be changed from %u to INT64_FORMAT
autoprewarm.c -> apw_dump_now
...............
ret = fprintf(file, "%u,%u,%u,%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
(uint32) block_info_array[i].forknum,
block_info_array[i].blocknum);
...............3) should the comment "entry point for old extension version" be on top of pg_buffercache_pages, as the current version will use pg_buffercache_pages_v1_4 + +Datum +pg_buffercache_pages(PG_FUNCTION_ARGS) +{ + return pg_buffercache_pages_internal(fcinfo, OIDOID); +} + +/* entry point for old extension version */ +Datum +pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS) +{ + return pg_buffercache_pages_internal(fcinfo, INT8OID); +}4) we could use the new style or ereport by removing the brackets around errcode: + if (fctx->record[i].relfilenumber > OID_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID", + fctx->record[i].relfilenumber), + errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE")));like:
ereport(ERROR,errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("relfilenode" INT64_FORMAT " is too large to be represented as
an OID",fctx->record[i].relfilenumber),
errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache
UPDATE"));5) Similarly in the below code too: + /* check whether the relfilenumber is within a valid range */ + if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("relfilenumber " INT64_FORMAT " is out of range", + (relfilenumber))));6) Similarly in the below code too: +#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \ +do { \ + if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \ + ereport(ERROR, \ + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \ + errmsg("relfilenumber " INT64_FORMAT " is out of range", \ + (relfilenumber)))); \ +} while (0) +7) This error code looks similar to CHECK_RELFILENUMBER_RANGE, can this macro be used here too: pg_filenode_relation(PG_FUNCTION_ARGS) { Oid reltablespace = PG_GETARG_OID(0); - RelFileNumber relfilenumber = PG_GETARG_OID(1); + RelFileNumber relfilenumber = PG_GETARG_INT64(1); Oid heaprel;+ /* check whether the relfilenumber is within a valid range */ + if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("relfilenumber " INT64_FORMAT " is out of range", + (relfilenumber))));8) I felt this include is not required: diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c index 849a7ce..a2f0d35 100644 --- a/src/backend/access/transam/varsup.c +++ b/src/backend/access/transam/varsup.c @@ -13,12 +13,16 @@#include "postgres.h"
+#include <unistd.h> + #include "access/clog.h" #include "access/commit_ts.h"9) should we change elog to ereport to use the New-style error reporting API + /* safety check, we should never get this far in a HS standby */ + if (RecoveryInProgress()) + elog(ERROR, "cannot assign RelFileNumber during recovery"); + + if (IsBinaryUpgrade) + elog(ERROR, "cannot assign RelFileNumber during binary upgrade");10) Here nextRelFileNumber is protected by RelFileNumberGenLock, the
comment stated OidGenLock. It should be slightly adjusted.
typedef struct VariableCacheData
{
/*
* These fields are protected by OidGenLock.
*/
Oid nextOid; /* next OID to assign */
uint32 oidCount; /* OIDs available before must do XLOG work */
RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
* loggedRelFileNumber */Thanks for the review I have fixed these except,
9) should we change elog to ereport to use the New-style error reporting API
I think this is internal error so if we use ereport we need to give
error code and all and I think for internal that is not necessary?
Ok, Sounds reasonable.
8) I felt this include is not required:
it is using access API so we do need <unistd.h>
Ok, It worked for me because I had not used the ASSERT enabled flag
while compilation.
Regards,
Vignesh
On Thu, Jul 28, 2022 at 10:29 AM Robert Haas <robertmhaas@gmail.com> wrote:
I have done some cleanup in 0002 as well, basically, earlier we were
storing the result of the BufTagGetRelFileLocator() in a separate
variable which is not required everywhere. So wherever possible I
have avoided using the intermediate variable.I'll have a look at this next.
I was taught that when programming in C one should avoid returning a
struct type, as BufTagGetRelFileLocator does. I would have expected it
to return void and take an argument of type RelFileLocator * into
which it writes the results. On the other hand, I was also taught that
one should avoid passing a struct type as an argument, and smgropen()
has been doing that since Tom Lane committed
87bd95638552b8fc1f5f787ce5b862bb6fc2eb80 all the way back in 2004. So
maybe this isn't that relevant any more on modern compilers? Or maybe
for small structs it doesn't matter much? I dunno.
Other than that, I think your 0002 looks fine.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 2022-Jul-29, Robert Haas wrote:
I was taught that when programming in C one should avoid returning a
struct type, as BufTagGetRelFileLocator does.
Doing it like that helps RelFileLocatorSkippingWAL, which takes a bare
RelFileLocator as argument. With this coding you can call one function
with the other function as its argument.
However, with the current definition of relpathbackend() and siblings,
it looks quite disastrous -- BufTagGetRelFileLocator is being called
three times. You could argue that a solution would be to turn those
macros into static inline functions.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"I'm impressed how quickly you are fixing this obscure issue. I came from
MS SQL and it would be hard for me to put into words how much of a better job
you all are doing on [PostgreSQL]."
Steve Midgley, http://archives.postgresql.org/pgsql-sql/2008-08/msg00000.php
On Fri, Jul 29, 2022 at 2:12 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Jul-29, Robert Haas wrote:
I was taught that when programming in C one should avoid returning a
struct type, as BufTagGetRelFileLocator does.Doing it like that helps RelFileLocatorSkippingWAL, which takes a bare
RelFileLocator as argument. With this coding you can call one function
with the other function as its argument.However, with the current definition of relpathbackend() and siblings,
it looks quite disastrous -- BufTagGetRelFileLocator is being called
three times. You could argue that a solution would be to turn those
macros into static inline functions.
Yeah, if we think it's OK to pass around structs, then that seems like
the right solution. Otherwise functions that take RelFileLocator
should be changed to take const RelFileLocator * and we should adjust
elsewhere accordingly.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 2022-Jul-29, Robert Haas wrote:
Yeah, if we think it's OK to pass around structs, then that seems like
the right solution. Otherwise functions that take RelFileLocator
should be changed to take const RelFileLocator * and we should adjust
elsewhere accordingly.
We do that in other places. See get_object_address() for another
example. Now, I don't see *why* they do it. I suppose there's
notational convenience; for get_object_address() I think it'd be uglier
with another out argument (it already has *relp). For smgropen() it's
not clear at all that there is any.
For the new function, there's at least a couple of places that the
calling convention makes simpler, so I don't see why you wouldn't use it
that way.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Use it up, wear it out, make it do, or do without"
On Fri, Jul 29, 2022 at 3:18 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Jul-29, Robert Haas wrote:
Yeah, if we think it's OK to pass around structs, then that seems like
the right solution. Otherwise functions that take RelFileLocator
should be changed to take const RelFileLocator * and we should adjust
elsewhere accordingly.We do that in other places. See get_object_address() for another
example. Now, I don't see *why* they do it. I suppose there's
notational convenience; for get_object_address() I think it'd be uglier
with another out argument (it already has *relp). For smgropen() it's
not clear at all that there is any.For the new function, there's at least a couple of places that the
calling convention makes simpler, so I don't see why you wouldn't use it
that way.
All right, perhaps it's fine as Dilip has it, then.
--
Robert Haas
EDB: http://www.enterprisedb.com
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2022-Jul-29, Robert Haas wrote:
Yeah, if we think it's OK to pass around structs, then that seems like
the right solution. Otherwise functions that take RelFileLocator
should be changed to take const RelFileLocator * and we should adjust
elsewhere accordingly.
We do that in other places. See get_object_address() for another
example. Now, I don't see *why* they do it.
If it's a big struct then avoiding copying it is good; but RelFileLocator
isn't that big.
While researching that statement I did happen to notice that no one has
bothered to update the comment immediately above struct RelFileLocator,
and it is something that absolutely does require attention if there
are plans to make RelFileNumber something other than 32 bits.
* Note: various places use RelFileLocator in hashtable keys. Therefore,
* there *must not* be any unused padding bytes in this struct. That
* should be safe as long as all the fields are of type Oid.
*/
typedef struct RelFileLocator
{
Oid spcOid; /* tablespace */
Oid dbOid; /* database */
RelFileNumber relNumber; /* relation */
} RelFileLocator;
regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes:
I was taught that when programming in C one should avoid returning a
struct type, as BufTagGetRelFileLocator does.
FWIW, I think that was invalid pre-ANSI-C, and maybe even in C89.
C99 and later requires it. But it is pass-by-value and you have
to think twice about whether you want the struct to be copied.
regards, tom lane
On Wed, Jul 20, 2022 at 7:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
There was also an issue where the user table from the old cluster's
relfilenode could conflict with the system table of the new cluster.
As a solution currently for system table object (while creating
storage first time) we are keeping the low range of relfilenumber,
basically we are using the same relfilenumber as OID so that during
upgrade the normal user table from the old cluster will not conflict
with the system tables in the new cluster. But with this solution
Robert told me (in off list chat) a problem that in future if we want
to make relfilenumber completely unique within a cluster by
implementing the CREATEDB differently then we can not do that as we
have created fixed relfilenodes for the system tables.I am not sure what exactly we can do to avoid that because even if we
do something to avoid that in the new cluster the old cluster might
be already using the non-unique relfilenode so after upgrading the new
cluster will also get those non-unique relfilenode.
I think this aspect of the patch could use some more discussion.
To recap, the problem is that pg_upgrade mustn't discover that a
relfilenode that is being migrated from the old cluster is being used
for some other table in the new cluster. Since the new cluster should
only contain system tables that we assume have never been rewritten,
they'll all have relfilenodes equal to their OIDs, and thus less than
16384. On the other hand all the user tables from the old cluster will
have relfilenodes greater than 16384, so we're fine. pg_largeobject,
which also gets migrated, is a special case. Since we don't change OID
assignments from version to version, it should have either the same
relfilenode value in the old and new clusters, if never rewritten, or
else the value in the old cluster will be greater than 16384, in which
case no conflict is possible.
But if we just assign all relfilenode values from a central counter,
then we have got trouble. If the new version has more system catalog
tables than the old version, then some value that got used for a user
table in the old version might get used for a system table in the new
version, which is a problem. One idea for fixing this is to have two
RelFileNumber ranges: a system range (small values) and a user range.
System tables get values in the system range initially, and in the
user range when first rewritten. User tables always get values in the
user range. Everything works fine in this scenario except maybe for
pg_largeobject: what if it gets one value from the system range in the
old cluster, and a different value from the system range in the new
cluster, but some other system table in the new cluster gets the value
that pg_largeobject had in the old cluster? Then we've got trouble. It
doesn't help if we assign pg_largeobject a starting relfilenode from
the user range, either: now a relfilenode that needs to end up
containing the some user table from the old cluster might find itself
blocked by pg_largeobject in the new cluster.
One solution to all this is to do as Dilip proposes here: for system
relations, keep assigning the OID as the initial relfilenumber.
Actually, we really only need to do this for pg_largeobject; all the
other relfilenumber values could be assigned from a counter, as long
as they're assigned from a range distinct from what we use for user
relations.
But I don't really like that, because I feel like the whole thing
where we start out with relfilenumber=oid is a recipe for hidden bugs.
I believe we'd be better off if we decouple those concepts more
thoroughly. So here's another idea: what if we set the
next-relfilenumber counter for the new cluster to the value from the
old cluster, and then rewrote all the (thus-far-empty) system tables?
Then every system relation in the new cluster has a relfilenode value
greater than any in use in the old cluster, so we can afterwards
migrate over every relfilenode from the old cluster with no risk of
conflicting with anything. Then all the special cases go away. We
don't need system and user ranges for relfilenodes, and
pg_largeobject's not a special case, either. We can assign relfilenode
values to system relations in exactly the same we do for user
relations: assign a value from the global counter and forget about it.
If this cluster happens to be the "new cluster" for a pg_upgrade
attempt, the procedure described at the beginning of this paragraph
will move everything that might conflict out of the way.
One thing to perhaps not like about this is that it's a little more
expensive: clustering every system table in every database on a new
cluster isn't completely free. Perhaps it's not expensive enough to be
a big problem, though.
Thoughts?
--
Robert Haas
EDB: http://www.enterprisedb.com
On Sat, Jul 30, 2022 at 8:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I was taught that when programming in C one should avoid returning a
struct type, as BufTagGetRelFileLocator does.FWIW, I think that was invalid pre-ANSI-C, and maybe even in C89.
C99 and later requires it. But it is pass-by-value and you have
to think twice about whether you want the struct to be copied.
C89 had that.
As for what it actually does in a non-inlined function: on all modern
Unix-y systems, 128 bit first arguments and return values are
transferred in register pairs[1]https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI. So if you define a struct that
holds uint32_t, uint32_t, uint64_t and compile a function that takes
one and returns it, you see the struct being transferred directly from
input registers to output registers:
0x0000000000000000 <+0>: mov %rdi,%rax
0x0000000000000003 <+3>: mov %rsi,%rdx
0x0000000000000006 <+6>: ret
Similar on ARM64. There it's an empty function, so it must be using
the same register in and out[2]https://gcc.godbolt.org/z/qfPzhW7YM.
The MSVC calling convention is different and doesn't seem to be able
to pass it through registers, so it schleps it out to memory at a
return address[3]https://gcc.godbolt.org/z/WqvYz6xjs. But that's pretty similar to the proposed
alternative anyway, so surely no worse. *shrug* And of course those
"constructor"-like functions are inlined anyway.
[1]: https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI
[2]: https://gcc.godbolt.org/z/qfPzhW7YM
[3]: https://gcc.godbolt.org/z/WqvYz6xjs
On Sat, Jul 30, 2022 at 9:11 AM Thomas Munro <thomas.munro@gmail.com> wrote:
on all modern Unix-y systems,
(I meant to write AMD64 there)
On Thu, Jul 28, 2022 at 9:29 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Not a full review, just a quick skim of 0003.
Thanks for the review
+ if (!shutdown) + { + if (ShmemVariableCache->loggedRelFileNumber < checkPoint.nextRelFileNumber) + elog(ERROR, "nextRelFileNumber can not go backward from " INT64_FORMAT "to" INT64_FORMAT, + checkPoint.nextRelFileNumber, ShmemVariableCache->loggedRelFileNumber); + + checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber; + }Please don't do this; rather use %llu and cast to (long long).
Otherwise the string becomes mangled for translation. I think there are
many uses of this sort of pattern in strings, but not all of them are
translatable so maybe we don't care -- for example contrib doesn't have
translations. And the rmgrdesc routines don't translate either, so we
probably don't care about it there; and nothing that uses elog either.
But this one in particular I think should be an ereport, not an elog.
There are several other ereports in various places of the patch also.
Okay, actually I did not understand the clear logic of when to use
%llu and to use (U)INT64_FORMAT. They are both used for 64-bit
integers. So do you think it is fine to replace all INT64_FORMAT in
my patch with %llu?
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record) if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0) { elog(FATAL, - "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u", + "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u", rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, forknum, blkno);Should this one be an ereport, and thus you do need to change it to that
and handle it like that?
Okay, so you mean irrespective of this patch should this be converted
to ereport?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On 2022-Jul-30, Dilip Kumar wrote:
On Thu, Jul 28, 2022 at 9:29 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Please don't do this; rather use %llu and cast to (long long).
Otherwise the string becomes mangled for translation.Okay, actually I did not understand the clear logic of when to use
%llu and to use (U)INT64_FORMAT. They are both used for 64-bit
integers. So do you think it is fine to replace all INT64_FORMAT in
my patch with %llu?
The point here is that there are two users of the source code: one is
the compiler, and the other is gettext, which extracts the string for
the translation catalog. The compiler is OK with UINT64_FORMAT, of
course (because the preprocessor deals with it). But gettext is quite
stupid and doesn't understand that UINT64_FORMAT expands to some
specifier, so it truncates the string at the double quote sign just
before; in other words, it just doesn't work. So whenever you have a
string that ends up in a translation catalog, you must not use
UINT64_FORMAT or any other preprocessor macro; it has to be a straight
specifier in the format string.
We have found that the most convenient notation is to use %llu in the
string and cast the argument to (unsigned long long), so our convention
is to use that.
For strings that do not end up in a translation catalog, there's no
reason to use %llu-and-cast; UINT64_FORMAT is okay.
@@ -2378,7 +2378,7 @@ verifyBackupPageConsistency(XLogReaderState *record) if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0) { elog(FATAL, - "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u", + "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u", rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, forknum, blkno);Should this one be an ereport, and thus you do need to change it to that
and handle it like that?Okay, so you mean irrespective of this patch should this be converted
to ereport?
Yes, I think this should be an ereport with errcode(ERRCODE_DATA_CORRUPTED).
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On Sat, Jul 30, 2022 at 1:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2022-Jul-29, Robert Haas wrote:
Yeah, if we think it's OK to pass around structs, then that seems like
the right solution. Otherwise functions that take RelFileLocator
should be changed to take const RelFileLocator * and we should adjust
elsewhere accordingly.We do that in other places. See get_object_address() for another
example. Now, I don't see *why* they do it.If it's a big struct then avoiding copying it is good; but RelFileLocator
isn't that big.While researching that statement I did happen to notice that no one has
bothered to update the comment immediately above struct RelFileLocator,
and it is something that absolutely does require attention if there
are plans to make RelFileNumber something other than 32 bits.
I think we need to update this comment in the patch where we are
making RelFileNumber 64 bits wide. But as such I do not see a problem
in using RelFileLocator directly as key because if we make
RelFileNumber 64 bits then its structure will already be 8 byte
aligned so there should not be any padding. However, if we use some
other structure as key which contain RelFileLocator i.e.
RelFileLocatorBackend then there will be a problem. So for handling
that issue while computing the key size (wherever we have
RelFileLocatorBackend as key) I have avoided the padding bytes in size
by introducing this new macro[1]#define SizeOfRelFileLocatorBackend \ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId)).
[1]: #define SizeOfRelFileLocatorBackend \ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
#define SizeOfRelFileLocatorBackend \
(offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Jul 29, 2022 at 10:55 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Jul 28, 2022 at 10:29 AM Robert Haas <robertmhaas@gmail.com> wrote:
I have done some cleanup in 0002 as well, basically, earlier we were
storing the result of the BufTagGetRelFileLocator() in a separate
variable which is not required everywhere. So wherever possible I
have avoided using the intermediate variable.I'll have a look at this next.
I was taught that when programming in C one should avoid returning a
struct type, as BufTagGetRelFileLocator does. I would have expected it
to return void and take an argument of type RelFileLocator * into
which it writes the results. On the other hand, I was also taught that
one should avoid passing a struct type as an argument, and smgropen()
has been doing that since Tom Lane committed
87bd95638552b8fc1f5f787ce5b862bb6fc2eb80 all the way back in 2004. So
maybe this isn't that relevant any more on modern compilers? Or maybe
for small structs it doesn't matter much? I dunno.Other than that, I think your 0002 looks fine.
Generally, I try to avoid it, but I see in current code also if the
structure is small and by directly returning the structure it makes
the other code easy then we are doing this way[1]static inline ForEachState for_each_from_setup(const List *lst, int N) { ForEachState r = {lst, N};. I wanted to do
this way is a) if we pass as an argument then I will have to use an
extra variable which makes some code complicated, it's not a big
issue, infact I had it that way in the previous version but simplified
in one of the recent versions. b) If I allocate memory and return
pointer then also I need to store that address and later free that.
[1]: static inline ForEachState for_each_from_setup(const List *lst, int N) { ForEachState r = {lst, N};
static inline ForEachState
for_each_from_setup(const List *lst, int N)
{
ForEachState r = {lst, N};
Assert(N >= 0);
return r;
}
static inline FullTransactionId
FullTransactionIdFromEpochAndXid(uint32 epoch, TransactionId xid)
{
FullTransactionId result;
result.value = ((uint64) epoch) << 32 | xid;
return result;
}
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Jul 29, 2022 at 8:02 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
+ ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("relfilenode" INT64_FORMAT " is too large to be represented as an OID", + fctx->record[i].relfilenumber), + errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE")));I think it would be good to recommend users to upgrade to the latest version instead of just saying upgrade the pg_buffercache using ALTER EXTENSION ....
This error would be hit if the relfilenumber is out of OID range that
means the user is using a new cluster but old pg_buffercache
extension. So this errhint is about suggesting to upgrade the
extension.
==
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql +++ b/contrib/pg_walinspect/sql/pg_walinspect.sql @@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1'); -- Test for filtering out WAL records of a particular table -- ===================================================================-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset +SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gsetIs this change required? The original query is just trying to fetch table oid not relfilenode and AFAIK we haven't changed anything in table oid.
If you notice the complete test, then you will realize that
sample_tbl_oid are used for verifying that in
pg_get_wal_records_info(), so earlier it was okay if we were using oid
instead of relfilenode because this test case is just creating table
doing some DML and verifying oid in WAL so that will be same as
relfilenode, but that is no longer true. So we will have to check the
relfilenode that was the actual intention of the test.
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber + * because of the possibility that that relation will be moved back to thethat that relation -> that relation
I think this is a grammatically correct sentence .
I have fixed other comments, and also fixed comments from Alvaro to
use %lld instead of INT64_FORMAT inside the ereport and wherever he
suggested.
I haven't yet changed MAX_RELFILENUMBER to represent the hex
characters because then we will have to change the filename as well.
So I think there is no conclusion on this yet whether we want to keep
it as it is or in hex. And there is another suggestion to change one
of the existing elog to an ereport, so for that I will share a
separate patch.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v15-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v15-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From 33ea40ab118a461bceecba4b5a9e82535e76d01d Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Sun, 31 Jul 2022 17:13:55 +0530
Subject: [PATCH v15 2/2] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 ++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 ++++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
doc/src/sgml/storage.sgml | 5 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 183 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 57 +++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/catalog/catalog.c | 95 -----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 2 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/basebackup.c | 2 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 28 ++--
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 7 +-
src/backend/utils/adt/pg_upgrade_support.c | 13 +-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 65 +++-----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 28 ++--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 5 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/bin/scripts/t/090_reindexdb.pl | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 +++++
src/include/access/transam.h | 35 ++++
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 1 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 ++++++-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
71 files changed, 687 insertions(+), 334 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d74b3e8..4d88eba 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..50956b1
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..912cbd8 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode %lld is too large to be represented as an OID",
+ (long long) fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c8d673a..6bb6da6 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," INT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," INT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index cd2cc37..360793a 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1965,7 +1965,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index f4b9f66..55a80fa 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -217,11 +217,10 @@ with the suffix <literal>_init</literal> (see <xref linkend="storage-init"/>).
<caution>
<para>
-Note that while a table's filenode often matches its OID, this is
-<emphasis>not</emphasis> necessarily the case; some operations, like
+Note that table's filenode are completely different than its OID. Although for
+system catalogs initial filenode matches with its OID, but some operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.
-Avoid assuming that filenode and table OID are the same.
Also, for certain system catalogs including <structname>pg_class</structname> itself,
<structname>pg_class</structname>.<structfield>relfilenode</structfield> contains zero. The
actual filenode number of these catalogs is stored in a lower-level data
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..a2f0d35 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,173 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previously logged record pointer and
+ * we will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 15ab8d9..5c7b664 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4543,6 +4543,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4556,7 +4557,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5023,7 +5027,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,14 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ if (shutdown)
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ else
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7375,35 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush upto
+ * currently logged record. Also store the currently logged record pointer in
+ * the '*prevrecptr' if prevrecptr is not NULL.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7618,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7642,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 87d1421..d27c7fc 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 27e02fb..d0970a0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2225,14 +2225,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2428,7 +2428,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6f43870..155400c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -481,101 +481,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..080622b 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT " that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d22dd44..77457b2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14340,10 +14340,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 570ce3d..4bd66c3 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -268,7 +268,7 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 86cf1b3..dc8b9da 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 7f85071..f22d858 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1230,7 +1230,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relNumber[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c5c6a2b..7029604 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fd..99580a4 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4869,7 +4869,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 647c458..c3faa68 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -31,7 +31,7 @@ static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
typedef struct
{
- Oid reloid; /* hash key */
+ RelFileNumber relnumber; /* hash key */
} unlogged_relation_entry;
/*
@@ -184,10 +184,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* need to be reset. Otherwise, this cleanup operation would be
* O(n^2).
*/
- ctl.keysize = sizeof(Oid);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(unlogged_relation_entry);
ctl.hcxt = CurrentMemoryContext;
- hash = hash_create("unlogged relation OIDs", 32, &ctl,
+ hash = hash_create("unlogged relation RelFileNumbers", 32, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
/* Scan the directory. */
@@ -208,10 +208,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * Put the OID portion of the name into the hash table, if it
- * isn't already.
+ * Put the RELFILENUMBER portion of the name into the hash table,
+ * if it isn't already.
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
(void) hash_search(hash, &ent, HASH_ENTER, NULL);
}
@@ -248,10 +248,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * See whether the OID portion of the name shows up in the hash
- * table. If so, nuke it!
+ * See whether the RELFILENUMBER portion of the name shows up in
+ * the hash table. If so, nuke it!
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
if (hash_search(hash, &ent, HASH_FIND, NULL))
{
snprintf(rm_path, sizeof(rm_path), "%s/%s",
@@ -286,7 +286,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
@@ -329,7 +329,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
@@ -372,8 +372,8 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *relnumchars has been set to a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
* RelFileNumber portion of the filename. This is critical to protect against
* a possible buffer overrun.
*/
@@ -386,7 +386,7 @@ parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
*relnumchars = pos;
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3deac49..532bd7f 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..9f70f35 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,12 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..fc2faed 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -98,10 +99,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +123,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +147,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 00dc0f2..6f4e96d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3712,7 +3712,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
{
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
}
else if (relation->rd_rel->relkind == RELKIND_INDEX)
{
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index dc20122..30933fd 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -489,9 +489,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..4b6ff4d 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber:%lld\n"),
+ (long long) ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da66051..d9ea12b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3143,9 +3143,9 @@ dumpDatabase(Archive *fout)
PQExpBuffer loOutQry = createPQExpBuffer();
PQExpBuffer loHorizonQry = createPQExpBuffer();
int i_relfrozenxid,
- i_relfilenode,
i_oid,
i_relminmxid;
+ RelFileNumber i_relfilenode;
/*
* pg_largeobject
@@ -3183,15 +3183,15 @@ dumpDatabase(Archive *fout)
atooid(PQgetvalue(lo_res, i, i_oid)));
oid = atooid(PQgetvalue(lo_res, i, i_oid));
- relfilenumber = atooid(PQgetvalue(lo_res, i, i_relfilenode));
+ relfilenumber = atorelnumber(PQgetvalue(lo_res, i, i_relfilenode));
if (oid == LargeObjectRelationId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
else if (oid == LargeObjectLOidPNIndexId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
@@ -4876,16 +4876,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4903,7 +4903,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4917,7 +4917,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4925,7 +4925,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4938,7 +4938,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index df374ce..e97ba4d 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -399,8 +399,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
i_reloid,
i_indtable,
i_toastheap,
- i_relfilenumber,
i_reltablespace;
+ RelFileNumber i_relfilenumber;
char query[QUERY_ALLOC];
char *last_namespace = NULL,
*last_tablespace = NULL;
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index e706d68..de5cee6 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -40,7 +40,7 @@ my $toast_index = $node->safe_psql('postgres',
# REINDEX operations. A set of relfilenodes is saved from the catalogs
# and then compared with pg_class.
$node->safe_psql('postgres',
- 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode oid);'
+ 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode int8);'
);
# Save the relfilenode of a set of toast indexes, one from the catalog
# pg_constraint and one from the test table.
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..36acd8b 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,30 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster. Anyway,
+ * the new cluster must not have any user tables while upgrading, so we needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber %lld is out of range", \
+ (long long) (relfilenumber))); \
+} while (0)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +239,14 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +325,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index be47583..5e88170 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7329,11 +7329,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11199,15 +11199,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..f84e22c 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -28,6 +28,7 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+#define RELNUMBERCHARS 20 /* max chars printed by %llu */
/*
* Stuff for fork names.
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 406db6b..1301301 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index d63f4f1..a489ccc 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
v15-0001-Preliminary-refactoring-for-supporting-larger-re.patchtext/x-patch; charset=US-ASCII; name=v15-0001-Preliminary-refactoring-for-supporting-larger-re.patchDownload
From a30108d0bf44a9752418e9c474b0108f19fb744f Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 28 Jul 2022 11:49:30 +0530
Subject: [PATCH v15 1/2] Preliminary refactoring for supporting larger
relfilenumber
Currently, relfilenumber is Oid type and it can wrap around so as part of
the larger patch set we are trying to make it 64 bit to avoid wraparound
and that will make a couple of other things simpler as explained in the
next patches.
So this is just a preliminary refactoring patch as part of this, in
BufferTag, instead of keeping the RelFileLocator, we will keep the
tablespace Oid, database Oid, and the relfilenumber directly. So that
once we change the relNumber in RelFileLocator to 64 bits the buffer tag
alignment padding will not change.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 8 +-
contrib/pg_prewarm/autoprewarm.c | 10 ++-
src/backend/storage/buffer/bufmgr.c | 115 +++++++++++++++-----------
src/backend/storage/buffer/localbuf.c | 21 +++--
src/include/storage/buf_internals.h | 64 ++++++++++++--
5 files changed, 145 insertions(+), 73 deletions(-)
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 131bd62..c5754ea 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,10 +153,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenumber = bufHdr->tag.rlocator.relNumber;
- fctx->record[i].reltablespace = bufHdr->tag.rlocator.spcOid;
- fctx->record[i].reldatabase = bufHdr->tag.rlocator.dbOid;
- fctx->record[i].forknum = bufHdr->tag.forkNum;
+ fctx->record[i].relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
+ fctx->record[i].reltablespace = bufHdr->tag.spcOid;
+ fctx->record[i].reldatabase = bufHdr->tag.dbOid;
+ fctx->record[i].forknum = BufTagGetForkNum(&bufHdr->tag);
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index d9ab39d..c8d673a 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -630,10 +630,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rlocator.dbOid;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rlocator.spcOid;
- block_info_array[num_blocks].filenumber = bufHdr->tag.rlocator.relNumber;
- block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
+ block_info_array[num_blocks].database = bufHdr->tag.dbOid;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.spcOid;
+ block_info_array[num_blocks].filenumber =
+ BufTagGetRelNumber(&bufHdr->tag);
+ block_info_array[num_blocks].forknum =
+ BufTagGetForkNum(&bufHdr->tag);
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6b30138..7a75711 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1657,8 +1657,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
LocalRefCount[-buffer - 1]--;
@@ -1668,8 +1668,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, relation->rd_locator) &&
- bufHdr->tag.forkNum == forkNum)
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &relation->rd_locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
}
@@ -2010,9 +2010,9 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rlocator.spcOid;
- item->relNumber = bufHdr->tag.rlocator.relNumber;
- item->forkNum = bufHdr->tag.forkNum;
+ item->tsId = bufHdr->tag.spcOid;
+ item->relNumber = BufTagGetRelNumber(&bufHdr->tag);
+ item->forkNum = BufTagGetForkNum(&bufHdr->tag);
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2718,7 +2718,8 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rlocator, backend, buf->tag.forkNum);
+ path = relpathbackend(BufTagGetRelFileLocator(&buf->tag), backend,
+ BufTagGetForkNum(&buf->tag));
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2797,8 +2798,8 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
bufHdr = GetBufferDescriptor(buffer - 1);
/* pinned, so OK to read tag without spinlock */
- *rlocator = bufHdr->tag.rlocator;
- *forknum = bufHdr->tag.forkNum;
+ *rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ *forknum = BufTagGetForkNum(&bufHdr->tag);
*blknum = bufHdr->tag.blockNum;
}
@@ -2848,9 +2849,9 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rlocator, InvalidBackendId);
+ reln = smgropen(BufTagGetRelFileLocator(&buf->tag), InvalidBackendId);
- TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_START(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -2909,7 +2910,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
smgrwrite(reln,
- buf->tag.forkNum,
+ BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
bufToWrite,
false);
@@ -2930,7 +2931,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
*/
TerminateBufferIO(buf, true, 0);
- TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,
+ TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(BufTagGetForkNum(&buf->tag),
buf->tag.blockNum,
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
@@ -3151,15 +3152,15 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rlocator, but
* the incremental win from doing so seems small.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator))
continue;
buf_state = LockBufHdr(bufHdr);
for (j = 0; j < nforks; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator.locator) &&
- bufHdr->tag.forkNum == forkNum[j] &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator.locator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -3310,7 +3311,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
for (j = 0; j < n; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, locators[j]))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &locators[j]))
{
rlocator = &locators[j];
break;
@@ -3319,7 +3320,10 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
}
else
{
- rlocator = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator locator;
+
+ locator = BufTagGetRelFileLocator(&bufHdr->tag);
+ rlocator = bsearch((const void *) &(locator),
locators, n, sizeof(RelFileLocator),
rlocator_comparator);
}
@@ -3329,7 +3333,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, (*rlocator)))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, rlocator))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3389,8 +3393,8 @@ FindAndDropRelationBuffers(RelFileLocator rlocator, ForkNumber forkNum,
*/
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
@@ -3428,11 +3432,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid)
+ if (bufHdr->tag.dbOid == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3462,7 +3466,8 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rlocator, InvalidBackendId, buf->tag.forkNum),
+ relpathbackend(BufTagGetRelFileLocator(&buf->tag),
+ InvalidBackendId, BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3487,7 +3492,8 @@ PrintPinnedBufs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathperm(buf->tag.rlocator, buf->tag.forkNum),
+ relpathperm(BufTagGetRelFileLocator(&buf->tag),
+ BufTagGetForkNum(&buf->tag)),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3526,7 +3532,7 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3544,7 +3550,7 @@ FlushRelationBuffers(Relation rel)
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
smgrwrite(RelationGetSmgr(rel),
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -3573,13 +3579,13 @@ FlushRelationBuffers(Relation rel)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (!RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator))
+ if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, rel->rd_locator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3653,7 +3659,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
for (j = 0; j < nrels; j++)
{
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srels[j].rlocator))
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srels[j].rlocator))
{
srelent = &srels[j];
break;
@@ -3662,7 +3668,10 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
}
else
{
- srelent = bsearch((const void *) &(bufHdr->tag.rlocator),
+ RelFileLocator rlocator;
+
+ rlocator = BufTagGetRelFileLocator(&bufHdr->tag);
+ srelent = bsearch((const void *) &(rlocator),
srels, nrels, sizeof(SMgrSortArray),
rlocator_comparator);
}
@@ -3674,7 +3683,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileLocatorEquals(bufHdr->tag.rlocator, srelent->rlocator) &&
+ if (BufTagMatchesRelFileLocator(&bufHdr->tag, &srelent->rlocator) &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3876,13 +3885,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelationBuffers, an unlocked precheck should be
* safe and saves some cycles.
*/
- if (bufHdr->tag.rlocator.dbOid != dbid)
+ if (bufHdr->tag.dbOid != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rlocator.dbOid == dbid &&
+ if (bufHdr->tag.dbOid == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4060,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
- RelFileLocatorSkippingWAL(bufHdr->tag.rlocator))
+ RelFileLocatorSkippingWAL(BufTagGetRelFileLocator(&bufHdr->tag)))
return;
/*
@@ -4660,7 +4669,8 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rlocator, buf->tag.forkNum);
+ path = relpathperm(BufTagGetRelFileLocator(&buf->tag),
+ BufTagGetForkNum(&buf->tag));
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4684,7 +4694,8 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rlocator, bufHdr->tag.forkNum);
+ char *path = relpathperm(BufTagGetRelFileLocator(&bufHdr->tag),
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4702,8 +4713,9 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum);
+ char *path = relpathbackend(BufTagGetRelFileLocator(&bufHdr->tag),
+ MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag));
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4797,15 +4809,20 @@ static inline int
buffertag_comparator(const BufferTag *ba, const BufferTag *bb)
{
int ret;
+ RelFileLocator rlocatora;
+ RelFileLocator rlocatorb;
- ret = rlocator_comparator(&ba->rlocator, &bb->rlocator);
+ rlocatora = BufTagGetRelFileLocator(ba);
+ rlocatorb = BufTagGetRelFileLocator(bb);
+
+ ret = rlocator_comparator(&rlocatora, &rlocatorb);
if (ret != 0)
return ret;
- if (ba->forkNum < bb->forkNum)
+ if (BufTagGetForkNum(ba) < BufTagGetForkNum(bb))
return -1;
- if (ba->forkNum > bb->forkNum)
+ if (BufTagGetForkNum(ba) > BufTagGetForkNum(bb))
return 1;
if (ba->blockNum < bb->blockNum)
@@ -4955,10 +4972,12 @@ IssuePendingWritebacks(WritebackContext *context)
SMgrRelation reln;
int ahead;
BufferTag tag;
+ RelFileLocator currlocator;
Size nblocks = 1;
cur = &context->pending_writebacks[i];
tag = cur->tag;
+ currlocator = BufTagGetRelFileLocator(&tag);
/*
* Peek ahead, into following writeback requests, to see if they can
@@ -4966,11 +4985,13 @@ IssuePendingWritebacks(WritebackContext *context)
*/
for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++)
{
+
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileLocatorEquals(cur->tag.rlocator, next->tag.rlocator) ||
- cur->tag.forkNum != next->tag.forkNum)
+ if (!RelFileLocatorEquals(currlocator,
+ BufTagGetRelFileLocator(&next->tag)) ||
+ BufTagGetForkNum(&cur->tag) != BufTagGetForkNum(&next->tag))
break;
/* ok, block queued twice, skip */
@@ -4988,8 +5009,8 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rlocator, InvalidBackendId);
- smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
+ reln = smgropen(currlocator, InvalidBackendId);
+ smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks);
}
context->nr_pending = 0;
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 014f644..9853007 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -215,13 +215,13 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rlocator, MyBackendId);
+ oreln = smgropen(BufTagGetRelFileLocator(&bufHdr->tag), MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
/* And write... */
smgrwrite(oreln,
- bufHdr->tag.forkNum,
+ BufTagGetForkNum(&bufHdr->tag),
bufHdr->tag.blockNum,
localpage,
false);
@@ -337,16 +337,18 @@ DropRelationLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator) &&
- bufHdr->tag.forkNum == forkNum &&
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator) &&
+ BufTagGetForkNum(&bufHdr->tag) == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(BufTagGetRelFileLocator(&bufHdr->tag),
+ MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
+
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
hash_search(LocalBufHash, (void *) &bufHdr->tag,
@@ -383,13 +385,14 @@ DropRelationAllLocalBuffers(RelFileLocator rlocator)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileLocatorEquals(bufHdr->tag.rlocator, rlocator))
+ BufTagMatchesRelFileLocator(&bufHdr->tag, &rlocator))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rlocator, MyBackendId,
- bufHdr->tag.forkNum),
+ relpathbackend(BufTagGetRelFileLocator(&bufHdr->tag),
+ MyBackendId,
+ BufTagGetForkNum(&bufHdr->tag)),
LocalRefCount[i]);
/* Remove entry from hashtable */
hresult = (LocalBufferLookupEnt *)
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 7246655..406db6b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,18 +90,51 @@
*/
typedef struct buftag
{
- RelFileLocator rlocator; /* physical relation identifier */
- ForkNumber forkNum;
+ Oid spcOid; /* tablespace oid */
+ Oid dbOid; /* database oid */
+ RelFileNumber relNumber; /* relation file number */
+ ForkNumber forkNum; /* fork number */
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+static inline RelFileNumber
+BufTagGetRelNumber(const BufferTag *tag)
+{
+ return tag->relNumber;
+}
+
+static inline ForkNumber
+BufTagGetForkNum(const BufferTag *tag)
+{
+ return tag->forkNum;
+}
+
+static inline void
+BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
+ ForkNumber forknum)
+{
+ tag->relNumber = relnumber;
+ tag->forkNum = forknum;
+}
+
+static inline RelFileLocator
+BufTagGetRelFileLocator(const BufferTag *tag)
+{
+ RelFileLocator rlocator;
+
+ rlocator.spcOid = tag->spcOid;
+ rlocator.dbOid = tag->dbOid;
+ rlocator.relNumber = BufTagGetRelNumber(tag);
+
+ return rlocator;
+}
+
static inline void
ClearBufferTag(BufferTag *tag)
{
- tag->rlocator.spcOid = InvalidOid;
- tag->rlocator.dbOid = InvalidOid;
- tag->rlocator.relNumber = InvalidRelFileNumber;
- tag->forkNum = InvalidForkNumber;
+ tag->spcOid = InvalidOid;
+ tag->dbOid = InvalidOid;
+ BufTagSetRelForkDetails(tag, InvalidRelFileNumber, InvalidForkNumber);
tag->blockNum = InvalidBlockNumber;
}
@@ -109,19 +142,32 @@ static inline void
InitBufferTag(BufferTag *tag, const RelFileLocator *rlocator,
ForkNumber forkNum, BlockNumber blockNum)
{
- tag->rlocator = *rlocator;
- tag->forkNum = forkNum;
+ tag->spcOid = rlocator->spcOid;
+ tag->dbOid = rlocator->dbOid;
+ BufTagSetRelForkDetails(tag, rlocator->relNumber, forkNum);
tag->blockNum = blockNum;
}
static inline bool
BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
- return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+ return (tag1->spcOid == tag2->spcOid) &&
+ (tag1->dbOid == tag2->dbOid) &&
+ (tag1->relNumber == tag2->relNumber) &&
(tag1->blockNum == tag2->blockNum) &&
(tag1->forkNum == tag2->forkNum);
}
+static inline bool
+BufTagMatchesRelFileLocator(const BufferTag *tag,
+ const RelFileLocator *rlocator)
+{
+ return (tag->spcOid == rlocator->spcOid) &&
+ (tag->dbOid == rlocator->dbOid) &&
+ (BufTagGetRelNumber(tag) == rlocator->relNumber);
+}
+
+
/*
* The shared buffer mapping table is partitioned to reduce contention.
* To determine which partition lock a given tag requires, compute the tag's
--
1.8.3.1
On Sat, Jul 30, 2022 at 1:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
One solution to all this is to do as Dilip proposes here: for system
relations, keep assigning the OID as the initial relfilenumber.
Actually, we really only need to do this for pg_largeobject; all the
other relfilenumber values could be assigned from a counter, as long
as they're assigned from a range distinct from what we use for user
relations.But I don't really like that, because I feel like the whole thing
where we start out with relfilenumber=oid is a recipe for hidden bugs.
I believe we'd be better off if we decouple those concepts more
thoroughly. So here's another idea: what if we set the
next-relfilenumber counter for the new cluster to the value from the
old cluster, and then rewrote all the (thus-far-empty) system tables?
You mean in a new cluster start the next-relfilenumber counter from
the highest relfilenode/Oid value in the old cluster right?. Yeah, if
we start next-relfilenumber after the range of the old cluster then we
can also avoid the logic of SetNextRelFileNumber() during upgrade.
My very initial idea around this was to start the next-relfilenumber
directly from the 4 billion in the new cluster so there can not be any
conflict and we don't even need to identify the highest value of used
relfilenode in the old cluster. In fact we don't need to rewrite the
system table before upgrading I think. So what do we lose with this?
just 4 billion relfilenode? does that really matter provided the range
we get with the 56 bits relfilenumber.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Aug 4, 2022 at 5:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Sat, Jul 30, 2022 at 1:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
One solution to all this is to do as Dilip proposes here: for system
relations, keep assigning the OID as the initial relfilenumber.
Actually, we really only need to do this for pg_largeobject; all the
other relfilenumber values could be assigned from a counter, as long
as they're assigned from a range distinct from what we use for user
relations.But I don't really like that, because I feel like the whole thing
where we start out with relfilenumber=oid is a recipe for hidden bugs.
I believe we'd be better off if we decouple those concepts more
thoroughly. So here's another idea: what if we set the
next-relfilenumber counter for the new cluster to the value from the
old cluster, and then rewrote all the (thus-far-empty) system tables?You mean in a new cluster start the next-relfilenumber counter from
the highest relfilenode/Oid value in the old cluster right?. Yeah, if
we start next-relfilenumber after the range of the old cluster then we
can also avoid the logic of SetNextRelFileNumber() during upgrade.My very initial idea around this was to start the next-relfilenumber
directly from the 4 billion in the new cluster so there can not be any
conflict and we don't even need to identify the highest value of used
relfilenode in the old cluster. In fact we don't need to rewrite the
system table before upgrading I think. So what do we lose with this?
just 4 billion relfilenode? does that really matter provided the range
we get with the 56 bits relfilenumber.
I think even if we start the range from the 4 billion we can not avoid
keeping two separate ranges for system and user tables otherwise the
next upgrade where old and new clusters both have 56 bits
relfilenumber will get conflicting files. And, for the same reason we
still have to call SetNextRelFileNumber() during upgrade.
So the idea is, we will be having 2 ranges for relfilenumbers, system
range will start from 4 billion and user range maybe something around
4.1 (I think we can keep it very small though, just reserve 50k
relfilenumber for system for future expansion and start user range
from there).
So now system tables have no issues and also the user tables from the
old cluster have no issues. But pg_largeobject might get conflict
when both old and new cluster are using 56 bits relfilenumber, because
it is possible that in the new cluster some other system table gets
that relfilenumber which is used by pg_largeobject in the old cluster.
This could be resolved if we allocate pg_largeobject's relfilenumber
from the user range, that means this relfilenumber will always be the
first value from the user range. So now if the old and new cluster
both are using 56bits relfilenumber then pg_largeobject in both
cluster would have got the same relfilenumber and if the old cluster
is using the current 32 bits relfilenode system then the whole range
of the new cluster is completely different than that of the old
cluster.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Aug 5, 2022 at 3:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I think even if we start the range from the 4 billion we can not avoid
keeping two separate ranges for system and user tables otherwise the
next upgrade where old and new clusters both have 56 bits
relfilenumber will get conflicting files. And, for the same reason we
still have to call SetNextRelFileNumber() during upgrade.
Well, my proposal to move everything from the new cluster up to higher
numbers would address this without requiring two ranges.
So the idea is, we will be having 2 ranges for relfilenumbers, system
range will start from 4 billion and user range maybe something around
4.1 (I think we can keep it very small though, just reserve 50k
relfilenumber for system for future expansion and start user range
from there).
A disadvantage of this is that it basically means all the file names
in new clusters are going to be 10 characters long. That's not a big
disadvantage, but it's not wonderful. File names that are only 5-7
characters long are common today, and easier to remember.
So now system tables have no issues and also the user tables from the
old cluster have no issues. But pg_largeobject might get conflict
when both old and new cluster are using 56 bits relfilenumber, because
it is possible that in the new cluster some other system table gets
that relfilenumber which is used by pg_largeobject in the old cluster.This could be resolved if we allocate pg_largeobject's relfilenumber
from the user range, that means this relfilenumber will always be the
first value from the user range. So now if the old and new cluster
both are using 56bits relfilenumber then pg_largeobject in both
cluster would have got the same relfilenumber and if the old cluster
is using the current 32 bits relfilenode system then the whole range
of the new cluster is completely different than that of the old
cluster.
I think this can work, but it does rely to some extent on the fact
that there are no other tables which need to be treated like
pg_largeobject. If there were others, they'd need fixed starting
RelFileNumber assignments, or some other trick, like renumbering them
twice in the cluster, first two a known-unused value and then back to
the proper value. You'd have trouble if in the other cluster
pg_largeobject was 4bn+1 and pg_largeobject2 was 4bn+2 and in the new
cluster the reverse, without some hackery.
I do feel like your idea here has some advantages - my proposal
requires rewriting all the catalogs in the new cluster before we do
anything else, and that's going to take some time even though they
should be small. But I also feel like it has some disadvantages: it
seems to rely on complicated reasoning and special cases more than I'd
like.
What do other people think?
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Aug 9, 2022 at 8:51 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Aug 5, 2022 at 3:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I think even if we start the range from the 4 billion we can not avoid
keeping two separate ranges for system and user tables otherwise the
next upgrade where old and new clusters both have 56 bits
relfilenumber will get conflicting files. And, for the same reason we
still have to call SetNextRelFileNumber() during upgrade.Well, my proposal to move everything from the new cluster up to higher
numbers would address this without requiring two ranges.So the idea is, we will be having 2 ranges for relfilenumbers, system
range will start from 4 billion and user range maybe something around
4.1 (I think we can keep it very small though, just reserve 50k
relfilenumber for system for future expansion and start user range
from there).A disadvantage of this is that it basically means all the file names
in new clusters are going to be 10 characters long. That's not a big
disadvantage, but it's not wonderful. File names that are only 5-7
characters long are common today, and easier to remember.
That's correct.
So now system tables have no issues and also the user tables from the
old cluster have no issues. But pg_largeobject might get conflict
when both old and new cluster are using 56 bits relfilenumber, because
it is possible that in the new cluster some other system table gets
that relfilenumber which is used by pg_largeobject in the old cluster.This could be resolved if we allocate pg_largeobject's relfilenumber
from the user range, that means this relfilenumber will always be the
first value from the user range. So now if the old and new cluster
both are using 56bits relfilenumber then pg_largeobject in both
cluster would have got the same relfilenumber and if the old cluster
is using the current 32 bits relfilenode system then the whole range
of the new cluster is completely different than that of the old
cluster.I think this can work, but it does rely to some extent on the fact
that there are no other tables which need to be treated like
pg_largeobject. If there were others, they'd need fixed starting
RelFileNumber assignments, or some other trick, like renumbering them
twice in the cluster, first two a known-unused value and then back to
the proper value. You'd have trouble if in the other cluster
pg_largeobject was 4bn+1 and pg_largeobject2 was 4bn+2 and in the new
cluster the reverse, without some hackery.
Agree, if it has more catalog like pg_largeobject then it would
require some hacking.
I do feel like your idea here has some advantages - my proposal
requires rewriting all the catalogs in the new cluster before we do
anything else, and that's going to take some time even though they
should be small. But I also feel like it has some disadvantages: it
seems to rely on complicated reasoning and special cases more than I'd
like.
One other advantage with your approach is that since we are starting
the "nextrelfilenumber" after the old cluster's relfilenumber range.
So only at the beginning we need to set the "nextrelfilenumber" but
after that while upgrading each object we don't need to set the
nextrelfilenumber every time because that is already higher than the
complete old cluster range. In other 2 approaches we will have to try
to set the nextrelfilenumber everytime we preserve the relfilenumber
during upgrade.
Other than these two approaches we have another approach (what the
patch set is already doing) where we keep the system relfilenumber
range same as Oid. I know you have already pointed out that this
might have some hidden bug but one advantage of this approach is it is
simple compared two above two approaches in the sense that it doesn't
need to maintain two ranges and it also doesn't need to rewrite all
system tables in the new cluster. So I think it would be good if we
can get others' opinions on all these 3 approaches.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Aug 11, 2022 at 10:58 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Aug 9, 2022 at 8:51 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Aug 5, 2022 at 3:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I think even if we start the range from the 4 billion we can not avoid
keeping two separate ranges for system and user tables otherwise the
next upgrade where old and new clusters both have 56 bits
relfilenumber will get conflicting files. And, for the same reason we
still have to call SetNextRelFileNumber() during upgrade.Well, my proposal to move everything from the new cluster up to higher
numbers would address this without requiring two ranges.So the idea is, we will be having 2 ranges for relfilenumbers, system
range will start from 4 billion and user range maybe something around
4.1 (I think we can keep it very small though, just reserve 50k
relfilenumber for system for future expansion and start user range
from there).A disadvantage of this is that it basically means all the file names
in new clusters are going to be 10 characters long. That's not a big
disadvantage, but it's not wonderful. File names that are only 5-7
characters long are common today, and easier to remember.That's correct.
So now system tables have no issues and also the user tables from the
old cluster have no issues. But pg_largeobject might get conflict
when both old and new cluster are using 56 bits relfilenumber, because
it is possible that in the new cluster some other system table gets
that relfilenumber which is used by pg_largeobject in the old cluster.This could be resolved if we allocate pg_largeobject's relfilenumber
from the user range, that means this relfilenumber will always be the
first value from the user range. So now if the old and new cluster
both are using 56bits relfilenumber then pg_largeobject in both
cluster would have got the same relfilenumber and if the old cluster
is using the current 32 bits relfilenode system then the whole range
of the new cluster is completely different than that of the old
cluster.I think this can work, but it does rely to some extent on the fact
that there are no other tables which need to be treated like
pg_largeobject. If there were others, they'd need fixed starting
RelFileNumber assignments, or some other trick, like renumbering them
twice in the cluster, first two a known-unused value and then back to
the proper value. You'd have trouble if in the other cluster
pg_largeobject was 4bn+1 and pg_largeobject2 was 4bn+2 and in the new
cluster the reverse, without some hackery.Agree, if it has more catalog like pg_largeobject then it would
require some hacking.I do feel like your idea here has some advantages - my proposal
requires rewriting all the catalogs in the new cluster before we do
anything else, and that's going to take some time even though they
should be small. But I also feel like it has some disadvantages: it
seems to rely on complicated reasoning and special cases more than I'd
like.One other advantage with your approach is that since we are starting
the "nextrelfilenumber" after the old cluster's relfilenumber range.
So only at the beginning we need to set the "nextrelfilenumber" but
after that while upgrading each object we don't need to set the
nextrelfilenumber every time because that is already higher than the
complete old cluster range. In other 2 approaches we will have to try
to set the nextrelfilenumber everytime we preserve the relfilenumber
during upgrade.
I was also thinking that whether we will get the max "relfilenumber"
from the old cluster at the cluster level or per database level? I
mean if we want to get database level we can run simple query on
pg_class and get it but there also we will need to see how to handle
the mapped relation if they are rewritten? I don't think we can get
the max relfilenumber from the old cluster at the cluster level.
Maybe in the newer version we can expose a function from the server to
just return the NextRelFileNumber and that would be the max
relfilenumber but I'm not sure how to do that in the old version.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Sat, Jul 30, 2022 at 1:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 20, 2022 at 7:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
There was also an issue where the user table from the old cluster's
relfilenode could conflict with the system table of the new cluster.
As a solution currently for system table object (while creating
storage first time) we are keeping the low range of relfilenumber,
basically we are using the same relfilenumber as OID so that during
upgrade the normal user table from the old cluster will not conflict
with the system tables in the new cluster. But with this solution
Robert told me (in off list chat) a problem that in future if we want
to make relfilenumber completely unique within a cluster by
implementing the CREATEDB differently then we can not do that as we
have created fixed relfilenodes for the system tables.I am not sure what exactly we can do to avoid that because even if we
do something to avoid that in the new cluster the old cluster might
be already using the non-unique relfilenode so after upgrading the new
cluster will also get those non-unique relfilenode.I think this aspect of the patch could use some more discussion.
To recap, the problem is that pg_upgrade mustn't discover that a
relfilenode that is being migrated from the old cluster is being used
for some other table in the new cluster. Since the new cluster should
only contain system tables that we assume have never been rewritten,
they'll all have relfilenodes equal to their OIDs, and thus less than
16384. On the other hand all the user tables from the old cluster will
have relfilenodes greater than 16384, so we're fine. pg_largeobject,
which also gets migrated, is a special case. Since we don't change OID
assignments from version to version, it should have either the same
relfilenode value in the old and new clusters, if never rewritten, or
else the value in the old cluster will be greater than 16384, in which
case no conflict is possible.But if we just assign all relfilenode values from a central counter,
then we have got trouble. If the new version has more system catalog
tables than the old version, then some value that got used for a user
table in the old version might get used for a system table in the new
version, which is a problem. One idea for fixing this is to have two
RelFileNumber ranges: a system range (small values) and a user range.
System tables get values in the system range initially, and in the
user range when first rewritten. User tables always get values in the
user range. Everything works fine in this scenario except maybe for
pg_largeobject: what if it gets one value from the system range in the
old cluster, and a different value from the system range in the new
cluster, but some other system table in the new cluster gets the value
that pg_largeobject had in the old cluster? Then we've got trouble.
To solve that problem, how about rewriting the system table in the new
cluster which has a conflicting relfilenode? I think we can probably
do this conflict checking before processing the tables from the old
cluster.
--
With Regards,
Amit Kapila.
On Mon, Aug 22, 2022 at 1:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jul 30, 2022 at 1:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 20, 2022 at 7:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
There was also an issue where the user table from the old cluster's
relfilenode could conflict with the system table of the new cluster.
As a solution currently for system table object (while creating
storage first time) we are keeping the low range of relfilenumber,
basically we are using the same relfilenumber as OID so that during
upgrade the normal user table from the old cluster will not conflict
with the system tables in the new cluster. But with this solution
Robert told me (in off list chat) a problem that in future if we want
to make relfilenumber completely unique within a cluster by
implementing the CREATEDB differently then we can not do that as we
have created fixed relfilenodes for the system tables.I am not sure what exactly we can do to avoid that because even if we
do something to avoid that in the new cluster the old cluster might
be already using the non-unique relfilenode so after upgrading the new
cluster will also get those non-unique relfilenode.I think this aspect of the patch could use some more discussion.
To recap, the problem is that pg_upgrade mustn't discover that a
relfilenode that is being migrated from the old cluster is being used
for some other table in the new cluster. Since the new cluster should
only contain system tables that we assume have never been rewritten,
they'll all have relfilenodes equal to their OIDs, and thus less than
16384. On the other hand all the user tables from the old cluster will
have relfilenodes greater than 16384, so we're fine. pg_largeobject,
which also gets migrated, is a special case. Since we don't change OID
assignments from version to version, it should have either the same
relfilenode value in the old and new clusters, if never rewritten, or
else the value in the old cluster will be greater than 16384, in which
case no conflict is possible.But if we just assign all relfilenode values from a central counter,
then we have got trouble. If the new version has more system catalog
tables than the old version, then some value that got used for a user
table in the old version might get used for a system table in the new
version, which is a problem. One idea for fixing this is to have two
RelFileNumber ranges: a system range (small values) and a user range.
System tables get values in the system range initially, and in the
user range when first rewritten. User tables always get values in the
user range. Everything works fine in this scenario except maybe for
pg_largeobject: what if it gets one value from the system range in the
old cluster, and a different value from the system range in the new
cluster, but some other system table in the new cluster gets the value
that pg_largeobject had in the old cluster? Then we've got trouble.To solve that problem, how about rewriting the system table in the new
cluster which has a conflicting relfilenode? I think we can probably
do this conflict checking before processing the tables from the old
cluster.
I think while rewriting of system table during the upgrade, we need to
ensure that it gets relfilenumber from the system range, otherwise, if
we allocate it from the user range, there will be a chance of conflict
with the user tables from the old cluster. Another way could be to set
the next-relfilenumber counter for the new cluster to the value from
the old cluster as mentioned by Robert in his previous email [1]/messages/by-id/CA+TgmoYsNiF8JGZ+Kp7Zgcct67Qk++YAp+1ybOQ0qomUayn+7A@mail.gmail.com.
[1]: /messages/by-id/CA+TgmoYsNiF8JGZ+Kp7Zgcct67Qk++YAp+1ybOQ0qomUayn+7A@mail.gmail.com
--
With Regards,
Amit Kapila.
On Mon, Aug 22, 2022 at 3:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
To solve that problem, how about rewriting the system table in the new
cluster which has a conflicting relfilenode? I think we can probably
do this conflict checking before processing the tables from the old
cluster.
Thanks for chiming in.
Right now, there are two parts to the relfilenumber preservation
system, and this scheme doesn't quite fit into either of them. First,
the dump includes commands to set pg_class.relfilenode in the new
cluster to the same value that it had in the old cluster. The dump
can't include any SQL commands that depend on what's happening in the
new cluster because pg_dump(all) only connects to a single cluster,
which in this case is the old cluster. Second, pg_upgrade itself
copies the files from the old cluster to the new cluster. This doesn't
involve a database connection at all. So there's no part of the
current relfilenode preservation mechanism that can look at the old
AND the new database and decide on some SQL to execute against the new
database.
I thought for a while that we could use the information that's already
gathered by get_rel_infos() to do what you're suggesting here, but it
doesn't quite work, because that function excludes system tables, and
we can't afford to do that here. We'd either need to modify that query
to include system tables - at least for the new cluster - or run a
separate one to gather information about system tables in the new
cluster. Then, we could put all the pg_class.relfilenode values we
found in the new cluster into a hash table, loop over the list of rels
this function found in the old cluster, and for each one, probe into
the hash table. If we find a match, that's a system table that needs
to be moved out of the way before calling create_new_objects(), or
maybe inside that function but before it runs pg_restore.
That doesn't seem too crazy, I think. It's a little bit of new
mechanism, but it doesn't sound horrific. It's got the advantage of
being significantly cheaper than my proposal of moving everything out
of the way unconditionally, and at the same time it retains one of the
key advantages of that proposal - IMV, anyway - which is that we don't
need separate relfilenode ranges for user and system objects any more.
So I guess on balance I kind of like it, but maybe I'm missing
something.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Aug 23, 2022 at 1:46 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Aug 22, 2022 at 3:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
To solve that problem, how about rewriting the system table in the new
cluster which has a conflicting relfilenode? I think we can probably
do this conflict checking before processing the tables from the old
cluster.Thanks for chiming in.
Right now, there are two parts to the relfilenumber preservation
system, and this scheme doesn't quite fit into either of them. First,
the dump includes commands to set pg_class.relfilenode in the new
cluster to the same value that it had in the old cluster. The dump
can't include any SQL commands that depend on what's happening in the
new cluster because pg_dump(all) only connects to a single cluster,
which in this case is the old cluster. Second, pg_upgrade itself
copies the files from the old cluster to the new cluster. This doesn't
involve a database connection at all. So there's no part of the
current relfilenode preservation mechanism that can look at the old
AND the new database and decide on some SQL to execute against the new
database.I thought for a while that we could use the information that's already
gathered by get_rel_infos() to do what you're suggesting here, but it
doesn't quite work, because that function excludes system tables, and
we can't afford to do that here. We'd either need to modify that query
to include system tables - at least for the new cluster - or run a
separate one to gather information about system tables in the new
cluster. Then, we could put all the pg_class.relfilenode values we
found in the new cluster into a hash table, loop over the list of rels
this function found in the old cluster, and for each one, probe into
the hash table. If we find a match, that's a system table that needs
to be moved out of the way before calling create_new_objects(), or
maybe inside that function but before it runs pg_restore.That doesn't seem too crazy, I think. It's a little bit of new
mechanism, but it doesn't sound horrific. It's got the advantage of
being significantly cheaper than my proposal of moving everything out
of the way unconditionally, and at the same time it retains one of the
key advantages of that proposal - IMV, anyway - which is that we don't
need separate relfilenode ranges for user and system objects any more.
So I guess on balance I kind of like it, but maybe I'm missing
something.
Okay, so this seems exactly the same as your previous proposal but
instead of unconditionally rewriting all the system tables we will
rewrite only those conflict with the user table or pg_largeobject from
the previous cluster. Although it might have additional
implementation complexity on the pg upgrade side, it seems cheaper
than rewriting everything.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Aug 23, 2022 at 8:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Aug 23, 2022 at 1:46 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Aug 22, 2022 at 3:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
To solve that problem, how about rewriting the system table in the new
cluster which has a conflicting relfilenode? I think we can probably
do this conflict checking before processing the tables from the old
cluster.Thanks for chiming in.
Right now, there are two parts to the relfilenumber preservation
system, and this scheme doesn't quite fit into either of them. First,
the dump includes commands to set pg_class.relfilenode in the new
cluster to the same value that it had in the old cluster. The dump
can't include any SQL commands that depend on what's happening in the
new cluster because pg_dump(all) only connects to a single cluster,
which in this case is the old cluster. Second, pg_upgrade itself
copies the files from the old cluster to the new cluster. This doesn't
involve a database connection at all. So there's no part of the
current relfilenode preservation mechanism that can look at the old
AND the new database and decide on some SQL to execute against the new
database.I thought for a while that we could use the information that's already
gathered by get_rel_infos() to do what you're suggesting here, but it
doesn't quite work, because that function excludes system tables, and
we can't afford to do that here. We'd either need to modify that query
to include system tables - at least for the new cluster - or run a
separate one to gather information about system tables in the new
cluster. Then, we could put all the pg_class.relfilenode values we
found in the new cluster into a hash table, loop over the list of rels
this function found in the old cluster, and for each one, probe into
the hash table. If we find a match, that's a system table that needs
to be moved out of the way before calling create_new_objects(), or
maybe inside that function but before it runs pg_restore.That doesn't seem too crazy, I think. It's a little bit of new
mechanism, but it doesn't sound horrific. It's got the advantage of
being significantly cheaper than my proposal of moving everything out
of the way unconditionally, and at the same time it retains one of the
key advantages of that proposal - IMV, anyway - which is that we don't
need separate relfilenode ranges for user and system objects any more.
So I guess on balance I kind of like it, but maybe I'm missing
something.Okay, so this seems exactly the same as your previous proposal but
instead of unconditionally rewriting all the system tables we will
rewrite only those conflict with the user table or pg_largeobject from
the previous cluster. Although it might have additional
implementation complexity on the pg upgrade side, it seems cheaper
than rewriting everything.
OTOH, if we keep the two separate ranges for the user and system table
then we don't need all this complex logic of conflict checking. From
the old cluster, we can just remember the relfilenumbr of the
pg_largeobject, and in the new cluster before trying to restore we can
just query the new cluster pg_class and find out whether it is used by
any system table and if so then we can just rewrite that system table.
And I think using two ranges might not be that complicated because as
soon as we are done with the initdb we can just set NextRelFileNumber
to the first user range relfilenumber so I think this could be the
simplest solution. And I think what Amit is suggesting is something
on this line?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Aug 23, 2022 at 11:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Aug 23, 2022 at 8:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Aug 23, 2022 at 1:46 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Aug 22, 2022 at 3:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
To solve that problem, how about rewriting the system table in the new
cluster which has a conflicting relfilenode? I think we can probably
do this conflict checking before processing the tables from the old
cluster.Thanks for chiming in.
Right now, there are two parts to the relfilenumber preservation
system, and this scheme doesn't quite fit into either of them. First,
the dump includes commands to set pg_class.relfilenode in the new
cluster to the same value that it had in the old cluster. The dump
can't include any SQL commands that depend on what's happening in the
new cluster because pg_dump(all) only connects to a single cluster,
which in this case is the old cluster. Second, pg_upgrade itself
copies the files from the old cluster to the new cluster. This doesn't
involve a database connection at all. So there's no part of the
current relfilenode preservation mechanism that can look at the old
AND the new database and decide on some SQL to execute against the new
database.I thought for a while that we could use the information that's already
gathered by get_rel_infos() to do what you're suggesting here, but it
doesn't quite work, because that function excludes system tables, and
we can't afford to do that here. We'd either need to modify that query
to include system tables - at least for the new cluster - or run a
separate one to gather information about system tables in the new
cluster. Then, we could put all the pg_class.relfilenode values we
found in the new cluster into a hash table, loop over the list of rels
this function found in the old cluster, and for each one, probe into
the hash table. If we find a match, that's a system table that needs
to be moved out of the way before calling create_new_objects(), or
maybe inside that function but before it runs pg_restore.That doesn't seem too crazy, I think. It's a little bit of new
mechanism, but it doesn't sound horrific. It's got the advantage of
being significantly cheaper than my proposal of moving everything out
of the way unconditionally, and at the same time it retains one of the
key advantages of that proposal - IMV, anyway - which is that we don't
need separate relfilenode ranges for user and system objects any more.
So I guess on balance I kind of like it, but maybe I'm missing
something.Okay, so this seems exactly the same as your previous proposal but
instead of unconditionally rewriting all the system tables we will
rewrite only those conflict with the user table or pg_largeobject from
the previous cluster. Although it might have additional
implementation complexity on the pg upgrade side, it seems cheaper
than rewriting everything.OTOH, if we keep the two separate ranges for the user and system table
then we don't need all this complex logic of conflict checking. From
the old cluster, we can just remember the relfilenumbr of the
pg_largeobject, and in the new cluster before trying to restore we can
just query the new cluster pg_class and find out whether it is used by
any system table and if so then we can just rewrite that system table.
Before re-write of that system table, I think we need to set
NextRelFileNumber to a number greater than the max relfilenumber from
the old cluster, otherwise, it can later conflict with some user
table.
And I think using two ranges might not be that complicated because as
soon as we are done with the initdb we can just set NextRelFileNumber
to the first user range relfilenumber so I think this could be the
simplest solution. And I think what Amit is suggesting is something
on this line?
Yeah, I had thought of checking only pg_largeobject. I think the
advantage of having separate ranges is that we have a somewhat simpler
logic in the upgrade but OTOH the other scheme has the advantage of
having a single allocation scheme. Do we see any other pros/cons of
one over the other?
One more thing we may want to think about is what if there are tables
created by extension? For example, I think BDR creates some tables
like node_group, conflict_history, etc. Now, I think if such an
extension is created by default, both old and new clusters will have
those tables. Isn't there a chance of relfilenumber conflict in such
cases?
--
With Regards,
Amit Kapila.
On Tue, Aug 23, 2022 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
OTOH, if we keep the two separate ranges for the user and system table
then we don't need all this complex logic of conflict checking. From
the old cluster, we can just remember the relfilenumbr of the
pg_largeobject, and in the new cluster before trying to restore we can
just query the new cluster pg_class and find out whether it is used by
any system table and if so then we can just rewrite that system table.Before re-write of that system table, I think we need to set
NextRelFileNumber to a number greater than the max relfilenumber from
the old cluster, otherwise, it can later conflict with some user
table.
Yes we will need to do that.
And I think using two ranges might not be that complicated because as
soon as we are done with the initdb we can just set NextRelFileNumber
to the first user range relfilenumber so I think this could be the
simplest solution. And I think what Amit is suggesting is something
on this line?Yeah, I had thought of checking only pg_largeobject. I think the
advantage of having separate ranges is that we have a somewhat simpler
logic in the upgrade but OTOH the other scheme has the advantage of
having a single allocation scheme. Do we see any other pros/cons of
one over the other?
I feel having a separate range is not much different from having a
single allocation scheme, after cluster initialization, we will just
have to set the NextRelFileNumber to something called
FirstNormalRelFileNumber which looks fine to me.
One more thing we may want to think about is what if there are tables
created by extension? For example, I think BDR creates some tables
like node_group, conflict_history, etc. Now, I think if such an
extension is created by default, both old and new clusters will have
those tables. Isn't there a chance of relfilenumber conflict in such
cases?
Shouldn't they behave as a normal user table? because before upgrade
anyway new cluster can not have any table other than system tables and
those tables created by an extension should also be restored as other
user table does.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Aug 23, 2022 at 2:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
OTOH, if we keep the two separate ranges for the user and system table
then we don't need all this complex logic of conflict checking.
True. That's the downside. The question is whether it's worth adding
some complexity to avoid needing separate ranges.
Honestly, if we don't care about having separate ranges, we can do
something even simpler and just make the starting relfilenumber for
system tables same as the OID. Then we don't have to do anything at
all, outside of not changing the OID assigned to pg_largeobject in a
future release. Then as long as pg_upgrade is targeting a new cluster
with completely fresh databases that have not had any system table
rewrites so far, there can't be any conflict.
And perhaps that is the best solution after all, but while it is
simple in terms of code, I feel it's a bit complicated for human
beings. It's very simple to understand the scheme that Amit proposed:
if there's anything in the new cluster that would conflict, we move it
out of the way. We don't have to assume the new cluster hasn't had any
table rewrites. We don't have to nail down starting relfilenumber
assignments for system tables. We don't have to worry about
relfilenumber or OID assignments changing between releases.
pg_largeobject is not a special case. There are no special ranges of
OIDs or relfilenumbers required. It just straight up works -- all the
time, no matter what, end of story.
The other schemes we're talking about here all require a bunch of
assumptions about stuff like what I just mentioned. We can certainly
do it that way, and maybe it's even for the best. But I feel like it's
a little bit fragile. Maybe some future change gets blocked because it
would break one of the assumptions that the system relies on, or maybe
someone doesn't even realize there's an issue and changes something
that introduces a bug into this system. Or on the other hand maybe
not. But I think there's at least some value in considering whether
adding a little more code might actually make things simpler to reason
about, and whether that might be a good enough reason to do it.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Aug 23, 2022 at 8:00 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 23, 2022 at 2:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
OTOH, if we keep the two separate ranges for the user and system table
then we don't need all this complex logic of conflict checking.True. That's the downside. The question is whether it's worth adding
some complexity to avoid needing separate ranges.Honestly, if we don't care about having separate ranges, we can do
something even simpler and just make the starting relfilenumber for
system tables same as the OID. Then we don't have to do anything at
all, outside of not changing the OID assigned to pg_largeobject in a
future release. Then as long as pg_upgrade is targeting a new cluster
with completely fresh databases that have not had any system table
rewrites so far, there can't be any conflict.And perhaps that is the best solution after all, but while it is
simple in terms of code, I feel it's a bit complicated for human
beings. It's very simple to understand the scheme that Amit proposed:
if there's anything in the new cluster that would conflict, we move it
out of the way. We don't have to assume the new cluster hasn't had any
table rewrites. We don't have to nail down starting relfilenumber
assignments for system tables. We don't have to worry about
relfilenumber or OID assignments changing between releases.
pg_largeobject is not a special case. There are no special ranges of
OIDs or relfilenumbers required. It just straight up works -- all the
time, no matter what, end of story.
This sounds simple to understand. It seems we always create new system
tables in the new cluster before the upgrade, so I think it is safe to
assume there won't be any table rewrite in it. OTOH, if the
relfilenumber allocation scheme is robust to deal with table rewrites
then we probably don't need to worry about this assumption changing in
the future.
--
With Regards,
Amit Kapila.
On Tue, Aug 23, 2022 at 3:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Aug 23, 2022 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
One more thing we may want to think about is what if there are tables
created by extension? For example, I think BDR creates some tables
like node_group, conflict_history, etc. Now, I think if such an
extension is created by default, both old and new clusters will have
those tables. Isn't there a chance of relfilenumber conflict in such
cases?Shouldn't they behave as a normal user table? because before upgrade
anyway new cluster can not have any table other than system tables and
those tables created by an extension should also be restored as other
user table does.
Right.
--
With Regards,
Amit Kapila.
On Mon, Aug 1, 2022 at 7:57 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have fixed other comments, and also fixed comments from Alvaro to
use %lld instead of INT64_FORMAT inside the ereport and wherever he
suggested.
Notwithstanding the ongoing discussion about the exact approach for
the main patch, it seemed OK to push the preparatory patch you posted
here, so I have now done that.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Aug 23, 2022 at 8:00 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 23, 2022 at 2:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
OTOH, if we keep the two separate ranges for the user and system table
then we don't need all this complex logic of conflict checking.True. That's the downside. The question is whether it's worth adding
some complexity to avoid needing separate ranges.
Other than complexity, we will have to check the conflict for all the
user table's relfilenumber from the old cluster into the hash build on
the new cluster's relfilenumber, isn't it extra overhead if there are
a lot of user tables? But I think we are already restoring all those
tables in the new cluster so compared to that it will be very small.
Honestly, if we don't care about having separate ranges, we can do
something even simpler and just make the starting relfilenumber for
system tables same as the OID. Then we don't have to do anything at
all, outside of not changing the OID assigned to pg_largeobject in a
future release. Then as long as pg_upgrade is targeting a new cluster
with completely fresh databases that have not had any system table
rewrites so far, there can't be any conflict.
I think having the OID-based system and having two ranges are not
exactly the same. Because if we have the OID-based relfilenumber
allocation for system table (initially) and then later allocate from
the nextRelFileNumber counter then it seems like a mix of old system
(where actually OID and relfilenumber are tightly connected) and the
new system where nextRelFileNumber is completely independent counter.
OTOH having two ranges means logically we are not making dependent on
OID we are just allocating from a central counter but after catalog
initialization, we will leave some gap and start from a new range. So
I don't think this system is hard to explain.
And perhaps that is the best solution after all, but while it is
simple in terms of code, I feel it's a bit complicated for human
beings. It's very simple to understand the scheme that Amit proposed:
if there's anything in the new cluster that would conflict, we move it
out of the way. We don't have to assume the new cluster hasn't had any
table rewrites. We don't have to nail down starting relfilenumber
assignments for system tables. We don't have to worry about
relfilenumber or OID assignments changing between releases.
pg_largeobject is not a special case. There are no special ranges of
OIDs or relfilenumbers required. It just straight up works -- all the
time, no matter what, end of story.
I agree on this that this system is easy to explain that we just
rewrite anything that conflicts so looks more future-proof. Okay, I
will try this solution and post the patch.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Aug 25, 2022 at 5:26 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I agree on this that this system is easy to explain that we just
rewrite anything that conflicts so looks more future-proof. Okay, I
will try this solution and post the patch.
While working on this solution I noticed one issue. Basically, the
problem is that during binary upgrade when we try to rewrite a heap we
would expect that “binary_upgrade_next_heap_pg_class_oid” and
“binary_upgrade_next_heap_pg_class_relfilenumber” are already set for
creating a new heap. But we are not preserving anything so we don't
have those values. One option to this problem is that we can first
start the postmaster in non-binary upgrade mode perform all conflict
checking and rewrite and stop the postmaster. Then start postmaster
again and perform the restore as we are doing now. Although we will
have to start/stop the postmaster one extra time we have a solution.
But while thinking about this I started to think that since now we are
completely decoupling the concept of Oid and relfilenumber then
logically during REWRITE we should only be allocating new
relfilenumber but we don’t really need to allocate the Oid at all.
Yeah, we can do that if inside make_new_heap() if we pass the
OIDOldHeap to heap_create_with_catalog(), then it will just create new
storage(relfilenumber) but not a new Oid. But the problem is that the
ATRewriteTable() and finish_heap_swap() functions are completely based
on the relation cache. So now if we only create a new relfilenumber
but not a new Oid then we will have to change this infrastructure to
copy at smgr level.
Thoughts?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Aug 26, 2022 at 7:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
While working on this solution I noticed one issue. Basically, the
problem is that during binary upgrade when we try to rewrite a heap we
would expect that “binary_upgrade_next_heap_pg_class_oid” and
“binary_upgrade_next_heap_pg_class_relfilenumber” are already set for
creating a new heap. But we are not preserving anything so we don't
have those values. One option to this problem is that we can first
start the postmaster in non-binary upgrade mode perform all conflict
checking and rewrite and stop the postmaster. Then start postmaster
again and perform the restore as we are doing now. Although we will
have to start/stop the postmaster one extra time we have a solution.
Yeah, that seems OK. Or we could add a new function, like
binary_upgrade_allow_relation_oid_and_relfilenode_assignment(bool).
Not sure which way is better.
But while thinking about this I started to think that since now we are
completely decoupling the concept of Oid and relfilenumber then
logically during REWRITE we should only be allocating new
relfilenumber but we don’t really need to allocate the Oid at all.
Yeah, we can do that if inside make_new_heap() if we pass the
OIDOldHeap to heap_create_with_catalog(), then it will just create new
storage(relfilenumber) but not a new Oid. But the problem is that the
ATRewriteTable() and finish_heap_swap() functions are completely based
on the relation cache. So now if we only create a new relfilenumber
but not a new Oid then we will have to change this infrastructure to
copy at smgr level.
I think it would be a good idea to continue preserving the OIDs. If
nothing else, it makes debugging way easier, but also, there might be
user-defined regclass columns or something. Note the comments in
check_for_reg_data_type_usage().
--
Robert Haas
EDB: http://www.enterprisedb.com
On Fri, Aug 26, 2022 at 9:33 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Aug 26, 2022 at 7:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
While working on this solution I noticed one issue. Basically, the
problem is that during binary upgrade when we try to rewrite a heap we
would expect that “binary_upgrade_next_heap_pg_class_oid” and
“binary_upgrade_next_heap_pg_class_relfilenumber” are already set for
creating a new heap. But we are not preserving anything so we don't
have those values. One option to this problem is that we can first
start the postmaster in non-binary upgrade mode perform all conflict
checking and rewrite and stop the postmaster. Then start postmaster
again and perform the restore as we are doing now. Although we will
have to start/stop the postmaster one extra time we have a solution.Yeah, that seems OK. Or we could add a new function, like
binary_upgrade_allow_relation_oid_and_relfilenode_assignment(bool).
Not sure which way is better.
I have found one more issue with this approach of rewriting the
conflicting table. Earlier I thought we could do the conflict
checking and rewriting inside create_new_objects() right before the
restore command. But after implementing (while testing) this I
realized that we DROP and CREATE the database while restoring the dump
that means it will again generate the conflicting system tables. So
theoretically the rewriting should go in between the CREATE DATABASE
and restoring the object but as of now both create database and
restoring other objects are part of a single dump file. I haven't yet
analyzed how feasible it is to generate the dump in two parts, first
part just to create the database and in second part restore the rest
of the object.
Thoughts?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Aug 30, 2022 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have found one more issue with this approach of rewriting the
conflicting table. Earlier I thought we could do the conflict
checking and rewriting inside create_new_objects() right before the
restore command. But after implementing (while testing) this I
realized that we DROP and CREATE the database while restoring the dump
that means it will again generate the conflicting system tables. So
theoretically the rewriting should go in between the CREATE DATABASE
and restoring the object but as of now both create database and
restoring other objects are part of a single dump file. I haven't yet
analyzed how feasible it is to generate the dump in two parts, first
part just to create the database and in second part restore the rest
of the object.Thoughts?
Well, that's very awkward. It doesn't seem like it would be very
difficult to teach pg_upgrade to call pg_restore without --clean and
just do the drop database itself, but that doesn't really help,
because pg_restore will in any event be creating the new database.
That doesn't seem like something we can practically refactor out,
because only pg_dump knows what properties to use when creating the
new database. What we could do is have the dump include a command like
SELECT pg_binary_upgrade_move_things_out_of_the_way(some_arguments_here),
but that doesn't really help very much, because passing the whole list
of relfilenode values from the old database seems pretty certain to be
a bad idea. The whole idea here was that we'd be able to build a hash
table on the new database's system table OIDs, and it seems like
that's not going to work.
We could try to salvage some portion of the idea by making
pg_binary_upgrade_move_things_out_of_the_way() take a more restricted
set of arguments, like the smallest and largest relfilenode values
from the old database, and then we'd just need to move things that
overlap. But that feels pretty hit-or-miss to me as to whether it
actually avoids any work, and
pg_binary_upgrade_move_things_out_of_the_way() might also be annoying
to write. So perhaps we have to go back to the drawing board here.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Aug 30, 2022 at 9:23 PM Robert Haas <robertmhaas@gmail.com> wrote:
Well, that's very awkward. It doesn't seem like it would be very
difficult to teach pg_upgrade to call pg_restore without --clean and
just do the drop database itself, but that doesn't really help,
because pg_restore will in any event be creating the new database.
That doesn't seem like something we can practically refactor out,
because only pg_dump knows what properties to use when creating the
new database. What we could do is have the dump include a command like
SELECT pg_binary_upgrade_move_things_out_of_the_way(some_arguments_here),
but that doesn't really help very much, because passing the whole list
of relfilenode values from the old database seems pretty certain to be
a bad idea. The whole idea here was that we'd be able to build a hash
table on the new database's system table OIDs, and it seems like
that's not going to work.
Right.
We could try to salvage some portion of the idea by making
pg_binary_upgrade_move_things_out_of_the_way() take a more restricted
set of arguments, like the smallest and largest relfilenode values
from the old database, and then we'd just need to move things that
overlap. But that feels pretty hit-or-miss to me as to whether it
actually avoids any work, and
pg_binary_upgrade_move_things_out_of_the_way() might also be annoying
to write. So perhaps we have to go back to the drawing board here.
So as of now, we have two open options 1) the current approach and
what patch is following to use Oid as relfilenode for the system
tables when initially created. 2) call
pg_binary_upgrade_move_things_out_of_the_way() which force rewrite all
the system tables.
Another idea that I am not very sure how feasible is. Can we change
the dump such that in binary upgrade mode it will not use template0 as
a template database (in creating database command) but instead some
new database as a template e.g. template-XYZ? And later for conflict
checking, we will create this template-XYZ database on the new cluster
and then we will perform all the conflict check (from all the
databases of the old cluster) and rewrite operations on this database.
And later all the databases will be created using template-XYZ as the
template and all the rewriting stuff we have done is still intact.
The problems I could think of are 1) only for a binary upgrade we will
have to change the pg_dump. 2) we will have to use another database
name as the reserved database name but what if that name is already in
use in the previous cluster?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Sat, Sep 3, 2022 at 1:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Aug 30, 2022 at 9:23 PM Robert Haas <robertmhaas@gmail.com> wrote:
Well, that's very awkward. It doesn't seem like it would be very
difficult to teach pg_upgrade to call pg_restore without --clean and
just do the drop database itself, but that doesn't really help,
because pg_restore will in any event be creating the new database.
That doesn't seem like something we can practically refactor out,
because only pg_dump knows what properties to use when creating the
new database. What we could do is have the dump include a command like
SELECT pg_binary_upgrade_move_things_out_of_the_way(some_arguments_here),
but that doesn't really help very much, because passing the whole list
of relfilenode values from the old database seems pretty certain to be
a bad idea. The whole idea here was that we'd be able to build a hash
table on the new database's system table OIDs, and it seems like
that's not going to work.Right.
We could try to salvage some portion of the idea by making
pg_binary_upgrade_move_things_out_of_the_way() take a more restricted
set of arguments, like the smallest and largest relfilenode values
from the old database, and then we'd just need to move things that
overlap. But that feels pretty hit-or-miss to me as to whether it
actually avoids any work, and
pg_binary_upgrade_move_things_out_of_the_way() might also be annoying
to write. So perhaps we have to go back to the drawing board here.So as of now, we have two open options 1) the current approach and
what patch is following to use Oid as relfilenode for the system
tables when initially created. 2) call
pg_binary_upgrade_move_things_out_of_the_way() which force rewrite all
the system tables.Another idea that I am not very sure how feasible is. Can we change
the dump such that in binary upgrade mode it will not use template0 as
a template database (in creating database command) but instead some
new database as a template e.g. template-XYZ? And later for conflict
checking, we will create this template-XYZ database on the new cluster
and then we will perform all the conflict check (from all the
databases of the old cluster) and rewrite operations on this database.
And later all the databases will be created using template-XYZ as the
template and all the rewriting stuff we have done is still intact.
The problems I could think of are 1) only for a binary upgrade we will
have to change the pg_dump. 2) we will have to use another database
name as the reserved database name but what if that name is already in
use in the previous cluster?
While we are still thinking on this issue, I have rebased the patch on
the latest head and fixed a couple of minor issues.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v16-0001-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v16-0001-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From a645d39c6109269bb74ed8f0627e61ecf9b0535a Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Fri, 26 Aug 2022 10:20:18 +0530
Subject: [PATCH v16] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 ++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 ++++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
doc/src/sgml/storage.sgml | 5 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 183 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 57 +++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/backup/basebackup.c | 2 +-
src/backend/catalog/catalog.c | 95 -----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 2 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 28 ++--
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 7 +-
src/backend/utils/adt/pg_upgrade_support.c | 13 +-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 65 +++-----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 3 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/bin/scripts/t/090_reindexdb.pl | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 +++++
src/include/access/transam.h | 35 ++++
src/include/access/xlog.h | 2 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 1 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 ++++++-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
71 files changed, 685 insertions(+), 332 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d74b3e8..4d88eba 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..50956b1
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..912cbd8 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode %lld is too large to be represented as an OID",
+ (long long) fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c8d673a..6bb6da6 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," INT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," INT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 00f833d..40d4e9c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1984,7 +1984,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index e5b9f3f..eefba2f 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -217,11 +217,10 @@ with the suffix <literal>_init</literal> (see <xref linkend="storage-init"/>).
<caution>
<para>
-Note that while a table's filenode often matches its OID, this is
-<emphasis>not</emphasis> necessarily the case; some operations, like
+Note that table's filenode are completely different than its OID. Although for
+system catalogs initial filenode matches with its OID, but some operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.
-Avoid assuming that filenode and table OID are the same.
Also, for certain system catalogs including <structname>pg_class</structname> itself,
<structname>pg_class</structname>.<structfield>relfilenode</structfield> contains zero. The
actual filenode number of these catalogs is stored in a lower-level data
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 6fec485..a723790 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..a2f0d35 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,173 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(ShmemVariableCache->nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged values are less than the threshold value then
+ * log more. Ideally, we can wait until all relfilenumbers have been
+ * consumed before logging more. Nevertheless, if we do that, we must
+ * immediately flush the logged wal record because we want to ensure that
+ * the nextRelFileNumber is always larger than any relfilenumber already
+ * in use on disk. And, to maintain that invariant, we must make sure
+ * that the record we log reaches the disk before any new files are
+ * created with the newly logged range. So in order to avoid flushing the
+ * wal immediately, we always log before consuming all the relfilenumber,
+ * and now we only have to flush the newly logged relfilenumber wal before
+ * consuming the relfilenumber from this new range. By the time we need
+ * to flush this wal, hopefully those have already been flushed with some
+ * other XLogFlush operation. Although VAR_RELNUMBER_PER_XLOG is large
+ * enough that it might not slow things down but it is always better if we
+ * can avoid some extra XLogFlush.
+ */
+ if (ShmemVariableCache->loggedRelFileNumber -
+ ShmemVariableCache->nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ RelFileNumber newlogrelnum;
+
+ /*
+ * First time, this will immediately flush the newly logged wal record
+ * as we don't have anything logged in advance. From next time
+ * onwards we will remember the previously logged record pointer and
+ * we will flush upto that point.
+ *
+ * XXX second time, it might try to flush the same thing what is
+ * already done in the first time but it will logically do nothing so
+ * it is not worth to add any extra complexity to avoid that.
+ */
+ newlogrelnum = ShmemVariableCache->nextRelFileNumber +
+ VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum,
+ &ShmemVariableCache->loggedRelFileNumberRecPtr);
+
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ LogNextRelFileNumber(newlogrelnum, NULL);
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7a710e6..48c05d2 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4538,6 +4538,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4551,7 +4552,10 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5017,7 +5021,9 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6489,14 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ if (shutdown)
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ else
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7375,35 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. If '*prevrecptr' is a valid
+ * XLogRecPtrthen flush the wal upto this record pointer otherwise flush upto
+ * currently logged record. Also store the currently logged record pointer in
+ * the '*prevrecptr' if prevrecptr is not NULL.
+ */
+void
+LogNextRelFileNumber(RelFileNumber nextrelnumber, XLogRecPtr *prevrecptr)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ /*
+ * If a valid prevrecptr is passed then flush that xlog record to disk
+ * otherwise flush the newly logged record.
+ */
+ if ((prevrecptr != NULL) && !XLogRecPtrIsInvalid(*prevrecptr))
+ XLogFlush(*prevrecptr);
+ else
+ XLogFlush(recptr);
+
+ if (prevrecptr != NULL)
+ *prevrecptr = recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7618,16 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7642,10 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 9aa5641..a0c5aca 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index ae2af5a..3b4da8e 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2225,14 +2225,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2428,7 +2428,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3bf3aa6..07cf34a 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1230,7 +1230,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relNumber[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2abd6b0..a9bd8ae 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -483,101 +483,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..080622b 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT " that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index dacc989..660b5e4 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14363,10 +14363,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index f260b48..728653c 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -267,7 +267,7 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index b707a09..f809a02 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 1667d72..971e243 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 89cf9f9..859bc29 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4932,7 +4932,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 647c458..c3faa68 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -31,7 +31,7 @@ static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
typedef struct
{
- Oid reloid; /* hash key */
+ RelFileNumber relnumber; /* hash key */
} unlogged_relation_entry;
/*
@@ -184,10 +184,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* need to be reset. Otherwise, this cleanup operation would be
* O(n^2).
*/
- ctl.keysize = sizeof(Oid);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(unlogged_relation_entry);
ctl.hcxt = CurrentMemoryContext;
- hash = hash_create("unlogged relation OIDs", 32, &ctl,
+ hash = hash_create("unlogged relation RelFileNumbers", 32, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
/* Scan the directory. */
@@ -208,10 +208,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * Put the OID portion of the name into the hash table, if it
- * isn't already.
+ * Put the RELFILENUMBER portion of the name into the hash table,
+ * if it isn't already.
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
(void) hash_search(hash, &ent, HASH_ENTER, NULL);
}
@@ -248,10 +248,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * See whether the OID portion of the name shows up in the hash
- * table. If so, nuke it!
+ * See whether the RELFILENUMBER portion of the name shows up in
+ * the hash table. If so, nuke it!
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
if (hash_search(hash, &ent, HASH_FIND, NULL))
{
snprintf(rm_path, sizeof(rm_path), "%s/%s",
@@ -286,7 +286,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
@@ -329,7 +329,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
@@ -372,8 +372,8 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *relnumchars has been set to a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
* RelFileNumber portion of the filename. This is critical to protect against
* a possible buffer overrun.
*/
@@ -386,7 +386,7 @@ parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
*relnumchars = pos;
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3deac49..532bd7f 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..9f70f35 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,12 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..fc2faed 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -98,10 +99,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +123,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +147,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 00dc0f2..6f4e96d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3712,7 +3712,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
{
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
}
else if (relation->rd_rel->relkind == RELKIND_INDEX)
{
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 324ccf7..ddb5ec1 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -485,9 +485,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..4b6ff4d 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber:%lld\n"),
+ (long long) ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d25709a..632b7fb 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3183,15 +3183,15 @@ dumpDatabase(Archive *fout)
atooid(PQgetvalue(lo_res, i, ii_oid)));
oid = atooid(PQgetvalue(lo_res, i, ii_oid));
- relfilenumber = atooid(PQgetvalue(lo_res, i, ii_relfilenode));
+ relfilenumber = atorelnumber(PQgetvalue(lo_res, i, ii_relfilenode));
if (oid == LargeObjectRelationId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
else if (oid == LargeObjectLOidPNIndexId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
@@ -4876,16 +4876,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4903,7 +4903,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4917,7 +4917,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4925,7 +4925,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4938,7 +4938,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 53ea348..1d3982f 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index e706d68..de5cee6 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -40,7 +40,7 @@ my $toast_index = $node->safe_psql('postgres',
# REINDEX operations. A set of relfilenodes is saved from the catalogs
# and then compared with pg_class.
$node->safe_psql('postgres',
- 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode oid);'
+ 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode int8);'
);
# Save the relfilenode of a set of toast indexes, one from the catalog
# pg_constraint and one from the test table.
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..36acd8b 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,30 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * RelFileNumber zero is InvalidRelFileNumber.
+ *
+ * For the system tables (OID < FirstNormalObjectId) the initial storage
+ * will be created with the relfilenumber same as their oid. And, later for
+ * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
+ * at 100000. Thus, when upgrading from an older cluster, the relation storage
+ * path for the user table from the old cluster will not conflict with the
+ * relation storage path for the system table from the new cluster. Anyway,
+ * the new cluster must not have any user tables while upgrading, so we needn't
+ * worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber %lld is out of range", \
+ (long long) (relfilenumber))); \
+} while (0)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +239,14 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +325,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..78660f1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,8 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern void LogNextRelFileNumber(RelFileNumber nextrelnumber,
+ XLogRecPtr *prevrecptr);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a07e737..8b72f8a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7329,11 +7329,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11125,15 +11125,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..f84e22c 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -28,6 +28,7 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+#define RELNUMBERCHARS 20 /* max chars printed by %llu */
/*
* Stuff for fork names.
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 406db6b..1301301 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index d63f4f1..a489ccc 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Tue, Aug 30, 2022 at 6:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Aug 26, 2022 at 9:33 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Aug 26, 2022 at 7:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
While working on this solution I noticed one issue. Basically, the
problem is that during binary upgrade when we try to rewrite a heap we
would expect that “binary_upgrade_next_heap_pg_class_oid” and
“binary_upgrade_next_heap_pg_class_relfilenumber” are already set for
creating a new heap. But we are not preserving anything so we don't
have those values. One option to this problem is that we can first
start the postmaster in non-binary upgrade mode perform all conflict
checking and rewrite and stop the postmaster. Then start postmaster
again and perform the restore as we are doing now. Although we will
have to start/stop the postmaster one extra time we have a solution.Yeah, that seems OK. Or we could add a new function, like
binary_upgrade_allow_relation_oid_and_relfilenode_assignment(bool).
Not sure which way is better.I have found one more issue with this approach of rewriting the
conflicting table. Earlier I thought we could do the conflict
checking and rewriting inside create_new_objects() right before the
restore command. But after implementing (while testing) this I
realized that we DROP and CREATE the database while restoring the dump
that means it will again generate the conflicting system tables. So
theoretically the rewriting should go in between the CREATE DATABASE
and restoring the object but as of now both create database and
restoring other objects are part of a single dump file. I haven't yet
analyzed how feasible it is to generate the dump in two parts, first
part just to create the database and in second part restore the rest
of the object.
Isn't this happening because we are passing "--clean
--create"/"--create" options to pg_restore in create_new_objects()? If
so, then I think one idea to decouple would be to not use those
options. Perform drop/create separately via commands (for create, we
need to generate the command as we are generating while generating the
dump in custom format), then rewrite the conflicting tables, and
finally restore the dump.
--
With Regards,
Amit Kapila.
On Sat, Sep 3, 2022 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I have found one more issue with this approach of rewriting the
conflicting table. Earlier I thought we could do the conflict
checking and rewriting inside create_new_objects() right before the
restore command. But after implementing (while testing) this I
realized that we DROP and CREATE the database while restoring the dump
that means it will again generate the conflicting system tables. So
theoretically the rewriting should go in between the CREATE DATABASE
and restoring the object but as of now both create database and
restoring other objects are part of a single dump file. I haven't yet
analyzed how feasible it is to generate the dump in two parts, first
part just to create the database and in second part restore the rest
of the object.Isn't this happening because we are passing "--clean
--create"/"--create" options to pg_restore in create_new_objects()? If
so, then I think one idea to decouple would be to not use those
options. Perform drop/create separately via commands (for create, we
need to generate the command as we are generating while generating the
dump in custom format), then rewrite the conflicting tables, and
finally restore the dump.
Hmm, you are right. So I think something like this is possible to do,
I will explore this more. Thanks for the idea.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Sun, Sep 4, 2022 at 9:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Sat, Sep 3, 2022 at 5:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Isn't this happening because we are passing "--clean
--create"/"--create" options to pg_restore in create_new_objects()? If
so, then I think one idea to decouple would be to not use those
options. Perform drop/create separately via commands (for create, we
need to generate the command as we are generating while generating the
dump in custom format), then rewrite the conflicting tables, and
finally restore the dump.Hmm, you are right. So I think something like this is possible to do,
I will explore this more. Thanks for the idea.
I have explored this area more and also tried to come up with a
working prototype, so while working on this I realized that we would
have almost to execute all the code which is getting generated as part
of the dumpDatabase() and dumpACL() which is basically,
1. UPDATE pg_catalog.pg_database SET datistemplate = false
2. DROP DATABASE
3. CREATE DATABASE with all the database properties like ENCODING,
LOCALE_PROVIDER, LOCALE, LC_COLLATE, LC_CTYPE, ICU_LOCALE,
COLLATION_VERSION, TABLESPACE
4. COMMENT ON DATABASE
5. Logic inside dumpACL()
I feel duplicating logic like this is really error-prone, but I do not
find any clear way to reuse the code as dumpDatabase() has a high
dependency on the Archive handle and generating the dump file.
So currently I have implemented most of this logic except for a few
e.g. dumpACL(), comments on the database, etc. So before we go too
far in this direction I wanted to know the opinions of others.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Sep 6, 2022 at 4:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have explored this area more and also tried to come up with a
working prototype, so while working on this I realized that we would
have almost to execute all the code which is getting generated as part
of the dumpDatabase() and dumpACL() which is basically,1. UPDATE pg_catalog.pg_database SET datistemplate = false
2. DROP DATABASE
3. CREATE DATABASE with all the database properties like ENCODING,
LOCALE_PROVIDER, LOCALE, LC_COLLATE, LC_CTYPE, ICU_LOCALE,
COLLATION_VERSION, TABLESPACE
4. COMMENT ON DATABASE
5. Logic inside dumpACL()I feel duplicating logic like this is really error-prone, but I do not
find any clear way to reuse the code as dumpDatabase() has a high
dependency on the Archive handle and generating the dump file.
Yeah, I don't think this is the way to go at all. The duplicated logic
is likely to get broken, and is also likely to annoy the next person
who has to maintain it.
I suggest that for now we fall back on making the initial
RelFileNumber for a system table equal to pg_class.oid. I don't really
love that system and I think maybe we should change it at some point
in the future, but all the alternatives seem too complicated to cram
them into the current patch.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Sep 6, 2022 at 11:07 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Sep 6, 2022 at 4:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have explored this area more and also tried to come up with a
working prototype, so while working on this I realized that we would
have almost to execute all the code which is getting generated as part
of the dumpDatabase() and dumpACL() which is basically,1. UPDATE pg_catalog.pg_database SET datistemplate = false
2. DROP DATABASE
3. CREATE DATABASE with all the database properties like ENCODING,
LOCALE_PROVIDER, LOCALE, LC_COLLATE, LC_CTYPE, ICU_LOCALE,
COLLATION_VERSION, TABLESPACE
4. COMMENT ON DATABASE
5. Logic inside dumpACL()I feel duplicating logic like this is really error-prone, but I do not
find any clear way to reuse the code as dumpDatabase() has a high
dependency on the Archive handle and generating the dump file.Yeah, I don't think this is the way to go at all. The duplicated logic
is likely to get broken, and is also likely to annoy the next person
who has to maintain it.
Right
I suggest that for now we fall back on making the initial
RelFileNumber for a system table equal to pg_class.oid. I don't really
love that system and I think maybe we should change it at some point
in the future, but all the alternatives seem too complicated to cram
them into the current patch.
That makes sense.
On a separate note, while reviewing the latest patch I see there is some
risk of using the unflushed relfilenumber in GetNewRelFileNumber()
function. Basically, in the current code, the flushing logic is tightly
coupled with the logging new relfilenumber logic and that might not work
with all the values of the VAR_RELNUMBER_NEW_XLOG_THRESHOLD. So the idea
is we need to keep the flushing logic separate from the logging, I am
working on the idea and I will post the patch soon.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Sep 8, 2022 at 4:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On a separate note, while reviewing the latest patch I see there is some risk of using the unflushed relfilenumber in GetNewRelFileNumber() function. Basically, in the current code, the flushing logic is tightly coupled with the logging new relfilenumber logic and that might not work with all the values of the VAR_RELNUMBER_NEW_XLOG_THRESHOLD. So the idea is we need to keep the flushing logic separate from the logging, I am working on the idea and I will post the patch soon.
I have fixed the issue, so now we will track nextRelFileNumber,
loggedRelFileNumber and flushedRelFileNumber. So whenever
nextRelFileNumber is just VAR_RELNUMBER_NEW_XLOG_THRESHOLD behind the
loggedRelFileNumber we will log VAR_RELNUMBER_PER_XLOG more
relfilenumbers. And whenever nextRelFileNumber reaches the
flushedRelFileNumber then we will do XlogFlush for WAL upto the last
loggedRelFileNumber. Ideally flushedRelFileNumber should always be
VAR_RELNUMBER_PER_XLOG number behind the loggedRelFileNumber so we can
avoid tracking the flushedRelFileNumber. But I feel keeping track of
the flushedRelFileNumber looks cleaner and easier to understand. For
more details refer to the code in GetNewRelFileNumber().
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v17-0001-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v17-0001-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From d04432467d34ebdc95934ab85409b1fad037c6cd Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Fri, 26 Aug 2022 10:20:18 +0530
Subject: [PATCH v17] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 ++++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 +++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/pgbuffercache.sgml | 2 +-
doc/src/sgml/storage.sgml | 5 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 199 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 50 ++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/backup/basebackup.c | 2 +-
src/backend/catalog/catalog.c | 95 ----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 2 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 28 +--
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 7 +-
src/backend/utils/adt/pg_upgrade_support.c | 13 +-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 65 +++----
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 3 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/bin/scripts/t/090_reindexdb.pl | 2 +-
src/common/relpath.c | 20 +--
src/fe_utils/option_utils.c | 39 ++++
src/include/access/transam.h | 34 ++++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 1 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 10 +-
src/include/storage/buf_internals.h | 62 ++++++-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
71 files changed, 692 insertions(+), 332 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d74b3e8..4d88eba 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..50956b1
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..912cbd8 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode %lld is too large to be represented as an OID",
+ (long long) fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c8d673a..6bb6da6 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," INT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," INT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 00f833d..40d4e9c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1984,7 +1984,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index e5b9f3f..eefba2f 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -217,11 +217,10 @@ with the suffix <literal>_init</literal> (see <xref linkend="storage-init"/>).
<caution>
<para>
-Note that while a table's filenode often matches its OID, this is
-<emphasis>not</emphasis> necessarily the case; some operations, like
+Note that table's filenode are completely different than its OID. Although for
+system catalogs initial filenode matches with its OID, but some operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.
-Avoid assuming that filenode and table OID are the same.
Also, for certain system catalogs including <structname>pg_class</structname> itself,
<structname>pg_class</structname>.<structfield>relfilenode</structfield> contains zero. The
actual filenode number of these catalogs is stored in a lower-level data
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..b75ad79 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" INT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..c699937 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..f6d278b 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..70feb2d 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..45c6ee7 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" INT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 3fd7185..9cdd1a9 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " INT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, INT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" INT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 734c39a..8911671 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..5b8f220 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,189 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+ RelFileNumber nextRelFileNumber,
+ loggedRelFileNumber,
+ flushedRelFileNumber;
+
+ StaticAssertStmt(VAR_RELNUMBER_NEW_XLOG_THRESHOLD < VAR_RELNUMBER_PER_XLOG,
+ "VAR_RELNUMBER_NEW_XLOG_THRESHOLD must be smaller than VAR_RELNUMBER_PER_XLOG");
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ loggedRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ flushedRelFileNumber = ShmemVariableCache->flushedRelFileNumber;
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is out of bound");
+
+ /*
+ * If the remaining logged relfilenumbers values are less than the
+ * threshold value then log more. Ideally, we can wait until all
+ * relfilenumbers have been consumed before logging more. Nevertheless, if
+ * we do that, we must immediately flush the logged wal record because we
+ * want to ensure that the nextRelFileNumber is always larger than any
+ * relfilenumber already in use on disk. And, to maintain that invariant,
+ * we must make sure that the record we log reaches the disk before any new
+ * files are created with the newly logged range.
+ *
+ * So in order to avoid flushing the wal immediately, we always log before
+ * consuming all the relfilenumber, and now we only have to flush the newly
+ * logged relfilenumber wal before consuming the relfilenumber from this
+ * new range. By the time we need to flush this wal, hopefully, those have
+ * already been flushed with some other XLogFlush operation.
+ */
+ if (loggedRelFileNumber - nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ XLogRecPtr recptr;
+
+ loggedRelFileNumber = loggedRelFileNumber + VAR_RELNUMBER_PER_XLOG;
+ recptr = LogNextRelFileNumber(loggedRelFileNumber);
+ ShmemVariableCache->loggedRelFileNumber = loggedRelFileNumber;
+
+ /* remember for the future flush */
+ ShmemVariableCache->loggedRelFileNumberRecPtr = recptr;
+ }
+
+ /*
+ * If the nextRelFileNumber is already reached to the already flushed
+ * relfilenumber then flush the WAL for previously logged relfilenumber.
+ */
+ if (nextRelFileNumber >= flushedRelFileNumber)
+ {
+ XLogFlush(ShmemVariableCache->loggedRelFileNumberRecPtr);
+ ShmemVariableCache->flushedRelFileNumber = loggedRelFileNumber;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+
+ /* we should never be using any relfilenumber outside the flushed range */
+ Assert(result <= ShmemVariableCache->flushedRelFileNumber);
+
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ return InvalidRelFileNumber; /* placate compiler */
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * logged relfilenumber then log again.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ XLogFlush(LogNextRelFileNumber(newlogrelnum));
+
+ /* we have flushed whatever we have logged so no pending flush */
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ ShmemVariableCache->flushedRelFileNumber = newlogrelnum;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7a710e6..83cf052 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4538,6 +4538,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4551,7 +4552,11 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5017,7 +5022,10 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6483,6 +6491,14 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ if (shutdown)
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ else
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7361,6 +7377,24 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. It also returns the XLogRecPtr of
+ * the currently logged relfilenumber record, so that the caller can flush it
+ * at the appropriate time.
+ */
+XLogRecPtr
+LogNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ return recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7575,6 +7609,17 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7589,6 +7634,11 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 9aa5641..a0c5aca 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -611,7 +611,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -634,7 +634,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -733,7 +733,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" INT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -754,7 +754,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" INT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -793,7 +793,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" INT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -932,7 +932,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -948,7 +948,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" INT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index ae2af5a..3b4da8e 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2225,14 +2225,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" INT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2428,7 +2428,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" INT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0cda225..b4398e1 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -617,17 +617,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), INT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3bf3aa6..07cf34a 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1230,7 +1230,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relNumber[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2abd6b0..a9bd8ae 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -483,101 +483,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..080622b 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT " that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " INT64_FORMAT "that is bigger than nextRelFileNumber " INT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index dacc989..660b5e4 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14363,10 +14363,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index f260b48..728653c 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -267,7 +267,7 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index b707a09..f809a02 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 1667d72..971e243 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 89cf9f9..859bc29 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4932,7 +4932,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" INT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 647c458..c3faa68 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -31,7 +31,7 @@ static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
typedef struct
{
- Oid reloid; /* hash key */
+ RelFileNumber relnumber; /* hash key */
} unlogged_relation_entry;
/*
@@ -184,10 +184,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* need to be reset. Otherwise, this cleanup operation would be
* O(n^2).
*/
- ctl.keysize = sizeof(Oid);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(unlogged_relation_entry);
ctl.hcxt = CurrentMemoryContext;
- hash = hash_create("unlogged relation OIDs", 32, &ctl,
+ hash = hash_create("unlogged relation RelFileNumbers", 32, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
/* Scan the directory. */
@@ -208,10 +208,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * Put the OID portion of the name into the hash table, if it
- * isn't already.
+ * Put the RELFILENUMBER portion of the name into the hash table,
+ * if it isn't already.
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
(void) hash_search(hash, &ent, HASH_ENTER, NULL);
}
@@ -248,10 +248,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * See whether the OID portion of the name shows up in the hash
- * table. If so, nuke it!
+ * See whether the RELFILENUMBER portion of the name shows up in
+ * the hash table. If so, nuke it!
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
if (hash_search(hash, &ent, HASH_FIND, NULL))
{
snprintf(rm_path, sizeof(rm_path), "%s/%s",
@@ -286,7 +286,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
@@ -329,7 +329,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
@@ -372,8 +372,8 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *relnumchars has been set to a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
* RelFileNumber portion of the filename. This is critical to protect against
* a possible buffer overrun.
*/
@@ -386,7 +386,7 @@ parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
*relnumchars = pos;
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..172225b 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" INT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3deac49..532bd7f 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..9f70f35 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,12 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..fc2faed 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -98,10 +99,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +123,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +147,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 00dc0f2..6f4e96d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3712,7 +3712,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
{
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
}
else if (relation->rd_rel->relkind == RELKIND_INDEX)
{
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..cbb18f0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,7 +117,7 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
@@ -137,26 +125,17 @@ InitializeRelfilenumberMap(void)
Oid
RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[0].sk_argument = Int64GetDatum(relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " INT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..3c1fef4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum(ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 324ccf7..ddb5ec1 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -485,9 +485,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..4b6ff4d 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber:%lld\n"),
+ (long long) ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 67b6d90..ed38515 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3183,15 +3183,15 @@ dumpDatabase(Archive *fout)
atooid(PQgetvalue(lo_res, i, ii_oid)));
oid = atooid(PQgetvalue(lo_res, i, ii_oid));
- relfilenumber = atooid(PQgetvalue(lo_res, i, ii_relfilenode));
+ relfilenumber = atorelnumber(PQgetvalue(lo_res, i, ii_relfilenode));
if (oid == LargeObjectRelationId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
else if (oid == LargeObjectLOidPNIndexId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
@@ -4876,16 +4876,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4903,7 +4903,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4917,7 +4917,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4925,7 +4925,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4938,7 +4938,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" INT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..8be5e66 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" INT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" INT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" INT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 53ea348..1d3982f 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3c7402..c8b10e4 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" INT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6528113..9aca583 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" INT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index e706d68..de5cee6 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -40,7 +40,7 @@ my $toast_index = $node->safe_psql('postgres',
# REINDEX operations. A set of relfilenodes is saved from the catalogs
# and then compared with pg_class.
$node->safe_psql('postgres',
- 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode oid);'
+ 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode int8);'
);
# Save the relfilenode of a set of toast indexes, one from the catalog
# pg_constraint and one from the test table.
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..0774e3f 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" INT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" INT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" INT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" INT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" INT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" INT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..7d49359 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ int64 val;
+
+ errno = 0;
+ val = strtoi64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " INT64_FORMAT ".." INT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..3dbe7bd 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,28 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * For the system tables (OID < FirstNormalObjectId) the initial storage will
+ * be created with the relfilenumber same as their Oid. And, later for any
+ * storage the relfilenumber allocated by GetNewRelFileNumber() and it will
+ * start at 100000. Thus, when upgrading from an older cluster, the relation
+ * storage path for the user table from the old cluster will not conflict with
+ * the relation storage path for the system table from the new cluster.
+ * Anyway, the new cluster must not have any user tables while upgrading, so we
+ * needn't worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber %lld is out of range", \
+ (long long) (relfilenumber))); \
+} while (0)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +237,15 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ RelFileNumber flushedRelFileNumber; /* last flushed relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +324,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cd674c3..a005bdb 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -234,6 +234,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern XLogRecPtr LogNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a07e737..8b72f8a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7329,11 +7329,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11125,15 +11125,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..f84e22c 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -28,6 +28,7 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+#define RELNUMBERCHARS 20 /* max chars printed by %llu */
/*
* Stuff for fork names.
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..12b6b80 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -48,11 +48,17 @@ typedef PG_INT64_TYPE pg_int64;
/*
* RelFileNumber data type identifies the specific relation file name.
+ * RelFileNumber is unique within a cluster.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef pg_int64 RelFileNumber;
+
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+
+/* Max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER INT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Identifiers of error message fields. Kept here to keep common
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 406db6b..1301301 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,73 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ int8 ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return (ForkNumber) ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
+
+ tag->relForkDetails[0] = (relnum >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnum & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +197,9 @@ BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index d63f4f1..a489ccc 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index e7013f5..c9622f6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Fri, Sep 9, 2022 at 3:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Sep 8, 2022 at 4:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On a separate note, while reviewing the latest patch I see there is some risk of using the unflushed relfilenumber in GetNewRelFileNumber() function. Basically, in the current code, the flushing logic is tightly coupled with the logging new relfilenumber logic and that might not work with all the values of the VAR_RELNUMBER_NEW_XLOG_THRESHOLD. So the idea is we need to keep the flushing logic separate from the logging, I am working on the idea and I will post the patch soon.
I have fixed the issue, so now we will track nextRelFileNumber,
loggedRelFileNumber and flushedRelFileNumber. So whenever
nextRelFileNumber is just VAR_RELNUMBER_NEW_XLOG_THRESHOLD behind the
loggedRelFileNumber we will log VAR_RELNUMBER_PER_XLOG more
relfilenumbers. And whenever nextRelFileNumber reaches the
flushedRelFileNumber then we will do XlogFlush for WAL upto the last
loggedRelFileNumber. Ideally flushedRelFileNumber should always be
VAR_RELNUMBER_PER_XLOG number behind the loggedRelFileNumber so we can
avoid tracking the flushedRelFileNumber. But I feel keeping track of
the flushedRelFileNumber looks cleaner and easier to understand. For
more details refer to the code in GetNewRelFileNumber().
Here are a few minor suggestions I came across while reading this
patch, might be useful:
+#ifdef USE_ASSERT_CHECKING
+
+ {
Unnecessary space after USE_ASSERT_CHECKING.
--
+ return InvalidRelFileNumber; /* placate compiler */
I don't think we needed this after the error on the latest branches.
--
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ if (shutdown)
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ else
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+
+ LWLockRelease(RelFileNumberGenLock);
This is done for the good reason, I think, it should have a comment
describing different checkPoint.nextRelFileNumber assignment
need and crash recovery perspective.
--
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
Can append empty parenthesis "()" to the macro name, to look like a
function call at use or change the macro name to uppercase?
--
+ if (val < 0 || val > MAX_RELFILENUMBER)
..
if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
How about adding a macro for this condition as RelFileNumberIsValid()?
We can replace all the checks referring to MAX_RELFILENUMBER with this.
Regards,
Amul
On Fri, Sep 9, 2022 at 6:02 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
[ new patch ]
+typedef pg_int64 RelFileNumber;
This seems really random to me. First, why isn't this an unsigned
type? OID is unsigned and I don't see a reason to change to a signed
type. But even if we were going to change to a signed type, why
pg_int64? That is declared like this:
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
Surely this is not a client API declaration....
Note that if we change this a lot of references to INT64_FORMAT will
need to become UINT64_FORMAT.
I think we should use int64 at the SQL level, because we don't have an
unsigned 64-bit SQL type, and a signed 64-bit type can hold 56 bits.
So it would still be Int64GetDatum((int64) rd_rel->relfilenode) or
similar. But internally I think using unsigned is cleaner.
+ * RelFileNumber is unique within a cluster.
Not really, because of CREATE DATABASE. Probably just drop this line.
Or else expand it: we never assign the same RelFileNumber twice within
the lifetime of the same cluster, but there can be multiple relations
with the same RelFileNumber e.g. because CREATE DATABASE duplicates
the RelFileNumber values from the template database. But maybe we
don't need this here, as it's already explained in relfilelocator.h.
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
Why not declare ret as ForkNumber instead of casting twice?
+ uint64 relnum;
+
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ relnum = relnumber;
Perhaps it'd be better to write uint64 relnum = relnumber instead of
initializing on a separate line.
+#define RELNUMBERCHARS 20 /* max chars printed by %llu */
Maybe instead of %llu we should say UINT64_FORMAT (or INT64_FORMAT if
there's some reason to stick with a signed type).
+ elog(ERROR, "relfilenumber is out of bound");
It would have to be "out of bounds", with an "s". But maybe "is too
large" would be better.
+ nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ loggedRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ flushedRelFileNumber = ShmemVariableCache->flushedRelFileNumber;
Maybe it would be a good idea to asset that next <= flushed and
flushed <= logged?
+#ifdef USE_ASSERT_CHECKING
+
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
Let's add a comment here, like "Because the RelFileNumber counter only
ever increases and never wraps around, it should be impossible for the
newly-allocated RelFileNumber to already be in use. But, if Asserts
are enabled, double check that there's no main-fork relation file with
the new RelFileNumber already on disk."
+ elog(ERROR, "cannot forward RelFileNumber during recovery");
forward -> set (or advance)
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
It probably doesn't make any difference, but to me it seems better to
test flushedRelFileNumber rather than logRelFileNumber here. What do
you think?
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
Boy, this makes me uncomfortable. The existing logic is pretty bogus,
and we're replacing it with some other bogus thing. Do we know whether
anything actually does try to use this for locking?
One notable difference between the existing logic and your change is
that, with the existing logic, we use a bogus value that will differ
from one relation to the next, whereas with this change, it will
always be the same value. Perhaps el->rd_lockInfo.lockRelId.relId =
(Oid) rlocator.relNumber would be a more natural adaptation?
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber %lld is out of range", \
+ (long long) (relfilenumber))); \
+} while (0)
Here, you take the approach of casting the relfilenumber to long long
and then using %lld. But elsewhere, you use
INT64_FORMAT/UINT64_FORMAT. If we're going to use this technique, we
ought to use it everywhere.
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
This feels like a bold change. Are you sure it's safe? i.e. Are you
certain that there's no way that a relfilenumber could repeat within a
database? If we're going to bank on that, we could adapt this more
heavily, e.g. RelidByRelfilenumber() could lose the reltablespace
parameter. I think maybe we should push this change into an 0002 patch
(or later) and have 0001 just do a minimal adaptation for the changed
data type.
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
Documentation updated is needed.
-Note that while a table's filenode often matches its OID, this is
-<emphasis>not</emphasis> necessarily the case; some operations, like
+Note that table's filenode are completely different than its OID. Although for
+system catalogs initial filenode matches with its OID, but some
operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>,
<command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while
preserving the OID.
-Avoid assuming that filenode and table OID are the same.
Suggest: Note that a table's filenode will normally be different than
the OID. For system tables, the initial filenode will be equal to the
table OID, but it will be different if the table has ever been
subjected to a rewriting operation, such as TRUNCATE, REINDEX,
CLUSTER, or some forms of ALTER TABLE. For user tables, even the
initial filenode will be different than the table OID.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Sep 20, 2022 at 10:44 PM Robert Haas <robertmhaas@gmail.com> wrote:
Thanks for the review, please see my response inline for some of the
comments, rest all are accepted.
On Fri, Sep 9, 2022 at 6:02 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
[ new patch ]
+typedef pg_int64 RelFileNumber;
This seems really random to me. First, why isn't this an unsigned
type? OID is unsigned and I don't see a reason to change to a signed
type. But even if we were going to change to a signed type, why
pg_int64? That is declared like this:/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;Surely this is not a client API declaration....
Note that if we change this a lot of references to INT64_FORMAT will
need to become UINT64_FORMAT.I think we should use int64 at the SQL level, because we don't have an
unsigned 64-bit SQL type, and a signed 64-bit type can hold 56 bits.
So it would still be Int64GetDatum((int64) rd_rel->relfilenode) or
similar. But internally I think using unsigned is cleaner.
Yeah you are right we can make it uint64. With respect to this, we
can not directly use uint64 because that is declared in c.h and that
can not be used in
postgres_ext.h IIUC. So what are the other option maybe we can
typedef the RelFIleNumber similar to what c.h done for uint64 i.e.
#ifdef HAVE_LONG_INT_64
typedef unsigned long int uint64;
#elif defined(HAVE_LONG_LONG_INT_64)
typedef long long int int64;
#endif
And maybe same for UINT64CONST ?
I am not liking duplicating this logic but is there any better
alternative for doing this? Can we move the existing definitions from
c.h file to some common file (common for client and server)?
+ if (relnumber >= ShmemVariableCache->loggedRelFileNumber)
It probably doesn't make any difference, but to me it seems better to
test flushedRelFileNumber rather than logRelFileNumber here. What do
you think?
Actually based on this condition are logging more so it make more
sense to check w.r.t loggedRelFileNumber, but OTOH technically,
without flushing log we are not supposed to use the relfilenumber so
make more sense to test flushedRelFileNumber. But since both are the
same I am fine with flushedRelFileNumber.
/* * We set up the lockRelId in case anything tries to lock the dummy - * relation. Note that this is fairly bogus since relNumber may be - * different from the relation's OID. It shouldn't really matter though. - * In recovery, we are running by ourselves and can't have any lock - * conflicts. While syncing, we already hold AccessExclusiveLock. + * relation. Note we are setting relId to just FirstNormalObjectId which + * is completely bogus. It shouldn't really matter though. In recovery, + * we are running by ourselves and can't have any lock conflicts. While + * syncing, we already hold AccessExclusiveLock. */ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid; - rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber; + rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;Boy, this makes me uncomfortable. The existing logic is pretty bogus,
and we're replacing it with some other bogus thing. Do we know whether
anything actually does try to use this for locking?One notable difference between the existing logic and your change is
that, with the existing logic, we use a bogus value that will differ
from one relation to the next, whereas with this change, it will
always be the same value. Perhaps el->rd_lockInfo.lockRelId.relId =
(Oid) rlocator.relNumber would be a more natural adaptation?+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \ +do { \ + if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \ + ereport(ERROR, \ + errcode(ERRCODE_INVALID_PARAMETER_VALUE), \ + errmsg("relfilenumber %lld is out of range", \ + (long long) (relfilenumber))); \ +} while (0)Here, you take the approach of casting the relfilenumber to long long
and then using %lld. But elsewhere, you use
INT64_FORMAT/UINT64_FORMAT. If we're going to use this technique, we
ought to use it everywhere.
Based on the discussion [1]/messages/by-id/20220730113922.qd7qmenwcmzyacje@alvherre.pgsql, it seems we can not use
INT64_FORMAT/UINT64_FORMAT while using ereport. But all other places
I am using INT64_FORMAT/UINT64_FORMAT. Does this make sense?
[1]: /messages/by-id/20220730113922.qd7qmenwcmzyacje@alvherre.pgsql
typedef struct { - Oid reltablespace; - RelFileNumber relfilenumber; -} RelfilenumberMapKey; - -typedef struct -{ - RelfilenumberMapKey key; /* lookup key - must be first */ + RelFileNumber relfilenumber; /* lookup key - must be first */ Oid relid; /* pg_class.oid */ } RelfilenumberMapEntry;This feels like a bold change. Are you sure it's safe? i.e. Are you
certain that there's no way that a relfilenumber could repeat within a
database?
IIUC, as of now, CREATE DATABASE is the only option which can create
the duplicate relfilenumber but that would be in different databases.
So based on that theory I think it should be safe.
If we're going to bank on that, we could adapt this more
heavily, e.g. RelidByRelfilenumber() could lose the reltablespace
parameter.
Yeah we might, although we need a bool to identify whether it is
shared relation or not.
I think maybe we should push this change into an 0002 patch
(or later) and have 0001 just do a minimal adaptation for the changed
data type.
Yeah that make sense.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Sep 21, 2022 at 3:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Yeah you are right we can make it uint64. With respect to this, we
can not directly use uint64 because that is declared in c.h and that
can not be used in
postgres_ext.h IIUC. So what are the other option maybe we can
typedef the RelFIleNumber similar to what c.h done for uint64 i.e.#ifdef HAVE_LONG_INT_64
typedef unsigned long int uint64;
#elif defined(HAVE_LONG_LONG_INT_64)
typedef long long int int64;
#endifI am not liking duplicating this logic but is there any better
alternative for doing this? Can we move the existing definitions from
c.h file to some common file (common for client and server)?
Here is the updated patch which fixes all the agreed comments. Except
this one which needs more thoughts, for now I have used unsigned long
int.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v18-0002-Don-t-need-tabespace-id-to-uniquely-identify-rel.patchtext/x-patch; charset=US-ASCII; name=v18-0002-Don-t-need-tabespace-id-to-uniquely-identify-rel.patchDownload
From bf1681b0cf1603a94cf64a8bfeb444fbe45120e1 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 22 Sep 2022 10:48:38 +0530
Subject: [PATCH v18 2/2] Don't need tabespace id to uniquely identify relation
file within database
Now relfilenumber is 56 bit wide and will never wrap around
so they are always unique within a database. So now we do not need need
tablespace Oid to uniquely identifying the relation file within a database.
---
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
src/backend/replication/logical/reorderbuffer.c | 5 +-
src/backend/utils/adt/dbsize.c | 3 +-
src/backend/utils/cache/relfilenumbermap.c | 69 +++++++++----------------
src/include/catalog/pg_class.h | 2 +-
src/include/utils/relfilenumbermap.h | 3 +-
8 files changed, 36 insertions(+), 54 deletions(-)
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 31caf10..b8bbe33 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -31,6 +31,7 @@
#include "access/relation.h"
#include "access/xact.h"
#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "catalog/pg_type.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -511,7 +512,8 @@ autoprewarm_database_main(Datum main_arg)
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
+ reloid = RelidByRelfilenumber(blk->filenumber,
+ blk->tablespace == GLOBALTABLESPACE_OID);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index a0f398b..a70a746 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -91,6 +91,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/pg_tablespace.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -2153,8 +2154,8 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
- change->data.tp.rlocator.relNumber);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.relNumber,
+ change->data.tp.rlocator.spcOid == GLOBALTABLESPACE_OID);
/*
* Mapped catalog tuple without data, emitted while
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 9f70f35..4d33032 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -908,7 +908,8 @@ pg_filenode_relation(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
+ heaprel = RelidByRelfilenumber(relfilenumber,
+ reltablespace== GLOBALTABLESPACE_OID);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index 2e0acf9..bef14b0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,34 +117,25 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
*/
Oid
-RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
+RelidByRelfilenumber(RelFileNumber relfilenumber, bool is_shared)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -174,7 +154,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* initialize empty/negative cache entry before doing the actual lookups */
relid = InvalidOid;
- if (reltablespace == GLOBALTABLESPACE_OID)
+ if (is_shared)
{
/*
* Ok, shared table, check relmapper.
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = Int64GetDatum((int64) relfilenumber);
+ skey[0].sk_argument = Int64GetDatum((int64) relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber " UINT64_FORMAT,
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " UINT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 4768e5e..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
index c149a93..ae81232 100644
--- a/src/include/utils/relfilenumbermap.h
+++ b/src/include/utils/relfilenumbermap.h
@@ -13,7 +13,6 @@
#ifndef RELFILENUMBERMAP_H
#define RELFILENUMBERMAP_H
-extern Oid RelidByRelfilenumber(Oid reltablespace,
- RelFileNumber relfilenumber);
+extern Oid RelidByRelfilenumber(RelFileNumber relfilenumber, bool is_shared);
#endif /* RELFILENUMBERMAP_H */
--
1.8.3.1
v18-0001-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v18-0001-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From 950bf687ce67cac6cf41972e6ba28f263dfe6524 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Fri, 26 Aug 2022 10:20:18 +0530
Subject: [PATCH v18 1/2] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 +++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/func.sgml | 5 +
doc/src/sgml/pgbuffercache.sgml | 2 +-
doc/src/sgml/storage.sgml | 11 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 207 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 60 ++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 12 +-
src/backend/backup/basebackup.c | 2 +-
src/backend/catalog/catalog.c | 95 ----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 2 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 28 +--
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 7 +-
src/backend/utils/adt/pg_upgrade_support.c | 13 +-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 3 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/bin/scripts/t/090_reindexdb.pl | 2 +-
src/common/relpath.c | 20 +-
src/fe_utils/option_utils.c | 39 ++++
src/include/access/transam.h | 34 ++++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 1 +
src/include/fe_utils/option_utils.h | 2 +
src/include/postgres_ext.h | 25 ++-
src/include/storage/buf_internals.h | 58 +++++-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
70 files changed, 701 insertions(+), 298 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d74b3e8..4d88eba 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..50956b1
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..a45f240 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode %llu is too large to be represented as an OID",
+ (unsigned long long) fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c8d673a..31caf10 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," UINT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," UINT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 00f833d..40d4e9c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1984,7 +1984,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e1fe4fe..e514edb 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25204,6 +25204,11 @@ SELECT collation for ('foo' COLLATE "de_DE");
<entry><type>timestamp with time zone</type></entry>
</row>
+ <row>
+ <entry><structfield>next_relfilenumber</structfield></entry>
+ <entry><type>timestamp with time zone</type></entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index e5b9f3f..d9e9b0f 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -217,11 +217,12 @@ with the suffix <literal>_init</literal> (see <xref linkend="storage-init"/>).
<caution>
<para>
-Note that while a table's filenode often matches its OID, this is
-<emphasis>not</emphasis> necessarily the case; some operations, like
-<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
-of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.
-Avoid assuming that filenode and table OID are the same.
+Note that a table's filenode will normally be different than the OID. For
+system tables, the initial filenode will be equal to the table OID, but it will
+be different if the table has ever been subjected to a rewriting operation,
+such as <command>TRUNCATE</command>, <command>REINDEX</command>,
+<command>CLUSTER</command> or some forms of <command>ALTER TABLE</command>.
+For user tables, even the initial filenode will be different than the table OID.
Also, for certain system catalogs including <structname>pg_class</structname> itself,
<structname>pg_class</structname>.<structfield>relfilenode</structfield> contains zero. The
actual filenode number of these catalogs is stored in a lower-level data
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..bc093f2 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" UINT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..d1c8a24 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..70bd493 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..6192a7b 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..df72caf 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 3fd7185..84a826b 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " UINT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, UINT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" UINT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" UINT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" UINT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 72af656..91c2578 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..e8cad59 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,197 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+ RelFileNumber nextRelFileNumber,
+ loggedRelFileNumber,
+ flushedRelFileNumber;
+
+ StaticAssertStmt(VAR_RELNUMBER_NEW_XLOG_THRESHOLD < VAR_RELNUMBER_PER_XLOG,
+ "VAR_RELNUMBER_NEW_XLOG_THRESHOLD must be smaller than VAR_RELNUMBER_PER_XLOG");
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ loggedRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ flushedRelFileNumber = ShmemVariableCache->flushedRelFileNumber;
+
+ Assert(nextRelFileNumber <= flushedRelFileNumber);
+ Assert(flushedRelFileNumber <= loggedRelFileNumber);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is too large");
+
+ /*
+ * If the remaining logged relfilenumbers values are less than the
+ * threshold value then log more. Ideally, we can wait until all
+ * relfilenumbers have been consumed before logging more. Nevertheless, if
+ * we do that, we must immediately flush the logged wal record because we
+ * want to ensure that the nextRelFileNumber is always larger than any
+ * relfilenumber already in use on disk. And, to maintain that invariant,
+ * we must make sure that the record we log reaches the disk before any new
+ * files are created with the newly logged range.
+ *
+ * So in order to avoid flushing the wal immediately, we always log before
+ * consuming all the relfilenumber, and now we only have to flush the newly
+ * logged relfilenumber wal before consuming the relfilenumber from this
+ * new range. By the time we need to flush this wal, hopefully, those have
+ * already been flushed with some other XLogFlush operation.
+ */
+ if (loggedRelFileNumber - nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ XLogRecPtr recptr;
+
+ loggedRelFileNumber = loggedRelFileNumber + VAR_RELNUMBER_PER_XLOG;
+ recptr = LogNextRelFileNumber(loggedRelFileNumber);
+ ShmemVariableCache->loggedRelFileNumber = loggedRelFileNumber;
+
+ /* remember for the future flush */
+ ShmemVariableCache->loggedRelFileNumberRecPtr = recptr;
+ }
+
+ /*
+ * If the nextRelFileNumber is already reached to the already flushed
+ * relfilenumber then flush the WAL for previously logged relfilenumber.
+ */
+ if (nextRelFileNumber >= flushedRelFileNumber)
+ {
+ XLogFlush(ShmemVariableCache->loggedRelFileNumberRecPtr);
+ ShmemVariableCache->flushedRelFileNumber = loggedRelFileNumber;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+
+ /* we should never be using any relfilenumber outside the flushed range */
+ Assert(result <= ShmemVariableCache->flushedRelFileNumber);
+
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ /*
+ * Because the RelFileNumber counter only ever increases and never wraps
+ * around, it should be impossible for the newly-allocated RelFileNumber to
+ * already be in use. But, if Asserts are enabled, double check that
+ * there's no main-fork relation file with the new RelFileNumber already on
+ * disk.
+ */
+#ifdef USE_ASSERT_CHECKING
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot set RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * flushed relfilenumber then log more and flush the xlog immediately.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->flushedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ XLogFlush(LogNextRelFileNumber(newlogrelnum));
+
+ /* we have flushed whatever we have logged so no pending flush */
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ ShmemVariableCache->flushedRelFileNumber = newlogrelnum;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f32b212..580ef7b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4712,6 +4712,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4725,7 +4726,11 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5191,7 +5196,10 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6663,6 +6671,24 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ /*
+ * If this is a shutdown checkpoint then we can safely start allocating
+ * relfilenumber from the nextRelFileNumber value after the restart because
+ * no one one else can use the relfilenumber beyond that number before the
+ * shutdown. OTOH, if it is a normal checkpoint then if there is a crash
+ * after this point then we might end up reusing the same relfilenumbers
+ * after the restart so we need to set the nextRelFileNumber to the already
+ * logged relfilenumber as no one will use number beyond this limit without
+ * logging again.
+ */
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ if (shutdown)
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ else
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7541,6 +7567,24 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. It also returns the XLogRecPtr of
+ * the currently logged relfilenumber record, so that the caller can flush it
+ * at the appropriate time.
+ */
+XLogRecPtr
+LogNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ return recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7755,6 +7799,17 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7769,6 +7824,11 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 3429602..15f6279 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -613,7 +613,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" UINT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -636,7 +636,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" UINT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -735,7 +735,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" UINT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -756,7 +756,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" UINT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -795,7 +795,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" UINT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -934,7 +934,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" UINT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -950,7 +950,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" UINT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index b41e682..1026ce5 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2228,14 +2228,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" UINT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" UINT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2433,7 +2433,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" UINT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 563cba2..717cdde 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -619,17 +619,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), UINT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
- * different from the relation's OID. It shouldn't really matter though.
- * In recovery, we are running by ourselves and can't have any lock
- * conflicts. While syncing, we already hold AccessExclusiveLock.
+ * relation. Note we are setting relId to just FirstNormalObjectId which
+ * is completely bogus. It shouldn't really matter though. In recovery,
+ * we are running by ourselves and can't have any lock conflicts. While
+ * syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;
rel->rd_smgr = NULL;
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index dd103a8..41d72db 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1231,7 +1231,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relNumber[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2abd6b0..a9bd8ae 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -483,101 +483,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9b03579..c17f60f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -898,7 +906,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1170,12 +1178,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1229,8 +1232,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d7192f3..a509c3e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..021e085 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " UINT64_FORMAT " that is bigger than nextRelFileNumber " UINT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " UINT64_FORMAT "that is bigger than nextRelFileNumber " UINT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7d8a75d..1b8e6d5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14375,10 +14375,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index b69ff37..cdd7986 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -267,7 +267,7 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index b707a09..f809a02 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 4d0bf19..10116fd 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 03d9c9c..a0f398b 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4932,7 +4932,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" UINT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 647c458..c3faa68 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -31,7 +31,7 @@ static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
typedef struct
{
- Oid reloid; /* hash key */
+ RelFileNumber relnumber; /* hash key */
} unlogged_relation_entry;
/*
@@ -184,10 +184,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* need to be reset. Otherwise, this cleanup operation would be
* O(n^2).
*/
- ctl.keysize = sizeof(Oid);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(unlogged_relation_entry);
ctl.hcxt = CurrentMemoryContext;
- hash = hash_create("unlogged relation OIDs", 32, &ctl,
+ hash = hash_create("unlogged relation RelFileNumbers", 32, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
/* Scan the directory. */
@@ -208,10 +208,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * Put the OID portion of the name into the hash table, if it
- * isn't already.
+ * Put the RELFILENUMBER portion of the name into the hash table,
+ * if it isn't already.
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
(void) hash_search(hash, &ent, HASH_ENTER, NULL);
}
@@ -248,10 +248,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * See whether the OID portion of the name shows up in the hash
- * table. If so, nuke it!
+ * See whether the RELFILENUMBER portion of the name shows up in
+ * the hash table. If so, nuke it!
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
if (hash_search(hash, &ent, HASH_FIND, NULL))
{
snprintf(rm_path, sizeof(rm_path), "%s/%s",
@@ -286,7 +286,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
@@ -329,7 +329,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
@@ -372,8 +372,8 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *relnumchars has been set to a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
* RelFileNumber portion of the filename. This is critical to protect against
* a possible buffer overrun.
*/
@@ -386,7 +386,7 @@ parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
*relnumchars = pos;
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..1210be7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" UINT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index a515bb3..bed47f0 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..9f70f35 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,12 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..fc2faed 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -98,10 +99,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +123,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +147,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 00dc0f2..6f4e96d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3712,7 +3712,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
{
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
}
else if (relation->rd_rel->relkind == RELKIND_INDEX)
{
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..2e0acf9 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum((int64) relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " UINT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..d441cd9 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum((int64) ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 324ccf7..ddb5ec1 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -485,9 +485,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..2f0e91f 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber:%llu\n"),
+ (unsigned long long) ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f8c4cb8..ddea1ab 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3183,15 +3183,15 @@ dumpDatabase(Archive *fout)
atooid(PQgetvalue(lo_res, i, ii_oid)));
oid = atooid(PQgetvalue(lo_res, i, ii_oid));
- relfilenumber = atooid(PQgetvalue(lo_res, i, ii_relfilenode));
+ relfilenumber = atorelnumber(PQgetvalue(lo_res, i, ii_relfilenode));
if (oid == LargeObjectRelationId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
else if (oid == LargeObjectLOidPNIndexId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
@@ -4876,16 +4876,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4903,7 +4903,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4917,7 +4917,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4925,7 +4925,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4938,7 +4938,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..197ec0e 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" UINT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" UINT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" UINT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index f18cf97..0c712a6 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3f3d6b..529267d 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" UINT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" UINT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 9993378..6fdc7dc 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" UINT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index e706d68..de5cee6 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -40,7 +40,7 @@ my $toast_index = $node->safe_psql('postgres',
# REINDEX operations. A set of relfilenodes is saved from the catalogs
# and then compared with pg_class.
$node->safe_psql('postgres',
- 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode oid);'
+ 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode int8);'
);
# Save the relfilenode of a set of toast indexes, one from the catalog
# pg_constraint and one from the test table.
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..d0d83e5 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" UINT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" UINT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" UINT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" UINT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" UINT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" UINT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" UINT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" UINT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" UINT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" UINT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..06c501f 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -82,3 +82,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ uint64 val;
+
+ errno = 0;
+ val = strtou64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " UINT64_FORMAT ".." UINT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..bcd7e62 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -196,6 +196,28 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * For the system tables (OID < FirstNormalObjectId) the initial storage will
+ * be created with the relfilenumber same as their Oid. And, later for any
+ * storage the relfilenumber allocated by GetNewRelFileNumber() and it will
+ * start at 100000. Thus, when upgrading from an older cluster, the relation
+ * storage path for the user table from the old cluster will not conflict with
+ * the relation storage path for the system table from the new cluster.
+ * Anyway, the new cluster must not have any user tables while upgrading, so we
+ * needn't worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber %llu is out of range", \
+ (unsigned long long) (relfilenumber))); \
+} while (0)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +237,15 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ RelFileNumber flushedRelFileNumber; /* last flushed relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +324,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 3dbfa6b..929a977e 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -235,6 +235,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern XLogRecPtr LogNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..4768e5e 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a07e737..8b72f8a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7329,11 +7329,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11125,15 +11125,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..60e9475 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -28,6 +28,7 @@
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+#define RELNUMBERCHARS 20 /* max chars printed by UINT64_FORMAT */
/*
* Stuff for fork names.
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..f73e287 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -39,21 +39,28 @@ typedef unsigned int Oid;
#define OID_MAX UINT_MAX
/* you will need to include <limits.h> to use the above #define */
-#define atooid(x) ((Oid) strtoul((x), NULL, 10))
-/* the above needs <stdlib.h> */
+/* RelFileNumber data type identifies the specific relation file name */
+typedef unsigned long int RelFileNumber;
+#ifdef __cplusplus
+#define InvalidRelFileNumber (RelFileNumber(0))
+#else
+#define InvalidRelFileNumber ((RelFileNumber) 0)
+#endif
-/* Define a signed 64-bit integer type for use in client API declarations. */
-typedef PG_INT64_TYPE pg_int64;
+/* max value of the relfilnumber, relfilnumber is 56 bits wide. */
+#define MAX_RELFILENUMBER UINT64CONST(0x00FFFFFFFFFFFFFF)
-/*
- * RelFileNumber data type identifies the specific relation file name.
- */
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atooid(x) ((Oid) strtoul((x), NULL, 10))
+#define atorelnumber(x) ((RelFileNumber) strtoul((x), NULL, 10))
+/* the above needs <stdlib.h> */
+
+/* Define a signed 64-bit integer type for use in client API declarations. */
+typedef PG_INT64_TYPE pg_int64;
+
/*
* Identifiers of error message fields. Kept here to keep common
* between frontend and backend, and also to export them to libpq
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 406db6b..d48aa6d 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,69 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ ForkNumber ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ tag->relForkDetails[0] = (relnumber >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnumber & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +193,9 @@ BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 346f594..86666b8 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 9f773ae..a67eb5f 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Tue, Sep 20, 2022 at 7:46 PM Amul Sul <sulamul@gmail.com> wrote:
Thanks for the review
Here are a few minor suggestions I came across while reading this
patch, might be useful:+#ifdef USE_ASSERT_CHECKING + + {Unnecessary space after USE_ASSERT_CHECKING.
Changed
+ return InvalidRelFileNumber; /* placate compiler */
I don't think we needed this after the error on the latest branches.
--
Changed
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED); + if (shutdown) + checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber; + else + checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber; + + LWLockRelease(RelFileNumberGenLock);This is done for the good reason, I think, it should have a comment
describing different checkPoint.nextRelFileNumber assignment
need and crash recovery perspective.
--
Done
+#define SizeOfRelFileLocatorBackend \ + (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))Can append empty parenthesis "()" to the macro name, to look like a
function call at use or change the macro name to uppercase?
--
Yeah we could SizeOfXXX macros are general practice I see used
everywhere in Postgres code so left as it is.
+ if (val < 0 || val > MAX_RELFILENUMBER)
..
if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \How about adding a macro for this condition as RelFileNumberIsValid()?
We can replace all the checks referring to MAX_RELFILENUMBER with this.
Actually, RelFileNumberIsValid is used to just check whether it is
InvalidRelFileNumber value i.e. 0. Maybe for this we can introduce
RelFileNumberInValidRange() but I am not sure whether it would be
cleaner than what we have now, so left as it is for now.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Sep 21, 2022 at 6:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Yeah you are right we can make it uint64. With respect to this, we
can not directly use uint64 because that is declared in c.h and that
can not be used in
postgres_ext.h IIUC.
Ugh.
Can we move the existing definitions from
c.h file to some common file (common for client and server)?
Yeah, I think that would be a good idea. Here's a quick patch that
moves them to common/relpath.h, which seems like a possibly-reasonable
choice, though perhaps you or someone else will have a better idea.
Based on the discussion [1], it seems we can not use
INT64_FORMAT/UINT64_FORMAT while using ereport. But all other places
I am using INT64_FORMAT/UINT64_FORMAT. Does this make sense?[1] /messages/by-id/20220730113922.qd7qmenwcmzyacje@alvherre.pgsql
Oh, hmm. So you're saying if the string is not translated then use
(U)INT64_FORMAT but if it is translated then cast? I guess that makes
sense. It feels a bit strange to have the style dependent on the
context like that, but maybe it's fine. I'll reread with that idea in
mind.
If we're going to bank on that, we could adapt this more
heavily, e.g. RelidByRelfilenumber() could lose the reltablespace
parameter.Yeah we might, although we need a bool to identify whether it is
shared relation or not.
Why?
--
Robert Haas
EDB: http://www.enterprisedb.com
Attachments:
move-relfilenumber-decls-v1.patchapplication/octet-stream; name=move-relfilenumber-decls-v1.patchDownload
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f8c4cb8d18..bd9b066e4e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -55,6 +55,7 @@
#include "catalog/pg_trigger_d.h"
#include "catalog/pg_type_d.h"
#include "common/connect.h"
+#include "common/relpath.h"
#include "dumputils.h"
#include "fe_utils/option_utils.h"
#include "fe_utils/string_utils.h"
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index e379aa4669..31589b0fdc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -10,6 +10,7 @@
#include <sys/stat.h>
#include <sys/time.h>
+#include "common/relpath.h"
#include "libpq-fe.h"
/* For now, pg_upgrade does not use common/logging.c; use our own pg_fatal */
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index fd934427ad..b7b7d3be00 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -14,6 +14,8 @@
#ifndef BINARY_UPGRADE_H
#define BINARY_UPGRADE_H
+#include "common/relpath.h"
+
extern PGDLLIMPORT Oid binary_upgrade_next_pg_tablespace_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_type_oid;
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab713247f..4bbd94393c 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -19,6 +19,13 @@
*/
#include "catalog/catversion.h" /* pgrminclude ignore */
+/*
+ * RelFileNumber data type identifies the specific relation file name.
+ */
+typedef Oid RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+#define RelFileNumberIsValid(relnumber) \
+ ((bool) ((relnumber) != InvalidRelFileNumber))
/*
* Name of major-version-specific tablespace subdirectories
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aead2afd6e..633e7671b3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -22,6 +22,7 @@
#ifndef PARSENODES_H
#define PARSENODES_H
+#include "common/relpath.h"
#include "nodes/bitmapset.h"
#include "nodes/lockoptions.h"
#include "nodes/primnodes.h"
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21e7a..21e642a64c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -16,6 +16,7 @@
#include "access/sdir.h"
#include "access/stratnum.h"
+#include "common/relpath.h"
#include "lib/stringinfo.h"
#include "nodes/bitmapset.h"
#include "nodes/lockoptions.h"
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa010..240ad4e93b 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -46,14 +46,6 @@ typedef unsigned int Oid;
/* Define a signed 64-bit integer type for use in client API declarations. */
typedef PG_INT64_TYPE pg_int64;
-/*
- * RelFileNumber data type identifies the specific relation file name.
- */
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
-#define RelFileNumberIsValid(relnumber) \
- ((bool) ((relnumber) != InvalidRelFileNumber))
-
/*
* Identifiers of error message fields. Kept here to keep common
* between frontend and backend, and also to export them to libpq
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 73106b6fc0..ce57e7ce4e 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -15,6 +15,7 @@
#define RELCACHE_H
#include "access/tupdesc.h"
+#include "common/relpath.h"
#include "nodes/bitmapset.h"
On Mon, Sep 26, 2022 at 9:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
Can we move the existing definitions from
c.h file to some common file (common for client and server)?Yeah, I think that would be a good idea. Here's a quick patch that
moves them to common/relpath.h, which seems like a possibly-reasonable
choice, though perhaps you or someone else will have a better idea.
Looks fine to me.
Based on the discussion [1], it seems we can not use
INT64_FORMAT/UINT64_FORMAT while using ereport. But all other places
I am using INT64_FORMAT/UINT64_FORMAT. Does this make sense?[1] /messages/by-id/20220730113922.qd7qmenwcmzyacje@alvherre.pgsql
Oh, hmm. So you're saying if the string is not translated then use
(U)INT64_FORMAT but if it is translated then cast?
Right
I guess that makes
sense. It feels a bit strange to have the style dependent on the
context like that, but maybe it's fine. I'll reread with that idea in
mind.
Ok
If we're going to bank on that, we could adapt this more
heavily, e.g. RelidByRelfilenumber() could lose the reltablespace
parameter.Yeah we might, although we need a bool to identify whether it is
shared relation or not.Why?
Because if entry is not in cache then we need to look into the
relmapper and for that we need to know whether it is a shared relation
or not. And I don't think we can identify that just by looking at
relfilenumber.
Another open comment which I missed in last reply
/* * We set up the lockRelId in case anything tries to lock the dummy - * relation. Note that this is fairly bogus since relNumber may be - * different from the relation's OID. It shouldn't really matter though. - * In recovery, we are running by ourselves and can't have any lock - * conflicts. While syncing, we already hold AccessExclusiveLock. + * relation. Note we are setting relId to just FirstNormalObjectId which + * is completely bogus. It shouldn't really matter though. In recovery, + * we are running by ourselves and can't have any lock conflicts. While + * syncing, we already hold AccessExclusiveLock. */ rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid; - rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber; + rel->rd_lockInfo.lockRelId.relId = FirstNormalObjectId;Boy, this makes me uncomfortable. The existing logic is pretty bogus,
and we're replacing it with some other bogus thing. Do we know whether
anything actually does try to use this for locking?
Looking at the code it seems it is not used for locking. I also test
by setting some special value for relid in
CreateFakeRelcacheEntry() and validating that id is never used for
locking in SET_LOCKTAG_RELATION. And ran check-world so I could not
see we are ever trying to create lock tag using fake relcache entry.
One notable difference between the existing logic and your change is
that, with the existing logic, we use a bogus value that will differ
from one relation to the next, whereas with this change, it will
always be the same value. Perhaps el->rd_lockInfo.lockRelId.relId =
(Oid) rlocator.relNumber would be a more natural adaptation?
I agree, so changed it this way.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
v19-0003-Don-t-need-tabespace-id-to-uniquely-identify-rel.patchtext/x-patch; charset=US-ASCII; name=v19-0003-Don-t-need-tabespace-id-to-uniquely-identify-rel.patchDownload
From e3bf618ff7ca46c533717320e4aebee32fc3e4ff Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Thu, 22 Sep 2022 10:48:38 +0530
Subject: [PATCH v19 3/3] Don't need tabespace id to uniquely identify relation
file within database
Now relfilenumber is 56 bit wide and will never wrap around
so they are always unique within a database. So now we do not need need
tablespace Oid to uniquely identifying the relation file within a database.
---
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/test_decoding/expected/rewrite.out | 2 +-
contrib/test_decoding/sql/rewrite.sql | 2 +-
src/backend/replication/logical/reorderbuffer.c | 5 +-
src/backend/utils/adt/dbsize.c | 3 +-
src/backend/utils/cache/relfilenumbermap.c | 69 +++++++++----------------
src/include/catalog/pg_class.h | 2 +-
src/include/utils/relfilenumbermap.h | 3 +-
8 files changed, 36 insertions(+), 54 deletions(-)
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 31caf10..b8bbe33 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -31,6 +31,7 @@
#include "access/relation.h"
#include "access/xact.h"
#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "catalog/pg_type.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -511,7 +512,8 @@ autoprewarm_database_main(Datum main_arg)
Assert(rel == NULL);
StartTransactionCommand();
- reloid = RelidByRelfilenumber(blk->tablespace, blk->filenumber);
+ reloid = RelidByRelfilenumber(blk->filenumber,
+ blk->tablespace == GLOBALTABLESPACE_OID);
if (OidIsValid(reloid))
rel = try_relation_open(reloid, AccessShareLock);
diff --git a/contrib/test_decoding/expected/rewrite.out b/contrib/test_decoding/expected/rewrite.out
index b30999c..8f1aa48 100644
--- a/contrib/test_decoding/expected/rewrite.out
+++ b/contrib/test_decoding/expected/rewrite.out
@@ -106,7 +106,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
BEGIN;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (6, 4);
diff --git a/contrib/test_decoding/sql/rewrite.sql b/contrib/test_decoding/sql/rewrite.sql
index 62dead3..8983704 100644
--- a/contrib/test_decoding/sql/rewrite.sql
+++ b/contrib/test_decoding/sql/rewrite.sql
@@ -77,7 +77,7 @@ VACUUM FULL pg_class;
-- reindexing of important relations / indexes
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
-REINDEX INDEX pg_class_tblspc_relfilenode_index;
+REINDEX INDEX pg_class_relfilenode_index;
INSERT INTO replication_example(somedata, testcolumn1) VALUES (5, 3);
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index a0f398b..e9e0c6b 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -91,6 +91,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/pg_tablespace.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -2153,8 +2154,8 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
case REORDER_BUFFER_CHANGE_DELETE:
Assert(snapshot_now);
- reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
- change->data.tp.rlocator.relNumber);
+ reloid = RelidByRelfilenumber(change->data.tp.rlocator.relNumber,
+ change->data.tp.rlocator.spcOid == GLOBALTABLESPACE_OID);
/*
* Mapped catalog tuple without data, emitted while
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 9f70f35..4d33032 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -908,7 +908,8 @@ pg_filenode_relation(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
- heaprel = RelidByRelfilenumber(reltablespace, relfilenumber);
+ heaprel = RelidByRelfilenumber(relfilenumber,
+ reltablespace== GLOBALTABLESPACE_OID);
if (!OidIsValid(heaprel))
PG_RETURN_NULL();
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index 2e0acf9..bef14b0 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -32,17 +32,11 @@
static HTAB *RelfilenumberMapHash = NULL;
/* built first time through in InitializeRelfilenumberMap */
-static ScanKeyData relfilenumber_skey[2];
+static ScanKeyData relfilenumber_skey[1];
typedef struct
{
- Oid reltablespace;
- RelFileNumber relfilenumber;
-} RelfilenumberMapKey;
-
-typedef struct
-{
- RelfilenumberMapKey key; /* lookup key - must be first */
+ RelFileNumber relfilenumber; /* lookup key - must be first */
Oid relid; /* pg_class.oid */
} RelfilenumberMapEntry;
@@ -72,7 +66,7 @@ RelfilenumberMapInvalidateCallback(Datum arg, Oid relid)
entry->relid == relid) /* individual flushed relation */
{
if (hash_search(RelfilenumberMapHash,
- (void *) &entry->key,
+ (void *) &entry->relfilenumber,
HASH_REMOVE,
NULL) == NULL)
elog(ERROR, "hash table corrupted");
@@ -88,7 +82,6 @@ static void
InitializeRelfilenumberMap(void)
{
HASHCTL ctl;
- int i;
/* Make sure we've initialized CacheMemoryContext. */
if (CacheMemoryContext == NULL)
@@ -97,25 +90,20 @@ InitializeRelfilenumberMap(void)
/* build skey */
MemSet(&relfilenumber_skey, 0, sizeof(relfilenumber_skey));
- for (i = 0; i < 2; i++)
- {
- fmgr_info_cxt(F_OIDEQ,
- &relfilenumber_skey[i].sk_func,
- CacheMemoryContext);
- relfilenumber_skey[i].sk_strategy = BTEqualStrategyNumber;
- relfilenumber_skey[i].sk_subtype = InvalidOid;
- relfilenumber_skey[i].sk_collation = InvalidOid;
- }
-
- relfilenumber_skey[0].sk_attno = Anum_pg_class_reltablespace;
- relfilenumber_skey[1].sk_attno = Anum_pg_class_relfilenode;
+ fmgr_info_cxt(F_INT8EQ,
+ &relfilenumber_skey[0].sk_func,
+ CacheMemoryContext);
+ relfilenumber_skey[0].sk_strategy = BTEqualStrategyNumber;
+ relfilenumber_skey[0].sk_subtype = InvalidOid;
+ relfilenumber_skey[0].sk_collation = InvalidOid;
+ relfilenumber_skey[0].sk_attno = Anum_pg_class_relfilenode;
/*
* Only create the RelfilenumberMapHash now, so we don't end up partially
* initialized when fmgr_info_cxt() above ERRORs out with an out of memory
* error.
*/
- ctl.keysize = sizeof(RelfilenumberMapKey);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(RelfilenumberMapEntry);
ctl.hcxt = CacheMemoryContext;
@@ -129,34 +117,25 @@ InitializeRelfilenumberMap(void)
}
/*
- * Map a relation's (tablespace, relfilenumber) to a relation's oid and cache
+ * Map a relation's relfilenumber to a relation's oid and cache
* the result.
*
* Returns InvalidOid if no relation matching the criteria could be found.
*/
Oid
-RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
+RelidByRelfilenumber(RelFileNumber relfilenumber, bool is_shared)
{
- RelfilenumberMapKey key;
RelfilenumberMapEntry *entry;
bool found;
SysScanDesc scandesc;
Relation relation;
HeapTuple ntp;
- ScanKeyData skey[2];
+ ScanKeyData skey[1];
Oid relid;
if (RelfilenumberMapHash == NULL)
InitializeRelfilenumberMap();
- /* pg_class will show 0 when the value is actually MyDatabaseTableSpace */
- if (reltablespace == MyDatabaseTableSpace)
- reltablespace = 0;
-
- MemSet(&key, 0, sizeof(key));
- key.reltablespace = reltablespace;
- key.relfilenumber = relfilenumber;
-
/*
* Check cache and return entry if one is found. Even if no target
* relation can be found later on we store the negative match and return a
@@ -164,7 +143,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* since querying invalid values isn't supposed to be a frequent thing,
* but it's basically free.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_FIND, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_FIND, &found);
if (found)
return entry->relid;
@@ -174,7 +154,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* initialize empty/negative cache entry before doing the actual lookups */
relid = InvalidOid;
- if (reltablespace == GLOBALTABLESPACE_OID)
+ if (is_shared)
{
/*
* Ok, shared table, check relmapper.
@@ -195,14 +175,13 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
memcpy(skey, relfilenumber_skey, sizeof(skey));
/* set scan arguments */
- skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = Int64GetDatum((int64) relfilenumber);
+ skey[0].sk_argument = Int64GetDatum((int64) relfilenumber);
scandesc = systable_beginscan(relation,
- ClassTblspcRelfilenodeIndexId,
+ ClassRelfilenodeIndexId,
true,
NULL,
- 2,
+ 1,
skey);
found = false;
@@ -213,11 +192,10 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber " UINT64_FORMAT,
- reltablespace, relfilenumber);
+ "unexpected duplicate for relfilenumber " UINT64_FORMAT,
+ relfilenumber);
found = true;
- Assert(classform->reltablespace == reltablespace);
Assert(classform->relfilenode == relfilenumber);
relid = classform->oid;
}
@@ -235,7 +213,8 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
* caused cache invalidations to be executed which would have deleted a
* new entry if we had entered it above.
*/
- entry = hash_search(RelfilenumberMapHash, (void *) &key, HASH_ENTER, &found);
+ entry = hash_search(RelfilenumberMapHash, (void *) &relfilenumber,
+ HASH_ENTER, &found);
if (found)
elog(ERROR, "corrupted hashtable");
entry->relid = relid;
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 4768e5e..cacfc11 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
+DECLARE_INDEX(pg_class_relfilenode_index, 3455, ClassRelfilenodeIndexId, on pg_class using btree(relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/utils/relfilenumbermap.h b/src/include/utils/relfilenumbermap.h
index c149a93..ae81232 100644
--- a/src/include/utils/relfilenumbermap.h
+++ b/src/include/utils/relfilenumbermap.h
@@ -13,7 +13,6 @@
#ifndef RELFILENUMBERMAP_H
#define RELFILENUMBERMAP_H
-extern Oid RelidByRelfilenumber(Oid reltablespace,
- RelFileNumber relfilenumber);
+extern Oid RelidByRelfilenumber(RelFileNumber relfilenumber, bool is_shared);
#endif /* RELFILENUMBERMAP_H */
--
1.8.3.1
v19-0001-Refactoring-move-RelFileNumber-to-relpath.h.patchtext/x-patch; charset=US-ASCII; name=v19-0001-Refactoring-move-RelFileNumber-to-relpath.h.patchDownload
From c8507fc9fadf725e078883523034e11b78574713 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 27 Sep 2022 09:04:22 +0530
Subject: [PATCH v19 1/3] Refactoring - move RelFileNumber to relpath.h
Patch by Robert Haas
---
src/bin/pg_dump/pg_dump.c | 1 +
src/bin/pg_upgrade/pg_upgrade.h | 1 +
src/include/catalog/binary_upgrade.h | 2 ++
src/include/common/relpath.h | 7 +++++++
src/include/nodes/parsenodes.h | 1 +
src/include/nodes/plannodes.h | 1 +
src/include/postgres_ext.h | 8 --------
src/include/utils/relcache.h | 1 +
8 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f8c4cb8..bd9b066 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -55,6 +55,7 @@
#include "catalog/pg_trigger_d.h"
#include "catalog/pg_type_d.h"
#include "common/connect.h"
+#include "common/relpath.h"
#include "dumputils.h"
#include "fe_utils/option_utils.h"
#include "fe_utils/string_utils.h"
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index e379aa4..31589b0 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -10,6 +10,7 @@
#include <sys/stat.h>
#include <sys/time.h>
+#include "common/relpath.h"
#include "libpq-fe.h"
/* For now, pg_upgrade does not use common/logging.c; use our own pg_fatal */
diff --git a/src/include/catalog/binary_upgrade.h b/src/include/catalog/binary_upgrade.h
index fd93442..b7b7d3b 100644
--- a/src/include/catalog/binary_upgrade.h
+++ b/src/include/catalog/binary_upgrade.h
@@ -14,6 +14,8 @@
#ifndef BINARY_UPGRADE_H
#define BINARY_UPGRADE_H
+#include "common/relpath.h"
+
extern PGDLLIMPORT Oid binary_upgrade_next_pg_tablespace_oid;
extern PGDLLIMPORT Oid binary_upgrade_next_pg_type_oid;
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 3ab7132..4bbd943 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -19,6 +19,13 @@
*/
#include "catalog/catversion.h" /* pgrminclude ignore */
+/*
+ * RelFileNumber data type identifies the specific relation file name.
+ */
+typedef Oid RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+#define RelFileNumberIsValid(relnumber) \
+ ((bool) ((relnumber) != InvalidRelFileNumber))
/*
* Name of major-version-specific tablespace subdirectories
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aead2af..633e767 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -22,6 +22,7 @@
#ifndef PARSENODES_H
#define PARSENODES_H
+#include "common/relpath.h"
#include "nodes/bitmapset.h"
#include "nodes/lockoptions.h"
#include "nodes/primnodes.h"
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21..21e642a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -16,6 +16,7 @@
#include "access/sdir.h"
#include "access/stratnum.h"
+#include "common/relpath.h"
#include "lib/stringinfo.h"
#include "nodes/bitmapset.h"
#include "nodes/lockoptions.h"
diff --git a/src/include/postgres_ext.h b/src/include/postgres_ext.h
index c9774fa..240ad4e 100644
--- a/src/include/postgres_ext.h
+++ b/src/include/postgres_ext.h
@@ -47,14 +47,6 @@ typedef unsigned int Oid;
typedef PG_INT64_TYPE pg_int64;
/*
- * RelFileNumber data type identifies the specific relation file name.
- */
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
-#define RelFileNumberIsValid(relnumber) \
- ((bool) ((relnumber) != InvalidRelFileNumber))
-
-/*
* Identifiers of error message fields. Kept here to keep common
* between frontend and backend, and also to export them to libpq
* applications.
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 73106b6..ce57e7c 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -15,6 +15,7 @@
#define RELCACHE_H
#include "access/tupdesc.h"
+#include "common/relpath.h"
#include "nodes/bitmapset.h"
--
1.8.3.1
v19-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchtext/x-patch; charset=UTF-8; name=v19-0002-Widen-relfilenumber-from-32-bits-to-56-bits.patchDownload
From a96d5fdd48e1f38f4eda2cf703ccd7a9d7e9c79d Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 27 Sep 2022 09:07:10 +0530
Subject: [PATCH v19 2/3] Widen relfilenumber from 32 bits to 56 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently relfilenumber is 32 bits wide and that has a risk of wraparound so
the relfilenumber can be reused. And to guard against the relfilenumber reuse
there is some complicated hack which leaves a 0-length tombstone file around
until the next checkpoint. And when we allocate a new relfilenumber
we also need to loop to check the on disk conflict.
As part of this patch we are making the relfilenumber 56 bits wide and there will be
no provision for wraparound. So after this change we will be able to get rid of the
0-length tombstone file and the loop for checking the on-disk conflict of the
relfilenumbers.
The reason behind making it 56 bits wide instead of directly making 64 bits wide is
that if we make it 64 bits wide then the size of the BufferTag will be increased which
will increase the memory usage and that may also impact the performance. So in order
to avoid that, inside the buffer tag, we will use 8 bits for the fork number and 56 bits
for the relfilenumber.
---
contrib/pg_buffercache/Makefile | 4 +-
.../pg_buffercache/pg_buffercache--1.3--1.4.sql | 30 +++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 39 +++-
contrib/pg_prewarm/autoprewarm.c | 4 +-
contrib/pg_walinspect/expected/pg_walinspect.out | 4 +-
contrib/pg_walinspect/sql/pg_walinspect.sql | 4 +-
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/func.sgml | 5 +
doc/src/sgml/pgbuffercache.sgml | 2 +-
doc/src/sgml/storage.sgml | 11 +-
src/backend/access/gin/ginxlog.c | 2 +-
src/backend/access/rmgrdesc/gistdesc.c | 2 +-
src/backend/access/rmgrdesc/heapdesc.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 2 +-
src/backend/access/rmgrdesc/seqdesc.c | 2 +-
src/backend/access/rmgrdesc/xlogdesc.c | 21 ++-
src/backend/access/transam/README | 5 +-
src/backend/access/transam/varsup.c | 207 ++++++++++++++++++++-
src/backend/access/transam/xlog.c | 60 ++++++
src/backend/access/transam/xlogprefetcher.c | 14 +-
src/backend/access/transam/xlogrecovery.c | 6 +-
src/backend/access/transam/xlogutils.c | 6 +-
src/backend/backup/basebackup.c | 2 +-
src/backend/catalog/catalog.c | 95 ----------
src/backend/catalog/heap.c | 25 +--
src/backend/catalog/index.c | 11 +-
src/backend/catalog/storage.c | 8 +
src/backend/commands/tablecmds.c | 12 +-
src/backend/commands/tablespace.c | 2 +-
src/backend/nodes/gen_node_support.pl | 4 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/replication/logical/reorderbuffer.c | 2 +-
src/backend/storage/file/reinit.c | 28 +--
src/backend/storage/freespace/fsmpage.c | 2 +-
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/storage/smgr/md.c | 7 +
src/backend/storage/smgr/smgr.c | 2 +-
src/backend/utils/adt/dbsize.c | 7 +-
src/backend/utils/adt/pg_upgrade_support.c | 13 +-
src/backend/utils/cache/relcache.c | 2 +-
src/backend/utils/cache/relfilenumbermap.c | 4 +-
src/backend/utils/misc/pg_controldata.c | 9 +-
src/bin/pg_checksums/pg_checksums.c | 4 +-
src/bin/pg_controldata/pg_controldata.c | 2 +
src/bin/pg_dump/pg_dump.c | 26 +--
src/bin/pg_rewind/filemap.c | 6 +-
src/bin/pg_upgrade/info.c | 3 +-
src/bin/pg_upgrade/pg_upgrade.c | 6 +-
src/bin/pg_upgrade/relfilenumber.c | 4 +-
src/bin/pg_waldump/pg_waldump.c | 2 +-
src/bin/scripts/t/090_reindexdb.pl | 2 +-
src/common/relpath.c | 20 +-
src/fe_utils/option_utils.c | 40 ++++
src/include/access/transam.h | 35 ++++
src/include/access/xlog.h | 1 +
src/include/catalog/catalog.h | 3 -
src/include/catalog/pg_class.h | 16 +-
src/include/catalog/pg_control.h | 2 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/common/relpath.h | 7 +-
src/include/fe_utils/option_utils.h | 2 +
src/include/storage/buf_internals.h | 58 +++++-
src/include/storage/relfilelocator.h | 12 +-
src/test/regress/expected/alter_table.out | 24 ++-
src/test/regress/expected/fast_default.out | 4 +-
src/test/regress/expected/oidjoins.out | 2 +-
src/test/regress/sql/alter_table.sql | 8 +-
src/test/regress/sql/fast_default.sql | 4 +-
69 files changed, 688 insertions(+), 288 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d74b3e8..4d88eba 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -6,8 +6,8 @@ OBJS = \
pg_buffercache_pages.o
EXTENSION = pg_buffercache
-DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
- pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql
+DATA = pg_buffercache--1.0--1.1.sql pg_buffercache--1.1--1.2.sql pg_buffercache--1.2.sql \
+ pg_buffercache--1.2--1.3.sql pg_buffercache--1.3--1.4.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
new file mode 100644
index 0000000..50956b1
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
@@ -0,0 +1,30 @@
+/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_buffercache DROP VIEW pg_buffercache;
+ALTER EXTENSION pg_buffercache DROP FUNCTION pg_buffercache_pages();
+
+/* Then we can drop them */
+DROP VIEW pg_buffercache;
+DROP FUNCTION pg_buffercache_pages();
+
+/* Now redefine */
+CREATE FUNCTION pg_buffercache_pages()
+RETURNS SETOF RECORD
+AS 'MODULE_PATHNAME', 'pg_buffercache_pages_v1_4'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE VIEW pg_buffercache AS
+ SELECT P.* FROM pg_buffercache_pages() AS P
+ (bufferid integer, relfilenode int8, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pinning_backends int4);
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
+REVOKE ALL ON pg_buffercache FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_buffercache_pages() TO pg_monitor;
+GRANT SELECT ON pg_buffercache TO pg_monitor;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index 8c060ae..a82ae5f 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.3'
+default_version = '1.4'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index c5754ea..a45f240 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -59,9 +59,10 @@ typedef struct
* relation node/tablespace/database/blocknum and dirty indicator.
*/
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+PG_FUNCTION_INFO_V1(pg_buffercache_pages_v1_4);
-Datum
-pg_buffercache_pages(PG_FUNCTION_ARGS)
+static Datum
+pg_buffercache_pages_internal(PG_FUNCTION_ARGS, Oid rfn_typid)
{
FuncCallContext *funcctx;
Datum result;
@@ -103,7 +104,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
INT4OID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
- OIDOID, -1, 0);
+ rfn_typid, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
OIDOID, -1, 0);
TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
@@ -209,7 +210,24 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
}
else
{
- values[1] = ObjectIdGetDatum(fctx->record[i].relfilenumber);
+ if (rfn_typid == INT8OID)
+ values[1] =
+ Int64GetDatum((int64) fctx->record[i].relfilenumber);
+ else
+ {
+ Assert(rfn_typid == OIDOID);
+
+ if (fctx->record[i].relfilenumber > OID_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relfilenode %llu is too large to be represented as an OID",
+ (unsigned long long) fctx->record[i].relfilenumber),
+ errhint("Upgrade the extension using ALTER EXTENSION pg_buffercache UPDATE"));
+
+ values[1] =
+ ObjectIdGetDatum((Oid) fctx->record[i].relfilenumber);
+ }
+
nulls[1] = false;
values[2] = ObjectIdGetDatum(fctx->record[i].reltablespace);
nulls[2] = false;
@@ -237,3 +255,16 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
else
SRF_RETURN_DONE(funcctx);
}
+
+/* entry point for old extension version */
+Datum
+pg_buffercache_pages(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, OIDOID);
+}
+
+Datum
+pg_buffercache_pages_v1_4(PG_FUNCTION_ARGS)
+{
+ return pg_buffercache_pages_internal(fcinfo, INT8OID);
+}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c8d673a..31caf10 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -345,7 +345,7 @@ apw_load_buffers(void)
{
unsigned forknum;
- if (fscanf(file, "%u,%u,%u,%u,%u\n", &blkinfo[i].database,
+ if (fscanf(file, "%u,%u," UINT64_FORMAT ",%u,%u\n", &blkinfo[i].database,
&blkinfo[i].tablespace, &blkinfo[i].filenumber,
&forknum, &blkinfo[i].blocknum) != 5)
ereport(ERROR,
@@ -669,7 +669,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
CHECK_FOR_INTERRUPTS();
- ret = fprintf(file, "%u,%u,%u,%u,%u\n",
+ ret = fprintf(file, "%u,%u," UINT64_FORMAT ",%u,%u\n",
block_info_array[i].database,
block_info_array[i].tablespace,
block_info_array[i].filenumber,
diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out
index a1ee743..e9b06ed 100644
--- a/contrib/pg_walinspect/expected/pg_walinspect.out
+++ b/contrib/pg_walinspect/expected/pg_walinspect.out
@@ -54,9 +54,9 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- ===================================================================
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
ok
----
t
diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql
index 1b265ea..5393834 100644
--- a/contrib/pg_walinspect/sql/pg_walinspect.sql
+++ b/contrib/pg_walinspect/sql/pg_walinspect.sql
@@ -39,10 +39,10 @@ SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'wal_lsn1');
-- Test for filtering out WAL records of a particular table
-- ===================================================================
-SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset
+SELECT relfilenode AS sample_tbl_relfilenode FROM pg_class WHERE relname = 'sample_tbl' \gset
SELECT COUNT(*) >= 1 AS ok FROM pg_get_wal_records_info(:'wal_lsn1', :'wal_lsn2')
- WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap';
+ WHERE block_ref LIKE concat('%', :'sample_tbl_relfilenode', '%') AND resource_manager = 'Heap';
-- ===================================================================
-- Test for filtering out WAL records based on resource_manager and
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 00f833d..40d4e9c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1984,7 +1984,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
</para>
<para>
Name of the on-disk file of this relation; zero means this
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e1fe4fe..e514edb 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25204,6 +25204,11 @@ SELECT collation for ('foo' COLLATE "de_DE");
<entry><type>timestamp with time zone</type></entry>
</row>
+ <row>
+ <entry><structfield>next_relfilenumber</structfield></entry>
+ <entry><type>timestamp with time zone</type></entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index a06fd3e..e222265 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -62,7 +62,7 @@
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>relfilenode</structfield> <type>oid</type>
+ <structfield>relfilenode</structfield> <type>int8</type>
(references <link linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>relfilenode</structfield>)
</para>
<para>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index e5b9f3f..d9e9b0f 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -217,11 +217,12 @@ with the suffix <literal>_init</literal> (see <xref linkend="storage-init"/>).
<caution>
<para>
-Note that while a table's filenode often matches its OID, this is
-<emphasis>not</emphasis> necessarily the case; some operations, like
-<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
-of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.
-Avoid assuming that filenode and table OID are the same.
+Note that a table's filenode will normally be different than the OID. For
+system tables, the initial filenode will be equal to the table OID, but it will
+be different if the table has ever been subjected to a rewriting operation,
+such as <command>TRUNCATE</command>, <command>REINDEX</command>,
+<command>CLUSTER</command> or some forms of <command>ALTER TABLE</command>.
+For user tables, even the initial filenode will be different than the table OID.
Also, for certain system catalogs including <structname>pg_class</structname> itself,
<structname>pg_class</structname>.<structfield>relfilenode</structfield> contains zero. The
actual filenode number of these catalogs is stored in a lower-level data
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 41b9211..bc093f2 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -100,7 +100,7 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
BlockNumber blknum;
BufferGetTag(buffer, &locator, &forknum, &blknum);
- elog(ERROR, "failed to add item to index page in %u/%u/%u",
+ elog(ERROR, "failed to add item to index page in %u/%u/" UINT64_FORMAT,
locator.spcOid, locator.dbOid, locator.relNumber);
}
}
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index 7dd3c1d..d1c8a24 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -26,7 +26,7 @@ out_gistxlogPageUpdate(StringInfo buf, gistxlogPageUpdate *xlrec)
static void
out_gistxlogPageReuse(StringInfo buf, gistxlogPageReuse *xlrec)
{
- appendStringInfo(buf, "rel %u/%u/%u; blk %u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; blk %u; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber, xlrec->block,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 923d3bc..70bd493 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -169,7 +169,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
{
xl_heap_new_cid *xlrec = (xl_heap_new_cid *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; tid %u/%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; tid %u/%u",
xlrec->target_locator.spcOid,
xlrec->target_locator.dbOid,
xlrec->target_locator.relNumber,
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 4843cd5..6192a7b 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
- appendStringInfo(buf, "rel %u/%u/%u; latestRemovedXid %u:%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT "; latestRemovedXid %u:%u",
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber,
EpochFromFullTransactionId(xlrec->latestRemovedFullXid),
diff --git a/src/backend/access/rmgrdesc/seqdesc.c b/src/backend/access/rmgrdesc/seqdesc.c
index b3845f9..df72caf 100644
--- a/src/backend/access/rmgrdesc/seqdesc.c
+++ b/src/backend/access/rmgrdesc/seqdesc.c
@@ -25,7 +25,7 @@ seq_desc(StringInfo buf, XLogReaderState *record)
xl_seq_rec *xlrec = (xl_seq_rec *) rec;
if (info == XLOG_SEQ_LOG)
- appendStringInfo(buf, "rel %u/%u/%u",
+ appendStringInfo(buf, "rel %u/%u/" UINT64_FORMAT,
xlrec->locator.spcOid, xlrec->locator.dbOid,
xlrec->locator.relNumber);
}
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 3fd7185..84a826b 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -45,8 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
- "oldest xid %u in DB %u; oldest multi %u in DB %u; "
+ "tli %u; prev tli %u; fpw %s; xid %u:%u; relfilenumber " UINT64_FORMAT ";oid %u; "
+ "multi %u; offset %u; oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
LSN_FORMAT_ARGS(checkpoint->redo),
@@ -55,6 +55,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->fullPageWrites ? "true" : "false",
EpochFromFullTransactionId(checkpoint->nextXid),
XidFromFullTransactionId(checkpoint->nextXid),
+ checkpoint->nextRelFileNumber,
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
@@ -74,6 +75,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
memcpy(&nextOid, rec, sizeof(Oid));
appendStringInfo(buf, "%u", nextOid);
}
+ else if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, rec, sizeof(RelFileNumber));
+ appendStringInfo(buf, UINT64_FORMAT, nextRelFileNumber);
+ }
else if (info == XLOG_RESTORE_POINT)
{
xl_restore_point *xlrec = (xl_restore_point *) rec;
@@ -169,6 +177,9 @@ xlog_identify(uint8 info)
case XLOG_NEXTOID:
id = "NEXTOID";
break;
+ case XLOG_NEXT_RELFILENUMBER:
+ id = "NEXT_RELFILENUMBER";
+ break;
case XLOG_SWITCH:
id = "SWITCH";
break;
@@ -237,7 +248,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
appendStringInfoChar(buf, ' ');
appendStringInfo(buf,
- "blkref #%d: rel %u/%u/%u fork %s blk %u",
+ "blkref #%d: rel %u/%u/" UINT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -297,7 +308,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
if (forknum != MAIN_FORKNUM)
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ ", blkref #%d: rel %u/%u/" UINT64_FORMAT " fork %s blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forkNames[forknum],
@@ -306,7 +317,7 @@ XLogRecGetBlockRefInfo(XLogReaderState *record, bool pretty,
else
{
appendStringInfo(buf,
- ", blkref #%d: rel %u/%u/%u blk %u",
+ ", blkref #%d: rel %u/%u/" UINT64_FORMAT " blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
blk);
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 72af656..91c2578 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -692,8 +692,9 @@ by having database restart search for files that don't have any committed
entry in pg_class, but that currently isn't done because of the possibility
of deleting data that is useful for forensic analysis of the crash.
Orphan files are harmless --- at worst they waste a bit of disk space ---
-because we check for on-disk collisions when allocating new relfilenumber
-OIDs. So cleaning up isn't really necessary.
+because the relfilenumber counter is monotonically increasing. The maximum
+value is 2^56-1, and there is no provision for wraparound. Thus, on-disk
+collisions aren't possible.
3. Deleting a table, which requires an unlink() that could fail.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 849a7ce..e8cad59 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -13,12 +13,16 @@
#include "postgres.h"
+#include <unistd.h>
+
#include "access/clog.h"
#include "access/commit_ts.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlogutils.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_tablespace.h"
#include "commands/dbcommands.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -30,6 +34,15 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+/* Number of RelFileNumbers to be logged per XLOG write */
+#define VAR_RELNUMBER_PER_XLOG 512
+
+/*
+ * Need to log more if remaining logged RelFileNumbers are less than the
+ * threshold. Valid range could be between 0 to VAR_RELNUMBER_PER_XLOG - 1.
+ */
+#define VAR_RELNUMBER_NEW_XLOG_THRESHOLD 256
+
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
@@ -521,8 +534,7 @@ ForceTransactionIdLimitUpdate(void)
* wide, counter wraparound will occur eventually, and therefore it is unwise
* to assume they are unique unless precautions are taken to make them so.
* Hence, this routine should generally not be used directly. The only direct
- * callers should be GetNewOidWithIndex() and GetNewRelFileNumber() in
- * catalog/catalog.c.
+ * caller should be GetNewOidWithIndex() in catalog/catalog.c.
*/
Oid
GetNewObjectId(void)
@@ -613,6 +625,197 @@ SetNextObjectId(Oid nextOid)
}
/*
+ * GetNewRelFileNumber
+ *
+ * Similar to GetNewObjectId but instead of new Oid it generates new
+ * relfilenumber.
+ */
+RelFileNumber
+GetNewRelFileNumber(Oid reltablespace, char relpersistence)
+{
+ RelFileNumber result;
+ RelFileNumber nextRelFileNumber,
+ loggedRelFileNumber,
+ flushedRelFileNumber;
+
+ StaticAssertStmt(VAR_RELNUMBER_NEW_XLOG_THRESHOLD < VAR_RELNUMBER_PER_XLOG,
+ "VAR_RELNUMBER_NEW_XLOG_THRESHOLD must be smaller than VAR_RELNUMBER_PER_XLOG");
+
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot assign RelFileNumber during recovery");
+
+ if (IsBinaryUpgrade)
+ elog(ERROR, "cannot assign RelFileNumber during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ loggedRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+ flushedRelFileNumber = ShmemVariableCache->flushedRelFileNumber;
+
+ Assert(nextRelFileNumber <= flushedRelFileNumber);
+ Assert(flushedRelFileNumber <= loggedRelFileNumber);
+
+ /* check for the wraparound for the relfilenumber counter */
+ if (unlikely(nextRelFileNumber > MAX_RELFILENUMBER))
+ elog(ERROR, "relfilenumber is too large");
+
+ /*
+ * If the remaining logged relfilenumbers values are less than the
+ * threshold value then log more. Ideally, we can wait until all
+ * relfilenumbers have been consumed before logging more. Nevertheless, if
+ * we do that, we must immediately flush the logged wal record because we
+ * want to ensure that the nextRelFileNumber is always larger than any
+ * relfilenumber already in use on disk. And, to maintain that invariant,
+ * we must make sure that the record we log reaches the disk before any new
+ * files are created with the newly logged range.
+ *
+ * So in order to avoid flushing the wal immediately, we always log before
+ * consuming all the relfilenumber, and now we only have to flush the newly
+ * logged relfilenumber wal before consuming the relfilenumber from this
+ * new range. By the time we need to flush this wal, hopefully, those have
+ * already been flushed with some other XLogFlush operation.
+ */
+ if (loggedRelFileNumber - nextRelFileNumber <=
+ VAR_RELNUMBER_NEW_XLOG_THRESHOLD)
+ {
+ XLogRecPtr recptr;
+
+ loggedRelFileNumber = loggedRelFileNumber + VAR_RELNUMBER_PER_XLOG;
+ recptr = LogNextRelFileNumber(loggedRelFileNumber);
+ ShmemVariableCache->loggedRelFileNumber = loggedRelFileNumber;
+
+ /* remember for the future flush */
+ ShmemVariableCache->loggedRelFileNumberRecPtr = recptr;
+ }
+
+ /*
+ * If the nextRelFileNumber is already reached to the already flushed
+ * relfilenumber then flush the WAL for previously logged relfilenumber.
+ */
+ if (nextRelFileNumber >= flushedRelFileNumber)
+ {
+ XLogFlush(ShmemVariableCache->loggedRelFileNumberRecPtr);
+ ShmemVariableCache->flushedRelFileNumber = loggedRelFileNumber;
+ }
+
+ result = ShmemVariableCache->nextRelFileNumber;
+
+ /* we should never be using any relfilenumber outside the flushed range */
+ Assert(result <= ShmemVariableCache->flushedRelFileNumber);
+
+ (ShmemVariableCache->nextRelFileNumber)++;
+
+ LWLockRelease(RelFileNumberGenLock);
+
+ /*
+ * Because the RelFileNumber counter only ever increases and never wraps
+ * around, it should be impossible for the newly-allocated RelFileNumber to
+ * already be in use. But, if Asserts are enabled, double check that
+ * there's no main-fork relation file with the new RelFileNumber already on
+ * disk.
+ */
+#ifdef USE_ASSERT_CHECKING
+ {
+ RelFileLocatorBackend rlocator;
+ char *rpath;
+ BackendId backend;
+
+ switch (relpersistence)
+ {
+ case RELPERSISTENCE_TEMP:
+ backend = BackendIdForTempRelations();
+ break;
+ case RELPERSISTENCE_UNLOGGED:
+ case RELPERSISTENCE_PERMANENT:
+ backend = InvalidBackendId;
+ break;
+ default:
+ elog(ERROR, "invalid relpersistence: %c", relpersistence);
+ }
+
+ /* this logic should match RelationInitPhysicalAddr */
+ rlocator.locator.spcOid =
+ reltablespace ? reltablespace : MyDatabaseTableSpace;
+ rlocator.locator.dbOid = (reltablespace == GLOBALTABLESPACE_OID) ?
+ InvalidOid : MyDatabaseId;
+ rlocator.locator.relNumber = result;
+
+ /*
+ * The relpath will vary based on the backend ID, so we must
+ * initialize that properly here to make sure that any collisions
+ * based on filename are properly detected.
+ */
+ rlocator.backend = backend;
+
+ /* check for existing file of same name. */
+ rpath = relpath(rlocator, MAIN_FORKNUM);
+ Assert(access(rpath, F_OK) != 0);
+ }
+#endif
+
+ return result;
+}
+
+/*
+ * SetNextRelFileNumber
+ *
+ * This may only be called during pg_upgrade; it advances the RelFileNumber
+ * counter to the specified value if the current value is smaller than the
+ * input value.
+ */
+void
+SetNextRelFileNumber(RelFileNumber relnumber)
+{
+ /* safety check, we should never get this far in a HS standby */
+ if (RecoveryInProgress())
+ elog(ERROR, "cannot set RelFileNumber during recovery");
+
+ if (!IsBinaryUpgrade)
+ elog(ERROR, "RelFileNumber can be set only during binary upgrade");
+
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+
+ /*
+ * If previous assigned value of the nextRelFileNumber is already higher
+ * than the current value then nothing to be done. This is possible
+ * because during upgrade the objects are not created in the relfilenumber
+ * order.
+ */
+ if (relnumber <= ShmemVariableCache->nextRelFileNumber)
+ {
+ LWLockRelease(RelFileNumberGenLock);
+ return;
+ }
+
+ /*
+ * If the new relfilenumber to be set is greater than or equal to already
+ * flushed relfilenumber then log more and flush the xlog immediately.
+ *
+ * XXX The new 'relnumber' to be set can be from any range so we can not
+ * plan to piggyback the XlogFlush by logging in advance. And it should
+ * not really matter as this is only called during binary upgrade.
+ */
+ if (relnumber >= ShmemVariableCache->flushedRelFileNumber)
+ {
+ RelFileNumber newlogrelnum;
+
+ newlogrelnum = relnumber + VAR_RELNUMBER_PER_XLOG;
+ XLogFlush(LogNextRelFileNumber(newlogrelnum));
+
+ /* we have flushed whatever we have logged so no pending flush */
+ ShmemVariableCache->loggedRelFileNumber = newlogrelnum;
+ ShmemVariableCache->flushedRelFileNumber = newlogrelnum;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
+ }
+
+ ShmemVariableCache->nextRelFileNumber = relnumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+}
+
+/*
* StopGeneratingPinnedObjectIds
*
* This is called once during initdb to force the OID counter up to
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1dd6df0..dff9b8d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4712,6 +4712,7 @@ BootStrapXLOG(void)
checkPoint.nextXid =
FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstGenbkiObjectId;
+ checkPoint.nextRelFileNumber = FirstNormalRelFileNumber;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
checkPoint.oldestXid = FirstNormalTransactionId;
@@ -4725,7 +4726,11 @@ BootStrapXLOG(void)
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumberRecPtr = InvalidXLogRecPtr;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -5191,7 +5196,10 @@ StartupXLOG(void)
/* initialize shared memory variables from the checkpoint record */
ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
ShmemVariableCache->oidCount = 0;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
AdvanceOldestClogXid(checkPoint.oldestXid);
SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
@@ -6663,6 +6671,24 @@ CreateCheckPoint(int flags)
checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
+ /*
+ * If this is a shutdown checkpoint then we can safely start allocating
+ * relfilenumber from the nextRelFileNumber value after the restart because
+ * no one one else can use the relfilenumber beyond that number before the
+ * shutdown. OTOH, if it is a normal checkpoint then if there is a crash
+ * after this point then we might end up reusing the same relfilenumbers
+ * after the restart so we need to set the nextRelFileNumber to the already
+ * logged relfilenumber as no one will use number beyond this limit without
+ * logging again.
+ */
+ LWLockAcquire(RelFileNumberGenLock, LW_SHARED);
+ if (shutdown)
+ checkPoint.nextRelFileNumber = ShmemVariableCache->nextRelFileNumber;
+ else
+ checkPoint.nextRelFileNumber = ShmemVariableCache->loggedRelFileNumber;
+
+ LWLockRelease(RelFileNumberGenLock);
+
MultiXactGetCheckptMulti(shutdown,
&checkPoint.nextMulti,
&checkPoint.nextMultiOffset,
@@ -7541,6 +7567,24 @@ XLogPutNextOid(Oid nextOid)
}
/*
+ * Similar to the XLogPutNextOid but instead of writing NEXTOID log record it
+ * writes a NEXT_RELFILENUMBER log record. It also returns the XLogRecPtr of
+ * the currently logged relfilenumber record, so that the caller can flush it
+ * at the appropriate time.
+ */
+XLogRecPtr
+LogNextRelFileNumber(RelFileNumber nextrelnumber)
+{
+ XLogRecPtr recptr;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) (&nextrelnumber), sizeof(RelFileNumber));
+ recptr = XLogInsert(RM_XLOG_ID, XLOG_NEXT_RELFILENUMBER);
+
+ return recptr;
+}
+
+/*
* Write an XLOG SWITCH record.
*
* Here we just blindly issue an XLogInsert request for the record.
@@ -7755,6 +7799,17 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
}
+ if (info == XLOG_NEXT_RELFILENUMBER)
+ {
+ RelFileNumber nextRelFileNumber;
+
+ memcpy(&nextRelFileNumber, XLogRecGetData(record), sizeof(RelFileNumber));
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
+ }
else if (info == XLOG_CHECKPOINT_SHUTDOWN)
{
CheckPoint checkPoint;
@@ -7769,6 +7824,11 @@ xlog_redo(XLogReaderState *record)
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);
+ LWLockAcquire(RelFileNumberGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->loggedRelFileNumber = checkPoint.nextRelFileNumber;
+ ShmemVariableCache->flushedRelFileNumber = checkPoint.nextRelFileNumber;
+ LWLockRelease(RelFileNumberGenLock);
MultiXactSetNextMXact(checkPoint.nextMulti,
checkPoint.nextMultiOffset);
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index 8f5d425..cea38ec 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -613,7 +613,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u until %X/%X is replayed, which creates the relation",
+ "suppressing prefetch in relation %u/%u/" UINT64_FORMAT " until %X/%X is replayed, which creates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -636,7 +636,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, which truncates the relation",
+ "suppressing prefetch in relation %u/%u/" UINT64_FORMAT " from block %u until %X/%X is replayed, which truncates the relation",
xlrec->rlocator.spcOid,
xlrec->rlocator.dbOid,
xlrec->rlocator.relNumber,
@@ -735,7 +735,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing all prefetch in relation %u/%u/%u until %X/%X is replayed, because the relation does not exist on disk",
+ "suppressing all prefetch in relation %u/%u/" UINT64_FORMAT " until %X/%X is replayed, because the relation does not exist on disk",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -756,7 +756,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "suppressing prefetch in relation %u/%u/%u from block %u until %X/%X is replayed, because the relation is too small",
+ "suppressing prefetch in relation %u/%u/" UINT64_FORMAT " from block %u until %X/%X is replayed, because the relation is too small",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -795,7 +795,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn)
* truncated beneath our feet?
*/
elog(ERROR,
- "could not prefetch relation %u/%u/%u block %u",
+ "could not prefetch relation %u/%u/" UINT64_FORMAT " block %u",
reln->smgr_rlocator.locator.spcOid,
reln->smgr_rlocator.locator.dbOid,
reln->smgr_rlocator.locator.relNumber,
@@ -934,7 +934,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
+ "prefetch of %u/%u/" UINT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (blocks >= %u filtered)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed),
filter->filter_from_block);
@@ -950,7 +950,7 @@ XLogPrefetcherIsFiltered(XLogPrefetcher *prefetcher, RelFileLocator rlocator,
{
#ifdef XLOGPREFETCHER_DEBUG_LEVEL
elog(XLOGPREFETCHER_DEBUG_LEVEL,
- "prefetch of %u/%u/%u block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
+ "prefetch of %u/%u/" UINT64_FORMAT " block %u suppressed; filtering until LSN %X/%X is replayed (whole database)",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber, blockno,
LSN_FORMAT_ARGS(filter->filter_until_replayed));
#endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index b41e682..1026ce5 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2228,14 +2228,14 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
continue;
if (forknum != MAIN_FORKNUM)
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, fork %u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" UINT64_FORMAT ", fork %u, blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
forknum,
blk);
else
- appendStringInfo(buf, "; blkref #%d: rel %u/%u/%u, blk %u",
+ appendStringInfo(buf, "; blkref #%d: rel %u/%u/" UINT64_FORMAT ", blk %u",
block_id,
rlocator.spcOid, rlocator.dbOid,
rlocator.relNumber,
@@ -2433,7 +2433,7 @@ verifyBackupPageConsistency(XLogReaderState *record)
if (memcmp(replay_image_masked, primary_image_masked, BLCKSZ) != 0)
{
elog(FATAL,
- "inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
+ "inconsistent page found, rel %u/%u/" UINT64_FORMAT ", forknum %u, blkno %u",
rlocator.spcOid, rlocator.dbOid, rlocator.relNumber,
forknum, blkno);
}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 563cba2..ffda2c2 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -619,17 +619,17 @@ CreateFakeRelcacheEntry(RelFileLocator rlocator)
rel->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
/* We don't know the name of the relation; use relfilenumber instead */
- sprintf(RelationGetRelationName(rel), "%u", rlocator.relNumber);
+ sprintf(RelationGetRelationName(rel), UINT64_FORMAT, rlocator.relNumber);
/*
* We set up the lockRelId in case anything tries to lock the dummy
- * relation. Note that this is fairly bogus since relNumber may be
+ * relation. Note that this is fairly bogus since relNumber are completely
* different from the relation's OID. It shouldn't really matter though.
* In recovery, we are running by ourselves and can't have any lock
* conflicts. While syncing, we already hold AccessExclusiveLock.
*/
rel->rd_lockInfo.lockRelId.dbId = rlocator.dbOid;
- rel->rd_lockInfo.lockRelId.relId = rlocator.relNumber;
+ rel->rd_lockInfo.lockRelId.relId = (Oid) rlocator.relNumber;
rel->rd_smgr = NULL;
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 411cac9..1434bcd 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1246,7 +1246,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
if (relForkNum != INIT_FORKNUM)
{
char initForkFile[MAXPGPATH];
- char relNumber[OIDCHARS + 1];
+ char relNumber[RELNUMBERCHARS + 1];
/*
* If any other type of fork, check if there is an init fork
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 2abd6b0..a9bd8ae 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -483,101 +483,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
}
/*
- * GetNewRelFileNumber
- * Generate a new relfilenumber that is unique within the
- * database of the given tablespace.
- *
- * If the relfilenumber will also be used as the relation's OID, pass the
- * opened pg_class catalog, and this routine will guarantee that the result
- * is also an unused OID within pg_class. If the result is to be used only
- * as a relfilenumber for an existing relation, pass NULL for pg_class.
- *
- * As with GetNewOidWithIndex(), there is some theoretical risk of a race
- * condition, but it doesn't seem worth worrying about.
- *
- * Note: we don't support using this in bootstrap mode. All relations
- * created by bootstrap have preassigned OIDs, so there's no need.
- */
-RelFileNumber
-GetNewRelFileNumber(Oid reltablespace, Relation pg_class, char relpersistence)
-{
- RelFileLocatorBackend rlocator;
- char *rpath;
- bool collides;
- BackendId backend;
-
- /*
- * If we ever get here during pg_upgrade, there's something wrong; all
- * relfilenumber assignments during a binary-upgrade run should be
- * determined by commands in the dump script.
- */
- Assert(!IsBinaryUpgrade);
-
- switch (relpersistence)
- {
- case RELPERSISTENCE_TEMP:
- backend = BackendIdForTempRelations();
- break;
- case RELPERSISTENCE_UNLOGGED:
- case RELPERSISTENCE_PERMANENT:
- backend = InvalidBackendId;
- break;
- default:
- elog(ERROR, "invalid relpersistence: %c", relpersistence);
- return InvalidRelFileNumber; /* placate compiler */
- }
-
- /* This logic should match RelationInitPhysicalAddr */
- rlocator.locator.spcOid = reltablespace ? reltablespace : MyDatabaseTableSpace;
- rlocator.locator.dbOid =
- (rlocator.locator.spcOid == GLOBALTABLESPACE_OID) ?
- InvalidOid : MyDatabaseId;
-
- /*
- * The relpath will vary based on the backend ID, so we must initialize
- * that properly here to make sure that any collisions based on filename
- * are properly detected.
- */
- rlocator.backend = backend;
-
- do
- {
- CHECK_FOR_INTERRUPTS();
-
- /* Generate the OID */
- if (pg_class)
- rlocator.locator.relNumber = GetNewOidWithIndex(pg_class, ClassOidIndexId,
- Anum_pg_class_oid);
- else
- rlocator.locator.relNumber = GetNewObjectId();
-
- /* Check for existing file of same name */
- rpath = relpath(rlocator, MAIN_FORKNUM);
-
- if (access(rpath, F_OK) == 0)
- {
- /* definite collision */
- collides = true;
- }
- else
- {
- /*
- * Here we have a little bit of a dilemma: if errno is something
- * other than ENOENT, should we declare a collision and loop? In
- * practice it seems best to go ahead regardless of the errno. If
- * there is a colliding file we will get an smgr failure when we
- * attempt to create the new relation file.
- */
- collides = false;
- }
-
- pfree(rpath);
- } while (collides);
-
- return rlocator.locator.relNumber;
-}
-
-/*
* SQL callable interface for GetNewOidWithIndex(). Outside of initdb's
* direct insertions into catalog tables, and recovering from corruption, this
* should rarely be needed.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9a80ccd..e12efe7 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -342,10 +342,18 @@ heap_create(const char *relname,
{
/*
* If relfilenumber is unspecified by the caller then create storage
- * with oid same as relid.
+ * with relfilenumber same as relid if it is a system table otherwise
+ * allocate a new relfilenumber. For more details read comments atop
+ * FirstNormalRelFileNumber declaration.
*/
if (!RelFileNumberIsValid(relfilenumber))
- relfilenumber = relid;
+ {
+ if (relid < FirstNormalObjectId)
+ relfilenumber = relid;
+ else
+ relfilenumber = GetNewRelFileNumber(reltablespace,
+ relpersistence);
+ }
}
/*
@@ -901,7 +909,7 @@ InsertPgClassTuple(Relation pg_class_desc,
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
- values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
+ values[Anum_pg_class_relfilenode - 1] = Int64GetDatum(rd_rel->relfilenode);
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
@@ -1173,12 +1181,7 @@ heap_create_with_catalog(const char *relname,
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global tablespace");
- /*
- * Allocate an OID for the relation, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the relation, unless we were told what to use. */
if (!OidIsValid(relid))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -1232,8 +1235,8 @@ heap_create_with_catalog(const char *relname,
}
if (!OidIsValid(relid))
- relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
- relpersistence);
+ relid = GetNewOidWithIndex(pg_class_desc, ClassOidIndexId,
+ Anum_pg_class_oid);
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 61f1d39..1fd40c4 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -898,12 +898,7 @@ index_create(Relation heapRelation,
collationObjectId,
classObjectId);
- /*
- * Allocate an OID for the index, unless we were told what to use.
- *
- * The OID will be the relfilenumber as well, so make sure it doesn't
- * collide with either pg_class OIDs or existing physical files.
- */
+ /* Allocate an OID for the index, unless we were told what to use. */
if (!OidIsValid(indexRelationId))
{
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
@@ -935,8 +930,8 @@ index_create(Relation heapRelation,
}
else
{
- indexRelationId =
- GetNewRelFileNumber(tableSpaceId, pg_class, relpersistence);
+ indexRelationId = GetNewOidWithIndex(pg_class, ClassOidIndexId,
+ Anum_pg_class_oid);
}
}
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index d708af1..021e085 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -968,6 +968,10 @@ smgr_redo(XLogReaderState *record)
xl_smgr_create *xlrec = (xl_smgr_create *) XLogRecGetData(record);
SMgrRelation reln;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " UINT64_FORMAT " that is bigger than nextRelFileNumber " UINT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
smgrcreate(reln, xlrec->forkNum, true);
}
@@ -981,6 +985,10 @@ smgr_redo(XLogReaderState *record)
int nforks = 0;
bool need_fsm_vacuum = false;
+ if (xlrec->rlocator.relNumber > ShmemVariableCache->nextRelFileNumber)
+ elog(ERROR, "unexpected relnumber " UINT64_FORMAT "that is bigger than nextRelFileNumber " UINT64_FORMAT,
+ xlrec->rlocator.relNumber, ShmemVariableCache->nextRelFileNumber);
+
reln = smgropen(xlrec->rlocator, InvalidBackendId);
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7d8a75d..1b8e6d5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14375,10 +14375,14 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
}
/*
- * Relfilenumbers are not unique in databases across tablespaces, so we
- * need to allocate a new one in the new tablespace.
- */
- newrelfilenumber = GetNewRelFileNumber(newTableSpace, NULL,
+ * Generate a new relfilenumber. We cannot reuse the old relfilenumber
+ * because of the possibility that that relation will be moved back to the
+ * original tablespace before the next checkpoint. At that point, the
+ * first segment of the main fork won't have been unlinked yet, and an
+ * attempt to create new relation storage with that same relfilenumber
+ * will fail.
+ */
+ newrelfilenumber = GetNewRelFileNumber(newTableSpace,
rel->rd_rel->relpersistence);
/* Open old and new relation */
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index b69ff37..cdd7986 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -267,7 +267,7 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
* parts.
*/
if (strlen(location) + 1 + strlen(TABLESPACE_VERSION_DIRECTORY) + 1 +
- OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
+ OIDCHARS + 1 + RELNUMBERCHARS + 1 + FORKNAMECHARS + 1 + OIDCHARS > MAXPGPATH)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("tablespace location \"%s\" is too long",
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 81b8c18..f1fa894 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -961,12 +961,12 @@ _read${n}(void)
print $off "\tWRITE_UINT_FIELD($f);\n";
print $rff "\tREAD_UINT_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'uint64')
+ elsif ($t eq 'uint64' || $t eq 'RelFileNumber')
{
print $off "\tWRITE_UINT64_FIELD($f);\n";
print $rff "\tREAD_UINT64_FIELD($f);\n" unless $no_read;
}
- elsif ($t eq 'Oid' || $t eq 'RelFileNumber')
+ elsif ($t eq 'Oid')
{
print $off "\tWRITE_OID_FIELD($f);\n";
print $rff "\tREAD_OID_FIELD($f);\n" unless $no_read;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 2cc0ac9..cdf19a9 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -154,6 +154,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
break;
case XLOG_NOOP:
case XLOG_NEXTOID:
+ case XLOG_NEXT_RELFILENUMBER:
case XLOG_SWITCH:
case XLOG_BACKUP_END:
case XLOG_PARAMETER_CHANGE:
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 03d9c9c..a0f398b 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4932,7 +4932,7 @@ DisplayMapping(HTAB *tuplecid_data)
hash_seq_init(&hstat, tuplecid_data);
while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
{
- elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
+ elog(DEBUG3, "mapping: node: %u/%u/" UINT64_FORMAT " tid: %u/%u cmin: %u, cmax: %u",
ent->key.rlocator.dbOid,
ent->key.rlocator.spcOid,
ent->key.rlocator.relNumber,
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 647c458..c3faa68 100644
--- a/src/backend/storage/file/reinit.c
+++ b/src/backend/storage/file/reinit.c
@@ -31,7 +31,7 @@ static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
typedef struct
{
- Oid reloid; /* hash key */
+ RelFileNumber relnumber; /* hash key */
} unlogged_relation_entry;
/*
@@ -184,10 +184,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* need to be reset. Otherwise, this cleanup operation would be
* O(n^2).
*/
- ctl.keysize = sizeof(Oid);
+ ctl.keysize = sizeof(RelFileNumber);
ctl.entrysize = sizeof(unlogged_relation_entry);
ctl.hcxt = CurrentMemoryContext;
- hash = hash_create("unlogged relation OIDs", 32, &ctl,
+ hash = hash_create("unlogged relation RelFileNumbers", 32, &ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
/* Scan the directory. */
@@ -208,10 +208,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * Put the OID portion of the name into the hash table, if it
- * isn't already.
+ * Put the RELFILENUMBER portion of the name into the hash table,
+ * if it isn't already.
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
(void) hash_search(hash, &ent, HASH_ENTER, NULL);
}
@@ -248,10 +248,10 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
continue;
/*
- * See whether the OID portion of the name shows up in the hash
- * table. If so, nuke it!
+ * See whether the RELFILENUMBER portion of the name shows up in
+ * the hash table. If so, nuke it!
*/
- ent.reloid = atooid(de->d_name);
+ ent.relnumber = atorelnumber(de->d_name);
if (hash_search(hash, &ent, HASH_FIND, NULL))
{
snprintf(rm_path, sizeof(rm_path), "%s/%s",
@@ -286,7 +286,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char srcpath[MAXPGPATH * 2];
char dstpath[MAXPGPATH];
@@ -329,7 +329,7 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
{
ForkNumber forkNum;
int relnumchars;
- char relnumbuf[OIDCHARS + 1];
+ char relnumbuf[RELNUMBERCHARS + 1];
char mainpath[MAXPGPATH];
/* Skip anything that doesn't look like a relation data file. */
@@ -372,8 +372,8 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op)
* for a non-temporary relation and false otherwise.
*
* NB: If this function returns true, the caller is entitled to assume that
- * *relnumchars has been set to a value no more than OIDCHARS, and thus
- * that a buffer of OIDCHARS+1 characters is sufficient to hold the
+ * *relnumchars has been set to a value no more than RELNUMBERCHARS, and thus
+ * that a buffer of RELNUMBERCHARS+1 characters is sufficient to hold the
* RelFileNumber portion of the filename. This is critical to protect against
* a possible buffer overrun.
*/
@@ -386,7 +386,7 @@ parse_filename_for_nontemp_relation(const char *name, int *relnumchars,
/* Look for a non-empty string of digits (that isn't too long). */
for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
;
- if (pos == 0 || pos > OIDCHARS)
+ if (pos == 0 || pos > RELNUMBERCHARS)
return false;
*relnumchars = pos;
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index af4dab7..1210be7 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -273,7 +273,7 @@ restart:
BlockNumber blknum;
BufferGetTag(buf, &rlocator, &forknum, &blknum);
- elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
+ elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/" UINT64_FORMAT,
blknum, rlocator.spcOid, rlocator.dbOid, rlocator.relNumber);
/* make sure we hold an exclusive lock */
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 6c7cf6c..3c5d041 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+RelFileNumberGenLock 48
\ No newline at end of file
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index a515bb3..bed47f0 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -257,6 +257,13 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
* next checkpoint, we prevent reassignment of the relfilenumber until it's
* safe, because relfilenumber assignment skips over any existing file.
*
+ * XXX. Although all of this was true when relfilenumbers were 32 bits wide,
+ * they are now 56 bits wide and do not wrap around, so in the future we can
+ * change the code to immediately unlink the first segment of the relation
+ * along with all the others. We still do reuse relfilenumbers when createdb()
+ * is performed using the file-copy method or during movedb(), but the scenario
+ * described above can only happen when creating a new relation.
+ *
* We do not need to go through this dance for temp relations, though, because
* we never make WAL entries for temp rels, and so a temp rel poses no threat
* to the health of a regular rel that has taken over its relfilenumber.
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index c1a5feb..ed46ac3 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -154,7 +154,7 @@ smgropen(RelFileLocator rlocator, BackendId backend)
/* First time through: initialize the hash table */
HASHCTL ctl;
- ctl.keysize = sizeof(RelFileLocatorBackend);
+ ctl.keysize = SizeOfRelFileLocatorBackend;
ctl.entrysize = sizeof(SMgrRelationData);
SMgrRelationHash = hash_create("smgr relation table", 400,
&ctl, HASH_ELEM | HASH_BLOBS);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 34efa12..9f70f35 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -878,7 +878,7 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
if (!RelFileNumberIsValid(result))
PG_RETURN_NULL();
- PG_RETURN_OID(result);
+ PG_RETURN_INT64(result);
}
/*
@@ -898,9 +898,12 @@ Datum
pg_filenode_relation(PG_FUNCTION_ARGS)
{
Oid reltablespace = PG_GETARG_OID(0);
- RelFileNumber relfilenumber = PG_GETARG_OID(1);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(1);
Oid heaprel;
+ /* check whether the relfilenumber is within a valid range */
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
+
/* test needed so RelidByRelfilenumber doesn't misbehave */
if (!RelFileNumberIsValid(relfilenumber))
PG_RETURN_NULL();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 797f5f5..fc2faed 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "commands/extension.h"
#include "miscadmin.h"
+#include "storage/relfilelocator.h"
#include "utils/array.h"
#include "utils/builtins.h"
@@ -98,10 +99,12 @@ binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_heap_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_heap_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -120,10 +123,12 @@ binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_index_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_index_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
@@ -142,10 +147,12 @@ binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
Datum
binary_upgrade_set_next_toast_relfilenode(PG_FUNCTION_ARGS)
{
- RelFileNumber relfilenumber = PG_GETARG_OID(0);
+ RelFileNumber relfilenumber = PG_GETARG_INT64(0);
CHECK_IS_BINARY_UPGRADE;
+ CHECK_RELFILENUMBER_RANGE(relfilenumber);
binary_upgrade_next_toast_pg_class_relfilenumber = relfilenumber;
+ SetNextRelFileNumber(relfilenumber + 1);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 00dc0f2..6f4e96d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3712,7 +3712,7 @@ RelationSetNewRelfilenumber(Relation relation, char persistence)
{
/* Allocate a new relfilenumber */
newrelfilenumber = GetNewRelFileNumber(relation->rd_rel->reltablespace,
- NULL, persistence);
+ persistence);
}
else if (relation->rd_rel->relkind == RELKIND_INDEX)
{
diff --git a/src/backend/utils/cache/relfilenumbermap.c b/src/backend/utils/cache/relfilenumbermap.c
index c4245d5..2e0acf9 100644
--- a/src/backend/utils/cache/relfilenumbermap.c
+++ b/src/backend/utils/cache/relfilenumbermap.c
@@ -196,7 +196,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
/* set scan arguments */
skey[0].sk_argument = ObjectIdGetDatum(reltablespace);
- skey[1].sk_argument = ObjectIdGetDatum(relfilenumber);
+ skey[1].sk_argument = Int64GetDatum((int64) relfilenumber);
scandesc = systable_beginscan(relation,
ClassTblspcRelfilenodeIndexId,
@@ -213,7 +213,7 @@ RelidByRelfilenumber(Oid reltablespace, RelFileNumber relfilenumber)
if (found)
elog(ERROR,
- "unexpected duplicate for tablespace %u, relfilenumber %u",
+ "unexpected duplicate for tablespace %u, relfilenumber " UINT64_FORMAT,
reltablespace, relfilenumber);
found = true;
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 781f8b8..d441cd9 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
@@ -202,6 +204,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[17] = TimestampTzGetDatum(time_t_to_timestamptz(ControlFile->checkPointCopy.time));
nulls[17] = false;
+ values[18] = Int64GetDatum((int64) ControlFile->checkPointCopy.nextRelFileNumber);
+ nulls[18] = false;
+
htup = heap_form_tuple(tupdesc, values, nulls);
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 324ccf7..ddb5ec1 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -485,9 +485,7 @@ main(int argc, char *argv[])
mode = PG_MODE_ENABLE;
break;
case 'f':
- if (!option_parse_int(optarg, "-f/--filenode", 0,
- INT_MAX,
- NULL))
+ if (!option_parse_relfilenumber(optarg, "-f/--filenode"))
exit(1);
only_filenode = pstrdup(optarg);
break;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec5..2f0e91f 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -250,6 +250,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
EpochFromFullTransactionId(ControlFile->checkPointCopy.nextXid),
XidFromFullTransactionId(ControlFile->checkPointCopy.nextXid));
+ printf(_("Latest checkpoint's NextRelFileNumber:%llu\n"),
+ (unsigned long long) ControlFile->checkPointCopy.nextRelFileNumber);
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bd9b066..9f78971 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3184,15 +3184,15 @@ dumpDatabase(Archive *fout)
atooid(PQgetvalue(lo_res, i, ii_oid)));
oid = atooid(PQgetvalue(lo_res, i, ii_oid));
- relfilenumber = atooid(PQgetvalue(lo_res, i, ii_relfilenode));
+ relfilenumber = atorelnumber(PQgetvalue(lo_res, i, ii_relfilenode));
if (oid == LargeObjectRelationId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
else if (oid == LargeObjectLOidPNIndexId)
appendPQExpBuffer(loOutQry,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
@@ -4877,16 +4877,16 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
relkind = *PQgetvalue(upgrade_res, 0, PQfnumber(upgrade_res, "relkind"));
- relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "relfilenode")));
+ relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "relfilenode")));
toast_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "reltoastrelid")));
- toast_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_relfilenode")));
+ toast_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_relfilenode")));
toast_index_oid = atooid(PQgetvalue(upgrade_res, 0,
PQfnumber(upgrade_res, "indexrelid")));
- toast_index_relfilenumber = atooid(PQgetvalue(upgrade_res, 0,
- PQfnumber(upgrade_res, "toast_index_relfilenode")));
+ toast_index_relfilenumber = atorelnumber(PQgetvalue(upgrade_res, 0,
+ PQfnumber(upgrade_res, "toast_index_relfilenode")));
appendPQExpBufferStr(upgrade_buffer,
"\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
@@ -4904,7 +4904,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
*/
if (RelFileNumberIsValid(relfilenumber) && relkind != RELKIND_PARTITIONED_TABLE)
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
/*
@@ -4918,7 +4918,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_toast_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
toast_relfilenumber);
/* every toast table has an index */
@@ -4926,7 +4926,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
toast_index_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
toast_index_relfilenumber);
}
@@ -4939,7 +4939,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
"SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
pg_class_oid);
appendPQExpBuffer(upgrade_buffer,
- "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('%u'::pg_catalog.oid);\n",
+ "SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('" UINT64_FORMAT "'::pg_catalog.int8);\n",
relfilenumber);
}
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 269ed64..197ec0e 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -538,7 +538,7 @@ isRelDataFile(const char *path)
segNo = 0;
matched = false;
- nmatch = sscanf(path, "global/%u.%u", &rlocator.relNumber, &segNo);
+ nmatch = sscanf(path, "global/" UINT64_FORMAT ".%u", &rlocator.relNumber, &segNo);
if (nmatch == 1 || nmatch == 2)
{
rlocator.spcOid = GLOBALTABLESPACE_OID;
@@ -547,7 +547,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "base/%u/%u.%u",
+ nmatch = sscanf(path, "base/%u/" UINT64_FORMAT ".%u",
&rlocator.dbOid, &rlocator.relNumber, &segNo);
if (nmatch == 2 || nmatch == 3)
{
@@ -556,7 +556,7 @@ isRelDataFile(const char *path)
}
else
{
- nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/%u.%u",
+ nmatch = sscanf(path, "pg_tblspc/%u/" TABLESPACE_VERSION_DIRECTORY "/%u/" UINT64_FORMAT ".%u",
&rlocator.spcOid, &rlocator.dbOid, &rlocator.relNumber,
&segNo);
if (nmatch == 3 || nmatch == 4)
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index f18cf97..0c712a6 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -527,7 +527,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
relname = PQgetvalue(res, relnum, i_relname);
curr->relname = pg_strdup(relname);
- curr->relfilenumber = atooid(PQgetvalue(res, relnum, i_relfilenumber));
+ curr->relfilenumber =
+ atorelnumber(PQgetvalue(res, relnum, i_relfilenumber));
curr->tblsp_alloc = false;
/* Is the tablespace oid non-default? */
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 115faa2..7ab1bcc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -15,10 +15,8 @@
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
- * While pg_class.oid and pg_class.relfilenode are initially the same in a
- * cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM FULL. We
- * control assignments of pg_class.relfilenode because we want the filenames
- * to match between the old and new cluster.
+ * We control assignments of pg_class.relfilenode because we want the
+ * filenames to match between the old and new cluster.
*
* We control assignment of pg_tablespace.oid because we want the oid to match
* between the old and new cluster.
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index c3f3d6b..529267d 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -190,14 +190,14 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
else
snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
- snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+ snprintf(old_file, sizeof(old_file), "%s%s/%u/" UINT64_FORMAT "%s%s",
map->old_tablespace,
map->old_tablespace_suffix,
map->db_oid,
map->relfilenumber,
type_suffix,
extent_suffix);
- snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+ snprintf(new_file, sizeof(new_file), "%s%s/%u/" UINT64_FORMAT "%s%s",
map->new_tablespace,
map->new_tablespace_suffix,
map->db_oid,
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 9993378..6fdc7dc 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -884,7 +884,7 @@ main(int argc, char **argv)
}
break;
case 'R':
- if (sscanf(optarg, "%u/%u/%u",
+ if (sscanf(optarg, "%u/%u/" UINT64_FORMAT,
&config.filter_by_relation.spcOid,
&config.filter_by_relation.dbOid,
&config.filter_by_relation.relNumber) != 3 ||
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index e706d68..de5cee6 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -40,7 +40,7 @@ my $toast_index = $node->safe_psql('postgres',
# REINDEX operations. A set of relfilenodes is saved from the catalogs
# and then compared with pg_class.
$node->safe_psql('postgres',
- 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode oid);'
+ 'CREATE TABLE index_relfilenodes (parent regclass, indname text, indoid oid, relfilenode int8);'
);
# Save the relfilenode of a set of toast indexes, one from the catalog
# pg_constraint and one from the test table.
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 1b6b620..d0d83e5 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -149,10 +149,10 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
Assert(dbOid == 0);
Assert(backendId == InvalidBackendId);
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("global/%u_%s",
+ path = psprintf("global/" UINT64_FORMAT "_%s",
relNumber, forkNames[forkNumber]);
else
- path = psprintf("global/%u", relNumber);
+ path = psprintf("global/" UINT64_FORMAT, relNumber);
}
else if (spcOid == DEFAULTTABLESPACE_OID)
{
@@ -160,21 +160,21 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/%u_%s",
+ path = psprintf("base/%u/" UINT64_FORMAT "_%s",
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/%u",
+ path = psprintf("base/%u/" UINT64_FORMAT,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("base/%u/t%d_%u_%s",
+ path = psprintf("base/%u/t%d_" UINT64_FORMAT "_%s",
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("base/%u/t%d_%u",
+ path = psprintf("base/%u/t%d_" UINT64_FORMAT,
dbOid, backendId, relNumber);
}
}
@@ -184,24 +184,24 @@ GetRelationPath(Oid dbOid, Oid spcOid, RelFileNumber relNumber,
if (backendId == InvalidBackendId)
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/" UINT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/" UINT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, relNumber);
}
else
{
if (forkNumber != MAIN_FORKNUM)
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u_%s",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" UINT64_FORMAT "_%s",
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber,
forkNames[forkNumber]);
else
- path = psprintf("pg_tblspc/%u/%s/%u/t%d_%u",
+ path = psprintf("pg_tblspc/%u/%s/%u/t%d_" UINT64_FORMAT,
spcOid, TABLESPACE_VERSION_DIRECTORY,
dbOid, backendId, relNumber);
}
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index abea881..50ddf74 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -13,6 +13,7 @@
#include "postgres_fe.h"
#include "common/logging.h"
+#include "common/relpath.h"
#include "common/string.h"
#include "fe_utils/option_utils.h"
@@ -82,3 +83,42 @@ option_parse_int(const char *optarg, const char *optname,
*result = val;
return true;
}
+
+/*
+ * option_parse_relfilenumber
+ *
+ * Parse relfilenumber value for an option. If the parsing is successful,
+ * returns; if parsing fails, returns false.
+ */
+bool
+option_parse_relfilenumber(const char *optarg, const char *optname)
+{
+ char *endptr;
+ uint64 val;
+
+ errno = 0;
+ val = strtou64(optarg, &endptr, 10);
+
+ /*
+ * Skip any trailing whitespace; if anything but whitespace remains before
+ * the terminating character, fail.
+ */
+ while (*endptr != '\0' && isspace((unsigned char) *endptr))
+ endptr++;
+
+ if (*endptr != '\0')
+ {
+ pg_log_error("invalid value \"%s\" for option %s",
+ optarg, optname);
+ return false;
+ }
+
+ if (val < 0 || val > MAX_RELFILENUMBER)
+ {
+ pg_log_error("%s must be in range " UINT64_FORMAT ".." UINT64_FORMAT,
+ optname, (RelFileNumber) 0, MAX_RELFILENUMBER);
+ return false;
+ }
+
+ return true;
+}
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 775471d..06fb2da 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -15,6 +15,7 @@
#define TRANSAM_H
#include "access/xlogdefs.h"
+#include "common/relpath.h"
/* ----------------
@@ -196,6 +197,28 @@ FullTransactionIdAdvance(FullTransactionId *dest)
#define FirstUnpinnedObjectId 12000
#define FirstNormalObjectId 16384
+/* ----------
+ * For the system tables (OID < FirstNormalObjectId) the initial storage will
+ * be created with the relfilenumber same as their Oid. And, later for any
+ * storage the relfilenumber allocated by GetNewRelFileNumber() and it will
+ * start at 100000. Thus, when upgrading from an older cluster, the relation
+ * storage path for the user table from the old cluster will not conflict with
+ * the relation storage path for the system table from the new cluster.
+ * Anyway, the new cluster must not have any user tables while upgrading, so we
+ * needn't worry about them.
+ * ----------
+ */
+#define FirstNormalRelFileNumber ((RelFileNumber) 100000)
+
+#define CHECK_RELFILENUMBER_RANGE(relfilenumber) \
+do { \
+ if ((relfilenumber) < 0 || (relfilenumber) > MAX_RELFILENUMBER) \
+ ereport(ERROR, \
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
+ errmsg("relfilenumber %llu is out of range", \
+ (unsigned long long) (relfilenumber))); \
+} while (0)
+
/*
* VariableCache is a data structure in shared memory that is used to track
* OID and XID assignment state. For largely historical reasons, there is
@@ -215,6 +238,15 @@ typedef struct VariableCacheData
uint32 oidCount; /* OIDs available before must do XLOG work */
/*
+ * These fields are protected by RelFileNumberGenLock.
+ */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber to assign */
+ RelFileNumber loggedRelFileNumber; /* last logged relfilenumber */
+ RelFileNumber flushedRelFileNumber; /* last flushed relfilenumber */
+ XLogRecPtr loggedRelFileNumberRecPtr; /* xlog record pointer w.r.t.
+ * loggedRelFileNumber */
+
+ /*
* These fields are protected by XidGenLock.
*/
FullTransactionId nextXid; /* next XID to assign */
@@ -293,6 +325,9 @@ extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
+ char relpersistence);
+extern void SetNextRelFileNumber(RelFileNumber relnumber);
extern void StopGeneratingPinnedObjectIds(void);
#ifdef USE_ASSERT_CHECKING
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index dce2650..5337586 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -236,6 +236,7 @@ extern void CreateCheckPoint(int flags);
extern bool CreateRestartPoint(int flags);
extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
extern void XLogPutNextOid(Oid nextOid);
+extern XLogRecPtr LogNextRelFileNumber(RelFileNumber nextrelnumber);
extern XLogRecPtr XLogRestorePoint(const char *rpName);
extern void UpdateFullPageWrites(void);
extern void GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p);
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index e1c85f9..b452530 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -38,8 +38,5 @@ extern bool IsPinnedObject(Oid classId, Oid objectId);
extern Oid GetNewOidWithIndex(Relation relation, Oid indexId,
AttrNumber oidcolumn);
-extern RelFileNumber GetNewRelFileNumber(Oid reltablespace,
- Relation pg_class,
- char relpersistence);
#endif /* CATALOG_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index e1f4eef..4768e5e 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -34,6 +34,13 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* oid */
Oid oid;
+ /* access method; 0 if not a table / index */
+ Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
+
+ /* identifier of physical storage file */
+ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
+ int64 relfilenode BKI_DEFAULT(0);
+
/* class name */
NameData relname;
@@ -49,13 +56,6 @@ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,Relat
/* class owner */
Oid relowner BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
- /* access method; 0 if not a table / index */
- Oid relam BKI_DEFAULT(heap) BKI_LOOKUP_OPT(pg_am);
-
- /* identifier of physical storage file */
- /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
- Oid relfilenode BKI_DEFAULT(0);
-
/* identifier of table space for relation (0 means default for database) */
Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP_OPT(pg_tablespace);
@@ -154,7 +154,7 @@ typedef FormData_pg_class *Form_pg_class;
DECLARE_UNIQUE_INDEX_PKEY(pg_class_oid_index, 2662, ClassOidIndexId, on pg_class using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, ClassNameNspIndexId, on pg_class using btree(relname name_ops, relnamespace oid_ops));
-DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, ClassTblspcRelfilenodeIndexId, on pg_class using btree(reltablespace oid_ops, relfilenode int8_ops));
#ifdef EXPOSE_TO_CLIENT_CODE
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2..096222f 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -41,6 +41,7 @@ typedef struct CheckPoint
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextXid; /* next free transaction ID */
+ RelFileNumber nextRelFileNumber; /* next relfilenumber */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
@@ -78,6 +79,7 @@ typedef struct CheckPoint
#define XLOG_FPI 0xB0
/* 0xC0 is used in Postgres 9.5-11 */
#define XLOG_OVERWRITE_CONTRECORD 0xD0
+#define XLOG_NEXT_RELFILENUMBER 0xE0
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a07e737..8b72f8a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7329,11 +7329,11 @@
proname => 'pg_indexes_size', provolatile => 'v', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_indexes_size' },
{ oid => '2999', descr => 'filenode identifier of relation',
- proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'oid',
+ proname => 'pg_relation_filenode', provolatile => 's', prorettype => 'int8',
proargtypes => 'regclass', prosrc => 'pg_relation_filenode' },
{ oid => '3454', descr => 'relation OID for filenode and tablespace',
proname => 'pg_filenode_relation', provolatile => 's',
- prorettype => 'regclass', proargtypes => 'oid oid',
+ prorettype => 'regclass', proargtypes => 'oid int8',
prosrc => 'pg_filenode_relation' },
{ oid => '3034', descr => 'file path of relation',
proname => 'pg_relation_filepath', provolatile => 's', prorettype => 'text',
@@ -11125,15 +11125,15 @@
prosrc => 'binary_upgrade_set_missing_value' },
{ oid => '4545', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_heap_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_heap_relfilenode' },
{ oid => '4546', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_index_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_index_relfilenode' },
{ oid => '4547', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_toast_relfilenode', provolatile => 'v',
- proparallel => 'u', prorettype => 'void', proargtypes => 'oid',
+ proparallel => 'u', prorettype => 'void', proargtypes => 'int8',
prosrc => 'binary_upgrade_set_next_toast_relfilenode' },
{ oid => '4548', descr => 'for use by pg_upgrade',
proname => 'binary_upgrade_set_next_pg_tablespace_oid', provolatile => 'v',
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 4bbd943..2d3b52f 100644
--- a/src/include/common/relpath.h
+++ b/src/include/common/relpath.h
@@ -22,10 +22,12 @@
/*
* RelFileNumber data type identifies the specific relation file name.
*/
-typedef Oid RelFileNumber;
-#define InvalidRelFileNumber ((RelFileNumber) InvalidOid)
+typedef uint64 RelFileNumber;
+#define InvalidRelFileNumber ((RelFileNumber) 0)
#define RelFileNumberIsValid(relnumber) \
((bool) ((relnumber) != InvalidRelFileNumber))
+#define atorelnumber(x) ((RelFileNumber) strtou64((x), NULL, 10))
+#define MAX_RELFILENUMBER UINT64CONST(0x00FFFFFFFFFFFFFF)
/*
* Name of major-version-specific tablespace subdirectories
@@ -35,6 +37,7 @@ typedef Oid RelFileNumber;
/* Characters to allow for an OID in a relation path */
#define OIDCHARS 10 /* max chars printed by %u */
+#define RELNUMBERCHARS 20 /* max chars printed by UINT64_FORMAT */
/*
* Stuff for fork names.
diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index 03c09fd..2508a61 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -22,5 +22,7 @@ extern void handle_help_version_opts(int argc, char *argv[],
extern bool option_parse_int(const char *optarg, const char *optname,
int min_range, int max_range,
int *result);
+extern bool option_parse_relfilenumber(const char *optarg,
+ const char *optname);
#endif /* OPTION_UTILS_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 406db6b..d48aa6d 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -92,29 +92,69 @@ typedef struct buftag
{
Oid spcOid; /* tablespace oid */
Oid dbOid; /* database oid */
- RelFileNumber relNumber; /* relation file number */
- ForkNumber forkNum; /* fork number */
+
+ /*
+ * Represents relfilenumber and the fork number. The 8 high bits of the
+ * first 32 bit integer represents the fork number and remaining 24 bits
+ * of the first integer and the 32 bits of the second integer represents
+ * relfilenumber that makes relfilenumber 56 bits wide. The reason behind
+ * making it 56 bits wide instead of directly making 64 bits wide is that
+ * if we make it 64 bits wide then the size of the BufferTag will be
+ * increased. And also instead of using single 64 bits integer we are
+ * using 2 32 bits integer in order to avoid the 8 byte alignment padding
+ * for BufferTag structure.
+ */
+ uint32 relForkDetails[2];
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
+/* High relNumber bits in relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_BITS 24
+
+/* Low relNumber bits in relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_BITS 32
+
+/* Mask to fetch high bits of relNumber from relForkDetails[0] */
+#define BUFTAG_RELNUM_HIGH_MASK ((1U << BUFTAG_RELNUM_HIGH_BITS) - 1)
+
+/* Mask to fetch low bits of relNumber from relForkDetails[1] */
+#define BUFTAG_RELNUM_LOW_MASK 0XFFFFFFFF
+
static inline RelFileNumber
BufTagGetRelNumber(const BufferTag *tag)
{
- return tag->relNumber;
+ uint64 relnum;
+
+ relnum = ((uint64) tag->relForkDetails[0]) & BUFTAG_RELNUM_HIGH_MASK;
+ relnum = (relnum << BUFTAG_RELNUM_LOW_BITS) | tag->relForkDetails[1];
+
+ Assert(relnum <= MAX_RELFILENUMBER);
+ return (RelFileNumber) relnum;
}
static inline ForkNumber
BufTagGetForkNum(const BufferTag *tag)
{
- return tag->forkNum;
+ ForkNumber ret;
+
+ StaticAssertStmt(MAX_FORKNUM <= INT8_MAX,
+ "MAX_FORKNUM can't be greater than INT8_MAX");
+
+ ret = (int8) (tag->relForkDetails[0] >> BUFTAG_RELNUM_HIGH_BITS);
+ return ret;
}
static inline void
BufTagSetRelForkDetails(BufferTag *tag, RelFileNumber relnumber,
ForkNumber forknum)
{
- tag->relNumber = relnumber;
- tag->forkNum = forknum;
+ Assert(relnumber <= MAX_RELFILENUMBER);
+ Assert(forknum <= MAX_FORKNUM);
+
+ tag->relForkDetails[0] = (relnumber >> BUFTAG_RELNUM_LOW_BITS) &
+ BUFTAG_RELNUM_HIGH_MASK;
+ tag->relForkDetails[0] |= (forknum << BUFTAG_RELNUM_HIGH_BITS);
+ tag->relForkDetails[1] = relnumber & BUFTAG_RELNUM_LOW_MASK;
}
static inline RelFileLocator
@@ -153,9 +193,9 @@ BufferTagsEqual(const BufferTag *tag1, const BufferTag *tag2)
{
return (tag1->spcOid == tag2->spcOid) &&
(tag1->dbOid == tag2->dbOid) &&
- (tag1->relNumber == tag2->relNumber) &&
- (tag1->blockNum == tag2->blockNum) &&
- (tag1->forkNum == tag2->forkNum);
+ (tag1->relForkDetails[0] == tag2->relForkDetails[0]) &&
+ (tag1->relForkDetails[1] == tag2->relForkDetails[1]) &&
+ (tag1->blockNum == tag2->blockNum);
}
static inline bool
diff --git a/src/include/storage/relfilelocator.h b/src/include/storage/relfilelocator.h
index 10f41f3..ef90464 100644
--- a/src/include/storage/relfilelocator.h
+++ b/src/include/storage/relfilelocator.h
@@ -32,10 +32,11 @@
* Nonzero dbOid values correspond to pg_database.oid.
*
* relNumber identifies the specific relation. relNumber corresponds to
- * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
- * to assign new physical files to relations in some situations).
- * Notice that relNumber is only unique within a database in a particular
- * tablespace.
+ * pg_class.relfilenode. Notice that relNumber values are assigned by
+ * GetNewRelFileNumber(), which will only ever assign the same value once
+ * during the lifetime of a cluster. However, since CREATE DATABASE duplicates
+ * the relfilenumbers of the template database, the values are in practice only
+ * unique within a database, not globally.
*
* Note: spcOid must be GLOBALTABLESPACE_OID if and only if dbOid is
* zero. We support shared relations only in the "global" tablespace.
@@ -75,6 +76,9 @@ typedef struct RelFileLocatorBackend
BackendId backend;
} RelFileLocatorBackend;
+#define SizeOfRelFileLocatorBackend \
+ (offsetof(RelFileLocatorBackend, backend) + sizeof(BackendId))
+
#define RelFileLocatorBackendIsTemp(rlocator) \
((rlocator).backend != InvalidBackendId)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 346f594..86666b8 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2164,9 +2164,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2175,10 +2174,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+---------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | t | own | child 0 index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | t | own | child 1 index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | t | orig | child 0 index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | t | orig | child 1 index
at_partitioned_id_name_key | t | none | parent index
(6 rows)
@@ -2198,9 +2197,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -2209,10 +2207,10 @@ select relname,
relname | orig_oid | storage | desc
------------------------------+----------+---------+--------------
at_partitioned | t | none |
- at_partitioned_0 | t | own |
- at_partitioned_0_id_name_key | f | own | parent index
- at_partitioned_1 | t | own |
- at_partitioned_1_id_name_key | f | own | parent index
+ at_partitioned_0 | t | orig |
+ at_partitioned_0_id_name_key | f | new | parent index
+ at_partitioned_1 | t | orig |
+ at_partitioned_1_id_name_key | f | new | parent index
at_partitioned_id_name_key | f | none | parent index
(6 rows)
@@ -2560,7 +2558,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out
index 91f2571..0a35f33 100644
--- a/src/test/regress/expected/fast_default.out
+++ b/src/test/regress/expected/fast_default.out
@@ -3,8 +3,8 @@
--
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
BEGIN
diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out
index 215eb89..af57470 100644
--- a/src/test/regress/expected/oidjoins.out
+++ b/src/test/regress/expected/oidjoins.out
@@ -74,11 +74,11 @@ NOTICE: checking pg_type {typcollation} => pg_collation {oid}
NOTICE: checking pg_attribute {attrelid} => pg_class {oid}
NOTICE: checking pg_attribute {atttypid} => pg_type {oid}
NOTICE: checking pg_attribute {attcollation} => pg_collation {oid}
+NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {relnamespace} => pg_namespace {oid}
NOTICE: checking pg_class {reltype} => pg_type {oid}
NOTICE: checking pg_class {reloftype} => pg_type {oid}
NOTICE: checking pg_class {relowner} => pg_authid {oid}
-NOTICE: checking pg_class {relam} => pg_am {oid}
NOTICE: checking pg_class {reltablespace} => pg_tablespace {oid}
NOTICE: checking pg_class {reltoastrelid} => pg_class {oid}
NOTICE: checking pg_class {relrewrite} => pg_class {oid}
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 9f773ae..a67eb5f 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1478,9 +1478,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1499,9 +1498,8 @@ select relname,
c.oid = oldoid as orig_oid,
case relfilenode
when 0 then 'none'
- when c.oid then 'own'
when oldfilenode then 'orig'
- else 'OTHER'
+ else 'new'
end as storage,
obj_description(c.oid, 'pg_class') as desc
from pg_class c left join old_oids using (relname)
@@ -1641,7 +1639,7 @@ CREATE FUNCTION check_ddl_rewrite(p_tablename regclass, p_ddl text)
RETURNS boolean
LANGUAGE plpgsql AS $$
DECLARE
- v_relfilenode oid;
+ v_relfilenode int8;
BEGIN
v_relfilenode := relfilenode FROM pg_class WHERE oid = p_tablename;
diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql
index 16a3b7c..819ec40 100644
--- a/src/test/regress/sql/fast_default.sql
+++ b/src/test/regress/sql/fast_default.sql
@@ -4,8 +4,8 @@
SET search_path = fast_default;
CREATE SCHEMA fast_default;
-CREATE TABLE m(id OID);
-INSERT INTO m VALUES (NULL::OID);
+CREATE TABLE m(id BIGINT);
+INSERT INTO m VALUES (NULL::BIGINT);
CREATE FUNCTION set(tabname name) RETURNS VOID
AS $$
--
1.8.3.1
On Tue, Sep 27, 2022 at 2:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Looks fine to me.
OK, committed. I also committed the 0002 patch with some wordsmithing,
and I removed a < 0 test an an unsigned value because my compiler
complained about it. 0001 turned out to make headerscheck sad, so I
just pushed a fix for that, too.
I'm not too sure about 0003. I think if we need an is_shared flag
maybe we might as well just pass the tablespace OID. The is_shared
flag seems to just make things a bit complicated for the callers for
no real benefit.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi Dilip,
I am very happy to see these commits. Here's some belated review for
the tombstone-removal patch.
v7-0004-Don-t-delay-removing-Tombstone-file-until-next.patch
More things you can remove:
* sync_unlinkfiletag in struct SyncOps
* the macro UNLINKS_PER_ABSORB
* global variable pendingUnlinks
This comment after the question mark is obsolete:
* XXX should we CHECK_FOR_INTERRUPTS in this loop?
Escaping with an
* error in the case of SYNC_UNLINK_REQUEST would leave the
* no-longer-used file still present on disk, which
would be bad, so
* I'm inclined to assume that the checkpointer will
always empty the
* queue soon.
(I think if the answer to the question is now yes, then we should
replace the stupid sleep with a condition variable sleep, but there's
another thread about that somewhere).
In a couple of places in dbcommands.c you could now make this change:
/*
- * Force a checkpoint before starting the copy. This will
force all dirty
- * buffers, including those of unlogged tables, out to disk, to ensure
- * source database is up-to-date on disk for the copy.
- * FlushDatabaseBuffers() would suffice for that, but we also want to
- * process any pending unlink requests. Otherwise, if a checkpoint
- * happened while we're copying files, a file might be deleted just when
- * we're about to copy it, causing the lstat() call in copydir() to fail
- * with ENOENT.
+ * Force all dirty buffers, including those of unlogged tables, out to
+ * disk, to ensure source database is up-to-date on disk for the copy.
*/
- RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE |
- CHECKPOINT_WAIT |
CHECKPOINT_FLUSH_ALL);
+ FlushDatabaseBuffers(src_dboid);
More obsolete comments you could change:
* If we were copying database at block levels then drop pages for the
* destination database that are in the shared buffer cache. And tell
--> * checkpointer to forget any pending fsync and unlink
requests for files
--> * Tell checkpointer to forget any pending fsync and unlink requests for
* files in the database; else the fsyncs will fail at next
checkpoint, or
* worse, it will delete file
In tablespace.c I think you could now make this change:
if (!destroy_tablespace_directories(tablespaceoid, false))
{
- /*
- * Not all files deleted? However, there can be
lingering empty files
- * in the directories, left behind by for example DROP
TABLE, that
- * have been scheduled for deletion at next checkpoint
(see comments
- * in mdunlink() for details). We could just delete
them immediately,
- * but we can't tell them apart from important data
files that we
- * mustn't delete. So instead, we force a checkpoint
which will clean
- * out any lingering files, and try again.
- */
- RequestCheckpoint(CHECKPOINT_IMMEDIATE |
CHECKPOINT_FORCE | CHECKPOINT_WAIT);
-
+#ifdef WIN32
/*
* On Windows, an unlinked file persists in the
directory listing
* until no process retains an open handle for the
file. The DDL
@@ -523,6 +513,7 @@ DropTableSpace(DropTableSpaceStmt *stmt)
/* And now try again. */
if (!destroy_tablespace_directories(tablespaceoid, false))
+#endif
{
/* Still not empty, the files must be important then */
ereport(ERROR,
On Wed, Sep 28, 2022 at 9:23 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Hi Dilip,
I am very happy to see these commits. Here's some belated review for
the tombstone-removal patch.v7-0004-Don-t-delay-removing-Tombstone-file-until-next.patch
More things you can remove:
* sync_unlinkfiletag in struct SyncOps
* the macro UNLINKS_PER_ABSORB
* global variable pendingUnlinksThis comment after the question mark is obsolete:
* XXX should we CHECK_FOR_INTERRUPTS in this loop?
Escaping with an
* error in the case of SYNC_UNLINK_REQUEST would leave the
* no-longer-used file still present on disk, which
would be bad, so
* I'm inclined to assume that the checkpointer will
always empty the
* queue soon.(I think if the answer to the question is now yes, then we should
replace the stupid sleep with a condition variable sleep, but there's
another thread about that somewhere).In a couple of places in dbcommands.c you could now make this change:
/* - * Force a checkpoint before starting the copy. This will force all dirty - * buffers, including those of unlogged tables, out to disk, to ensure - * source database is up-to-date on disk for the copy. - * FlushDatabaseBuffers() would suffice for that, but we also want to - * process any pending unlink requests. Otherwise, if a checkpoint - * happened while we're copying files, a file might be deleted just when - * we're about to copy it, causing the lstat() call in copydir() to fail - * with ENOENT. + * Force all dirty buffers, including those of unlogged tables, out to + * disk, to ensure source database is up-to-date on disk for the copy. */ - RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | - CHECKPOINT_WAIT | CHECKPOINT_FLUSH_ALL); + FlushDatabaseBuffers(src_dboid);More obsolete comments you could change:
* If we were copying database at block levels then drop pages for the
* destination database that are in the shared buffer cache. And tell
--> * checkpointer to forget any pending fsync and unlink
requests for files--> * Tell checkpointer to forget any pending fsync and unlink requests for
* files in the database; else the fsyncs will fail at next
checkpoint, or
* worse, it will delete fileIn tablespace.c I think you could now make this change:
if (!destroy_tablespace_directories(tablespaceoid, false)) { - /* - * Not all files deleted? However, there can be lingering empty files - * in the directories, left behind by for example DROP TABLE, that - * have been scheduled for deletion at next checkpoint (see comments - * in mdunlink() for details). We could just delete them immediately, - * but we can't tell them apart from important data files that we - * mustn't delete. So instead, we force a checkpoint which will clean - * out any lingering files, and try again. - */ - RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT); - +#ifdef WIN32 /* * On Windows, an unlinked file persists in the directory listing * until no process retains an open handle for the file. The DDL @@ -523,6 +513,7 @@ DropTableSpace(DropTableSpaceStmt *stmt)/* And now try again. */
if (!destroy_tablespace_directories(tablespaceoid, false))
+#endif
{
/* Still not empty, the files must be important then */
ereport(ERROR,
Thanks, Thomas, these all look fine to me. So far we have committed
the patch to make relfilenode 56 bits wide. The Tombstone file
removal patch is still pending to be committed, so when I will rebase
that patch I will accommodate all these comments in that patch.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Sep 28, 2022 at 9:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Thanks, Thomas, these all look fine to me. So far we have committed
the patch to make relfilenode 56 bits wide. The Tombstone file
removal patch is still pending to be committed, so when I will rebase
that patch I will accommodate all these comments in that patch.
I noticed that your new unlinking algorithm goes like this:
stat("x")
stat("x.1")
stat("x.2")
stat("x.3") -> ENOENT /* now we know how many segments there are */
truncate("x.2")
unlink("x.2")
truncate("x.1")
unlink("x.1")
truncate("x")
unlink("x")
Could you say what problem this solves, and, guessing that it's just
that you want the 0 file to be "in the way" until the other files are
gone (at least while we're running; who knows what'll be left if you
power-cycle), could you do it like this instead?
truncate("x")
truncate("x.1")
truncate("x.2")
truncate("x.3") -> ENOENT /* now we know how many segments there are */
unlink("x.2")
unlink("x.1")
unlink("x")
Hi!
I'm not in the context of this thread, but I've notice something strange by
attempting to rebase my patch set from 64XID thread.
As far as I'm aware, this patch set is adding "relfilenumber". So, in
pg_control_checkpoint, we have next changes:
diff --git a/src/backend/utils/misc/pg_controldata.c
b/src/backend/utils/misc/pg_controldata.c
index 781f8b8758..d441cd97e2 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -79,8 +79,8 @@ pg_control_system(PG_FUNCTION_ARGS)
Datum
pg_control_checkpoint(PG_FUNCTION_ARGS)
{
- Datum values[18];
- bool nulls[18];
+ Datum values[19];
+ bool nulls[19];
TupleDesc tupdesc;
HeapTuple htup;
ControlFileData *ControlFile;
@@ -129,6 +129,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
XIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 18, "checkpoint_time",
TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 19, "next_relfilenumber",
+ INT8OID, -1, 0);
tupdesc = BlessTupleDesc(tupdesc);
/* Read the control file. */
In other words, we have 19 attributes. But tupdesc here is constructed for
18 elements:
tupdesc = CreateTemplateTupleDesc(18);
Is that normal or not? Again, I'm not in this thread and if that is
completely ok, I'm sorry about the noise.
--
Best regards,
Maxim Orlov.
On Thu, Sep 29, 2022 at 10:50 AM Maxim Orlov <orlovmg@gmail.com> wrote:
In other words, we have 19 attributes. But tupdesc here is constructed for 18 elements:
tupdesc = CreateTemplateTupleDesc(18);Is that normal or not? Again, I'm not in this thread and if that is completely ok, I'm sorry about the noise.
I think that's a mistake. Thanks for the report.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Sep 29, 2022 at 10:50 AM Maxim Orlov <orlovmg@gmail.com> wrote:
In other words, we have 19 attributes. But tupdesc here is constructed for 18 elements:
tupdesc = CreateTemplateTupleDesc(18);
I think that's a mistake. Thanks for the report.
The assertions in TupleDescInitEntry would have caught that,
if only utils/misc/pg_controldata.c had more than zero test coverage.
Seems like somebody ought to do something about that.
regards, tom lane
On Thu, Sep 29, 2022 at 02:39:44PM -0400, Tom Lane wrote:
The assertions in TupleDescInitEntry would have caught that,
if only utils/misc/pg_controldata.c had more than zero test coverage.
Seems like somebody ought to do something about that.
While passing by, I have noticed this thread. We don't really care
about the contents returned by these functions, and one simple trick
to check their execution is SELECT FROM. Like in the attached, for
example.
--
Michael
Attachments:
controldata-regression.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 9f106c2a10..93cba8e76d 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -594,3 +594,22 @@ SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g;
Index Cond: (unique1 = g.g)
(4 rows)
+--
+-- Test functions for control data
+--
+SELECT FROM pg_control_checkpoint();
+--
+(1 row)
+
+SELECT FROM pg_control_init();
+--
+(1 row)
+
+SELECT FROM pg_control_recovery();
+--
+(1 row)
+
+SELECT FROM pg_control_system();
+--
+(1 row)
+
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 639e9b352c..207d5a5292 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -223,3 +223,11 @@ SELECT * FROM tenk1 a JOIN my_gen_series(1,1000) g ON a.unique1 = g;
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g;
+
+--
+-- Test functions for control data
+--
+SELECT FROM pg_control_checkpoint();
+SELECT FROM pg_control_init();
+SELECT FROM pg_control_recovery();
+SELECT FROM pg_control_system();
Michael Paquier <michael@paquier.xyz> writes:
While passing by, I have noticed this thread. We don't really care
about the contents returned by these functions, and one simple trick
to check their execution is SELECT FROM. Like in the attached, for
example.
Hmmm ... I'd tend to do SELECT COUNT(*) FROM. But can't we provide
any actual checks on the sanity of the output? I realize that the
output's far from static, but still ...
regards, tom lane
On Thu, Sep 29, 2022 at 09:23:38PM -0400, Tom Lane wrote:
Hmmm ... I'd tend to do SELECT COUNT(*) FROM. But can't we provide
any actual checks on the sanity of the output? I realize that the
output's far from static, but still ...
Honestly, checking all the fields is not that exciting, but the
maximum I can think of that would be portable enough is something like
the attached. No arithmetic operators for xid limits things a bit,
but at least that's something.
Thoughts?
--
Michael
Attachments:
controldata-regression-2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 9f106c2a10..38987e2afc 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -594,3 +594,89 @@ SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g;
Index Cond: (unique1 = g.g)
(4 rows)
+--
+-- Test functions for control data
+--
+\x
+SELECT checkpoint_lsn > '0/0'::pg_lsn AS checkpoint_lsn,
+ redo_lsn > '0/0'::pg_lsn AS redo_lsn,
+ redo_wal_file IS NOT NULL AS redo_wal_file,
+ timeline_id > 0 AS timeline_id,
+ prev_timeline_id > 0 AS prev_timeline_id,
+ next_xid IS NOT NULL AS next_xid,
+ next_oid > 0 AS next_oid,
+ next_multixact_id != '0'::xid AS next_multixact_id,
+ next_multi_offset IS NOT NULL AS next_multi_offset,
+ oldest_xid != '0'::xid AS oldest_xid,
+ oldest_xid_dbid > 0 AS oldest_xid_dbid,
+ oldest_active_xid != '0'::xid AS oldest_active_xid,
+ oldest_multi_xid != '0'::xid AS oldest_multi_xid,
+ oldest_multi_dbid > 0 AS oldest_multi_dbid,
+ oldest_commit_ts_xid IS NOT NULL AS oldest_commit_ts_xid,
+ newest_commit_ts_xid IS NOT NULL AS newest_commit_ts_xid
+ FROM pg_control_checkpoint();
+-[ RECORD 1 ]--------+--
+checkpoint_lsn | t
+redo_lsn | t
+redo_wal_file | t
+timeline_id | t
+prev_timeline_id | t
+next_xid | t
+next_oid | t
+next_multixact_id | t
+next_multi_offset | t
+oldest_xid | t
+oldest_xid_dbid | t
+oldest_active_xid | t
+oldest_multi_xid | t
+oldest_multi_dbid | t
+oldest_commit_ts_xid | t
+newest_commit_ts_xid | t
+
+SELECT max_data_alignment > 0 AS max_data_alignment,
+ database_block_size > 0 AS database_block_size,
+ blocks_per_segment > 0 AS blocks_per_segment,
+ wal_block_size > 0 AS wal_block_size,
+ max_identifier_length > 0 AS max_identifier_length,
+ max_index_columns > 0 AS max_index_columns,
+ max_toast_chunk_size > 0 AS max_toast_chunk_size,
+ large_object_chunk_size > 0 AS large_object_chunk_size,
+ float8_pass_by_value IS NOT NULL AS float8_pass_by_value,
+ data_page_checksum_version >= 0 AS data_page_checksum_version
+ FROM pg_control_init();
+-[ RECORD 1 ]--------------+--
+max_data_alignment | t
+database_block_size | t
+blocks_per_segment | t
+wal_block_size | t
+max_identifier_length | t
+max_index_columns | t
+max_toast_chunk_size | t
+large_object_chunk_size | t
+float8_pass_by_value | t
+data_page_checksum_version | t
+
+SELECT min_recovery_end_lsn >= '0/0'::pg_lsn AS min_recovery_end_lsn,
+ min_recovery_end_timeline >= 0 AS min_recovery_end_timeline,
+ backup_start_lsn >= '0/0'::pg_lsn AS backup_start_lsn,
+ backup_end_lsn >= '0/0'::pg_lsn AS backup_end_lsn,
+ end_of_backup_record_required IS NOT NULL AS end_of_backup_record_required
+ FROM pg_control_recovery();
+-[ RECORD 1 ]-----------------+--
+min_recovery_end_lsn | t
+min_recovery_end_timeline | t
+backup_start_lsn | t
+backup_end_lsn | t
+end_of_backup_record_required | t
+
+SELECT pg_control_version > 0 AS pg_control_version,
+ catalog_version_no > 0 AS catalog_version_no,
+ system_identifier >= 0 AS system_identifier,
+ pg_control_last_modified <= now() AS pg_control_last_modified
+ FROM pg_control_system();
+-[ RECORD 1 ]------------+--
+pg_control_version | t
+catalog_version_no | t
+system_identifier | t
+pg_control_last_modified | t
+
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 639e9b352c..986e07c3a5 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -223,3 +223,47 @@ SELECT * FROM tenk1 a JOIN my_gen_series(1,1000) g ON a.unique1 = g;
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g;
+
+--
+-- Test functions for control data
+--
+\x
+SELECT checkpoint_lsn > '0/0'::pg_lsn AS checkpoint_lsn,
+ redo_lsn > '0/0'::pg_lsn AS redo_lsn,
+ redo_wal_file IS NOT NULL AS redo_wal_file,
+ timeline_id > 0 AS timeline_id,
+ prev_timeline_id > 0 AS prev_timeline_id,
+ next_xid IS NOT NULL AS next_xid,
+ next_oid > 0 AS next_oid,
+ next_multixact_id != '0'::xid AS next_multixact_id,
+ next_multi_offset IS NOT NULL AS next_multi_offset,
+ oldest_xid != '0'::xid AS oldest_xid,
+ oldest_xid_dbid > 0 AS oldest_xid_dbid,
+ oldest_active_xid != '0'::xid AS oldest_active_xid,
+ oldest_multi_xid != '0'::xid AS oldest_multi_xid,
+ oldest_multi_dbid > 0 AS oldest_multi_dbid,
+ oldest_commit_ts_xid IS NOT NULL AS oldest_commit_ts_xid,
+ newest_commit_ts_xid IS NOT NULL AS newest_commit_ts_xid
+ FROM pg_control_checkpoint();
+SELECT max_data_alignment > 0 AS max_data_alignment,
+ database_block_size > 0 AS database_block_size,
+ blocks_per_segment > 0 AS blocks_per_segment,
+ wal_block_size > 0 AS wal_block_size,
+ max_identifier_length > 0 AS max_identifier_length,
+ max_index_columns > 0 AS max_index_columns,
+ max_toast_chunk_size > 0 AS max_toast_chunk_size,
+ large_object_chunk_size > 0 AS large_object_chunk_size,
+ float8_pass_by_value IS NOT NULL AS float8_pass_by_value,
+ data_page_checksum_version >= 0 AS data_page_checksum_version
+ FROM pg_control_init();
+SELECT min_recovery_end_lsn >= '0/0'::pg_lsn AS min_recovery_end_lsn,
+ min_recovery_end_timeline >= 0 AS min_recovery_end_timeline,
+ backup_start_lsn >= '0/0'::pg_lsn AS backup_start_lsn,
+ backup_end_lsn >= '0/0'::pg_lsn AS backup_end_lsn,
+ end_of_backup_record_required IS NOT NULL AS end_of_backup_record_required
+ FROM pg_control_recovery();
+SELECT pg_control_version > 0 AS pg_control_version,
+ catalog_version_no > 0 AS catalog_version_no,
+ system_identifier >= 0 AS system_identifier,
+ pg_control_last_modified <= now() AS pg_control_last_modified
+ FROM pg_control_system();
On Fri, 21 Oct 2022 at 11:31, Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Sep 29, 2022 at 09:23:38PM -0400, Tom Lane wrote:
Hmmm ... I'd tend to do SELECT COUNT(*) FROM. But can't we provide
any actual checks on the sanity of the output? I realize that the
output's far from static, but still ...Honestly, checking all the fields is not that exciting, but the
maximum I can think of that would be portable enough is something like
the attached. No arithmetic operators for xid limits things a bit,
but at least that's something.Thoughts?
The patch does not apply on top of HEAD as in [1]http://cfbot.cputube.org/patch_41_3711.log, please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
33ab0a2a527e3af5beee3a98fc07201e555d6e45 ===
=== applying patch ./controldata-regression-2.patch
patching file src/test/regress/expected/misc_functions.out
Hunk #1 succeeded at 642 with fuzz 2 (offset 48 lines).
patching file src/test/regress/sql/misc_functions.sql
Hunk #1 FAILED at 223.
1 out of 1 hunk FAILED -- saving rejects to file
src/test/regress/sql/misc_functions.sql.rej
[1]: http://cfbot.cputube.org/patch_41_3711.log
Regards,
Vignesh
On Wed, Jan 4, 2023 at 5:45 PM vignesh C <vignesh21@gmail.com> wrote:
On Fri, 21 Oct 2022 at 11:31, Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Sep 29, 2022 at 09:23:38PM -0400, Tom Lane wrote:
Hmmm ... I'd tend to do SELECT COUNT(*) FROM. But can't we provide
any actual checks on the sanity of the output? I realize that the
output's far from static, but still ...Honestly, checking all the fields is not that exciting, but the
maximum I can think of that would be portable enough is something like
the attached. No arithmetic operators for xid limits things a bit,
but at least that's something.Thoughts?
The patch does not apply on top of HEAD as in [1], please post a rebased patch:
Because of the extra WAL overhead, we are not continuing with the
patch, I will withdraw it.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com