logical changeset generation v4
Hi everyone,
Here is the newest version of logical changeset generation.
Changes since last time round:
* loads and loads of bugfixes
* crash/restart persistency of in-memory structures in a crash safe manner
* very large transaction support (spooling to disk)
* rebased onto the newest version of xlogreader
Overview over the patches:
Xlogreader (separate patch):
[01]: Centralize Assert* macros into c.h so its common between backend/frontend
[02]: Provide a common malloc wrappers and palloc et al. emulation for frontend'ish environs
[03]: Split out xlog reading into its own module called xlogreader
[04]: Add pg_xlogdump contrib module
Those seem to be ready baring some infrastructure work around common
backend/frontend code for xlogdump.
Add capability to map from (tablespace, relfilenode) to pg_class.oid:
[05]: Add a new RELFILENODE syscache to fetch a pg_class entry via (reltablespace, relfilenode)
[06]: Add RelationMapFilenodeToOid function to relmapper.c
[07]: Add pg_relation_by_filenode to lookup up a relation by (tablespace, filenode)
Imo those are pretty solid although there are some doubts about the
correctness of [05]Add a new RELFILENODE syscache to fetch a pg_class entry via (reltablespace, relfilenode) which I think are all fixed in this version:
The fundamental problem of adding a (tablespace, relfilenode) syscache
is that no unique index exists in pg_class over (relfilenode,
reltablespace) because relfilenode is set to a '0' (aka InvalidOid) when
the table is either a shared table or a nailed table. This cannot really
be changed as pg_class.relfilenode is not authoritative for those and
can possibly not even accessed (different table, early startup). We also
don't want to rely on the null bitmap, so we can't set it to NULL.
The reason why I think it is safe to use the added RELFILENODE syscache
as I have in those patches is that when looking a (tablespace, filenode)
pair up none of those duplicat '0' values will get looked up as there is
no point in looking up an invalid relfilenode. Instead the shared/nailed
relfilenodes will have to get mapped via RelationMapFilenodeToOid.
The alternative here seems to be to invent an own attoptcache style but
given that the above syscache is fairly performance critical and should
do invalidations in a sensible manner that seems to be an unnecessary
amount of code.
Any opinions here?
[08]: wal_decoding: Introduce InvalidCommandId and declare that to be the new maximum for CommandCounterIncrement
Its useful to represent values that are not a valid CommandId. Add such
a representation.
Imo this is straightforward and easy.
[09]: Adjust all *Satisfies routines to take a HeapTuple instead of a HeapTupleHeader
For timetravel access to the catalog we need to be able to lookup (cmin,
cmax) pairs of catalog rows when were 'inside' that TX. This patch just
adapts the signature of the *Satisfies routines to expect a HeapTuple
instead of a HeapTupleHeader. The amount of changes for that is fairly
low as the HeapTupleSatisfiesVisibility macro already expected the
former.
It also makes sure the HeapTuple fields are setup in the few places that
didn't already do so.
[10]: wal_decoding: Allow walsender's to connect to a specific database
For logical decoding we need to be able access the schema of a database
- for that we need to be connected to a database. Thus allow recovery
connections to connect to a specific database.
This patch currently has the disadvantage that its not possible anymore
to connect to a database thats actually named "replication" as the
decision whether a connection goes to a database or not is made based
uppon dbname != replication.
Better ideas?
[11]: wal_decoding: Add alreadyLocked parameter to GetOldestXminNoLock
Pretty boring preparatory for being able to nail a certain xid as the
global horizon. I don't think there is much to be said about this
anymore, it already has been somewhat discussed.
[12]: wal_decodign: Log xl_running_xact's at a higher frequency than checkpoints are done
Make the bgwriter emit a xl_running_xacts record every 15s if there is
xlog activity in the system.
Imo this isn't too complicated and already beneficial for HS so it could
be applied separately.
[13]: copydir: make fsync_fname public
This should probably go to some other file, fd.[ch]? Otherwise its
pretty trivial.
[14]: wal decoding: Add information about a tables primary key to struct RelationData
Back when discussing the first prototype of BDR Heikki was concerned of
doing a search for the primary key during heap_delete. I agree that that
isn't really a good idea.
So remember the primary key (or a candidate key) when looking through
the available indexes in RelationGetIndexList().
I don't really like the name rd_primary as it also contains candidate
keys (i.e. indimmediate, inunique, noexpression, notnull), better
suggestions?
I don't think there is too much debatable in here, but there is no
independent benefit of applying it separately.
[15]: wal decoding: Introduce wal decoding via catalog timetravel
The heart of changeset generation.
Built out of several parts:
* snapshot building infrastructure
* transaction reassembly
* shared memory state for replication slots
* new wal_level=logical that catches more data
* (local) output plugin interface
* (external) walsender interface
[16]: wal decoding: Add a simple decoding module in contrib named 'test_decoding'
An example output plugin also used in regression tests
[17]: wal decoding: Introduce pg_receivellog, the pg_receivexlog equivalent for logical changes
An application to receive changes over the walsender/replication=1
interface.
[18]: wal_decoding: Add test_logical_replication extension for easier testing of logical decoding
An extension that allows to use logical decoding from sql. This isn't
really suitable for production, high performance use but its usefor for
development and more importantly it makes it relatively easy to write
regression tests without new infrastructure.
I am starting to be happy about the state of this!
Open issues & questions:
1) testing infrastructure
2) Determination of replication slots
3) Options for output plugins
4) the general walsender interface
5) Additional catalog tables
1) Currently all the tests are in patch [18]wal_decoding: Add test_logical_replication extension for easier testing of logical decoding which is a contrib
module. There are two reasons for putting them there:
First the tests need wal_level=logical which isn't the case with the
current regression tests.
Second, I don't think the test_logical_replication functions should live
in core as they shouldn't be used for a production replication scenario
(causes longrunning transactions, requires polling) , but I have failed
to find a neat way to include a contrib extension in the plain
regression tests.
2) Currently the logical replication infrastructure assigns a 'slot-id'
when a new replica is setup. That slot id isn't really nice
(e.g. "id-321578-3"). It also requires that [18]wal_decoding: Add test_logical_replication extension for easier testing of logical decoding keeps state in a global
variable to make writing regression tests easy.
I think it would be better to make the user specify those replication
slot ids, but I am not really sure about it.
3) Currently no options can be passed to an output plugin. I am thinking
about making "INIT_LOGICAL_REPLICATION 'plugin'" accept the now widely
used ('option' ['value'], ...) syntax and pass that to the output
plugin's initialization function.
4) Does anybody object to:
-- allocate a permanent replication slot
INIT_LOGICAL_REPLICATION 'plugin' 'slotname' (options);
-- stream data
START_LOGICAL_REPLICATION 'slotname' 'recptr';
-- deallocate a permanent replication slot
FREE_LOGICAL_REPLICATION 'slotname';
5) Currently its only allowed to access catalog tables, its fairly
trivial to extend this to additional tables if you can accept some
(noticeable but not too big) overhead for modifications on those tables.
I was thinking of making that an option for tables, that would be useful
for replication solutions configuration tables.
Further todo:
* don't reread so much WAL after a restart/crash
* better TOAST support, the current one can fail after A DROP TABLE
* only peg a new "catalog xmin" instead of the global xmin
* more docs about the internals
* nicer interface between snapbuild.c, reorderbuffer.c, decode.c and the
outside. There have been improvements vs 3.1 but not enough
* abort too old replication slots
Puh.
The current git tree is at:
git://git.postgresql.org/git/users/andresfreund/postgres.git branch xlog-decoding-rebasing-cf4
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-cf4
The xlogreader development happens xlogreader_4.
Input?
Greetings,
Andres Freund
PS: Thanks for the input & help so far!
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
0010-wal_decoding-Allow-walsender-s-to-connect-to-a-speci.patchtext/x-patch; charset=us-asciiDownload
>From 12f4329b2c31eee6d2d93e42e0f52c411dab9d8d Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 13 Nov 2012 12:54:36 +0100
Subject: [PATCH 10/19] wal_decoding: Allow walsender's to connect to a
specific database
Currently the decision whether to connect to a database or not is made by
checking whether the passed "dbname" parameter is "replication". Unfortunately
this makes it impossible to connect a to a database named replication...
This is useful for future walsender commands which need database interaction.
---
src/backend/postmaster/postmaster.c | 7 ++++--
.../libpqwalreceiver/libpqwalreceiver.c | 4 ++--
src/backend/replication/walsender.c | 27 ++++++++++++++++++----
src/backend/utils/init/postinit.c | 5 ++++
src/bin/pg_basebackup/pg_basebackup.c | 4 ++--
src/bin/pg_basebackup/pg_receivexlog.c | 4 ++--
src/bin/pg_basebackup/receivelog.c | 4 ++--
7 files changed, 41 insertions(+), 14 deletions(-)
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 15c2320..53a3988 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1953,10 +1953,13 @@ retry1:
if (strlen(port->user_name) >= NAMEDATALEN)
port->user_name[NAMEDATALEN - 1] = '\0';
- /* Walsender is not related to a particular database */
- if (am_walsender)
+ /* Generic Walsender is not related to a particular database */
+ if (am_walsender && strcmp(port->database_name, "replication") == 0)
port->database_name[0] = '\0';
+ if (am_walsender)
+ elog(WARNING, "connecting to %s", port->database_name);
+
/*
* Done putting stuff in TopMemoryContext.
*/
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7374489..e46a060 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -130,7 +130,7 @@ libpqrcv_identify_system(TimeLineID *primary_tli)
"the primary server: %s",
PQerrorMessage(streamConn))));
}
- if (PQnfields(res) != 3 || PQntuples(res) != 1)
+ if (PQnfields(res) != 4 || PQntuples(res) != 1)
{
int ntuples = PQntuples(res);
int nfields = PQnfields(res);
@@ -138,7 +138,7 @@ libpqrcv_identify_system(TimeLineID *primary_tli)
PQclear(res);
ereport(ERROR,
(errmsg("invalid response from primary server"),
- errdetail("Expected 1 tuple with 3 fields, got %d tuples with %d fields.",
+ errdetail("Expected 1 tuple with 4 fields, got %d tuples with %d fields.",
ntuples, nfields)));
}
primary_sysid = PQgetvalue(res, 0, 0);
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index ad7d1c9..2284d58 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -46,6 +46,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
#include "catalog/pg_type.h"
+#include "commands/dbcommands.h"
#include "funcapi.h"
#include "libpq/libpq.h"
#include "libpq/pqformat.h"
@@ -237,10 +238,12 @@ IdentifySystem(void)
char tli[11];
char xpos[MAXFNAMELEN];
XLogRecPtr logptr;
+ char* dbname = NULL;
/*
- * Reply with a result set with one row, three columns. First col is
- * system ID, second is timeline ID, and third is current xlog location.
+ * Reply with a result set with one row, four columns. First col is system
+ * ID, second is timeline ID, third is current xlog location and the fourth
+ * contains the database name if we are connected to one.
*/
snprintf(sysid, sizeof(sysid), UINT64_FORMAT,
@@ -259,9 +262,14 @@ IdentifySystem(void)
snprintf(xpos, sizeof(xpos), "%X/%X", (uint32) (logptr >> 32), (uint32) logptr);
+ if (MyDatabaseId != InvalidOid)
+ dbname = get_database_name(MyDatabaseId);
+ else
+ dbname = "(none)";
+
/* Send a RowDescription message */
pq_beginmessage(&buf, 'T');
- pq_sendint(&buf, 3, 2); /* 3 fields */
+ pq_sendint(&buf, 4, 2); /* 4 fields */
/* first field */
pq_sendstring(&buf, "systemid"); /* col name */
@@ -289,17 +297,28 @@ IdentifySystem(void)
pq_sendint(&buf, -1, 2);
pq_sendint(&buf, 0, 4);
pq_sendint(&buf, 0, 2);
+
+ /* fourth field */
+ pq_sendstring(&buf, "dbname");
+ pq_sendint(&buf, 0, 4);
+ pq_sendint(&buf, 0, 2);
+ pq_sendint(&buf, TEXTOID, 4);
+ pq_sendint(&buf, -1, 2);
+ pq_sendint(&buf, 0, 4);
+ pq_sendint(&buf, 0, 2);
pq_endmessage(&buf);
/* Send a DataRow message */
pq_beginmessage(&buf, 'D');
- pq_sendint(&buf, 3, 2); /* # of columns */
+ pq_sendint(&buf, 4, 2); /* # of columns */
pq_sendint(&buf, strlen(sysid), 4); /* col1 len */
pq_sendbytes(&buf, (char *) &sysid, strlen(sysid));
pq_sendint(&buf, strlen(tli), 4); /* col2 len */
pq_sendbytes(&buf, (char *) tli, strlen(tli));
pq_sendint(&buf, strlen(xpos), 4); /* col3 len */
pq_sendbytes(&buf, (char *) xpos, strlen(xpos));
+ pq_sendint(&buf, strlen(dbname), 4); /* col4 len */
+ pq_sendbytes(&buf, (char *) dbname, strlen(dbname));
pq_endmessage(&buf);
}
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 7e21cea..2a93cff 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -728,7 +728,12 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
ereport(FATAL,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser or replication role to start walsender")));
+ }
+ if (am_walsender &&
+ (in_dbname == NULL || in_dbname[0] == '\0') &&
+ dboid == InvalidOid)
+ {
/* process any options passed in the startup packet */
if (MyProcPort != NULL)
process_startup_options(MyProcPort, am_superuser);
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 6631161..9d2fa6d 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1236,11 +1236,11 @@ BaseBackup(void)
progname, "IDENTIFY_SYSTEM", PQerrorMessage(conn));
disconnect_and_exit(1);
}
- if (PQntuples(res) != 1 || PQnfields(res) != 3)
+ if (PQntuples(res) != 1 || PQnfields(res) != 4)
{
fprintf(stderr,
_("%s: could not identify system: got %d rows and %d fields, expected %d rows and %d fields\n"),
- progname, PQntuples(res), PQnfields(res), 1, 3);
+ progname, PQntuples(res), PQnfields(res), 1, 4);
disconnect_and_exit(1);
}
sysidentifier = pg_strdup(PQgetvalue(res, 0, 0));
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index b9ccb62..a0f3efc 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -237,11 +237,11 @@ StreamLog(void)
progname, "IDENTIFY_SYSTEM", PQerrorMessage(conn));
disconnect_and_exit(1);
}
- if (PQntuples(res) != 1 || PQnfields(res) != 3)
+ if (PQntuples(res) != 1 || PQnfields(res) != 4)
{
fprintf(stderr,
_("%s: could not identify system: got %d rows and %d fields, expected %d rows and %d fields\n"),
- progname, PQntuples(res), PQnfields(res), 1, 3);
+ progname, PQntuples(res), PQnfields(res), 1, 4);
disconnect_and_exit(1);
}
timeline = atoi(PQgetvalue(res, 0, 1));
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index f4f883c..c9cb834 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -355,11 +355,11 @@ ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
PQclear(res);
return false;
}
- if (PQnfields(res) != 3 || PQntuples(res) != 1)
+ if (PQnfields(res) != 4 || PQntuples(res) != 1)
{
fprintf(stderr,
_("%s: could not identify system: got %d rows and %d fields, expected %d rows and %d fields\n"),
- progname, PQntuples(res), PQnfields(res), 1, 3);
+ progname, PQntuples(res), PQnfields(res), 1, 4);
PQclear(res);
return false;
}
--
1.7.12.289.g0ce9864.dirty
0011-wal_decoding-Add-alreadyLocked-parameter-to-GetOldes.patchtext/x-patch; charset=us-asciiDownload
>From 54cf27b505efcf5aeeb2b78638e88fab04e66b5b Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 13 Dec 2012 20:47:57 +0100
Subject: [PATCH 11/19] wal_decoding: Add alreadyLocked parameter to
GetOldestXminNoLock
This is useful because it allows to compute the current OldestXmin while
already holding the procarray lock which enables setting the own xmin horizon
safely.
---
src/backend/access/transam/xlog.c | 4 ++--
src/backend/catalog/index.c | 3 ++-
src/backend/commands/analyze.c | 2 +-
src/backend/commands/vacuum.c | 4 ++--
src/backend/replication/walreceiver.c | 2 +-
src/backend/storage/ipc/procarray.c | 16 ++++++++--------
src/include/storage/procarray.h | 2 +-
7 files changed, 17 insertions(+), 16 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 310a654..ab7f0e4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6823,7 +6823,7 @@ CreateCheckPoint(int flags)
* StartupSUBTRANS hasn't been called yet.
*/
if (!RecoveryInProgress())
- TruncateSUBTRANS(GetOldestXmin(true, false));
+ TruncateSUBTRANS(GetOldestXmin(true, false, false));
/* Real work is done, but log and update stats before releasing lock. */
LogCheckpointEnd(false);
@@ -7107,7 +7107,7 @@ CreateRestartPoint(int flags)
* this because StartupSUBTRANS hasn't been called yet.
*/
if (EnableHotStandby)
- TruncateSUBTRANS(GetOldestXmin(true, false));
+ TruncateSUBTRANS(GetOldestXmin(true, false, false));
/* Real work is done, but log and update before releasing lock. */
LogCheckpointEnd(true);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index a29c106..dbee4d5 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2196,7 +2196,8 @@ IndexBuildHeapScan(Relation heapRelation,
{
snapshot = SnapshotAny;
/* okay to ignore lazy VACUUMs here */
- OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared, true);
+ OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared, true,
+ false);
}
scan = heap_beginscan_strat(heapRelation, /* relation */
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index ac16284..f5a6af7 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1077,7 +1077,7 @@ acquire_sample_rows(Relation onerel, int elevel,
totalblocks = RelationGetNumberOfBlocks(onerel);
/* Need a cutoff xmin for HeapTupleSatisfiesVacuum */
- OldestXmin = GetOldestXmin(onerel->rd_rel->relisshared, true);
+ OldestXmin = GetOldestXmin(onerel->rd_rel->relisshared, true, false);
/* Prepare for sampling block numbers */
BlockSampler_Init(&bs, totalblocks, targrows);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 2d3170a..158d0dc 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -394,7 +394,7 @@ vacuum_set_xid_limits(int freeze_min_age,
* working on a particular table at any time, and that each vacuum is
* always an independent transaction.
*/
- *oldestXmin = GetOldestXmin(sharedRel, true);
+ *oldestXmin = GetOldestXmin(sharedRel, true, false);
Assert(TransactionIdIsNormal(*oldestXmin));
@@ -686,7 +686,7 @@ vac_update_datfrozenxid(void)
* committed pg_class entries for new tables; see AddNewRelationTuple().
* Se we cannot produce a wrong minimum by starting with this.
*/
- newFrozenXid = GetOldestXmin(true, true);
+ newFrozenXid = GetOldestXmin(true, true, false);
/*
* We must seqscan pg_class to find the minimum Xid, because there is no
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 16cf944..2b77369 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1089,7 +1089,7 @@ XLogWalRcvSendHSFeedback(void)
* Make the expensive call to get the oldest xmin once we are certain
* everything else has been checked.
*/
- xmin = GetOldestXmin(true, false);
+ xmin = GetOldestXmin(true, false, false);
/*
* Get epoch and adjust if nextXid and oldestXmin are different sides of
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 4308128..f59e792 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1100,7 +1100,7 @@ TransactionIdIsActive(TransactionId xid)
* GetOldestXmin() move backwards, with no consequences for data integrity.
*/
TransactionId
-GetOldestXmin(bool allDbs, bool ignoreVacuum)
+GetOldestXmin(bool allDbs, bool ignoreVacuum, bool alreadyLocked)
{
ProcArrayStruct *arrayP = procArray;
TransactionId result;
@@ -1109,7 +1109,8 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
/* Cannot look for individual databases during recovery */
Assert(allDbs || !RecoveryInProgress());
- LWLockAcquire(ProcArrayLock, LW_SHARED);
+ if (!alreadyLocked)
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
/*
* We initialize the MIN() calculation with latestCompletedXid + 1. This
@@ -1164,7 +1165,8 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
*/
TransactionId kaxmin = KnownAssignedXidsGetOldestXmin();
- LWLockRelease(ProcArrayLock);
+ if (!alreadyLocked)
+ LWLockRelease(ProcArrayLock);
if (TransactionIdIsNormal(kaxmin) &&
TransactionIdPrecedes(kaxmin, result))
@@ -1172,10 +1174,8 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
}
else
{
- /*
- * No other information needed, so release the lock immediately.
- */
- LWLockRelease(ProcArrayLock);
+ if (!alreadyLocked)
+ LWLockRelease(ProcArrayLock);
/*
* Compute the cutoff XID by subtracting vacuum_defer_cleanup_age,
@@ -1249,7 +1249,7 @@ GetMaxSnapshotSubxidCount(void)
* older than this are known not running any more.
* RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
* running transactions, except those running LAZY VACUUM). This is
- * the same computation done by GetOldestXmin(true, true).
+ * the same computation done by GetOldestXmin(true, true, ...).
*
* Note: this function should probably not be called with an argument that's
* not statically allocated (see xip allocation below).
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index d5fdfea..fe0bad7 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -49,7 +49,7 @@ extern RunningTransactions GetRunningTransactionData(void);
extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid);
-extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum);
+extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum, bool alreadyLocked);
extern TransactionId GetOldestActiveTransactionId(void);
extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
--
1.7.12.289.g0ce9864.dirty
0012-wal_decodign-Log-xl_running_xact-s-at-a-higher-frequ.patchtext/x-patch; charset=us-asciiDownload
>From a3fb76c6a0982e7115dd0909aaccce4572bb7551 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 10 Dec 2012 12:08:30 +0100
Subject: [PATCH 12/19] wal_decodign: Log xl_running_xact's at a higher
frequency than checkpoints are done
Do so in the background writer which seems to be the best choice as its
regularly running and shouldn't be busy for too long without getting back into
its main loop.
Also mark xl_standby records as being relevant for async commit so the wal
writer writes them out soonish.
This might also be beneficial for HS as it would make it faster to hit a spot
where no (old) transactions are running anymroe.
---
src/backend/postmaster/bgwriter.c | 47 +++++++++++++++++++++++++++++++++++++++
src/backend/storage/ipc/standby.c | 22 +++++++++++++++---
src/include/storage/standby.h | 2 +-
3 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 286ae86..2adb36f 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -54,9 +54,11 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/standby.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
+#include "utils/timestamp.h"
/*
@@ -76,6 +78,10 @@ int BgWriterDelay = 200;
static volatile sig_atomic_t got_SIGHUP = false;
static volatile sig_atomic_t shutdown_requested = false;
+static TimestampTz last_logged_snap_ts;
+static XLogRecPtr last_logged_snap_recptr = InvalidXLogRecPtr;
+static uint32 log_snap_interval_ms = 15000;
+
/* Signal handlers */
static void bg_quickdie(SIGNAL_ARGS);
@@ -142,6 +148,12 @@ BackgroundWriterMain(void)
CurrentResourceOwner = ResourceOwnerCreate(NULL, "Background Writer");
/*
+ * We just started, assume there has been either a shutdown or
+ * end-of-recovery snapshot.
+ */
+ last_logged_snap_ts = GetCurrentTimestamp();
+
+ /*
* Create a memory context that we will do all our work in. We do this so
* that we can reset the context during error recovery and thereby avoid
* possible memory leaks. Formerly this code just ran in
@@ -276,6 +288,41 @@ BackgroundWriterMain(void)
}
/*
+ * Log a new xl_running_xacts every now and then so replication can get
+ * into a consistent state faster and clean up resources more
+ * frequently. The costs of this are relatively low, so doing it 4
+ * times a minute seems fine.
+ *
+ * We assume the interval for writing xl_running_xacts is significantly
+ * bigger than BgWriterDelay, so we don't complicate the overall
+ * timeout handling but just assume we're going to get called often
+ * enough even if hibernation mode is active. It's not that important
+ * that log_snap_interval_ms is met strictly.
+ *
+ * We do this logging in the bgwriter as its the only process thats run
+ * regularly and returns to its mainloop all the
+ * time. E.g. Checkpointer, when active, is barely every in its
+ * mainloop.
+ */
+ if (XLogStandbyInfoActive() && !RecoveryInProgress())
+ {
+ TimestampTz timeout = 0;
+ timeout = TimestampTzPlusMilliseconds(last_logged_snap_ts,
+ log_snap_interval_ms);
+
+ /*
+ * only log if enough time has passed and some xlog record has been
+ * inserted.
+ */
+ if (GetCurrentTimestamp() >= timeout &&
+ last_logged_snap_recptr != GetXLogInsertRecPtr())
+ {
+ last_logged_snap_recptr = LogStandbySnapshot();
+ last_logged_snap_ts = GetCurrentTimestamp();
+ }
+ }
+
+ /*
* Sleep until we are signaled or BgWriterDelay has elapsed.
*
* Note: the feedback control loop in BgBufferSync() expects that we
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index a903f12..deb1850 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -42,7 +42,7 @@ static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlis
ProcSignalReason reason);
static void ResolveRecoveryConflictWithLock(Oid dbOid, Oid relOid);
static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
-static void LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
+static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -846,10 +846,13 @@ standby_redo(XLogRecPtr lsn, XLogRecord *record)
* currently running xids, performed by StandbyReleaseOldLocks().
* Zero xids should no longer be possible, but we may be replaying WAL
* from a time when they were possible.
+ *
+ * Returns the RecPtr of the last inserted record.
*/
-void
+XLogRecPtr
LogStandbySnapshot(void)
{
+ XLogRecPtr recptr;
RunningTransactions running;
xl_standby_lock *locks;
int nlocks;
@@ -875,8 +878,11 @@ LogStandbySnapshot(void)
*/
running = GetRunningTransactionData();
LogCurrentRunningXacts(running);
+
/* GetRunningTransactionData() acquired XidGenLock, we must release it */
LWLockRelease(XidGenLock);
+
+ return recptr;
}
/*
@@ -887,7 +893,7 @@ LogStandbySnapshot(void)
* is a contiguous chunk of memory and never exists fully until it is
* assembled in WAL.
*/
-static void
+static XLogRecPtr
LogCurrentRunningXacts(RunningTransactions CurrRunningXacts)
{
xl_running_xacts xlrec;
@@ -937,6 +943,16 @@ LogCurrentRunningXacts(RunningTransactions CurrRunningXacts)
CurrRunningXacts->oldestRunningXid,
CurrRunningXacts->latestCompletedXid,
CurrRunningXacts->nextXid);
+
+ /*
+ * Ensure running xact information is synced to disk not too far in the
+ * future, logical standby's need this soon after initialization. We don't
+ * want to stall anything though, so we let the wal writer do it during
+ * normal operation.
+ */
+ XLogSetAsyncXactLSN(recptr);
+
+ return recptr;
}
/*
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 168c14c..ab84584 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -113,6 +113,6 @@ typedef RunningTransactionsData *RunningTransactions;
extern void LogAccessExclusiveLock(Oid dbOid, Oid relOid);
extern void LogAccessExclusiveLockPrepare(void);
-extern void LogStandbySnapshot(void);
+extern XLogRecPtr LogStandbySnapshot(void);
#endif /* STANDBY_H */
--
1.7.12.289.g0ce9864.dirty
0001-Centralize-Assert-macros-into-c.h-so-its-common-betw.patchtext/x-patch; charset=us-asciiDownload
>From 4cec3fe09d714483e0bc05b53fc20501cffe951c Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 8 Jan 2013 17:59:10 +0100
Subject: [PATCH 01/19] Centralize Assert* macros into c.h so its common
between backend/frontend
c.h already had parts of the assert support (StaticAssert*) and its the shared
file between postgres.h and postgres_fe.h. This makes it easier to build
frontend programs which have to do the hack.
---
src/include/c.h | 65 +++++++++++++++++++++++++++++++++++++++++++++++
src/include/postgres.h | 54 ++-------------------------------------
src/include/postgres_fe.h | 12 ---------
3 files changed, 67 insertions(+), 64 deletions(-)
diff --git a/src/include/c.h b/src/include/c.h
index 59af5b5..57664e8 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -694,6 +694,71 @@ typedef NameData *Name;
/*
+ * USE_ASSERT_CHECKING, if defined, turns on all the assertions.
+ * - plai 9/5/90
+ *
+ * It should _NOT_ be defined in releases or in benchmark copies
+ */
+
+/*
+ * Assert() can be used in both frontend and backend code. In frontend code it
+ * just calls the standard assert, if it's available. If use of assertions is
+ * not configured, it does nothing.
+ */
+#ifndef USE_ASSERT_CHECKING
+
+#define Assert(condition)
+#define AssertMacro(condition) ((void)true)
+#define AssertArg(condition)
+#define AssertState(condition)
+
+#elif defined FRONTEND
+
+#include <assert.h>
+#define Assert(p) assert(p)
+#define AssertMacro(p) ((void) assert(p))
+
+#else /* USE_ASSERT_CHECKING && FRONTEND */
+
+/*
+ * Trap
+ * Generates an exception if the given condition is true.
+ */
+#define Trap(condition, errorType) \
+ do { \
+ if ((assert_enabled) && (condition)) \
+ ExceptionalCondition(CppAsString(condition), (errorType), \
+ __FILE__, __LINE__); \
+ } while (0)
+
+/*
+ * TrapMacro is the same as Trap but it's intended for use in macros:
+ *
+ * #define foo(x) (AssertMacro(x != 0), bar(x))
+ *
+ * Isn't CPP fun?
+ */
+#define TrapMacro(condition, errorType) \
+ ((bool) ((! assert_enabled) || ! (condition) || \
+ (ExceptionalCondition(CppAsString(condition), (errorType), \
+ __FILE__, __LINE__), 0)))
+
+#define Assert(condition) \
+ Trap(!(condition), "FailedAssertion")
+
+#define AssertMacro(condition) \
+ ((void) TrapMacro(!(condition), "FailedAssertion"))
+
+#define AssertArg(condition) \
+ Trap(!(condition), "BadArgument")
+
+#define AssertState(condition) \
+ Trap(!(condition), "BadState")
+
+#endif /* USE_ASSERT_CHECKING && !FRONTEND */
+
+
+/*
* Macros to support compile-time assertion checks.
*
* If the "condition" (a compile-time-constant expression) evaluates to false,
diff --git a/src/include/postgres.h b/src/include/postgres.h
index b6e922f..bbe125a 100644
--- a/src/include/postgres.h
+++ b/src/include/postgres.h
@@ -25,7 +25,7 @@
* ------- ------------------------------------------------
* 1) variable-length datatypes (TOAST support)
* 2) datum type + support macros
- * 3) exception handling definitions
+ * 3) exception handling
*
* NOTES
*
@@ -627,62 +627,12 @@ extern Datum Float8GetDatum(float8 X);
/* ----------------------------------------------------------------
- * Section 3: exception handling definitions
- * Assert, Trap, etc macros
+ * Section 3: exception handling backend support
* ----------------------------------------------------------------
*/
extern PGDLLIMPORT bool assert_enabled;
-/*
- * USE_ASSERT_CHECKING, if defined, turns on all the assertions.
- * - plai 9/5/90
- *
- * It should _NOT_ be defined in releases or in benchmark copies
- */
-
-/*
- * Trap
- * Generates an exception if the given condition is true.
- */
-#define Trap(condition, errorType) \
- do { \
- if ((assert_enabled) && (condition)) \
- ExceptionalCondition(CppAsString(condition), (errorType), \
- __FILE__, __LINE__); \
- } while (0)
-
-/*
- * TrapMacro is the same as Trap but it's intended for use in macros:
- *
- * #define foo(x) (AssertMacro(x != 0), bar(x))
- *
- * Isn't CPP fun?
- */
-#define TrapMacro(condition, errorType) \
- ((bool) ((! assert_enabled) || ! (condition) || \
- (ExceptionalCondition(CppAsString(condition), (errorType), \
- __FILE__, __LINE__), 0)))
-
-#ifndef USE_ASSERT_CHECKING
-#define Assert(condition)
-#define AssertMacro(condition) ((void)true)
-#define AssertArg(condition)
-#define AssertState(condition)
-#else
-#define Assert(condition) \
- Trap(!(condition), "FailedAssertion")
-
-#define AssertMacro(condition) \
- ((void) TrapMacro(!(condition), "FailedAssertion"))
-
-#define AssertArg(condition) \
- Trap(!(condition), "BadArgument")
-
-#define AssertState(condition) \
- Trap(!(condition), "BadState")
-#endif /* USE_ASSERT_CHECKING */
-
extern void ExceptionalCondition(const char *conditionName,
const char *errorType,
const char *fileName, int lineNumber) __attribute__((noreturn));
diff --git a/src/include/postgres_fe.h b/src/include/postgres_fe.h
index af31227..0f35ecc 100644
--- a/src/include/postgres_fe.h
+++ b/src/include/postgres_fe.h
@@ -24,16 +24,4 @@
#include "c.h"
-/*
- * Assert() can be used in both frontend and backend code. In frontend code it
- * just calls the standard assert, if it's available. If use of assertions is
- * not configured, it does nothing.
- */
-#ifdef USE_ASSERT_CHECKING
-#include <assert.h>
-#define Assert(p) assert(p)
-#else
-#define Assert(p)
-#endif
-
#endif /* POSTGRES_FE_H */
--
1.7.12.289.g0ce9864.dirty
0004-Add-pg_xlogdump-contrib-module.patchtext/x-patch; charset=us-asciiDownload
>From c414374faf290d6216dc5fb166b800b08b196fd2 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 8 Jan 2013 18:27:12 +0100
Subject: [PATCH 04/19] Add pg_xlogdump contrib module
Authors: Andres Freund, Heikki Linnakangas
---
contrib/Makefile | 1 +
contrib/pg_xlogdump/Makefile | 37 +++
contrib/pg_xlogdump/compat.c | 58 ++++
contrib/pg_xlogdump/pg_xlogdump.c | 656 ++++++++++++++++++++++++++++++++++++++
contrib/pg_xlogdump/tables.c | 78 +++++
doc/src/sgml/ref/allfiles.sgml | 1 +
doc/src/sgml/ref/pg_xlogdump.sgml | 76 +++++
doc/src/sgml/reference.sgml | 1 +
src/backend/access/transam/rmgr.c | 1 +
src/backend/catalog/catalog.c | 2 +
src/tools/msvc/Mkvcbuild.pm | 16 +-
11 files changed, 926 insertions(+), 1 deletion(-)
create mode 100644 contrib/pg_xlogdump/Makefile
create mode 100644 contrib/pg_xlogdump/compat.c
create mode 100644 contrib/pg_xlogdump/pg_xlogdump.c
create mode 100644 contrib/pg_xlogdump/tables.c
create mode 100644 doc/src/sgml/ref/pg_xlogdump.sgml
diff --git a/contrib/Makefile b/contrib/Makefile
index fcd7c1e..5d290b8 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -39,6 +39,7 @@ SUBDIRS = \
pg_trgm \
pg_upgrade \
pg_upgrade_support \
+ pg_xlogdump \
pgbench \
pgcrypto \
pgrowlocks \
diff --git a/contrib/pg_xlogdump/Makefile b/contrib/pg_xlogdump/Makefile
new file mode 100644
index 0000000..1adef35
--- /dev/null
+++ b/contrib/pg_xlogdump/Makefile
@@ -0,0 +1,37 @@
+# contrib/pg_xlogdump/Makefile
+
+PGFILEDESC = "pg_xlogdump"
+PGAPPICON=win32
+
+PROGRAM = pg_xlogdump
+OBJS = pg_xlogdump.o compat.o tables.o xlogreader.o $(RMGRDESCOBJS) \
+ $(WIN32RES)
+
+# XXX: Perhaps this should be done by a wildcard rule so that you don't need
+# to remember to add new rmgrdesc files to this list.
+RMGRDESCSOURCES = clogdesc.c dbasedesc.c gindesc.c gistdesc.c hashdesc.c \
+ heapdesc.c mxactdesc.c nbtdesc.c relmapdesc.c seqdesc.c smgrdesc.c \
+ spgdesc.c standbydesc.c tblspcdesc.c xactdesc.c xlogdesc.c
+
+RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
+
+EXTRA_CLEAN = $(RMGRDESCSOURCES) xlogreader.c
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_xlogdump
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+
+xlogreader.c: % : $(top_srcdir)/src/backend/access/transam/%
+ rm -f $@ && $(LN_S) $< .
+
+$(RMGRDESCSOURCES): % : $(top_srcdir)/src/backend/access/rmgrdesc/%
+ rm -f $@ && $(LN_S) $< .
diff --git a/contrib/pg_xlogdump/compat.c b/contrib/pg_xlogdump/compat.c
new file mode 100644
index 0000000..80a83f6
--- /dev/null
+++ b/contrib/pg_xlogdump/compat.c
@@ -0,0 +1,58 @@
+/*-------------------------------------------------------------------------
+ *
+ * compat.c
+ * Reimplementations of various backend functions.
+ *
+ * Portions Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/pg_xlogdump/compat.c
+ *
+ * This file contains client-side implementations for various backend
+ * functions that the rm_desc functions in *desc.c files rely on.
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/* ugly hack, same as in e.g pg_controldata */
+#define FRONTEND 1
+#include "postgres.h"
+
+#include "catalog/catalog.h"
+#include "datatype/timestamp.h"
+#include "lib/stringinfo.h"
+#include "storage/relfilenode.h"
+#include "utils/timestamp.h"
+#include "utils/datetime.h"
+
+const char *
+timestamptz_to_str(TimestampTz dt)
+{
+ return "unimplemented-timestamp";
+}
+
+char *
+relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
+{
+ return pstrdup("unimplemented-relpathbackend");
+}
+
+/*
+ * Provide a hacked up compat layer for StringInfos so xlog desc functions can
+ * be linked/called.
+ */
+void
+appendStringInfo(StringInfo str, const char *fmt, ...)
+{
+ va_list args;
+
+ va_start(args, fmt);
+ vprintf(fmt, args);
+ va_end(args);
+}
+
+void
+appendStringInfoString(StringInfo str, const char *string)
+{
+ appendStringInfo(str, "%s", string);
+}
diff --git a/contrib/pg_xlogdump/pg_xlogdump.c b/contrib/pg_xlogdump/pg_xlogdump.c
new file mode 100644
index 0000000..e68058f
--- /dev/null
+++ b/contrib/pg_xlogdump/pg_xlogdump.c
@@ -0,0 +1,656 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_xlogdump.c - decode and display WAL
+ *
+ * Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/pg_xlogdump/pg_xlogdump.c
+ *-------------------------------------------------------------------------
+ */
+
+/* ugly hack, same as in e.g pg_controldata */
+#define FRONTEND 1
+#include "postgres.h"
+
+#include <unistd.h>
+#include <sys/types.h>
+#include <dirent.h>
+
+#include "access/xlog.h"
+#include "access/xlogreader.h"
+#include "access/rmgr.h"
+#include "access/transam.h"
+
+#include "catalog/catalog.h"
+
+#include "getopt_long.h"
+
+static const char *progname;
+
+typedef struct XLogDumpPrivateData
+{
+ TimeLineID timeline;
+ char *inpath;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+
+ /* display options */
+ bool bkp_details;
+ int stop_after_records;
+ int already_displayed_records;
+
+ /* filter options */
+ int filter_by_rmgr;
+ TransactionId filter_by_xid;
+} XLogDumpPrivateData;
+
+static void fatal_error(const char *fmt, ...)
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
+
+static void fatal_error(const char *fmt, ...)
+{
+ va_list args;
+ fflush(stdout);
+
+ fprintf(stderr, "%s: fatal_error: ", progname);
+ va_start(args, fmt);
+ vfprintf(stderr, fmt, args);
+ va_end(args);
+ fputc('\n', stderr);
+ exit(EXIT_FAILURE);
+}
+
+/*
+ * Check whether directory exists and whether we can open it. Keep errno set
+ * error reporting by the caller.
+ */
+static bool
+verify_directory(const char *directory)
+{
+ DIR *dir = opendir(directory);
+ if (dir == NULL)
+ return false;
+ closedir(dir);
+ return true;
+}
+
+static void
+split_path(const char *path, char **dir, char **fname)
+{
+ char *sep;
+
+ /* split filepath into directory & filename */
+ sep = strrchr(path, '/');
+
+ /* directory path */
+ if (sep != NULL)
+ {
+ /* windows doesn't have strndup */
+ *dir = strdup(path);
+ (*dir)[(sep - path) + 1] = '\0';
+ *fname = strdup(sep + 1);
+ }
+ /* local directory */
+ else
+ {
+ *dir = NULL;
+ *fname = strdup(path);
+ }
+}
+
+/*
+ * Try to find the file in several places:
+ * if directory == NULL:
+ * fname
+ * XLOGDIR / fname
+ * $PGDATA / XLOGDIR / fname
+ * else
+ * directory / fname
+ * directory / XLOGDIR / fname
+ *
+ * return a read only fd
+ */
+static int
+fuzzy_open_file(const char *directory, const char *fname)
+{
+ int fd = -1;
+ char fpath[MAXPGPATH];
+
+ if (directory == NULL)
+ {
+ const char* datadir;
+
+ /* fname */
+ fd = open(fname, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0 && errno != ENOENT)
+ return -1;
+ else if (fd > 0)
+ return fd;
+
+ /* XLOGDIR / fname */
+ snprintf(fpath, MAXPGPATH, "%s/%s",
+ XLOGDIR, fname);
+ fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0 && errno != ENOENT)
+ return -1;
+ else if (fd > 0)
+ return fd;
+
+ datadir = getenv("PGDATA");
+ /* $PGDATA / XLOGDIR / fname */
+ if (datadir != NULL)
+ {
+ snprintf(fpath, MAXPGPATH, "%s/%s/%s",
+ datadir, XLOGDIR, fname);
+ fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0 && errno != ENOENT)
+ return -1;
+ else if (fd > 0)
+ return fd;
+ }
+ }
+ else
+ {
+ /* directory / fname */
+ snprintf(fpath, MAXPGPATH, "%s/%s",
+ directory, fname);
+ fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0 && errno != ENOENT)
+ return -1;
+ else if (fd > 0)
+ return fd;
+
+ /* directory / XLOGDIR / fname */
+ snprintf(fpath, MAXPGPATH, "%s/%s/%s",
+ directory, XLOGDIR, fname);
+ fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0 && errno != ENOENT)
+ return -1;
+ else if (fd > 0)
+ return fd;
+ }
+ return -1;
+}
+
+/* this should probably be put in a general implementation */
+static void
+XLogDumpXLogRead(const char *directory, TimeLineID timeline_id,
+ XLogRecPtr startptr, char *buf, Size count)
+{
+ char *p;
+ XLogRecPtr recptr;
+ Size nbytes;
+
+ static int sendFile = -1;
+ static XLogSegNo sendSegNo = 0;
+ static uint32 sendOff = 0;
+
+ p = buf;
+ recptr = startptr;
+ nbytes = count;
+
+ while (nbytes > 0)
+ {
+ uint32 startoff;
+ int segbytes;
+ int readbytes;
+
+ startoff = recptr % XLogSegSize;
+
+ if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo))
+ {
+ char fname[MAXFNAMELEN];
+
+ /* Switch to another logfile segment */
+ if (sendFile >= 0)
+ close(sendFile);
+
+ XLByteToSeg(recptr, sendSegNo);
+
+ XLogFileName(fname, timeline_id, sendSegNo);
+
+ sendFile = fuzzy_open_file(directory, fname);
+
+ if (sendFile < 0)
+ fatal_error("could not find file \"%s\": %s",
+ fname, strerror(errno));
+ sendOff = 0;
+ }
+
+ /* Need to seek in the file? */
+ if (sendOff != startoff)
+ {
+ if (lseek(sendFile, (off_t) startoff, SEEK_SET) < 0)
+ {
+ int err = errno;
+ char fname[MAXPGPATH];
+ XLogFileName(fname, timeline_id, sendSegNo);
+
+ fatal_error("could not seek in log segment %s to offset %u: %s",
+ fname, startoff, strerror(err));
+ }
+ sendOff = startoff;
+ }
+
+ /* How many bytes are within this segment? */
+ if (nbytes > (XLogSegSize - startoff))
+ segbytes = XLogSegSize - startoff;
+ else
+ segbytes = nbytes;
+
+ readbytes = read(sendFile, p, segbytes);
+ if (readbytes <= 0)
+ {
+ int err = errno;
+ char fname[MAXPGPATH];
+ XLogFileName(fname, timeline_id, sendSegNo);
+
+ fatal_error("could not read from log segment %s, offset %d, length %d: %s",
+ fname, sendOff, segbytes, strerror(err));
+ }
+
+ /* Update state for read */
+ recptr += readbytes;
+
+ sendOff += readbytes;
+ nbytes -= readbytes;
+ p += readbytes;
+ }
+}
+
+static int
+XLogDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ char *readBuff, TimeLineID *curFileTLI)
+{
+ XLogDumpPrivateData *private = state->private_data;
+ int count = XLOG_BLCKSZ;
+
+ if (private->endptr != InvalidXLogRecPtr)
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ return -1;
+ }
+
+ XLogDumpXLogRead(private->inpath, private->timeline, targetPagePtr,
+ readBuff, count);
+
+ return count;
+}
+
+static void
+XLogDumpDisplayRecord(XLogReaderState *state, XLogRecord *record)
+{
+ XLogDumpPrivateData *config = (XLogDumpPrivateData *) state->private_data;
+ const RmgrData *rmgr = &RmgrTable[record->xl_rmid];
+
+ if (config->filter_by_rmgr != -1 &&
+ config->filter_by_rmgr != record->xl_rmid)
+ return;
+
+ if (TransactionIdIsValid(config->filter_by_xid) &&
+ config->filter_by_xid != record->xl_xid)
+ return;
+
+ config->already_displayed_records++;
+
+ printf("xlog record: rmgr: %-11s, record_len: %6u, tot_len: %6u, tx: %10u, lsn: %X/%08X, prev %X/%08X, bkp: %u%u%u%u, desc:",
+ rmgr->rm_name,
+ record->xl_len, record->xl_tot_len,
+ record->xl_xid,
+ (uint32) (state->ReadRecPtr >> 32), (uint32) state->ReadRecPtr,
+ (uint32) (record->xl_prev >> 32), (uint32) record->xl_prev,
+ !!(XLR_BKP_BLOCK(0) & record->xl_info),
+ !!(XLR_BKP_BLOCK(1) & record->xl_info),
+ !!(XLR_BKP_BLOCK(2) & record->xl_info),
+ !!(XLR_BKP_BLOCK(3) & record->xl_info));
+
+ /* the desc routine will printf the description directly to stdout */
+ rmgr->rm_desc(NULL, record->xl_info, XLogRecGetData(record));
+
+ putchar('\n');
+
+ if (config->bkp_details)
+ {
+ int off;
+ char *blk = (char *) XLogRecGetData(record) + record->xl_len;
+
+ for (off = 0; off < XLR_MAX_BKP_BLOCKS; off++)
+ {
+ BkpBlock bkpb;
+
+ if (!(XLR_BKP_BLOCK(off) & record->xl_info))
+ continue;
+
+ memcpy(&bkpb, blk, sizeof(BkpBlock));
+ blk += sizeof(BkpBlock);
+ blk += BLCKSZ - bkpb.hole_length;
+
+ printf("\tbackup bkp #%u; rel %u/%u/%u; fork: %s; block: %u; hole: offset: %u, length: %u\n",
+ off, bkpb.node.spcNode, bkpb.node.dbNode, bkpb.node.relNode,
+ forkNames[bkpb.fork], bkpb.block, bkpb.hole_offset, bkpb.hole_length);
+ }
+ }
+}
+
+static void
+usage(void)
+{
+ printf("%s: reads/writes postgres transaction logs for debugging.\n\n",
+ progname);
+ printf("Usage:\n");
+ printf(" %s [OPTION] [STARTSEG [ENDSEG]] \n", progname);
+ printf("\nOptions:\n");
+ printf(" -b, --bkp-details output detailed information about backup blocks\n");
+ printf(" -e, --end RECPTR read wal up to RECPTR\n");
+ printf(" -h, --help show this help, then exit\n");
+ printf(" -n, --limit RECORDS only display n records, abort afterwards\n");
+ printf(" -p, --path PATH from where do we want to read? cwd/pg_xlog is the default\n");
+ printf(" -r, --rmgr RMGR only show records generated by the rmgr RMGR\n");
+ printf(" -s, --start RECPTR read wal in directory indicated by -p starting at RECPTR\n");
+ printf(" -t, --timeline TLI which timeline do we want to read, defaults to 1\n");
+ printf(" -V, --version output version information, then exit\n");
+ printf(" -x, --xid XID only show records with transactionid XID\n");
+}
+
+int
+main(int argc, char **argv)
+{
+ uint32 xlogid;
+ uint32 xrecoff;
+ XLogReaderState *xlogreader_state;
+ XLogDumpPrivateData private;
+ XLogRecord *record;
+ XLogRecPtr first_record;
+ char *errormsg;
+
+ static struct option long_options[] = {
+ {"bkp-details", no_argument, NULL, 'b'},
+ {"end", required_argument, NULL, 'e'},
+ {"help", no_argument, NULL, '?'},
+ {"limit", required_argument, NULL, 'n'},
+ {"path", required_argument, NULL, 'p'},
+ {"rmgr", required_argument, NULL, 'r'},
+ {"start", required_argument, NULL, 's'},
+ {"timeline", required_argument, NULL, 't'},
+ {"xid", required_argument, NULL, 'x'},
+ {"version", no_argument, NULL, 'V'},
+ {NULL, 0, NULL, 0}
+ };
+
+ int option;
+ int optindex = 0;
+
+ progname = get_progname(argv[0]);
+
+ memset(&private, 0, sizeof(XLogDumpPrivateData));
+
+ private.timeline = 1;
+ private.bkp_details = false;
+ private.startptr = InvalidXLogRecPtr;
+ private.endptr = InvalidXLogRecPtr;
+ private.stop_after_records = -1;
+ private.already_displayed_records = 0;
+ private.filter_by_rmgr = -1;
+ private.filter_by_xid = InvalidTransactionId;
+
+ if (argc <= 1)
+ {
+ fprintf(stderr, "%s: no arguments specified\n", progname);
+ goto bad_argument;
+ }
+
+ while ((option = getopt_long(argc, argv, "be:?n:p:r:s:t:Vx:",
+ long_options, &optindex)) != -1)
+ {
+ switch (option)
+ {
+ case 'b':
+ private.bkp_details = true;
+ break;
+ case 'e':
+ if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2)
+ {
+ fprintf(stderr, "%s: could not parse parse --end %s\n",
+ progname, optarg);
+ goto bad_argument;
+ }
+ private.endptr = (uint64)xlogid << 32 | xrecoff;
+ break;
+ case '?':
+ usage();
+ exit(EXIT_SUCCESS);
+ break;
+ case 'n':
+ if (sscanf(optarg, "%d", &private.stop_after_records) != 1)
+ {
+ fprintf(stderr, "%s: could not parse parse --limit %s\n",
+ progname, optarg);
+ goto bad_argument;
+ }
+ break;
+ case 'p':
+ private.inpath = strdup(optarg);
+ break;
+ case 'r':
+ {
+ int i;
+ for (i = 0; i < RM_MAX_ID; i++)
+ {
+ if (strcmp(optarg, RmgrTable[i].rm_name) == 0)
+ {
+ private.filter_by_rmgr = i;
+ break;
+ }
+ }
+
+ if (private.filter_by_rmgr == -1)
+ {
+ fprintf(stderr, "%s: --rmgr %s does not exist\n",
+ progname, optarg);
+ goto bad_argument;
+ }
+ }
+ break;
+ case 's':
+ if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2)
+ {
+ fprintf(stderr, "%s: could not parse parse --end %s\n",
+ progname, optarg);
+ goto bad_argument;
+ }
+ else
+ private.startptr = (uint64)xlogid << 32 | xrecoff;
+ break;
+ case 't':
+ if (sscanf(optarg, "%d", &private.timeline) != 1)
+ {
+ fprintf(stderr, "%s: could not parse timeline --timeline %s\n",
+ progname, optarg);
+ goto bad_argument;
+ }
+ break;
+ case 'V':
+ puts("pg_xlogdump (PostgreSQL) " PG_VERSION);
+ exit(EXIT_SUCCESS);
+ break;
+ case 'x':
+ if (sscanf(optarg, "%u", &private.filter_by_xid) != 1)
+ {
+ fprintf(stderr, "%s: could not parse --xid %s as a valid xid\n",
+ progname, optarg);
+ goto bad_argument;
+ }
+ break;
+ default:
+ goto bad_argument;
+ }
+ }
+
+ if ((optind + 2) < argc)
+ {
+ fprintf(stderr,
+ "%s: too many command-line arguments (first is \"%s\")\n",
+ progname, argv[optind + 2]);
+ goto bad_argument;
+ }
+
+ if (private.inpath != NULL)
+ {
+ /* validdate path points to directory */
+ if (!verify_directory(private.inpath))
+ {
+ fprintf(stderr,
+ "%s: --path %s is cannot be opened: %s\n",
+ progname, private.inpath, strerror(errno));
+ goto bad_argument;
+ }
+ }
+
+ /* parse files as start/end boundaries, extract path if not specified */
+ if (optind < argc)
+ {
+ char *directory = NULL;
+ char *fname = NULL;
+ int fd;
+ XLogSegNo segno;
+
+ split_path(argv[optind], &directory, &fname);
+
+ if (private.inpath == NULL && directory != NULL)
+ {
+ private.inpath = directory;
+
+ if (!verify_directory(private.inpath))
+ fatal_error("cannot open directory %s: %s",
+ private.inpath, strerror(errno));
+ }
+
+ fd = fuzzy_open_file(private.inpath, fname);
+ if (fd < 0)
+ fatal_error("could not open file %s", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &segno);
+
+ if (XLogRecPtrIsInvalid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno))
+ {
+ fprintf(stderr,
+ "%s: --end %X/%X is not inside file \"%s\"\n",
+ progname,
+ (uint32)(private.startptr >> 32),
+ (uint32)private.startptr,
+ fname);
+ goto bad_argument;
+ }
+
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.endptr);
+
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
+
+ fd = fuzzy_open_file(private.inpath, fname);
+ if (fd < 0)
+ fatal_error("could not open file %s", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno);
+
+ if (endsegno < segno)
+ fatal_error("ENDSEG %s is before STARSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno) &&
+ private.endptr != (segno + 1) * XLogSegSize)
+ {
+ fprintf(stderr,
+ "%s: --end %X/%X is not inside file \"%s\"\n",
+ progname,
+ (uint32)(private.endptr >> 32),
+ (uint32)private.endptr,
+ argv[argc -1]);
+ goto bad_argument;
+ }
+ }
+
+ /* we don't know what to print */
+ if (XLogRecPtrIsInvalid(private.startptr))
+ {
+ fprintf(stderr, "%s: no --start given in range mode.\n", progname);
+ goto bad_argument;
+ }
+
+ /* done with argument parsing, do the actual work */
+
+ /* we have everything we need, start reading */
+ xlogreader_state = XLogReaderAllocate(private.startptr,
+ XLogDumpReadPage,
+ &private);
+
+ /* first find a valid recptr to start from */
+ first_record = XLogFindNextRecord(xlogreader_state, private.startptr);
+
+ if (first_record == InvalidXLogRecPtr)
+ fatal_error("could not find a valid record after %X/%X",
+ (uint32) (private.startptr >> 32),
+ (uint32) private.startptr);
+
+ /*
+ * Display a message that were skipping data if `from` wasn't a pointer
+ * to the start of a record and also wasn't a pointer to the beginning
+ * of a segment (e.g. we were used in file mode).
+ */
+ if (first_record != private.startptr && (private.startptr % XLogSegSize) != 0)
+ printf("first record is after %X/%X, at %X/%X, skipping over %u bytes\n",
+ (uint32) (private.startptr >> 32), (uint32) private.startptr,
+ (uint32) (first_record >> 32), (uint32) first_record,
+ (uint32) (first_record - private.startptr));
+
+ while ((record = XLogReadRecord(xlogreader_state, first_record, &errormsg)))
+ {
+ /* continue after the last record */
+ first_record = InvalidXLogRecPtr;
+ XLogDumpDisplayRecord(xlogreader_state, record);
+
+ /* check whether we printed enough */
+ if (private.stop_after_records > 0 &&
+ private.already_displayed_records >= private.stop_after_records)
+ break;
+ }
+
+ if (errormsg)
+ fatal_error("error in WAL record at %X/%X: %s\n",
+ (uint32)(xlogreader_state->ReadRecPtr >> 32),
+ (uint32)xlogreader_state->ReadRecPtr,
+ errormsg);
+
+ XLogReaderFree(xlogreader_state);
+
+ return EXIT_SUCCESS;
+bad_argument:
+ fprintf(stderr, "Try \"%s --help\" for more information.\n", progname);
+ return EXIT_FAILURE;
+}
diff --git a/contrib/pg_xlogdump/tables.c b/contrib/pg_xlogdump/tables.c
new file mode 100644
index 0000000..e947e0d
--- /dev/null
+++ b/contrib/pg_xlogdump/tables.c
@@ -0,0 +1,78 @@
+/*-------------------------------------------------------------------------
+ *
+ * tables.c
+ * Support data for xlogdump.c
+ *
+ * Portions Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/pg_xlogdump/tables.c
+ *
+ * NOTES
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * rmgr.c
+ *
+ * Resource managers definition
+ *
+ * src/backend/access/transam/rmgr.c
+ */
+#include "postgres.h"
+
+#include "access/clog.h"
+#include "access/gin.h"
+#include "access/gist_private.h"
+#include "access/hash.h"
+#include "access/heapam_xlog.h"
+#include "access/multixact.h"
+#include "access/nbtree.h"
+#include "access/spgist.h"
+#include "access/xact.h"
+#include "access/xlog_internal.h"
+#include "catalog/storage_xlog.h"
+#include "commands/dbcommands.h"
+#include "commands/sequence.h"
+#include "commands/tablespace.h"
+#include "storage/standby.h"
+#include "utils/relmapper.h"
+#include "catalog/catalog.h"
+
+/*
+ * Table of fork names.
+ *
+ * needs to be synced with src/backend/catalog/catalog.c
+ */
+const char *forkNames[] = {
+ "main", /* MAIN_FORKNUM */
+ "fsm", /* FSM_FORKNUM */
+ "vm", /* VISIBILITYMAP_FORKNUM */
+ "init" /* INIT_FORKNUM */
+};
+
+/*
+ * RmgrTable linked only to functions available outside of the backend.
+ *
+ * needs to be synced with src/backend/access/transam/rmgr.c
+ */
+const RmgrData RmgrTable[RM_MAX_ID + 1] = {
+ {"XLOG", NULL, xlog_desc, NULL, NULL, NULL},
+ {"Transaction", NULL, xact_desc, NULL, NULL, NULL},
+ {"Storage", NULL, smgr_desc, NULL, NULL, NULL},
+ {"CLOG", NULL, clog_desc, NULL, NULL, NULL},
+ {"Database", NULL, dbase_desc, NULL, NULL, NULL},
+ {"Tablespace", NULL, tblspc_desc, NULL, NULL, NULL},
+ {"MultiXact", NULL, multixact_desc, NULL, NULL, NULL},
+ {"RelMap", NULL, relmap_desc, NULL, NULL, NULL},
+ {"Standby", NULL, standby_desc, NULL, NULL, NULL},
+ {"Heap2", NULL, heap2_desc, NULL, NULL, NULL},
+ {"Heap", NULL, heap_desc, NULL, NULL, NULL},
+ {"Btree", NULL, btree_desc, NULL, NULL, NULL},
+ {"Hash", NULL, hash_desc, NULL, NULL, NULL},
+ {"Gin", NULL, gin_desc, NULL, NULL, NULL},
+ {"Gist", NULL, gist_desc, NULL, NULL, NULL},
+ {"Sequence", NULL, seq_desc, NULL, NULL, NULL},
+ {"SPGist", NULL, spg_desc, NULL, NULL, NULL}
+};
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index df84054..49cb7ac 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -178,6 +178,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY pgReceivexlog SYSTEM "pg_receivexlog.sgml">
<!ENTITY pgResetxlog SYSTEM "pg_resetxlog.sgml">
<!ENTITY pgRestore SYSTEM "pg_restore.sgml">
+<!ENTITY pgXlogdump SYSTEM "pg_xlogdump.sgml">
<!ENTITY postgres SYSTEM "postgres-ref.sgml">
<!ENTITY postmaster SYSTEM "postmaster.sgml">
<!ENTITY psqlRef SYSTEM "psql-ref.sgml">
diff --git a/doc/src/sgml/ref/pg_xlogdump.sgml b/doc/src/sgml/ref/pg_xlogdump.sgml
new file mode 100644
index 0000000..7a27c7b
--- /dev/null
+++ b/doc/src/sgml/ref/pg_xlogdump.sgml
@@ -0,0 +1,76 @@
+<!--
+doc/src/sgml/ref/pg_xlogdump.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="APP-PGXLOGDUMP">
+ <refmeta>
+ <refentrytitle><application>pg_xlogdump</application></refentrytitle>
+ <manvolnum>1</manvolnum>
+ <refmiscinfo>Application</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>pg_xlogdump</refname>
+ <refpurpose>Display the write-ahead log of a <productname>PostgreSQL</productname> database cluster</refpurpose>
+ </refnamediv>
+
+ <indexterm zone="app-pgxlogdump">
+ <primary>pg_xlogdump</primary>
+ </indexterm>
+
+ <refsynopsisdiv>
+ <cmdsynopsis>
+ <command>pg_xlogdump</command>
+ <arg choice="opt"><option>-b</option></arg>
+ <arg choice="opt"><option>-e</option> <replaceable class="parameter">xlogrecptr</replaceable></arg>
+ <arg choice="opt"><option>-f</option> <replaceable class="parameter">filename</replaceable></arg>
+ <arg choice="opt"><option>-h</option></arg>
+ <arg choice="opt"><option>-p</option> <replaceable class="parameter">directory</replaceable></arg>
+ <arg choice="opt"><option>-s</option> <replaceable class="parameter">xlogrecptr</replaceable></arg>
+ <arg choice="opt"><option>-t</option> <replaceable class="parameter">timelineid</replaceable></arg>
+ <arg choice="opt"><option>-v</option></arg>
+ </cmdsynopsis>
+ </refsynopsisdiv>
+
+ <refsect1 id="R1-APP-PGXLOGDUMP-1">
+ <title>Description</title>
+ <para>
+ <command>pg_xlogdump</command> display the write-ahead log (WAL) and is only
+ useful for debugging or educational purposes.
+ </para>
+
+ <para>
+ This utility can only be run by the user who installed the server, because
+ it requires read access to the data directory. It does not perform any
+ modifications.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Options</title>
+
+ <para>
+ The following command-line options control the location and format of the
+ output.
+
+ <variablelist>
+ <varlistentry>
+ <term><option>-p <replaceable class="parameter">directory</replaceable></option></term>
+ <listitem>
+ <para>
+ Directory to find xlog files in.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Notes</title>
+ <para>
+ Can give wrong results when the server is running.
+ </para>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 0872168..fed1fdd 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -225,6 +225,7 @@
&pgDumpall;
&pgReceivexlog;
&pgRestore;
+ &pgXlogdump;
&psqlRef;
&reindexdb;
&vacuumdb;
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index cc210a7..4e94af1 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -24,6 +24,7 @@
#include "storage/standby.h"
#include "utils/relmapper.h"
+/* Also update contrib/pg_xlogdump/tables.c if you add something here. */
const RmgrData RmgrTable[RM_MAX_ID + 1] = {
{"XLOG", xlog_redo, xlog_desc, NULL, NULL, NULL},
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 9686486..04e0139 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -52,6 +52,8 @@
* If you add a new entry, remember to update the errhint below, and the
* documentation for pg_relation_size(). Also keep FORKNAMECHARS above
* up-to-date.
+ *
+ * Also update contrib/pg_xlogdump/tables.c if you add something here.
*/
const char *forkNames[] = {
"main", /* MAIN_FORKNUM */
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 47af367..7307af5 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -41,7 +41,7 @@ my $contrib_extraincludes =
my $contrib_extrasource = {
'cube' => [ 'cubescan.l', 'cubeparse.y' ],
'seg' => [ 'segscan.l', 'segparse.y' ] };
-my @contrib_excludes = ('pgcrypto', 'intagg', 'sepgsql');
+my @contrib_excludes = ('intagg', 'pgcrypto', 'pg_xlogdump', 'sepgsql');
sub mkvcbuild
{
@@ -411,6 +411,20 @@ sub mkvcbuild
'localtime.c');
$zic->AddReference($libpgport);
+ my $pgxlogdump = $solution->AddProject('pg_xlogdump', 'exe', 'contrib');
+ $pgxlogdump->{name} = 'pg_xlogdump';
+ $pgxlogdump->AddIncludeDir('src\backend');
+ $pgxlogdump->AddFiles('contrib\pg_xlogdump',
+ 'compat.c', 'pg_xlogdump.c', 'tables.c');
+ $pgxlogdump->AddFile('src\backend\access\transam\xlogreader.c');
+ $pgxlogdump->AddFiles('src\backend\access\rmgrdesc',
+ 'clogdesc.c', 'dbasedesc.c', 'gindesc.c', 'gistdesc.c', 'hashdesc.c',
+ 'heapdesc.c', 'mxactdesc.c', 'nbtdesc.c', 'relmapdesc.c', 'seqdesc.c',
+ 'smgrdesc.c', 'spgdesc.c', 'standbydesc.c', 'tblspcdesc.c',
+ 'xactdesc.c', 'xlogdesc.c');
+ $pgxlogdump->AddReference($libpgport);
+ $pgxlogdump->AddDefine('FRONTEND');
+
if ($solution->{options}->{xml})
{
$contrib_extraincludes->{'pgxml'} = [
--
1.7.12.289.g0ce9864.dirty
0005-wal_decoding-Add-a-new-RELFILENODE-syscache-to-fetch.patchtext/x-patch; charset=us-asciiDownload
>From d411e69a0c9c05b7ffadf2d9fe6afa1e025377d5 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 5 Apr 2012 11:09:59 +0200
Subject: [PATCH 05/19] wal_decoding: Add a new RELFILENODE syscache to fetch
a pg_class entry via (reltablespace, relfilenode)
This cache is theoretically problematic because formally indexes used by
syscaches needs to be unique, this one is not. This is "just" because of
0/InvalidOid are stored in pg_class.relfilenode for nailed/shared catalog
relations. This syscache will never be queried for InvalidOid relfilenodes
however so it seems to be safe even if it bends the rules somewhat.
It might be nicer to add infrastructure to do this properly, like using a
partial index, its not clear what the best way to do this is though and the
benefit very well might not be worth the overhead.
Needs a CATVERSION bump.
---
src/backend/utils/cache/syscache.c | 11 +++++++++++
src/include/catalog/indexing.h | 2 ++
src/include/utils/syscache.h | 1 +
3 files changed, 14 insertions(+)
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index bfc3c86..b5fe64f 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -591,6 +591,17 @@ static const struct cachedesc cacheinfo[] = {
},
64
},
+ {RelationRelationId, /* RELFILENODE */
+ ClassTblspcRelfilenodeIndexId,
+ 2,
+ {
+ Anum_pg_class_reltablespace,
+ Anum_pg_class_relfilenode,
+ 0,
+ 0
+ },
+ 1024
+ },
{RelationRelationId, /* RELNAMENSP */
ClassNameNspIndexId,
2,
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 6251fb8..2a3cd82 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -106,6 +106,8 @@ DECLARE_UNIQUE_INDEX(pg_class_oid_index, 2662, on pg_class using btree(oid oid_o
#define ClassOidIndexId 2662
DECLARE_UNIQUE_INDEX(pg_class_relname_nsp_index, 2663, on pg_class using btree(relname name_ops, relnamespace oid_ops));
#define ClassNameNspIndexId 2663
+DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops));
+#define ClassTblspcRelfilenodeIndexId 3455
DECLARE_UNIQUE_INDEX(pg_collation_name_enc_nsp_index, 3164, on pg_collation using btree(collname name_ops, collencoding int4_ops, collnamespace oid_ops));
#define CollationNameEncNspIndexId 3164
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index d1d8abe..2a14905 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -75,6 +75,7 @@ enum SysCacheIdentifier
PROCNAMEARGSNSP,
PROCOID,
RANGETYPE,
+ RELFILENODE,
RELNAMENSP,
RELOID,
RULERELNAME,
--
1.7.12.289.g0ce9864.dirty
0006-wal_decoding-Add-RelationMapFilenodeToOid-function-t.patchtext/x-patch; charset=us-asciiDownload
>From fea6b2e45cb2caf8d7a4c19f6031bc24b6e47d3b Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Sep 2012 23:51:08 +0200
Subject: [PATCH 06/19] wal_decoding: Add RelationMapFilenodeToOid function to
relmapper.c
This function maps (reltablespace, relfilenode) to the table oid and thus acts
as a reverse of RelationMapOidToFilenode.
---
src/backend/utils/cache/relmapper.c | 53 +++++++++++++++++++++++++++++++++++++
src/include/utils/relmapper.h | 2 ++
2 files changed, 55 insertions(+)
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 2c7d9f3..039aa29 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -180,6 +180,59 @@ RelationMapOidToFilenode(Oid relationId, bool shared)
return InvalidOid;
}
+/* RelationMapFilenodeToOid
+ *
+ * Do the reverse of the normal direction of mapping done in
+ * RelationMapOidToFilenode.
+ *
+ * This is not supposed to be used during normal running but rather for
+ * information purposes when looking at the filesystem or the xlog.
+ *
+ * Returns InvalidOid if the OID is not know which can easily happen if the
+ * filenode is not of a relation that is nailed or shared or if it simply
+ * doesn't exists anywhere.
+ */
+Oid
+RelationMapFilenodeToOid(Oid filenode, bool shared)
+{
+ const RelMapFile *map;
+ int32 i;
+
+ /* If there are active updates, believe those over the main maps */
+ if (shared)
+ {
+ map = &active_shared_updates;
+ for (i = 0; i < map->num_mappings; i++)
+ {
+ if (filenode == map->mappings[i].mapfilenode)
+ return map->mappings[i].mapoid;
+ }
+ map = &shared_map;
+ for (i = 0; i < map->num_mappings; i++)
+ {
+ if (filenode == map->mappings[i].mapfilenode)
+ return map->mappings[i].mapoid;
+ }
+ }
+ else
+ {
+ map = &active_local_updates;
+ for (i = 0; i < map->num_mappings; i++)
+ {
+ if (filenode == map->mappings[i].mapfilenode)
+ return map->mappings[i].mapoid;
+ }
+ map = &local_map;
+ for (i = 0; i < map->num_mappings; i++)
+ {
+ if (filenode == map->mappings[i].mapfilenode)
+ return map->mappings[i].mapoid;
+ }
+ }
+
+ return InvalidOid;
+}
+
/*
* RelationMapUpdateMap
*
diff --git a/src/include/utils/relmapper.h b/src/include/utils/relmapper.h
index 8f0b438..071bc98 100644
--- a/src/include/utils/relmapper.h
+++ b/src/include/utils/relmapper.h
@@ -36,6 +36,8 @@ typedef struct xl_relmap_update
extern Oid RelationMapOidToFilenode(Oid relationId, bool shared);
+extern Oid RelationMapFilenodeToOid(Oid relationId, bool shared);
+
extern void RelationMapUpdateMap(Oid relationId, Oid fileNode, bool shared,
bool immediate);
--
1.7.12.289.g0ce9864.dirty
0007-wal-decoding-Add-pg_relation_by_filenode-to-lookup-u.patchtext/x-patch; charset=us-asciiDownload
>From 29c11973ac071493bf0aa8bbaa41e0ac7c8b5ea2 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Sep 2012 23:53:23 +0200
Subject: [PATCH 07/19] wal decoding: Add pg_relation_by_filenode to lookup up
a relation by (tablespace, filenode)
This requires the previously added RELFILENODE syscache and the added
RelationMapFilenodeToOid function added in previous two commits.
---
doc/src/sgml/func.sgml | 23 +++++++++++-
src/backend/utils/adt/dbsize.c | 79 ++++++++++++++++++++++++++++++++++++++++++
src/include/catalog/pg_proc.h | 2 ++
src/include/utils/builtins.h | 1 +
4 files changed, 104 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 35c7f75..091372d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15176,7 +15176,7 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<para>
The functions shown in <xref linkend="functions-admin-dblocation"> assist
- in identifying the specific disk files associated with database objects.
+ in identifying the specific disk files associated with database objects or doing the reverse.
</para>
<indexterm>
@@ -15185,6 +15185,9 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<indexterm>
<primary>pg_relation_filepath</primary>
</indexterm>
+ <indexterm>
+ <primary>pg_relation_by_filenode</primary>
+ </indexterm>
<table id="functions-admin-dblocation">
<title>Database Object Location Functions</title>
@@ -15213,6 +15216,15 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
File path name of the specified relation
</entry>
</row>
+ <row>
+ <entry>
+ <literal><function>pg_relation_by_filenode(<parameter>tablespace</parameter> <type>oid</type>, <parameter>filenode</parameter> <type>oid</type>)</function></literal>
+ </entry>
+ <entry><type>regclass</type></entry>
+ <entry>
+ Find the associated relation of a filenode
+ </entry>
+ </row>
</tbody>
</tgroup>
</table>
@@ -15236,6 +15248,15 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
the relation.
</para>
+ <para>
+ <function>pg_relation_by_filenode</> is the reverse of
+ <function>pg_relation_filenode</>. Given a <quote>tablespace</> OID and
+ a <quote>filenode</> it returns the associated relation. The default
+ tablespace for user tables can be replaced with 0. Check the
+ documentation of <function>pg_relation_filenode</> for an explanation why
+ this cannot always easily answered by querying <structname>pg_class</>.
+ </para>
+
</sect2>
<sect2 id="functions-admin-genfile">
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 89ad386..73c886a 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -744,6 +744,85 @@ pg_relation_filenode(PG_FUNCTION_ARGS)
}
/*
+ * Get the relation via (reltablespace, relfilenode)
+ *
+ * This is expected to be used when somebody wants to match an individual file
+ * on the filesystem back to its table. Thats not trivially possible via
+ * pg_class because that doesn't contain the relfilenodes of shared and nailed
+ * tables.
+ *
+ * We don't fail but return NULL if we cannot find a mapping.
+ *
+ * Instead of knowing DEFAULTTABLESPACE_OID you can pass 0.
+ */
+Datum
+pg_relation_by_filenode(PG_FUNCTION_ARGS)
+{
+ Oid reltablespace = PG_GETARG_OID(0);
+ Oid relfilenode = PG_GETARG_OID(1);
+ Oid lookup_tablespace = reltablespace;
+ Oid result = InvalidOid;
+ HeapTuple tuple;
+
+ if (reltablespace == 0)
+ reltablespace = DEFAULTTABLESPACE_OID;
+
+ /* pg_class stores 0 instead of DEFAULTTABLESPACE_OID */
+ if (reltablespace == DEFAULTTABLESPACE_OID)
+ lookup_tablespace = 0;
+
+ tuple = SearchSysCache2(RELFILENODE,
+ lookup_tablespace,
+ relfilenode);
+
+ /* found it in the system catalog, not be a shared/nailed table */
+ if (HeapTupleIsValid(tuple))
+ {
+ result = HeapTupleHeaderGetOid(tuple->t_data);
+ ReleaseSysCache(tuple);
+ }
+ else
+ {
+ if (reltablespace == GLOBALTABLESPACE_OID)
+ {
+ result = RelationMapFilenodeToOid(relfilenode, true);
+ }
+ else
+ {
+ Form_pg_class relform;
+
+ result = RelationMapFilenodeToOid(relfilenode, false);
+
+ if (result != InvalidOid)
+ {
+ /* check that we found the correct relation */
+ tuple = SearchSysCache1(RELOID,
+ result);
+
+ if (!HeapTupleIsValid(tuple))
+ {
+ elog(ERROR, "Couldn't refind previously looked up relation with oid %u",
+ result);
+ }
+
+ relform = (Form_pg_class) GETSTRUCT(tuple);
+
+ if (relform->reltablespace != reltablespace &&
+ relform->reltablespace != lookup_tablespace)
+ result = InvalidOid;
+
+ ReleaseSysCache(tuple);
+ }
+ }
+ }
+
+ if (!OidIsValid(result))
+ PG_RETURN_NULL();
+ else
+ PG_RETURN_OID(result);
+}
+
+/*
* Get the pathname (relative to $PGDATA) of a relation
*
* See comments for pg_relation_filenode.
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 010605d..d179e49 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3441,6 +3441,8 @@ DATA(insert OID = 2998 ( pg_indexes_size PGNSP PGUID 12 1 0 0 0 f f f f t f v 1
DESCR("disk space usage for all indexes attached to the specified table");
DATA(insert OID = 2999 ( pg_relation_filenode PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 26 "2205" _null_ _null_ _null_ _null_ pg_relation_filenode _null_ _null_ _null_ ));
DESCR("filenode identifier of relation");
+DATA(insert OID = 3454 ( pg_relation_by_filenode PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 2205 "26 26" _null_ _null_ _null_ _null_ pg_relation_by_filenode _null_ _null_ _null_ ));
+DESCR("filenode identifier of relation");
DATA(insert OID = 3034 ( pg_relation_filepath PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 25 "2205" _null_ _null_ _null_ _null_ pg_relation_filepath _null_ _null_ _null_ ));
DESCR("file path of relation");
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 61d6aef..c5984ad 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -458,6 +458,7 @@ extern Datum pg_table_size(PG_FUNCTION_ARGS);
extern Datum pg_indexes_size(PG_FUNCTION_ARGS);
extern Datum pg_relation_filenode(PG_FUNCTION_ARGS);
extern Datum pg_relation_filepath(PG_FUNCTION_ARGS);
+extern Datum pg_relation_by_filenode(PG_FUNCTION_ARGS);
/* genfile.c */
extern bytea *read_binary_file(const char *filename,
--
1.7.12.289.g0ce9864.dirty
0008-wal_decoding-Introduce-InvalidCommandId-and-declare-.patchtext/x-patch; charset=us-asciiDownload
>From 321a38776fcd10df090f737b722a692b649f969c Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 13 Nov 2012 12:18:07 +0100
Subject: [PATCH 08/19] wal_decoding: Introduce InvalidCommandId and declare
that to be the new maximum for
CommandCounterIncrement
This is useful to be able to represent a CommandId thats invalid. There was no
such value before.
This decreases the possible number of subtransactions by one which seems
unproblematic. Its also not a problem for pg_upgrade because cmin/cmax are
never looked at outside the context of their own transaction (spare timetravel
access, but thats new anyway).
---
src/backend/access/transam/xact.c | 4 ++--
src/include/c.h | 1 +
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 81d2687..369d2b6 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -745,12 +745,12 @@ CommandCounterIncrement(void)
if (currentCommandIdUsed)
{
currentCommandId += 1;
- if (currentCommandId == FirstCommandId) /* check for overflow */
+ if (currentCommandId == InvalidCommandId)
{
currentCommandId -= 1;
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
- errmsg("cannot have more than 2^32-1 commands in a transaction")));
+ errmsg("cannot have more than 2^32-2 commands in a transaction")));
}
currentCommandIdUsed = false;
diff --git a/src/include/c.h b/src/include/c.h
index 57664e8..aba0049 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -367,6 +367,7 @@ typedef uint32 MultiXactOffset;
typedef uint32 CommandId;
#define FirstCommandId ((CommandId) 0)
+#define InvalidCommandId (~(CommandId)0)
/*
* Array indexing support
--
1.7.12.289.g0ce9864.dirty
0009-wal_decoding-Adjust-all-Satisfies-routines-to-take-a.patchtext/x-patch; charset=us-asciiDownload
>From 56f8f82a77b63079068dd5c29726ddfcdfb581c2 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 12 Nov 2012 13:39:52 +0100
Subject: [PATCH 09/19] wal_decoding: Adjust all *Satisfies routines to take a
HeapTuple instead of a HeapTupleHeader
For the regular satisfies routines this is needed in prepareation of logical
decoding. I changed the non-regular ones for consistency as well.
The naming between htup, tuple and similar is rather confused, I could not find
any consistent naming anywhere.
This is preparatory work for the logical decoding feature which needs to be
able to get to a valid relfilenode from when checking the visibility of a
tuple.
---
contrib/pgrowlocks/pgrowlocks.c | 2 +-
src/backend/access/heap/heapam.c | 13 ++++++----
src/backend/access/heap/pruneheap.c | 16 ++++++++++--
src/backend/catalog/index.c | 2 +-
src/backend/commands/analyze.c | 3 ++-
src/backend/commands/cluster.c | 2 +-
src/backend/commands/vacuumlazy.c | 3 ++-
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/time/tqual.c | 50 +++++++++++++++++++++++++++++-------
src/include/utils/snapshot.h | 4 +--
src/include/utils/tqual.h | 20 +++++++--------
11 files changed, 83 insertions(+), 34 deletions(-)
diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index 20beed2..8f9db55 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -120,7 +120,7 @@ pgrowlocks(PG_FUNCTION_ARGS)
/* must hold a buffer lock to call HeapTupleSatisfiesUpdate */
LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
- if (HeapTupleSatisfiesUpdate(tuple->t_data,
+ if (HeapTupleSatisfiesUpdate(tuple,
GetCurrentCommandId(false),
scan->rs_cbuf) == HeapTupleBeingUpdated)
{
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index b19d1cf..ba9fd36 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -289,6 +289,7 @@ heapgetpage(HeapScanDesc scan, BlockNumber page)
HeapTupleData loctup;
bool valid;
+ loctup.t_tableOid = RelationGetRelid(scan->rs_rd);
loctup.t_data = (HeapTupleHeader) PageGetItem((Page) dp, lpp);
loctup.t_len = ItemIdGetLength(lpp);
ItemPointerSet(&(loctup.t_self), page, lineoff);
@@ -1603,7 +1604,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
heapTuple->t_data = (HeapTupleHeader) PageGetItem(dp, lp);
heapTuple->t_len = ItemIdGetLength(lp);
- heapTuple->t_tableOid = relation->rd_id;
+ heapTuple->t_tableOid = RelationGetRelid(relation);
heapTuple->t_self = *tid;
/*
@@ -1651,7 +1652,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
* transactions.
*/
if (all_dead && *all_dead &&
- !HeapTupleIsSurelyDead(heapTuple->t_data, RecentGlobalXmin))
+ !HeapTupleIsSurelyDead(heapTuple, RecentGlobalXmin))
*all_dead = false;
/*
@@ -2447,12 +2448,13 @@ heap_delete(Relation relation, ItemPointer tid,
lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid));
Assert(ItemIdIsNormal(lp));
+ tp.t_tableOid = RelationGetRelid(relation);
tp.t_data = (HeapTupleHeader) PageGetItem(page, lp);
tp.t_len = ItemIdGetLength(lp);
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(tp.t_data, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
if (result == HeapTupleInvisible)
{
@@ -2817,6 +2819,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
lp = PageGetItemId(page, ItemPointerGetOffsetNumber(otid));
Assert(ItemIdIsNormal(lp));
+ oldtup.t_tableOid = RelationGetRelid(relation);
oldtup.t_data = (HeapTupleHeader) PageGetItem(page, lp);
oldtup.t_len = ItemIdGetLength(lp);
oldtup.t_self = *otid;
@@ -2829,7 +2832,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
*/
l2:
- result = HeapTupleSatisfiesUpdate(oldtup.t_data, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(&oldtup, cid, buffer);
if (result == HeapTupleInvisible)
{
@@ -3531,7 +3534,7 @@ heap_lock_tuple(Relation relation, HeapTuple tuple,
tuple->t_tableOid = RelationGetRelid(relation);
l3:
- result = HeapTupleSatisfiesUpdate(tuple->t_data, cid, *buffer);
+ result = HeapTupleSatisfiesUpdate(tuple, cid, *buffer);
if (result == HeapTupleInvisible)
{
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 390585b..a0efe48 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -340,6 +340,9 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
OffsetNumber chainitems[MaxHeapTuplesPerPage];
int nchain = 0,
i;
+ HeapTupleData tup;
+
+ tup.t_tableOid = RelationGetRelid(relation);
rootlp = PageGetItemId(dp, rootoffnum);
@@ -349,6 +352,11 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
if (ItemIdIsNormal(rootlp))
{
htup = (HeapTupleHeader) PageGetItem(dp, rootlp);
+
+ tup.t_data = htup;
+ tup.t_len = ItemIdGetLength(rootlp);
+ ItemPointerSet(&(tup.t_self), BufferGetBlockNumber(buffer), rootoffnum);
+
if (HeapTupleHeaderIsHeapOnly(htup))
{
/*
@@ -369,7 +377,7 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
* either here or while following a chain below. Whichever path
* gets there first will mark the tuple unused.
*/
- if (HeapTupleSatisfiesVacuum(htup, OldestXmin, buffer)
+ if (HeapTupleSatisfiesVacuum(&tup, OldestXmin, buffer)
== HEAPTUPLE_DEAD && !HeapTupleHeaderIsHotUpdated(htup))
{
heap_prune_record_unused(prstate, rootoffnum);
@@ -432,6 +440,10 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
Assert(ItemIdIsNormal(lp));
htup = (HeapTupleHeader) PageGetItem(dp, lp);
+ tup.t_data = htup;
+ tup.t_len = ItemIdGetLength(lp);
+ ItemPointerSet(&(tup.t_self), BufferGetBlockNumber(buffer), offnum);
+
/*
* Check the tuple XMIN against prior XMAX, if any
*/
@@ -449,7 +461,7 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
*/
tupdead = recent_dead = false;
- switch (HeapTupleSatisfiesVacuum(htup, OldestXmin, buffer))
+ switch (HeapTupleSatisfiesVacuum(&tup, OldestXmin, buffer))
{
case HEAPTUPLE_DEAD:
tupdead = true;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 5892e44..a29c106 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2269,7 +2269,7 @@ IndexBuildHeapScan(Relation heapRelation,
*/
LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
- switch (HeapTupleSatisfiesVacuum(heapTuple->t_data, OldestXmin,
+ switch (HeapTupleSatisfiesVacuum(heapTuple, OldestXmin,
scan->rs_cbuf))
{
case HEAPTUPLE_DEAD:
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 7a5eb42..ac16284 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1134,10 +1134,11 @@ acquire_sample_rows(Relation onerel, int elevel,
ItemPointerSet(&targtuple.t_self, targblock, targoffset);
+ targtuple.t_tableOid = RelationGetRelid(onerel);
targtuple.t_data = (HeapTupleHeader) PageGetItem(targpage, itemid);
targtuple.t_len = ItemIdGetLength(itemid);
- switch (HeapTupleSatisfiesVacuum(targtuple.t_data,
+ switch (HeapTupleSatisfiesVacuum(&targtuple,
OldestXmin,
targbuffer))
{
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 238781b..cb1a430 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -931,7 +931,7 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex,
LockBuffer(buf, BUFFER_LOCK_SHARE);
- switch (HeapTupleSatisfiesVacuum(tuple->t_data, OldestXmin, buf))
+ switch (HeapTupleSatisfiesVacuum(tuple, OldestXmin, buf))
{
case HEAPTUPLE_DEAD:
/* Definitely dead */
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8eda663..62dda43 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -727,12 +727,13 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
Assert(ItemIdIsNormal(itemid));
+ tuple.t_tableOid = RelationGetRelid(onerel);
tuple.t_data = (HeapTupleHeader) PageGetItem(page, itemid);
tuple.t_len = ItemIdGetLength(itemid);
tupgone = false;
- switch (HeapTupleSatisfiesVacuum(tuple.t_data, OldestXmin, buf))
+ switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf))
{
case HEAPTUPLE_DEAD:
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 90a9e2a..ee34afb 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3894,7 +3894,7 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
* tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
* is going on with it.
*/
- htsvResult = HeapTupleSatisfiesVacuum(tuple->t_data, TransactionXmin, buffer);
+ htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
switch (htsvResult)
{
case HEAPTUPLE_LIVE:
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 51f0afd..2961822 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -163,8 +163,12 @@ HeapTupleSetHintBits(HeapTupleHeader tuple, Buffer buffer,
* Xmax is not committed))) that has not been committed
*/
bool
-HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
+HeapTupleSatisfiesSelf(HeapTuple htup, Snapshot snapshot, Buffer buffer)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
{
if (tuple->t_infomask & HEAP_XMIN_INVALID)
@@ -326,8 +330,12 @@ HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
*
*/
bool
-HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
+HeapTupleSatisfiesNow(HeapTuple htup, Snapshot snapshot, Buffer buffer)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
{
if (tuple->t_infomask & HEAP_XMIN_INVALID)
@@ -471,7 +479,7 @@ HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
* Dummy "satisfies" routine: any tuple satisfies SnapshotAny.
*/
bool
-HeapTupleSatisfiesAny(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
+HeapTupleSatisfiesAny(HeapTuple htup, Snapshot snapshot, Buffer buffer)
{
return true;
}
@@ -491,9 +499,13 @@ HeapTupleSatisfiesAny(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
* table.
*/
bool
-HeapTupleSatisfiesToast(HeapTupleHeader tuple, Snapshot snapshot,
+HeapTupleSatisfiesToast(HeapTuple htup, Snapshot snapshot,
Buffer buffer)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
{
if (tuple->t_infomask & HEAP_XMIN_INVALID)
@@ -572,9 +584,13 @@ HeapTupleSatisfiesToast(HeapTupleHeader tuple, Snapshot snapshot,
* distinguish that case must test for it themselves.)
*/
HTSU_Result
-HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
+HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
Buffer buffer)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
{
if (tuple->t_infomask & HEAP_XMIN_INVALID)
@@ -739,9 +755,13 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
* for snapshot->xmax and the tuple's xmax.
*/
bool
-HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
+HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
Buffer buffer)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
@@ -902,9 +922,13 @@ HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
* can't see it.)
*/
bool
-HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
+HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Buffer buffer)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
{
if (tuple->t_infomask & HEAP_XMIN_INVALID)
@@ -1058,9 +1082,13 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
* even if we see that the deleting transaction has committed.
*/
HTSV_Result
-HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
+HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Buffer buffer)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
/*
* Has inserting transaction committed?
*
@@ -1233,8 +1261,12 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
* just whether or not the tuple is surely dead).
*/
bool
-HeapTupleIsSurelyDead(HeapTupleHeader tuple, TransactionId OldestXmin)
+HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin)
{
+ HeapTupleHeader tuple = htup->t_data;
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
/*
* If the inserting transaction is marked invalid, then it aborted, and
* the tuple is definitely dead. If it's marked neither committed nor
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index e747191..ed3f586 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -27,8 +27,8 @@ typedef struct SnapshotData *Snapshot;
* The specific semantics of a snapshot are encoded by the "satisfies"
* function.
*/
-typedef bool (*SnapshotSatisfiesFunc) (HeapTupleHeader tuple,
- Snapshot snapshot, Buffer buffer);
+typedef bool (*SnapshotSatisfiesFunc) (HeapTuple htup,
+ Snapshot snapshot, Buffer buffer);
typedef struct SnapshotData
{
diff --git a/src/include/utils/tqual.h b/src/include/utils/tqual.h
index 72a8ea4..5309ce3 100644
--- a/src/include/utils/tqual.h
+++ b/src/include/utils/tqual.h
@@ -52,7 +52,7 @@ extern PGDLLIMPORT SnapshotData SnapshotToastData;
* if so, the indicated buffer is marked dirty.
*/
#define HeapTupleSatisfiesVisibility(tuple, snapshot, buffer) \
- ((*(snapshot)->satisfies) ((tuple)->t_data, snapshot, buffer))
+ ((*(snapshot)->satisfies) (tuple, snapshot, buffer))
/* Result codes for HeapTupleSatisfiesVacuum */
typedef enum
@@ -65,25 +65,25 @@ typedef enum
} HTSV_Result;
/* These are the "satisfies" test routines for the various snapshot types */
-extern bool HeapTupleSatisfiesMVCC(HeapTupleHeader tuple,
+extern bool HeapTupleSatisfiesMVCC(HeapTuple htup,
Snapshot snapshot, Buffer buffer);
-extern bool HeapTupleSatisfiesNow(HeapTupleHeader tuple,
+extern bool HeapTupleSatisfiesNow(HeapTuple htup,
Snapshot snapshot, Buffer buffer);
-extern bool HeapTupleSatisfiesSelf(HeapTupleHeader tuple,
+extern bool HeapTupleSatisfiesSelf(HeapTuple htup,
Snapshot snapshot, Buffer buffer);
-extern bool HeapTupleSatisfiesAny(HeapTupleHeader tuple,
+extern bool HeapTupleSatisfiesAny(HeapTuple htup,
Snapshot snapshot, Buffer buffer);
-extern bool HeapTupleSatisfiesToast(HeapTupleHeader tuple,
+extern bool HeapTupleSatisfiesToast(HeapTuple htup,
Snapshot snapshot, Buffer buffer);
-extern bool HeapTupleSatisfiesDirty(HeapTupleHeader tuple,
+extern bool HeapTupleSatisfiesDirty(HeapTuple htup,
Snapshot snapshot, Buffer buffer);
/* Special "satisfies" routines with different APIs */
-extern HTSU_Result HeapTupleSatisfiesUpdate(HeapTupleHeader tuple,
+extern HTSU_Result HeapTupleSatisfiesUpdate(HeapTuple htup,
CommandId curcid, Buffer buffer);
-extern HTSV_Result HeapTupleSatisfiesVacuum(HeapTupleHeader tuple,
+extern HTSV_Result HeapTupleSatisfiesVacuum(HeapTuple htup,
TransactionId OldestXmin, Buffer buffer);
-extern bool HeapTupleIsSurelyDead(HeapTupleHeader tuple,
+extern bool HeapTupleIsSurelyDead(HeapTuple htup,
TransactionId OldestXmin);
extern void HeapTupleSetHintBits(HeapTupleHeader tuple, Buffer buffer,
--
1.7.12.289.g0ce9864.dirty
0013-wal_decoding-copydir-make-fsync_fname-public.patchtext/x-patch; charset=us-asciiDownload
>From f026cb457c41b197dee9dca294d955851a64baf6 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 9 Jan 2013 17:36:20 +0100
Subject: [PATCH 13/19] wal_decoding: copydir: make fsync_fname public
This probably should be somewhere else, its a generally useful function, not
really related to copying directories. fd.[ch]?
---
src/backend/storage/file/copydir.c | 5 +----
src/include/storage/copydir.h | 1 +
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 7f94f50..a86a35d 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -28,9 +28,6 @@
#include "miscadmin.h"
-static void fsync_fname(char *fname, bool isdir);
-
-
/*
* copydir: copy a directory
*
@@ -216,7 +213,7 @@ copy_file(char *fromfile, char *tofile)
* Try to fsync directories but ignore errors that indicate the OS
* just doesn't allow/require fsyncing directories.
*/
-static void
+void
fsync_fname(char *fname, bool isdir)
{
int fd;
diff --git a/src/include/storage/copydir.h b/src/include/storage/copydir.h
index a087cce..3bccf3b 100644
--- a/src/include/storage/copydir.h
+++ b/src/include/storage/copydir.h
@@ -15,5 +15,6 @@
extern void copydir(char *fromdir, char *todir, bool recurse);
extern void copy_file(char *fromfile, char *tofile);
+extern void fsync_fname(char *fname, bool isdir);
#endif /* COPYDIR_H */
--
1.7.12.289.g0ce9864.dirty
0014-wal-decoding-Add-information-about-a-tables-primary-.patchtext/x-patch; charset=us-asciiDownload
>From e42be6c2b152cc0d1de2db802e20c1e19eceb364 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 14 Jan 2013 12:16:54 +0100
Subject: [PATCH 14/19] wal decoding: Add information about a tables primary
key to struct RelationData
'rd_primary' now contains the Oid of an index over uniquely identifying
columns. Several types of indexes are interesting and are collected in that
order:
* Primary Key
* oid index
* the first (OID order) unique, immediate, non-partial and
non-expression index over one or more NOT NULL'ed columns
To gather rd_primary value RelationGetIndexList() needs to have been called.
This is helpful because for logical replication we frequently - on the sending
and receiving side - need to lookup that index and RelationGetIndexList already
gathers all the necessary information.
This could be used to replace tablecmd.c's transformFkeyGetPrimaryKey, but
would change the meaning of that, so it seems to require additional discussion.
---
src/backend/utils/cache/relcache.c | 52 +++++++++++++++++++++++++++++++++++---
src/include/utils/rel.h | 12 +++++++++
2 files changed, 61 insertions(+), 3 deletions(-)
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 33fb858..aa110f0 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3365,7 +3365,9 @@ RelationGetIndexList(Relation relation)
ScanKeyData skey;
HeapTuple htup;
List *result;
- Oid oidIndex;
+ Oid oidIndex = InvalidOid;
+ Oid pkeyIndex = InvalidOid;
+ Oid candidateIndex = InvalidOid;
MemoryContext oldcxt;
/* Quick exit if we already computed the list. */
@@ -3422,17 +3424,61 @@ RelationGetIndexList(Relation relation)
Assert(!isnull);
indclass = (oidvector *) DatumGetPointer(indclassDatum);
+ if (!IndexIsValid(index))
+ continue;
+
/* Check to see if it is a unique, non-partial btree index on OID */
- if (IndexIsValid(index) &&
- index->indnatts == 1 &&
+ if (index->indnatts == 1 &&
index->indisunique && index->indimmediate &&
index->indkey.values[0] == ObjectIdAttributeNumber &&
indclass->values[0] == OID_BTREE_OPS_OID &&
heap_attisnull(htup, Anum_pg_index_indpred))
oidIndex = index->indexrelid;
+
+ if (index->indisunique &&
+ index->indimmediate &&
+ heap_attisnull(htup, Anum_pg_index_indpred))
+ {
+ /* always prefer primary keys */
+ if (index->indisprimary)
+ pkeyIndex = index->indexrelid;
+ else if (!OidIsValid(pkeyIndex)
+ && !OidIsValid(oidIndex)
+ && !OidIsValid(candidateIndex))
+ {
+ int key;
+ bool found = true;
+ for (key = 0; key < index->indnatts; key++)
+ {
+ int16 attno = index->indkey.values[key];
+ Form_pg_attribute attr;
+ /* internal column, like oid */
+ if (attno <= 0)
+ continue;
+
+ attr = relation->rd_att->attrs[attno - 1];
+ if (!attr->attnotnull)
+ {
+ found = false;
+ break;
+ }
+ }
+ if (found)
+ candidateIndex = index->indexrelid;
+ }
+ }
}
systable_endscan(indscan);
+
+ if (OidIsValid(pkeyIndex))
+ relation->rd_primary = pkeyIndex;
+ /* prefer oid indexes over normal candidate ones */
+ else if (OidIsValid(oidIndex))
+ relation->rd_primary = oidIndex;
+ else if (OidIsValid(candidateIndex))
+ relation->rd_primary = candidateIndex;
+
heap_close(indrel, AccessShareLock);
/* Now save a copy of the completed list in the relcache entry. */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index bde5f17..930f621 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -121,6 +121,18 @@ typedef struct RelationData
TriggerDesc *trigdesc; /* Trigger info, or NULL if rel has none */
/*
+ * The 'best' primary or candidate key that has been found, only set
+ * correctly if RelationGetIndexList has been called/rd_indexvalid > 0.
+ *
+ * Indexes are chosen in the following order:
+ * * Primary Key
+ * * oid index
+ * * the first (OID order) unique, immediate, non-partial and
+ * non-expression index over one or more NOT NULL'ed columns
+ */
+ Oid rd_primary;
+
+ /*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
* defaults".
--
1.7.12.289.g0ce9864.dirty
0015-wal-decoding-Introduce-wal-decoding-via-catalog-time.patch.gzapplication/x-patch-gzipDownload
� ���P �\ys�8����Hj�#Y�,Q��Mv�I���lg��R)D�����s|��n�7%��lm�s%�E�@?4�A���9;�}.:��&�`bYb`v����'�^W�}Kr��{.���c��!�cz����6���5}����\��������}��p��xS{�Cq�ng���x�yWg��ag���c����D� #<d_��oO>��`�{��-��Lax��N��
}���{����<��7���������S������1Q��]�@��,8���W��|S��$��bs�� ��m��/x���R`��m�
���,���y�!�`����w�"@�����vL�-��� �/���ar���,�8 ���4��&��x�}�y&0�)�|�����
g�Qw�aq�v���m##+�����`���S0�>,��'>����(d'�a����;n�5���Nt�_��7o>�F:��m��a���cY�(J~0-[�X_8l�x����va]K&\/��p����v|v�atr|6�>�:�_nG��or�G0�
"m���C;�&�g`(���F����e1�0�=���|�p@�h
,p��V�e��$��6��b:���
X��U<N����z������ �-�d���Q �q�4T�����#0G&%'
���Z�D�!) SLd,�P�BF'��uY��5BQ�����`m�� �az(k�sD5�/��i4�h���i��}4<�,+����{AW�6��|�_��\
�������7S|@�,��4��.? ��k;jH�^��������$M��v�- %���p�\f�Q���`4s��d���`����4z��� e�t5�A������@!Y��|����l��H�S4"��;�y����/�A,���e���A4��z���T
�H��s(V�i[��`N>���mi�s��\4@��T�R�B����%���{� �Y//�.m!�f9���l �Y��
��!l��J"r���
z`���C���30���S��p�Y��M�[�6r�14v���6s-H�����Ga�����v]~���������� ��9 8���F������/;d� En�4u%��>��7S����� ������=�^��)������i�Z5��.�)�m�(�������`���?���lV���)xJ`=����~;`���� �J��
Ji]��p�-f`G�Ie���nLz�YZ+���FR�n%)�/�7�����NX6`s���RE�RvK���L�+�t�S��L�~&�@���?���F�l��t������@�=������6�� �ZMwp0�����>�mm�~()�?(YP�@�Go;;���f�j�Cm�j��!���T����e�^5r����>(�����r��A�7)*��+�F:�����S1��c*I�6 �42��F-9������p�WJ���Q�t�����]����Q=�t��)�VX�Y��}��1�*,�k8�)����c��Y��zo���2U2�`e�`Y��d�������w��R~���+�����t+�}�tK"��������W����B�j�t�Q�@]$�n�����g�A�k�+�cy2��"�^��P�E���y5q
����Z��k�� �x���!�S,V�`g�/�*\e1e �l����T�J�Fk�w��u����BA�S��gjl���!��#
���fCc�!��!� X7���?Ce�Ng��?:|yq�<�8�_~�Z�.�bQa��#���_�&O"J��9T
C�DZ �'�`�I���L��X�5����xN�l�R���M��e���v���������G�S|�X?��Z�C
���1��Wh�#���\�� yb�}1�^���qZ@���B��|��y�ka\�>f���,Z@�!rL��[���A'�`�F �W�]�a�IOW,�gfdx"G���9�;����G�
G`rGc�}����Zs������� 6�h4l����e���������v8��@�lg��c�~�V� ��������C�^��^W����Zw���E � Vk�����9��UMLJ]R����.��-V���~�������0XB$S� V�e�``����L��b�.��3i����������x��7�]#���u<��"�O`Ur���Y��Q�$H���bv�*`J�l!������ Q�l+�V���{��=��/�s���K5�#bs2��{�����7�4�x� ^Fa���2�OY*k�_J�0�J�F_�y�
���a(U����e\)�o�c���iB�b�q�;A3���)��%b'k<�;��x P��k|,��o>�c��d��$�{g�a��X�Q�aF�?4�L�R������Z�S�D_{�H����7�s������������#�>���<�����������xtqsz}���DU�w���+���"�T���6)��3!1L|.�g�����Q��Y��0���� ��>fx�c(R<����pS`=@�?]+�/��k��4����������IW��%A1�����%x�.Q*F�&�����R}�:���1�{���1e���������������DAf��� h~��������������sI%��8i��+I�7�������� /@t�.0�������/�0��WM�A�,mFy��[@S�y�b�qV<=JiB�OE��)��h���1>+� ��1�H��;�~���w�������K��~k+��y�g;
���|��Wy�n�����|$p H�(;}��@��X4����l�=�0!g%�q��B?�fW����]�O�o��3���{��������������� .nb������7��5������2���&�5 �g��K !��n1�!��Q��i�>h=�S+�i�����^�z"��Ogg��B!����I��dXd��"7����K[D� H��w�2�R|D� �����*g����QX!y�����
���� l+~+[.)�"5�)<��~�5'�\{N���g���S1�
��C c,�V�l�oY�qj+ ���"/������eSbC{`F��#�
=��"���C���m� a���
3����Z�/��p��*n�#����,]��@�`_��~�(d �?���H W�`����mR�4�_@��� F������FUv���dU���l$F5N~#C*�tbq������iX��4�PE�z����O�����0�<rB�Q���7R�y�\�&�Fh8C-����`[��
lmp�x�q��Y��� ��@�S��iH�"p�'��8`�u,�c�}�e������1����+X����((��(�efN�"�B�i�"�@������~��~�
��T���8>&d�a;������uy0Xqd@�+b~}|���v�����b<���#�#��!|=)�!`����r5z�7����RN2��FSG�-h�(��F=��$������:�|3�&7#���~�{8������L[br�F^0{t`�f]��\��n��Vy�M7Pt,����Q3���'�gu ��,��
J;H��\��n��������q��������C���u������T!q�����r`���xSD\
�3�YlTUl���,1�U�Ylbk#�f��^���N�q�*����(-�T����x�6�ueF *� �(��8��� f!5�Q�U
����J�r$_-�1��*m�"�yN�C �6�R( �u��
j���u)y�t����tU�D&��t��.��Lr#O�HTx�.>�����I�?��F�������j�aj$O����M�i���9^x��2K�3%�?�,Y���[���`x�.��c?�em������l�1������9=g��.��~��|��7*�"S�.v�F�S�)��'-R6�n���(A��j&w�:��A�
���%`w�?���.�T�@Y������9�$�^\��(��gM�+]6r[� ;��;}��CWc/O([�z �4r��}
K���������b�5^�H���Dvx����G�dA�p���2�7�q�u�mt�/���E�
>��f���������b�j��~�����H_���9E2;z�^��[B�n,��-4�^# ���"�+�#��:��Vb�*Y6�����z�h-�
-c����G����9�t��0.�������Q5�&�����U$�VmV�?U��"w��)FA�l���huTP
}�����G��:-�B���S�y�r�9��H�|t25��r)%���v�x��zn�����\OR�;=;�=M�>,.�������.�~��:�t�XwX�=M�����V
p��Y��d��0�:7�� �U����W��!.��*{[�34�����:?Z��l���K#fC��RL�)�\���18��U���3������Z���
\�����[�9{��TF-����k��v*����1��j�_U�����c��d�������]k��r�
�R1�?�B�����iq�cs~'X�� %�nb������=
�������Y���lN�N�+`���(��})�()T��d���t�4}�b������(r���(T���tu�^�T):y��DM( yu������b:�d�I�� }0@�Q��e!�@5��4������u�Uc�y��L�]|w�X�\b��B�V��[<4N��^��uU��0��T�beM5�:ocZ��M�t��T>cj�6'��>��_o^3�M�)��B��2zb�����!yWJ��h+��}2u����L�g��C��P�������Wl������2�E��D?I]����K-��M[&+d]l,�,��B����G @�US�MC�F~(�Z����E���gt)_�����"l�u�2UY1�����>�����0����t�a�y�DQ"�M��-R��� |%|<g��o�gjh���'�wj;B*��H3B����aE��cm~���&���xemi��:b�����b���$��5����};3�����0X������ ��.��O�a\k�����"�s��4j�� *B�.�x�d����Q���
d2>'��s~s��U�)[u 3|��|R�����b|O���h�-n���Gf�x>2��������$@du�of�#��}ry��-AH�� {�W
f{
�K�O���~Eg~_������/�% !�U�wu0T�H�#��o�*�`Rut����\i�^�����4����m�i�?S��O���}�d�k2�O�rq��b��CmRA��)�4R5�<�J�T����.�P�B,O��u�e5�']J9�5H��31 |��-�
�#heX���9B������O�O�����������:����>���*h�c8{������������t6�[�+�LP!�n����J91U�d�n�/s�E�J�Iy��[c�S�A�s���Q�?h���n��bM\7P�����8�L����W����L�q�"����.��R3���=�!��M/U����}*:��z��1����o������X�|����|)�Jd�#���S�~��G�i�R���e�!]�,��M[o��������>4'�g�Y�����q��a�?�*O�Jg��O�������)�;��O��Pji�d�����/A�D�R�=d/%�zn�����u5e�P�wV0;b���`m M�F��a�~�H�k�~�~P���"&��������/p� ���+��,�����>�X�����>��7������,K�T V����r�L��"�`���p��X����cu���70<!����P!
"���p~|����7�k#��E��"�L��X�e�`�m^�
�����G/%� �%�Z)�.����-��LI����o��6dF�'��=QC���� �q�_���������nA��b\����������������W��9���uj&Q0�R�`}Q����8��L��'/�?���6�'���[���T����$������w9�7 �V�g���J������CEa���B�xC���k�e0�jM�F~��[�9�[���BG�w��71^�_���x�YLy^��U����|�l,��*&���}��������Tl����1 ���=i�j�����E��������ut*!�w�O�8�9Jfg���O���`o�}��u�������Um��+�-PK��TO0�y�#>[H �:�U[]�k
'�tn��-��}��T RQ���2�x����~�h��y�������Ij�y������t��"��U�~� �@�����N��i���{5�x��6�f���>P�S�$�2�E�[#)�X�O��H�V����V�w�.����qr�����A �����|U�Kcd���g�������C����k�*�8`�u����R��M|�C �.�2�����g7��\�jD��4������iw��e�^]O�TOe�;�=��g7�v/V���~��h��rJ����M&��5C���D}�C47Bn����%W����F�m/ts?yJn��,��~��8;b��F�B��Prbj'=�_�S<�`I��U#:TaCX���� r�5e�#U��<Bb:��DqU�4��z�~/�
���-e��*����G$%T
)!����0�������+9��
�=jZ���)��$�\�}�b�:�:��.���}���CQU�> C@�~�� ��<&��
"$�u�% �@�^.���F�h�W���hI��/�/�3�lx����!�~�!$�r�������N:1������9���-c��� 7ZS���"H�=�VF�,�[W!��x5���B��.V�����o�R�����'�>G Fc?#AJp�P����unD���
� �\�� �$N�r2[2m*�X�a0�}*qo�`�*~gUm�
1�B1Z�O�t1��X�#a$�m�N��'� �|-��������>{��
�}��>��Q&�C�fl����7`Y�t����-�z5���^\�X
�y.�6��/����+c�j>��(!�����<�i���'��8R�l����
�C�NA�A��A����|c��qE %KPQ} 'J��:A4!���T���^��E:�(A�������|�S�����v�d�gk��x��<��6j�2 ����J:��B�A��
|,M`�,mk#jW�E��|P�?�v@n���W$^��m��J�M��O����4"�������ATP�*��Q>j,���MC�)�P�)�.�Ar��.���R�R��"�p�N���x�����F|�_-�b��M��;
�G ��'I��b��g�����&[����N��8�y^~���O[����H���:��f�>2�ggo���]D?��6��uu��j��5��������]����:u���<��8;g7�����
�4���!,��= �����"�>��A�����W�HE=6����[d��2�1���S�ku�������� h[>��8��o,t/Y�-��h<����<��bov�/��=cM
��n������$�������"�=@�@Ort3z*u4O.�
���3� ���8��u`maGa������#�5.�hl�����]�J�>U�o0N?6�����r�O������s��2$>�z9L�
�<{�jd�yNF���&O��:�!�d<%����&�����t@���UB��bw�F�U�"JRNPe7�Dup!�=�Gx��S��u��#*��,Ot�NN/�?�O���r�c(�k-�����8 lj �1U�F���2T��I]<����E��\n)�����oR��7�Y����~6��N��Y��_�|��)Q�P���/��(z��.PN�F$W�g����� ����'��!^Py:���}`���
�Xw��T��)�� ���u�m����'>�~���qw����K9��wp�C4�r�����\��43�S�u��'vt�H�f)���FKXh����z%��@[I
/UF�����V�o��1�A��LD����)�zD �����`Y�
����^�:�+��i�F���'������&���r��5�����>%���p���F�;�V����9� Ti��j
8�MSp�
T��=������ig��^,���u�}�!\fN�'I�����Ee��
/;%:�=��W��FU
;vB�x���xD��������Z�{�bYo�� ��}�Z-2�\ �x��b�U�<}�"m[��Q�p�?#I���K��������y���o�i@:���rs�
����]�NQ�+�����,�zQ���X�RTb�d���"���@�4��d�v���4A3�y����f7�+�fSK�Ve��DXipT��QH4�.i����U��u��]�K���U��NN�pe5Y��Zt�>>��(�nU�G�:�����i=|X���m�����pxt��?=�8���C;��#����z��]�z���,���j����l4���}�}��� �Y���/�:�/��E5>1�A���#%���f�6�MG��6?=S���Z��l� �v=�������+$��E���]A��}�EWS�x�S.�
�����Kq���������:%��V0� #��Sl3<tx�pv�G�x���r��D��W�I�#�����O�>a���R�%ds��d��w��U�#�X��g>�x�7H�p����4�b�A��������������Q���QS��F�N?�����>��j%�e�k�`�q��7��lT�8��}./��~��?0;?������l�ct6������\i;��H����.�2�L�Jv�M��SI�F;�b�J����O{�����5c��������*U�����m]��Q�]����X�,��
r�'�l�����?����m"Vi�:�rC����Es���/�����}LS��SX�4�R��=#��J���;Z��iY�����W�����Y���i�-H*�!��6�j]x3�#��>T����K���Nx�[g���������}K{����������w0���wz�|DjW�,�r��Et�Q���uf���eS�� �z��t��.�=��9�_+��6���^]2�.���2j5���]d�{oN���X�oG�tH���;�$,��w��kL'0������2,<���(��E����[�[��]���c����������H�o%��j������/k��^\�;9�*����Hwg�q`�K��Z��:P�����?o��.D V)#X��/�O5x�3HW��^7���i%t�J�rE�ZM���Y����V��b�O^��r��I�W �;Z.p1A��Q�Y� ;���UR�R��/-������gF18Q|��~�i!�yr�:}�:o����<���G�Cb������g��'����.N?��'����l�>�s�>�(�������Y����(2ZE0���\?0�
�Oj��0��/�d<]�����Xkg�9�� ��Y+��������W F�C
���w^m/�7�Z6�;��*�����p=X4Sn����.NP[������~]�12A�
�_ ��p:�=:�8��FK)������,+]j�l��*��w+�����DB:���b���V����b>i-�T��CGyG0
����� C��������T�)Vg������ta����?�����=+;A���N��8���g�<.�jzX�Uu����e�69�u/�������^���� �Z���5Em������8;�
+��a���/�����(��mA���_�Zo�N��1�������[�/Q�D�H�z�h�U����@O�1� �gpfcfh��?���y6c��+�T�j/��\�qe]8o��4a���������kj@~�~�:���C�)�lQ�����;�y�����U����'�uXY��.���=Y�M�n���g_�1�D�e�n���^������z����:na��w=nb&(7�����U����?������X+9�<��#�/t&��}����:)�&|��+���/FK%�@�
?,�e�Il*^C�X����+]}��2�������"