pg_filedump 9.3: checksums (and a few other fixes)
Attached is a first draft of an update to pg_filedump for 9.3. I know
pg_filedump is a pgfoundry project, but that seems like it's just there
to host the download; so please excuse the slightly off-topic post here
on -hackers.
I made a few changes to support 9.3, which were mostly fixes related two
things:
* new htup_details.h and changes related to FK concurrency improvements
* XLogRecPtr is now a uint64
And, of course, I added support for checksums. They are always displayed
and calculated, but it only throws an error if you pass "-k". Only the
user knows whether checksums are enabled, because we removed page-level
bits indicating the presence of a checksum.
The patch is a bit ugly: I had to copy some code, and copy the entire
checksum.c file (minus some Asserts, which don't work in an external
program). Suggestions welcome.
Regards,
Jeff Davis
Attachments:
filedump-9.3.difftext/x-patch; charset=UTF-8; name=filedump-9.3.diffDownload
diff -Nc pg_filedump-9.2.0/checksum.c pg_filedump-9.3.0j/checksum.c
*** pg_filedump-9.2.0/checksum.c 1969-12-31 16:00:00.000000000 -0800
--- pg_filedump-9.3.0j/checksum.c 2013-06-09 21:20:34.036176831 -0700
***************
*** 0 ****
--- 1,157 ----
+ /*-------------------------------------------------------------------------
+ *
+ * checksum.c
+ * Checksum implementation for data pages.
+ *
+ * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/storage/page/checksum.c
+ *
+ *-------------------------------------------------------------------------
+ *
+ * Checksum algorithm
+ *
+ * The algorithm used to checksum pages is chosen for very fast calculation.
+ * Workloads where the database working set fits into OS file cache but not
+ * into shared buffers can read in pages at a very fast pace and the checksum
+ * algorithm itself can become the largest bottleneck.
+ *
+ * The checksum algorithm itself is based on the FNV-1a hash (FNV is shorthand
+ * for Fowler/Noll/Vo) The primitive of a plain FNV-1a hash folds in data 1
+ * byte at a time according to the formula:
+ *
+ * hash = (hash ^ value) * FNV_PRIME
+ *
+ * FNV-1a algorithm is described at http://www.isthe.com/chongo/tech/comp/fnv/
+ *
+ * PostgreSQL doesn't use FNV-1a hash directly because it has bad mixing of
+ * high bits - high order bits in input data only affect high order bits in
+ * output data. To resolve this we xor in the value prior to multiplication
+ * shifted right by 17 bits. The number 17 was chosen because it doesn't
+ * have common denominator with set bit positions in FNV_PRIME and empirically
+ * provides the fastest mixing for high order bits of final iterations quickly
+ * avalanche into lower positions. For performance reasons we choose to combine
+ * 4 bytes at a time. The actual hash formula used as the basis is:
+ *
+ * hash = (hash ^ value) * FNV_PRIME ^ ((hash ^ value) >> 17)
+ *
+ * The main bottleneck in this calculation is the multiplication latency. To
+ * hide the latency and to make use of SIMD parallelism multiple hash values
+ * are calculated in parallel. The page is treated as a 32 column two
+ * dimensional array of 32 bit values. Each column is aggregated separately
+ * into a partial checksum. Each partial checksum uses a different initial
+ * value (offset basis in FNV terminology). The initial values actually used
+ * were chosen randomly, as the values themselves don't matter as much as that
+ * they are different and don't match anything in real data. After initializing
+ * partial checksums each value in the column is aggregated according to the
+ * above formula. Finally two more iterations of the formula are performed with
+ * value 0 to mix the bits of the last value added.
+ *
+ * The partial checksums are then folded together using xor to form a single
+ * 32-bit checksum. The caller can safely reduce the value to 16 bits
+ * using modulo 2^16-1. That will cause a very slight bias towards lower
+ * values but this is not significant for the performance of the
+ * checksum.
+ *
+ * The algorithm choice was based on what instructions are available in SIMD
+ * instruction sets. This meant that a fast and good algorithm needed to use
+ * multiplication as the main mixing operator. The simplest multiplication
+ * based checksum primitive is the one used by FNV. The prime used is chosen
+ * for good dispersion of values. It has no known simple patterns that result
+ * in collisions. Test of 5-bit differentials of the primitive over 64bit keys
+ * reveals no differentials with 3 or more values out of 100000 random keys
+ * colliding. Avalanche test shows that only high order bits of the last word
+ * have a bias. Tests of 1-4 uncorrelated bit errors, stray 0 and 0xFF bytes,
+ * overwriting page from random position to end with 0 bytes, and overwriting
+ * random segments of page with 0x00, 0xFF and random data all show optimal
+ * 2e-16 false positive rate within margin of error.
+ *
+ * Vectorization of the algorithm requires 32bit x 32bit -> 32bit integer
+ * multiplication instruction. As of 2013 the corresponding instruction is
+ * available on x86 SSE4.1 extensions (pmulld) and ARM NEON (vmul.i32).
+ * Vectorization requires a compiler to do the vectorization for us. For recent
+ * GCC versions the flags -msse4.1 -funroll-loops -ftree-vectorize are enough
+ * to achieve vectorization.
+ *
+ * The optimal amount of parallelism to use depends on CPU specific instruction
+ * latency, SIMD instruction width, throughput and the amount of registers
+ * available to hold intermediate state. Generally, more parallelism is better
+ * up to the point that state doesn't fit in registers and extra load-store
+ * instructions are needed to swap values in/out. The number chosen is a fixed
+ * part of the algorithm because changing the parallelism changes the checksum
+ * result.
+ *
+ * The parallelism number 32 was chosen based on the fact that it is the
+ * largest state that fits into architecturally visible x86 SSE registers while
+ * leaving some free registers for intermediate values. For future processors
+ * with 256bit vector registers this will leave some performance on the table.
+ * When vectorization is not available it might be beneficial to restructure
+ * the computation to calculate a subset of the columns at a time and perform
+ * multiple passes to avoid register spilling. This optimization opportunity
+ * is not used. Current coding also assumes that the compiler has the ability
+ * to unroll the inner loop to avoid loop overhead and minimize register
+ * spilling. For less sophisticated compilers it might be beneficial to manually
+ * unroll the inner loop.
+ */
+ #include "postgres.h"
+
+ #include "storage/checksum.h"
+
+ /* number of checksums to calculate in parallel */
+ #define N_SUMS 32
+ /* prime multiplier of FNV-1a hash */
+ #define FNV_PRIME 16777619
+
+ /*
+ * Base offsets to initialize each of the parallel FNV hashes into a
+ * different initial state.
+ */
+ static const uint32 checksumBaseOffsets[N_SUMS] = {
+ 0x5B1F36E9, 0xB8525960, 0x02AB50AA, 0x1DE66D2A,
+ 0x79FF467A, 0x9BB9F8A3, 0x217E7CD2, 0x83E13D2C,
+ 0xF8D4474F, 0xE39EB970, 0x42C6AE16, 0x993216FA,
+ 0x7B093B5D, 0x98DAFF3C, 0xF718902A, 0x0B1C9CDB,
+ 0xE58F764B, 0x187636BC, 0x5D7B3BB1, 0xE73DE7DE,
+ 0x92BEC979, 0xCCA6C0B2, 0x304A0979, 0x85AA43D4,
+ 0x783125BB, 0x6CA8EAA2, 0xE407EAC6, 0x4B5CFC3E,
+ 0x9FBF8C76, 0x15CA20BE, 0xF2CA9FD3, 0x959BD756
+ };
+
+ /*
+ * Calculate one round of the checksum.
+ */
+ #define CHECKSUM_COMP(checksum, value) do {\
+ uint32 __tmp = (checksum) ^ (value);\
+ (checksum) = __tmp * FNV_PRIME ^ (__tmp >> 17);\
+ } while (0)
+
+ uint32
+ checksum_block(char *data, uint32 size)
+ {
+ uint32 sums[N_SUMS];
+ uint32 (*dataArr)[N_SUMS] = (uint32 (*)[N_SUMS]) data;
+ uint32 result = 0;
+ int i, j;
+
+ /* initialize partial checksums to their corresponding offsets */
+ memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+ /* main checksum calculation */
+ for (i = 0; i < size/sizeof(uint32)/N_SUMS; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], dataArr[i][j]);
+
+ /* finally add in two rounds of zeroes for additional mixing */
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], 0);
+
+ /* xor fold partial checksums together */
+ for (i = 0; i < N_SUMS; i++)
+ result ^= sums[i];
+
+ return result;
+ }
Common subdirectories: pg_filedump-9.2.0/.deps and pg_filedump-9.3.0j/.deps
diff -Nc pg_filedump-9.2.0/Makefile pg_filedump-9.3.0j/Makefile
*** pg_filedump-9.2.0/Makefile 2012-03-12 09:02:44.000000000 -0700
--- pg_filedump-9.3.0j/Makefile 2013-06-09 21:15:43.908182347 -0700
***************
*** 1,7 ****
# View README.pg_filedump first
# note this must match version macros in pg_filedump.h
! FD_VERSION=9.2.0
CC=gcc
CFLAGS=-g -O -Wall -Wmissing-prototypes -Wmissing-declarations
--- 1,7 ----
# View README.pg_filedump first
# note this must match version macros in pg_filedump.h
! FD_VERSION=9.3.0
CC=gcc
CFLAGS=-g -O -Wall -Wmissing-prototypes -Wmissing-declarations
***************
*** 17,28 ****
all: pg_filedump
! pg_filedump: pg_filedump.o
${CC} ${CFLAGS} -o pg_filedump pg_filedump.o
pg_filedump.o: pg_filedump.c
${CC} ${CFLAGS} -I${PGSQL_INCLUDE_DIR} pg_filedump.c -c
dist:
rm -rf pg_filedump-${FD_VERSION} pg_filedump-${FD_VERSION}.tar.gz
mkdir pg_filedump-${FD_VERSION}
--- 17,31 ----
all: pg_filedump
! pg_filedump: pg_filedump.o checksum.o
${CC} ${CFLAGS} -o pg_filedump pg_filedump.o
pg_filedump.o: pg_filedump.c
${CC} ${CFLAGS} -I${PGSQL_INCLUDE_DIR} pg_filedump.c -c
+ checksum.o: checksum.c
+ ${CC} ${CFLAGS} -I${PGSQL_INCLUDE_DIR} checksum.c -c
+
dist:
rm -rf pg_filedump-${FD_VERSION} pg_filedump-${FD_VERSION}.tar.gz
mkdir pg_filedump-${FD_VERSION}
diff -Nc pg_filedump-9.2.0/Makefile.contrib pg_filedump-9.3.0j/Makefile.contrib
*** pg_filedump-9.2.0/Makefile.contrib 2012-03-12 08:52:57.000000000 -0700
--- pg_filedump-9.3.0j/Makefile.contrib 2013-06-09 21:16:17.524181706 -0700
***************
*** 1,5 ****
PROGRAM = pg_filedump
! OBJS = pg_filedump.o
DOCS = README.pg_filedump
--- 1,5 ----
PROGRAM = pg_filedump
! OBJS = pg_filedump.o checksum.o
DOCS = README.pg_filedump
diff -Nc pg_filedump-9.2.0/pg_filedump.c pg_filedump-9.3.0j/pg_filedump.c
*** pg_filedump-9.2.0/pg_filedump.c 2012-03-12 08:58:31.000000000 -0700
--- pg_filedump-9.3.0j/pg_filedump.c 2013-06-09 21:46:21.240147414 -0700
***************
*** 40,51 ****
static void DisplayOptions (unsigned int validOptions);
static unsigned int ConsumeOptions (int numOptions, char **options);
static int GetOptionValue (char *optionString);
! static void FormatBlock ();
static unsigned int GetBlockSize ();
static unsigned int GetSpecialSectionType (Page page);
static bool IsBtreeMetaPage(Page page);
static void CreateDumpFileHeader (int numOptions, char **options);
! static int FormatHeader (Page page);
static void FormatItemBlock (Page page);
static void FormatItem (unsigned int numBytes, unsigned int startIndex,
unsigned int formatAs);
--- 40,51 ----
static void DisplayOptions (unsigned int validOptions);
static unsigned int ConsumeOptions (int numOptions, char **options);
static int GetOptionValue (char *optionString);
! static void FormatBlock (BlockNumber blkno);
static unsigned int GetBlockSize ();
static unsigned int GetSpecialSectionType (Page page);
static bool IsBtreeMetaPage(Page page);
static void CreateDumpFileHeader (int numOptions, char **options);
! static int FormatHeader (Page page, BlockNumber blkno);
static void FormatItemBlock (Page page);
static void FormatItem (unsigned int numBytes, unsigned int startIndex,
unsigned int formatAs);
***************
*** 54,60 ****
static void FormatBinary (unsigned int numBytes, unsigned int startIndex);
static void DumpBinaryBlock ();
static void DumpFileContents ();
!
// Send properly formed usage information to the user.
static void
--- 54,60 ----
static void FormatBinary (unsigned int numBytes, unsigned int startIndex);
static void DumpBinaryBlock ();
static void DumpFileContents ();
! static uint16 PageCalcChecksum16 (Page page, BlockNumber blkno);
// Send properly formed usage information to the user.
static void
***************
*** 288,293 ****
--- 288,298 ----
SET_OPTION (itemOptions, ITEM_DETAIL, 'i');
break;
+ // Verify block checksums
+ case 'k':
+ SET_OPTION (blockOptions, BLOCK_CHECKSUMS, 'k');
+ break;
+
// Interpret items as standard index values
case 'x':
SET_OPTION (itemOptions, ITEM_INDEX, 'x');
***************
*** 522,527 ****
--- 527,561 ----
return false;
}
+ static uint16
+ PageCalcChecksum16(Page page, BlockNumber blkno)
+ {
+ PageHeader phdr = (PageHeader) page;
+ uint16 save_checksum;
+ uint32 checksum;
+
+ /*
+ * Save pd_checksum and set it to zero, so that the checksum calculation
+ * isn't affected by the checksum stored on the page. We do this to
+ * allow optimization of the checksum calculation on the whole block
+ * in one go.
+ */
+ save_checksum = phdr->pd_checksum;
+ phdr->pd_checksum = 0;
+ checksum = checksum_block(page, BLCKSZ);
+ phdr->pd_checksum = save_checksum;
+
+ /* mix in the block number to detect transposed pages */
+ checksum ^= blkno;
+
+ /*
+ * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
+ * one. That avoids checksums of zero, which seems like a good idea.
+ */
+ return (checksum % 65535) + 1;
+ }
+
+
// Display a header for the dump so we know the file name, the options
// used and the time the dump was taken
static void
***************
*** 555,561 ****
// Dump out a formatted block header for the requested block
static int
! FormatHeader (Page page)
{
int rc = 0;
unsigned int headerBytes;
--- 589,595 ----
// Dump out a formatted block header for the requested block
static int
! FormatHeader (Page page, BlockNumber blkno)
{
int rc = 0;
unsigned int headerBytes;
***************
*** 609,623 ****
" Block: Size %4d Version %4u Upper %4u (0x%04hx)\n"
" LSN: logid %6d recoff 0x%08x Special %4u (0x%04hx)\n"
" Items: %4d Free Space: %4u\n"
! " TLI: 0x%04x Prune XID: 0x%08x Flags: 0x%04x (%s)\n"
" Length (including item array): %u\n\n",
pageOffset, pageHeader->pd_lower, pageHeader->pd_lower,
(int) PageGetPageSize (page), blockVersion,
pageHeader->pd_upper, pageHeader->pd_upper,
! pageLSN.xlogid, pageLSN.xrecoff,
pageHeader->pd_special, pageHeader->pd_special,
maxOffset, pageHeader->pd_upper - pageHeader->pd_lower,
! pageHeader->pd_tli, pageHeader->pd_prune_xid,
pageHeader->pd_flags, flagString,
headerBytes);
--- 643,657 ----
" Block: Size %4d Version %4u Upper %4u (0x%04hx)\n"
" LSN: logid %6d recoff 0x%08x Special %4u (0x%04hx)\n"
" Items: %4d Free Space: %4u\n"
! " Checksum: %05hu Prune XID: 0x%08x Flags: 0x%04x (%s)\n"
" Length (including item array): %u\n\n",
pageOffset, pageHeader->pd_lower, pageHeader->pd_lower,
(int) PageGetPageSize (page), blockVersion,
pageHeader->pd_upper, pageHeader->pd_upper,
! (uint32) (pageLSN >> 32), (uint32) pageLSN,
pageHeader->pd_special, pageHeader->pd_special,
maxOffset, pageHeader->pd_upper - pageHeader->pd_lower,
! pageHeader->pd_checksum, pageHeader->pd_prune_xid,
pageHeader->pd_flags, flagString,
headerBytes);
***************
*** 647,652 ****
--- 681,694 ----
|| (pageHeader->pd_upper < pageHeader->pd_lower)
|| (pageHeader->pd_special > blockSize))
printf (" Error: Invalid header information.\n\n");
+
+ if (blockOptions & BLOCK_CHECKSUMS)
+ {
+ uint16 calc_checksum = PageCalcChecksum16(page, blkno);
+ if (calc_checksum != pageHeader->pd_checksum)
+ printf(" Error: checksum failure: calculated %05hu.\n\n",
+ calc_checksum);
+ }
}
// If we have reached the end of file while interpreting the header, let
***************
*** 933,939 ****
printf (" XMIN: %u XMAX: %u CID|XVAC: %u",
HeapTupleHeaderGetXmin(htup),
! HeapTupleHeaderGetXmax(htup),
HeapTupleHeaderGetRawCommandId(htup));
if (infoMask & HEAP_HASOID)
--- 975,981 ----
printf (" XMIN: %u XMAX: %u CID|XVAC: %u",
HeapTupleHeaderGetXmin(htup),
! HeapTupleHeaderGetRawXmax(htup),
HeapTupleHeaderGetRawCommandId(htup));
if (infoMask & HEAP_HASOID)
***************
*** 958,969 ****
strcat (flagString, "HASEXTERNAL|");
if (infoMask & HEAP_HASOID)
strcat (flagString, "HASOID|");
if (infoMask & HEAP_COMBOCID)
strcat (flagString, "COMBOCID|");
if (infoMask & HEAP_XMAX_EXCL_LOCK)
strcat (flagString, "XMAX_EXCL_LOCK|");
! if (infoMask & HEAP_XMAX_SHARED_LOCK)
! strcat (flagString, "XMAX_SHARED_LOCK|");
if (infoMask & HEAP_XMIN_COMMITTED)
strcat (flagString, "XMIN_COMMITTED|");
if (infoMask & HEAP_XMIN_INVALID)
--- 1000,1015 ----
strcat (flagString, "HASEXTERNAL|");
if (infoMask & HEAP_HASOID)
strcat (flagString, "HASOID|");
+ if (infoMask & HEAP_XMAX_KEYSHR_LOCK)
+ strcat (flagString, "XMAX_KEYSHR_LOCK|");
if (infoMask & HEAP_COMBOCID)
strcat (flagString, "COMBOCID|");
if (infoMask & HEAP_XMAX_EXCL_LOCK)
strcat (flagString, "XMAX_EXCL_LOCK|");
! if (infoMask & HEAP_XMAX_SHR_LOCK)
! strcat (flagString, "XMAX_SHR_LOCK|");
! if (infoMask & HEAP_XMAX_LOCK_ONLY)
! strcat (flagString, "XMAX_LOCK_ONLY|");
if (infoMask & HEAP_XMIN_COMMITTED)
strcat (flagString, "XMIN_COMMITTED|");
if (infoMask & HEAP_XMIN_INVALID)
***************
*** 981,986 ****
--- 1027,1034 ----
if (infoMask & HEAP_MOVED_IN)
strcat (flagString, "MOVED_IN|");
+ if (infoMask2 & HEAP_KEYS_UPDATED)
+ strcat (flagString, "KEYS_UPDATED|");
if (infoMask2 & HEAP_HOT_UPDATED)
strcat (flagString, "HOT_UPDATED|");
if (infoMask2 & HEAP_ONLY_TUPLE)
***************
*** 1204,1210 ****
// For each block, dump out formatted header and content information
static void
! FormatBlock ()
{
Page page = (Page) buffer;
pageOffset = blockSize * currentBlock;
--- 1252,1258 ----
// For each block, dump out formatted header and content information
static void
! FormatBlock (BlockNumber blkno)
{
Page page = (Page) buffer;
pageOffset = blockSize * currentBlock;
***************
*** 1224,1230 ****
int rc;
// Every block contains a header, items and possibly a special
// section. Beware of partial block reads though
! rc = FormatHeader (page);
// If we didn't encounter a partial read in the header, carry on...
if (rc != EOF_ENCOUNTERED)
--- 1272,1278 ----
int rc;
// Every block contains a header, items and possibly a special
// section. Beware of partial block reads though
! rc = FormatHeader (page, blkno);
// If we didn't encounter a partial read in the header, carry on...
if (rc != EOF_ENCOUNTERED)
***************
*** 1340,1354 ****
controlData->system_identifier,
dbState,
ctime (&(cd_time)),
! controlData->checkPoint.xlogid, controlData->checkPoint.xrecoff,
! controlData->prevCheckPoint.xlogid, controlData->prevCheckPoint.xrecoff,
! checkPoint->redo.xlogid, checkPoint->redo.xrecoff,
checkPoint->ThisTimeLineID,
checkPoint->nextXidEpoch, checkPoint->nextXid,
checkPoint->nextOid,
checkPoint->nextMulti, checkPoint->nextMultiOffset,
ctime (&cp_time),
! controlData->minRecoveryPoint.xlogid, controlData->minRecoveryPoint.xrecoff,
controlData->maxAlign,
controlData->floatFormat,
(controlData->floatFormat == FLOATFORMAT_VALUE ?
--- 1388,1402 ----
controlData->system_identifier,
dbState,
ctime (&(cd_time)),
! (uint32) (controlData->checkPoint >> 32), (uint32) controlData->checkPoint,
! (uint32) (controlData->prevCheckPoint >> 32), (uint32) controlData->prevCheckPoint,
! (uint32) (checkPoint->redo >> 32), (uint32) checkPoint->redo,
checkPoint->ThisTimeLineID,
checkPoint->nextXidEpoch, checkPoint->nextXid,
checkPoint->nextOid,
checkPoint->nextMulti, checkPoint->nextMultiOffset,
ctime (&cp_time),
! (uint32) (controlData->minRecoveryPoint), (uint32) (controlData->minRecoveryPoint),
controlData->maxAlign,
controlData->floatFormat,
(controlData->floatFormat == FLOATFORMAT_VALUE ?
***************
*** 1494,1500 ****
contentsToDump = false;
}
else
! FormatBlock ();
}
}
--- 1542,1548 ----
contentsToDump = false;
}
else
! FormatBlock (currentBlock);
}
}
diff -Nc pg_filedump-9.2.0/pg_filedump.h pg_filedump-9.3.0j/pg_filedump.h
*** pg_filedump-9.2.0/pg_filedump.h 2012-03-12 08:58:23.000000000 -0700
--- pg_filedump-9.3.0j/pg_filedump.h 2013-06-09 21:28:26.944167838 -0700
***************
*** 22,29 ****
* Original Author: Patrick Macdonald <patrickm@redhat.com>
*/
! #define FD_VERSION "9.2.0" /* version ID of pg_filedump */
! #define FD_PG_VERSION "PostgreSQL 9.2.x" /* PG version it works with */
#include "postgres.h"
--- 22,29 ----
* Original Author: Patrick Macdonald <patrickm@redhat.com>
*/
! #define FD_VERSION "9.3.0" /* version ID of pg_filedump */
! #define FD_PG_VERSION "PostgreSQL 9.3.x" /* PG version it works with */
#include "postgres.h"
***************
*** 34,44 ****
--- 34,46 ----
#include "access/gist.h"
#include "access/hash.h"
#include "access/htup.h"
+ #include "access/htup_details.h"
#include "access/itup.h"
#include "access/nbtree.h"
#include "access/spgist_private.h"
#include "catalog/pg_control.h"
#include "storage/bufpage.h"
+ #include "storage/checksum.h"
// Options for Block formatting operations
static unsigned int blockOptions = 0;
***************
*** 49,55 ****
BLOCK_FORMAT = 0x00000004, // -f: Formatted dump of blocks / control file
BLOCK_FORCED = 0x00000008, // -S: Block size forced
BLOCK_NO_INTR = 0x00000010, // -d: Dump straight blocks
! BLOCK_RANGE = 0x00000020 // -R: Specific block range to dump
}
blockSwitches;
--- 51,58 ----
BLOCK_FORMAT = 0x00000004, // -f: Formatted dump of blocks / control file
BLOCK_FORCED = 0x00000008, // -S: Block size forced
BLOCK_NO_INTR = 0x00000010, // -d: Dump straight blocks
! BLOCK_RANGE = 0x00000020, // -R: Specific block range to dump
! BLOCK_CHECKSUMS = 0x00000040 // -k: verify block checksums
}
blockSwitches;
Jeff Davis wrote:
--- 1000,1015 ---- strcat (flagString, "HASEXTERNAL|"); if (infoMask & HEAP_HASOID) strcat (flagString, "HASOID|"); + if (infoMask & HEAP_XMAX_KEYSHR_LOCK) + strcat (flagString, "XMAX_KEYSHR_LOCK|"); if (infoMask & HEAP_COMBOCID) strcat (flagString, "COMBOCID|"); if (infoMask & HEAP_XMAX_EXCL_LOCK) strcat (flagString, "XMAX_EXCL_LOCK|"); ! if (infoMask & HEAP_XMAX_SHR_LOCK) ! strcat (flagString, "XMAX_SHR_LOCK|"); ! if (infoMask & HEAP_XMAX_LOCK_ONLY) ! strcat (flagString, "XMAX_LOCK_ONLY|"); if (infoMask & HEAP_XMIN_COMMITTED) strcat (flagString, "XMIN_COMMITTED|"); if (infoMask & HEAP_XMIN_INVALID)
Hm, note that XMAX_SHR_LOCK is two bits, so when that flag is present
you will get the three lock modes displayed with the above code, which is
probably going to be misleading. htup_details.h does this:
/*
* Use these to test whether a particular lock is applied to a tuple
*/
#define HEAP_XMAX_IS_SHR_LOCKED(infomask) \
(((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_SHR_LOCK)
#define HEAP_XMAX_IS_EXCL_LOCKED(infomask) \
(((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_EXCL_LOCK)
#define HEAP_XMAX_IS_KEYSHR_LOCKED(infomask) \
(((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_KEYSHR_LOCK)
Presumably it'd be better to do something similar.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, 2013-06-10 at 01:28 -0400, Alvaro Herrera wrote:
Hm, note that XMAX_SHR_LOCK is two bits, so when that flag is present
you will get the three lock modes displayed with the above code, which is
probably going to be misleading. htup_details.h does this:/*
* Use these to test whether a particular lock is applied to a tuple
*/
#define HEAP_XMAX_IS_SHR_LOCKED(infomask) \
(((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_SHR_LOCK)
#define HEAP_XMAX_IS_EXCL_LOCKED(infomask) \
(((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_EXCL_LOCK)
#define HEAP_XMAX_IS_KEYSHR_LOCKED(infomask) \
(((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_KEYSHR_LOCK)Presumably it'd be better to do something similar.
I was hesitant to do too much interpretation of the bits. Do you think
it would be better to just remove the test for XMAX_SHR_LOCK?
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jeff Davis wrote:
I was hesitant to do too much interpretation of the bits. Do you think
it would be better to just remove the test for XMAX_SHR_LOCK?
I don't know, but then I'm biased because I know what that specific bit
combination means. I guess someone that doesn't know is going to be
surprised by seeing both lock strength bits together .. but maybe
they're just going to have a look at htup_details.h and instantly
understand what's going on. Not sure how likely that is.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Jeff Davis wrote:
I was hesitant to do too much interpretation of the bits. Do you think
it would be better to just remove the test for XMAX_SHR_LOCK?
I don't know, but then I'm biased because I know what that specific bit
combination means. I guess someone that doesn't know is going to be
surprised by seeing both lock strength bits together .. but maybe
they're just going to have a look at htup_details.h and instantly
understand what's going on. Not sure how likely that is.
I think it's all right because there are only 4 combinations of the two
bits and all 4 will be printed sensibly if we do it this way. It would
be bad if pg_filedump could print invalid flag combinations in a
misleading way --- but I don't see such a risk here. So we might as
well go for readability.
The thing I'm not too happy about is having to copy the checksum code
into pg_filedump. We just got rid of the need to do that for the CRC
code, and here it is coming back again. Can't we rearrange the core
checksum code similarly to what we did for the CRC stuff recently,
so that you only need access to a .h file for it?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, 2013-06-10 at 11:38 -0400, Tom Lane wrote:
The thing I'm not too happy about is having to copy the checksum code
into pg_filedump. We just got rid of the need to do that for the CRC
code, and here it is coming back again. Can't we rearrange the core
checksum code similarly to what we did for the CRC stuff recently,
so that you only need access to a .h file for it?
The CRC implementation is entirely in header files. Do you think we need
to go that far, or is it fine to just put it in libpgport and link that
to pg_filedump?
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jeff Davis wrote:
On Mon, 2013-06-10 at 11:38 -0400, Tom Lane wrote:
The thing I'm not too happy about is having to copy the checksum code
into pg_filedump. We just got rid of the need to do that for the CRC
code, and here it is coming back again. Can't we rearrange the core
checksum code similarly to what we did for the CRC stuff recently,
so that you only need access to a .h file for it?The CRC implementation is entirely in header files. Do you think we need
to go that far, or is it fine to just put it in libpgport and link that
to pg_filedump?
If a lib is okay, use libpgcommon please, not libpgport. But I think a
.h would be better, because there'd be no need to have a built source
tree to build pg_filedump, only the headers installed.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Jeff Davis wrote:
The CRC implementation is entirely in header files. Do you think we need
to go that far, or is it fine to just put it in libpgport and link that
to pg_filedump?
If a lib is okay, use libpgcommon please, not libpgport. But I think a
.h would be better, because there'd be no need to have a built source
tree to build pg_filedump, only the headers installed.
Neither lib would provide a useful solution so far as Red Hat's
packaging is concerned, because those libs are not exposed to other
packages (and never will be, unless we start supplying them as .so's)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jeff Davis <pgsql@j-davis.com> writes:
The patch is a bit ugly: I had to copy some code, and copy the entire
checksum.c file (minus some Asserts, which don't work in an external
program). Suggestions welcome.
What I propose we do about this is reduce backend/storage/page/checksum.c
to something like
#include "postgres.h"
#include "storage/checksum.h"
#include "storage/checksum_impl.h"
moving all the code currently in the file into a new .h file. Then,
any external programs such as pg_filedump can use the checksum code
by including checksum_impl.h. This is essentially the same thing we
did with the CRC support functionality some time ago.
Also, we have the cut-point between checksum.c and bufpage.c at the
wrong place. IMO we should move PageCalcChecksum16 in toto into
checksum.c (or really now into checksum_impl.h), because that and not
just checksum_block() is the functionality that is wanted.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, 2013-06-13 at 20:09 -0400, Tom Lane wrote:
What I propose we do about this is reduce backend/storage/page/checksum.c
to something like#include "postgres.h"
#include "storage/checksum.h"
#include "storage/checksum_impl.h"moving all the code currently in the file into a new .h file. Then,
any external programs such as pg_filedump can use the checksum code
by including checksum_impl.h. This is essentially the same thing we
did with the CRC support functionality some time ago.
Thank you for taking care of that. After seeing that it needed to be in
a header file, I was going to try doing it all as macros.
I have a question about the commit though: shouldn't both functions be
static if they are in a .h file? Otherwise, it could lead to naming
conflicts. I suppose it's wrong to include the implementation file
twice, but it still might be confusing if someone tries. Two ideas that
come to mind are:
* make both static and then have a trivial wrapper in checksum.c
* export one or both functions, but use #ifndef CHECKSUM_IMPL_H to
prevent redefinition
Also, we have the cut-point between checksum.c and bufpage.c at the
wrong place. IMO we should move PageCalcChecksum16 in toto into
checksum.c (or really now into checksum_impl.h), because that and not
just checksum_block() is the functionality that is wanted.
Agreed.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jeff Davis <pgsql@j-davis.com> writes:
I have a question about the commit though: shouldn't both functions be
static if they are in a .h file? Otherwise, it could lead to naming
conflicts. I suppose it's wrong to include the implementation file
twice, but it still might be confusing if someone tries. Two ideas that
come to mind are:
* make both static and then have a trivial wrapper in checksum.c
* export one or both functions, but use #ifndef CHECKSUM_IMPL_H to
prevent redefinition
Ah, you are right, I forgot the #ifndef CHECKSUM_IMPL_H dance. Will fix
in a bit.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-06-14 11:59:04 -0400, Tom Lane wrote:
Jeff Davis <pgsql@j-davis.com> writes:
I have a question about the commit though: shouldn't both functions be
static if they are in a .h file? Otherwise, it could lead to naming
conflicts. I suppose it's wrong to include the implementation file
twice, but it still might be confusing if someone tries. Two ideas that
come to mind are:
* make both static and then have a trivial wrapper in checksum.c
* export one or both functions, but use #ifndef CHECKSUM_IMPL_H to
prevent redefinitionAh, you are right, I forgot the #ifndef CHECKSUM_IMPL_H dance. Will fix
in a bit.
That won't help against errors if it's included in two different
files/translation units though. I don't really see a valid case where it
could be validly be included multiple times in one TU?
If anything we should #error in that case, but I am not sure it's worth
bothering.
E.g. in rmgrlist.h we have the following comment:
/* there is deliberately not an #ifndef RMGRLIST_H here */
and I think the reasoning behind that comment applies here as well.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
On 2013-06-14 11:59:04 -0400, Tom Lane wrote:
Ah, you are right, I forgot the #ifndef CHECKSUM_IMPL_H dance. Will fix
in a bit.
That won't help against errors if it's included in two different
files/translation units though.
Good point, but there's not any real reason to do that --- only
checksum.h should ever be #include'd in more than one file. Any program
using this stuff is expected to #include checksum_impl.h in exactly one
place. So maybe it's fine as-is.
E.g. in rmgrlist.h we have the following comment:
/* there is deliberately not an #ifndef RMGRLIST_H here */
and I think the reasoning behind that comment applies here as well.
Well, that's a different case: there, and also in kwlist.h, there's an
idea that it could actually be useful to #include the file more than
once, redefining the PG_RMGR() macro each time. There's no such use
case that I can see for checksum_impl.h.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, 2013-06-13 at 20:09 -0400, Tom Lane wrote:
What I propose we do about this is reduce backend/storage/page/checksum.c
to something like#include "postgres.h"
#include "storage/checksum.h"
#include "storage/checksum_impl.h"
Attached a new diff for pg_filedump that makes use of the above change.
I'm not sure what the resolution of Alvaro's concern was, so I left the
flag reporting the same as the previous patch.
Regards,
Jeff Davis
Attachments:
filedump-9.3b.difftext/x-patch; charset=UTF-8; name=filedump-9.3b.diffDownload
Common subdirectories: pg_filedump-9.2.0/.deps and pg_filedump-9.3.0j/.deps
diff -Nc pg_filedump-9.2.0/Makefile pg_filedump-9.3.0j/Makefile
*** pg_filedump-9.2.0/Makefile 2012-03-12 09:02:44.000000000 -0700
--- pg_filedump-9.3.0j/Makefile 2013-06-18 09:14:42.442220848 -0700
***************
*** 1,7 ****
# View README.pg_filedump first
# note this must match version macros in pg_filedump.h
! FD_VERSION=9.2.0
CC=gcc
CFLAGS=-g -O -Wall -Wmissing-prototypes -Wmissing-declarations
--- 1,7 ----
# View README.pg_filedump first
# note this must match version macros in pg_filedump.h
! FD_VERSION=9.3.0
CC=gcc
CFLAGS=-g -O -Wall -Wmissing-prototypes -Wmissing-declarations
diff -Nc pg_filedump-9.2.0/pg_filedump.c pg_filedump-9.3.0j/pg_filedump.c
*** pg_filedump-9.2.0/pg_filedump.c 2012-03-12 08:58:31.000000000 -0700
--- pg_filedump-9.3.0j/pg_filedump.c 2013-06-18 09:25:42.438208300 -0700
***************
*** 26,31 ****
--- 26,37 ----
#include "utils/pg_crc_tables.h"
+ // checksum_impl.h uses Assert, which doesn't work outside the server
+ #undef Assert
+ #define Assert(X)
+
+ #include "storage/checksum_impl.h"
+
// Global variables for ease of use mostly
static FILE *fp = NULL; // File to dump or format
static char *fileName = NULL; // File name for display
***************
*** 40,51 ****
static void DisplayOptions (unsigned int validOptions);
static unsigned int ConsumeOptions (int numOptions, char **options);
static int GetOptionValue (char *optionString);
! static void FormatBlock ();
static unsigned int GetBlockSize ();
static unsigned int GetSpecialSectionType (Page page);
static bool IsBtreeMetaPage(Page page);
static void CreateDumpFileHeader (int numOptions, char **options);
! static int FormatHeader (Page page);
static void FormatItemBlock (Page page);
static void FormatItem (unsigned int numBytes, unsigned int startIndex,
unsigned int formatAs);
--- 46,57 ----
static void DisplayOptions (unsigned int validOptions);
static unsigned int ConsumeOptions (int numOptions, char **options);
static int GetOptionValue (char *optionString);
! static void FormatBlock (BlockNumber blkno);
static unsigned int GetBlockSize ();
static unsigned int GetSpecialSectionType (Page page);
static bool IsBtreeMetaPage(Page page);
static void CreateDumpFileHeader (int numOptions, char **options);
! static int FormatHeader (Page page, BlockNumber blkno);
static void FormatItemBlock (Page page);
static void FormatItem (unsigned int numBytes, unsigned int startIndex,
unsigned int formatAs);
***************
*** 288,293 ****
--- 294,304 ----
SET_OPTION (itemOptions, ITEM_DETAIL, 'i');
break;
+ // Verify block checksums
+ case 'k':
+ SET_OPTION (blockOptions, BLOCK_CHECKSUMS, 'k');
+ break;
+
// Interpret items as standard index values
case 'x':
SET_OPTION (itemOptions, ITEM_INDEX, 'x');
***************
*** 555,561 ****
// Dump out a formatted block header for the requested block
static int
! FormatHeader (Page page)
{
int rc = 0;
unsigned int headerBytes;
--- 566,572 ----
// Dump out a formatted block header for the requested block
static int
! FormatHeader (Page page, BlockNumber blkno)
{
int rc = 0;
unsigned int headerBytes;
***************
*** 609,623 ****
" Block: Size %4d Version %4u Upper %4u (0x%04hx)\n"
" LSN: logid %6d recoff 0x%08x Special %4u (0x%04hx)\n"
" Items: %4d Free Space: %4u\n"
! " TLI: 0x%04x Prune XID: 0x%08x Flags: 0x%04x (%s)\n"
" Length (including item array): %u\n\n",
pageOffset, pageHeader->pd_lower, pageHeader->pd_lower,
(int) PageGetPageSize (page), blockVersion,
pageHeader->pd_upper, pageHeader->pd_upper,
! pageLSN.xlogid, pageLSN.xrecoff,
pageHeader->pd_special, pageHeader->pd_special,
maxOffset, pageHeader->pd_upper - pageHeader->pd_lower,
! pageHeader->pd_tli, pageHeader->pd_prune_xid,
pageHeader->pd_flags, flagString,
headerBytes);
--- 620,634 ----
" Block: Size %4d Version %4u Upper %4u (0x%04hx)\n"
" LSN: logid %6d recoff 0x%08x Special %4u (0x%04hx)\n"
" Items: %4d Free Space: %4u\n"
! " Checksum: %05hu Prune XID: 0x%08x Flags: 0x%04x (%s)\n"
" Length (including item array): %u\n\n",
pageOffset, pageHeader->pd_lower, pageHeader->pd_lower,
(int) PageGetPageSize (page), blockVersion,
pageHeader->pd_upper, pageHeader->pd_upper,
! (uint32) (pageLSN >> 32), (uint32) pageLSN,
pageHeader->pd_special, pageHeader->pd_special,
maxOffset, pageHeader->pd_upper - pageHeader->pd_lower,
! pageHeader->pd_checksum, pageHeader->pd_prune_xid,
pageHeader->pd_flags, flagString,
headerBytes);
***************
*** 647,652 ****
--- 658,671 ----
|| (pageHeader->pd_upper < pageHeader->pd_lower)
|| (pageHeader->pd_special > blockSize))
printf (" Error: Invalid header information.\n\n");
+
+ if (blockOptions & BLOCK_CHECKSUMS)
+ {
+ uint16 calc_checksum = pg_checksum_page(page, blkno);
+ if (calc_checksum != pageHeader->pd_checksum)
+ printf(" Error: checksum failure: calculated %05hu.\n\n",
+ calc_checksum);
+ }
}
// If we have reached the end of file while interpreting the header, let
***************
*** 933,939 ****
printf (" XMIN: %u XMAX: %u CID|XVAC: %u",
HeapTupleHeaderGetXmin(htup),
! HeapTupleHeaderGetXmax(htup),
HeapTupleHeaderGetRawCommandId(htup));
if (infoMask & HEAP_HASOID)
--- 952,958 ----
printf (" XMIN: %u XMAX: %u CID|XVAC: %u",
HeapTupleHeaderGetXmin(htup),
! HeapTupleHeaderGetRawXmax(htup),
HeapTupleHeaderGetRawCommandId(htup));
if (infoMask & HEAP_HASOID)
***************
*** 958,969 ****
strcat (flagString, "HASEXTERNAL|");
if (infoMask & HEAP_HASOID)
strcat (flagString, "HASOID|");
if (infoMask & HEAP_COMBOCID)
strcat (flagString, "COMBOCID|");
if (infoMask & HEAP_XMAX_EXCL_LOCK)
strcat (flagString, "XMAX_EXCL_LOCK|");
! if (infoMask & HEAP_XMAX_SHARED_LOCK)
! strcat (flagString, "XMAX_SHARED_LOCK|");
if (infoMask & HEAP_XMIN_COMMITTED)
strcat (flagString, "XMIN_COMMITTED|");
if (infoMask & HEAP_XMIN_INVALID)
--- 977,992 ----
strcat (flagString, "HASEXTERNAL|");
if (infoMask & HEAP_HASOID)
strcat (flagString, "HASOID|");
+ if (infoMask & HEAP_XMAX_KEYSHR_LOCK)
+ strcat (flagString, "XMAX_KEYSHR_LOCK|");
if (infoMask & HEAP_COMBOCID)
strcat (flagString, "COMBOCID|");
if (infoMask & HEAP_XMAX_EXCL_LOCK)
strcat (flagString, "XMAX_EXCL_LOCK|");
! if (infoMask & HEAP_XMAX_SHR_LOCK)
! strcat (flagString, "XMAX_SHR_LOCK|");
! if (infoMask & HEAP_XMAX_LOCK_ONLY)
! strcat (flagString, "XMAX_LOCK_ONLY|");
if (infoMask & HEAP_XMIN_COMMITTED)
strcat (flagString, "XMIN_COMMITTED|");
if (infoMask & HEAP_XMIN_INVALID)
***************
*** 981,986 ****
--- 1004,1011 ----
if (infoMask & HEAP_MOVED_IN)
strcat (flagString, "MOVED_IN|");
+ if (infoMask2 & HEAP_KEYS_UPDATED)
+ strcat (flagString, "KEYS_UPDATED|");
if (infoMask2 & HEAP_HOT_UPDATED)
strcat (flagString, "HOT_UPDATED|");
if (infoMask2 & HEAP_ONLY_TUPLE)
***************
*** 1204,1210 ****
// For each block, dump out formatted header and content information
static void
! FormatBlock ()
{
Page page = (Page) buffer;
pageOffset = blockSize * currentBlock;
--- 1229,1235 ----
// For each block, dump out formatted header and content information
static void
! FormatBlock (BlockNumber blkno)
{
Page page = (Page) buffer;
pageOffset = blockSize * currentBlock;
***************
*** 1224,1230 ****
int rc;
// Every block contains a header, items and possibly a special
// section. Beware of partial block reads though
! rc = FormatHeader (page);
// If we didn't encounter a partial read in the header, carry on...
if (rc != EOF_ENCOUNTERED)
--- 1249,1255 ----
int rc;
// Every block contains a header, items and possibly a special
// section. Beware of partial block reads though
! rc = FormatHeader (page, blkno);
// If we didn't encounter a partial read in the header, carry on...
if (rc != EOF_ENCOUNTERED)
***************
*** 1340,1354 ****
controlData->system_identifier,
dbState,
ctime (&(cd_time)),
! controlData->checkPoint.xlogid, controlData->checkPoint.xrecoff,
! controlData->prevCheckPoint.xlogid, controlData->prevCheckPoint.xrecoff,
! checkPoint->redo.xlogid, checkPoint->redo.xrecoff,
checkPoint->ThisTimeLineID,
checkPoint->nextXidEpoch, checkPoint->nextXid,
checkPoint->nextOid,
checkPoint->nextMulti, checkPoint->nextMultiOffset,
ctime (&cp_time),
! controlData->minRecoveryPoint.xlogid, controlData->minRecoveryPoint.xrecoff,
controlData->maxAlign,
controlData->floatFormat,
(controlData->floatFormat == FLOATFORMAT_VALUE ?
--- 1365,1379 ----
controlData->system_identifier,
dbState,
ctime (&(cd_time)),
! (uint32) (controlData->checkPoint >> 32), (uint32) controlData->checkPoint,
! (uint32) (controlData->prevCheckPoint >> 32), (uint32) controlData->prevCheckPoint,
! (uint32) (checkPoint->redo >> 32), (uint32) checkPoint->redo,
checkPoint->ThisTimeLineID,
checkPoint->nextXidEpoch, checkPoint->nextXid,
checkPoint->nextOid,
checkPoint->nextMulti, checkPoint->nextMultiOffset,
ctime (&cp_time),
! (uint32) (controlData->minRecoveryPoint), (uint32) (controlData->minRecoveryPoint),
controlData->maxAlign,
controlData->floatFormat,
(controlData->floatFormat == FLOATFORMAT_VALUE ?
***************
*** 1494,1500 ****
contentsToDump = false;
}
else
! FormatBlock ();
}
}
--- 1519,1525 ----
contentsToDump = false;
}
else
! FormatBlock (currentBlock);
}
}
diff -Nc pg_filedump-9.2.0/pg_filedump.h pg_filedump-9.3.0j/pg_filedump.h
*** pg_filedump-9.2.0/pg_filedump.h 2012-03-12 08:58:23.000000000 -0700
--- pg_filedump-9.3.0j/pg_filedump.h 2013-06-18 09:10:28.010225685 -0700
***************
*** 22,29 ****
* Original Author: Patrick Macdonald <patrickm@redhat.com>
*/
! #define FD_VERSION "9.2.0" /* version ID of pg_filedump */
! #define FD_PG_VERSION "PostgreSQL 9.2.x" /* PG version it works with */
#include "postgres.h"
--- 22,29 ----
* Original Author: Patrick Macdonald <patrickm@redhat.com>
*/
! #define FD_VERSION "9.3.0" /* version ID of pg_filedump */
! #define FD_PG_VERSION "PostgreSQL 9.3.x" /* PG version it works with */
#include "postgres.h"
***************
*** 34,44 ****
--- 34,46 ----
#include "access/gist.h"
#include "access/hash.h"
#include "access/htup.h"
+ #include "access/htup_details.h"
#include "access/itup.h"
#include "access/nbtree.h"
#include "access/spgist_private.h"
#include "catalog/pg_control.h"
#include "storage/bufpage.h"
+ #include "storage/checksum.h"
// Options for Block formatting operations
static unsigned int blockOptions = 0;
***************
*** 49,55 ****
BLOCK_FORMAT = 0x00000004, // -f: Formatted dump of blocks / control file
BLOCK_FORCED = 0x00000008, // -S: Block size forced
BLOCK_NO_INTR = 0x00000010, // -d: Dump straight blocks
! BLOCK_RANGE = 0x00000020 // -R: Specific block range to dump
}
blockSwitches;
--- 51,58 ----
BLOCK_FORMAT = 0x00000004, // -f: Formatted dump of blocks / control file
BLOCK_FORCED = 0x00000008, // -S: Block size forced
BLOCK_NO_INTR = 0x00000010, // -d: Dump straight blocks
! BLOCK_RANGE = 0x00000020, // -R: Specific block range to dump
! BLOCK_CHECKSUMS = 0x00000040 // -k: verify block checksums
}
blockSwitches;
On Tue, Jun 18, 2013 at 12:42 PM, Jeff Davis <pgsql@j-davis.com> wrote:
Attached a new diff for pg_filedump that makes use of the above change.
I'm not sure what the resolution of Alvaro's concern was, so I left the
flag reporting the same as the previous patch.
This patch is in the current CommitFest, does it still need to be
reviewed? If so, I notice that the version in pgfoundry's CVS is
rather different than the version the patch seems to have been built
against (presumably the pg_filedump-9.2.0.tar.gz release), and
conflicts in several places with cvs tip.
Also, would anyone be willing to convert this repository to git and
post it on github or similar? pgfoundry is becoming increasingly
difficult to use, for instance the 'Browse CVS Repository' link for
pg_filedump and other projects is broken[1]http://pgfoundry.org/scm/browser.php?group_id=1000541 and apparently has been
for months[2]http://pgfoundry.org/forum/forum.php?thread_id=15554&forum_id=44&group_id=1000013, not to mention the general crumminess of using CVS [3]Since the pgfoundry cvs server apparently doesn't support the 'ls' command, you might need to know this to know which module name to check out: cvs -d :pserver:anonymous@cvs.pgfoundry.org:/cvsroot/pgfiledump checkout pg_filedump.
Josh
[1]: http://pgfoundry.org/scm/browser.php?group_id=1000541
[2]: http://pgfoundry.org/forum/forum.php?thread_id=15554&forum_id=44&group_id=1000013
[3]: Since the pgfoundry cvs server apparently doesn't support the 'ls' command, you might need to know this to know which module name to check out: cvs -d :pserver:anonymous@cvs.pgfoundry.org:/cvsroot/pgfiledump checkout pg_filedump
command, you might need to know this to know which module name to
check out: cvs -d
:pserver:anonymous@cvs.pgfoundry.org:/cvsroot/pgfiledump checkout
pg_filedump
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Josh Kupershmidt <schmiddy@gmail.com> writes:
This patch is in the current CommitFest, does it still need to be
reviewed? If so, I notice that the version in pgfoundry's CVS is
rather different than the version the patch seems to have been built
against (presumably the pg_filedump-9.2.0.tar.gz release), and
conflicts in several places with cvs tip.
Yeah, I pushed some basic 9.3 compatibility fixes into pg_filedump CVS
a few weeks back. If someone could rebase the patch against that,
I'd appreciate it.
Also, would anyone be willing to convert this repository to git and
post it on github or similar?
I know it's time to get off of pgfoundry, but doing so for pg_filedump
is way down the priority list ...
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, 2013-06-24 at 20:34 -0400, Josh Kupershmidt wrote:
This patch is in the current CommitFest, does it still need to be
reviewed? If so, I notice that the version in pgfoundry's CVS is
rather different than the version the patch seems to have been built
against (presumably the pg_filedump-9.2.0.tar.gz release), and
conflicts in several places with cvs tip.
Oh, thank you. After browsing the CVS repo failed, I just made the diff
against 9.2.0. I'll rebase it.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, 2013-06-24 at 20:34 -0400, Josh Kupershmidt wrote:
This patch is in the current CommitFest, does it still need to be
reviewed? If so, I notice that the version in pgfoundry's CVS is
rather different than the version the patch seems to have been built
against (presumably the pg_filedump-9.2.0.tar.gz release), and
conflicts in several places with cvs tip.
Rebased against CVS tip; attached.
Also, would anyone be willing to convert this repository to git and
post it on github or similar? pgfoundry is becoming increasingly
difficult to use, for instance the 'Browse CVS Repository' link for
pg_filedump and other projects is broken[1] and apparently has been
for months[2], not to mention the general crumminess of using CVS [3].
Eventually, it would be nice to have a more full-featured offline
checker utility. Do we want to try to turn this utility into that, or
make a new one?
Regards,
Jeff Davis
Attachments:
filedump-9.3.checksums.difftext/x-patch; charset=UTF-8; name=filedump-9.3.checksums.diffDownload
Only in pg_filedump.checksums/: .deps
Only in pg_filedump.checksums/: pg_filedump
diff -rc pg_filedump/pg_filedump.c pg_filedump.checksums/pg_filedump.c
*** pg_filedump/pg_filedump.c 2013-06-06 11:33:17.000000000 -0700
--- pg_filedump.checksums/pg_filedump.c 2013-06-26 11:53:45.780117294 -0700
***************
*** 26,31 ****
--- 26,38 ----
#include "utils/pg_crc_tables.h"
+ // checksum_impl.h uses Assert, which doesn't work outside the server
+ #undef Assert
+ #define Assert(X)
+
+ #include "storage/checksum.h"
+ #include "storage/checksum_impl.h"
+
// Global variables for ease of use mostly
static FILE *fp = NULL; // File to dump or format
static char *fileName = NULL; // File name for display
***************
*** 40,51 ****
static void DisplayOptions (unsigned int validOptions);
static unsigned int ConsumeOptions (int numOptions, char **options);
static int GetOptionValue (char *optionString);
! static void FormatBlock ();
static unsigned int GetBlockSize ();
static unsigned int GetSpecialSectionType (Page page);
static bool IsBtreeMetaPage(Page page);
static void CreateDumpFileHeader (int numOptions, char **options);
! static int FormatHeader (Page page);
static void FormatItemBlock (Page page);
static void FormatItem (unsigned int numBytes, unsigned int startIndex,
unsigned int formatAs);
--- 47,58 ----
static void DisplayOptions (unsigned int validOptions);
static unsigned int ConsumeOptions (int numOptions, char **options);
static int GetOptionValue (char *optionString);
! static void FormatBlock (BlockNumber blkno);
static unsigned int GetBlockSize ();
static unsigned int GetSpecialSectionType (Page page);
static bool IsBtreeMetaPage(Page page);
static void CreateDumpFileHeader (int numOptions, char **options);
! static int FormatHeader (Page page, BlockNumber blkno);
static void FormatItemBlock (Page page);
static void FormatItem (unsigned int numBytes, unsigned int startIndex,
unsigned int formatAs);
***************
*** 288,293 ****
--- 295,305 ----
SET_OPTION (itemOptions, ITEM_DETAIL, 'i');
break;
+ // Verify block checksums
+ case 'k':
+ SET_OPTION (blockOptions, BLOCK_CHECKSUMS, 'k');
+ break;
+
// Interpret items as standard index values
case 'x':
SET_OPTION (itemOptions, ITEM_INDEX, 'x');
***************
*** 555,561 ****
// Dump out a formatted block header for the requested block
static int
! FormatHeader (Page page)
{
int rc = 0;
unsigned int headerBytes;
--- 567,573 ----
// Dump out a formatted block header for the requested block
static int
! FormatHeader (Page page, BlockNumber blkno)
{
int rc = 0;
unsigned int headerBytes;
***************
*** 647,652 ****
--- 659,672 ----
|| (pageHeader->pd_upper < pageHeader->pd_lower)
|| (pageHeader->pd_special > blockSize))
printf (" Error: Invalid header information.\n\n");
+
+ if (blockOptions & BLOCK_CHECKSUMS)
+ {
+ uint16 calc_checksum = pg_checksum_page(page, blkno);
+ if (calc_checksum != pageHeader->pd_checksum)
+ printf(" Error: checksum failure: calculated 0x%04x.\n\n",
+ calc_checksum);
+ }
}
// If we have reached the end of file while interpreting the header, let
***************
*** 1208,1214 ****
// For each block, dump out formatted header and content information
static void
! FormatBlock ()
{
Page page = (Page) buffer;
pageOffset = blockSize * currentBlock;
--- 1228,1234 ----
// For each block, dump out formatted header and content information
static void
! FormatBlock (BlockNumber blkno)
{
Page page = (Page) buffer;
pageOffset = blockSize * currentBlock;
***************
*** 1228,1234 ****
int rc;
// Every block contains a header, items and possibly a special
// section. Beware of partial block reads though
! rc = FormatHeader (page);
// If we didn't encounter a partial read in the header, carry on...
if (rc != EOF_ENCOUNTERED)
--- 1248,1254 ----
int rc;
// Every block contains a header, items and possibly a special
// section. Beware of partial block reads though
! rc = FormatHeader (page, blkno);
// If we didn't encounter a partial read in the header, carry on...
if (rc != EOF_ENCOUNTERED)
***************
*** 1498,1504 ****
contentsToDump = false;
}
else
! FormatBlock ();
}
}
--- 1518,1524 ----
contentsToDump = false;
}
else
! FormatBlock (currentBlock);
}
}
Only in pg_filedump.checksums/: pg_filedump.c~
diff -rc pg_filedump/pg_filedump.h pg_filedump.checksums/pg_filedump.h
*** pg_filedump/pg_filedump.h 2013-06-06 11:33:17.000000000 -0700
--- pg_filedump.checksums/pg_filedump.h 2013-06-26 11:49:20.520122340 -0700
***************
*** 50,56 ****
BLOCK_FORMAT = 0x00000004, // -f: Formatted dump of blocks / control file
BLOCK_FORCED = 0x00000008, // -S: Block size forced
BLOCK_NO_INTR = 0x00000010, // -d: Dump straight blocks
! BLOCK_RANGE = 0x00000020 // -R: Specific block range to dump
}
blockSwitches;
--- 50,57 ----
BLOCK_FORMAT = 0x00000004, // -f: Formatted dump of blocks / control file
BLOCK_FORCED = 0x00000008, // -S: Block size forced
BLOCK_NO_INTR = 0x00000010, // -d: Dump straight blocks
! BLOCK_RANGE = 0x00000020, // -R: Specific block range to dump
! BLOCK_CHECKSUMS = 0x00000040 // -k: verify block checksums
}
blockSwitches;
Only in pg_filedump.checksums/: pg_filedump.o
Jeff Davis <pgsql@j-davis.com> writes:
On Mon, 2013-06-24 at 20:34 -0400, Josh Kupershmidt wrote:
This patch is in the current CommitFest, does it still need to be
reviewed? If so, I notice that the version in pgfoundry's CVS is
rather different than the version the patch seems to have been built
against (presumably the pg_filedump-9.2.0.tar.gz release), and
conflicts in several places with cvs tip.
Rebased against CVS tip; attached.
Thanks. I'm feeling pretty overworked these days but will try to push
this into pgfoundry in a timely fashion.
Eventually, it would be nice to have a more full-featured offline
checker utility. Do we want to try to turn this utility into that, or
make a new one?
TBH, I've always been annoyed that pg_filedump is GPL and so there's no
way for us to just ship it in contrib. (That stems from Red Hat
corporate policy of a dozen years ago, but the conflict is real anyway.)
If somebody is sufficiently excited about this topic to do something
that's largely new anyway, I'd be in favor of starting from scratch so
it could be put under the usual Postgres license.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 26, 2013 at 8:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, I've always been annoyed that pg_filedump is GPL and so there's no
way for us to just ship it in contrib. (That stems from Red Hat
corporate policy of a dozen years ago, but the conflict is real anyway.)
If somebody is sufficiently excited about this topic to do something
that's largely new anyway, I'd be in favor of starting from scratch so
it could be put under the usual Postgres license.
Heroku are interested in online verification of basebackups (i.e.
using checksums to verify the integrity of heap files as they are
backed up, with a view to relying less and less on logical backups). I
am very glad that you made the page checksums stuff available to
external utilities in commit f04216341dd1cc235e975f93ac806d9d3729a344.
In the last couple of days, I haven't been able to figure out a way to
solve the problem of torn pages in a way that isn't a complete kludge
(with a hopefully-acceptable risk of false positives), so I've been
operating under the assumption that anything I produce here won't be
up to the standards of contrib. I had intended to release whatever
results as an open source project anyway. However, if we can figure
out a way to solve the torn pages problem, or at least produce
something acceptable, I think I'd certainly be able to find the time
to work on a contrib module that is mainly concerned with verifying
basebackups, but also offers some pg_filedump-like functionality.
That's something largely new.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-06-26 21:18:49 -0700, Peter Geoghegan wrote:
On Wed, Jun 26, 2013 at 8:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, I've always been annoyed that pg_filedump is GPL and so there's no
way for us to just ship it in contrib. (That stems from Red Hat
corporate policy of a dozen years ago, but the conflict is real anyway.)
If somebody is sufficiently excited about this topic to do something
that's largely new anyway, I'd be in favor of starting from scratch so
it could be put under the usual Postgres license.Heroku are interested in online verification of basebackups (i.e.
using checksums to verify the integrity of heap files as they are
backed up, with a view to relying less and less on logical backups). I
am very glad that you made the page checksums stuff available to
external utilities in commit f04216341dd1cc235e975f93ac806d9d3729a344.In the last couple of days, I haven't been able to figure out a way to
solve the problem of torn pages in a way that isn't a complete kludge
(with a hopefully-acceptable risk of false positives), so I've been
operating under the assumption that anything I produce here won't be
up to the standards of contrib.
Why not do this from a function/background worker in the backend where
you can go via the buffer manager to avoid torn pages et al. If you use
a buffer strategy the cache poisoning et al should be controlleable.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 26, 2013 at 11:27 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Why not do this from a function/background worker in the backend where
you can go via the buffer manager to avoid torn pages et al. If you use
a buffer strategy the cache poisoning et al should be controlleable.
I had considered that, but thought it might be a little bit
aggressive, even with a strategy of BAS_BULKREAD. Maybe the kludge I
have in mind might not end up being that bad in practice, and would
certainly perform better than an approach that used the buffer
manager. But then, going through shared_buffers could be worth the
overhead, if only for the peace of mind of not relying on something
that is subtly broken.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-06-26 23:42:55 -0700, Peter Geoghegan wrote:
On Wed, Jun 26, 2013 at 11:27 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Why not do this from a function/background worker in the backend where
you can go via the buffer manager to avoid torn pages et al. If you use
a buffer strategy the cache poisoning et al should be controlleable.I had considered that, but thought it might be a little bit
aggressive, even with a strategy of BAS_BULKREAD.
Well, you can influence the pacing yourself, you don't need to rely on
the strategy for that. I'd only use it because of the ringbuffer logic
it has to avoid trashing the cache.
Maybe the kludge I
have in mind might not end up being that bad in practice, and would
certainly perform better than an approach that used the buffer
manager.
What do you have in mind then?
But then, going through shared_buffers could be worth the
overhead, if only for the peace of mind of not relying on something
that is subtly broken.
Spurious alarms quickly lead to people ignoring them, consciously or
not, so trying to take care not to go there sounds like a good idea.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jun 18, 2013 at 9:42 AM, Jeff Davis <pgsql@j-davis.com> wrote:
I'm not sure what the resolution of Alvaro's concern was, so I left the
flag reporting the same as the previous patch.
Alvaro's concern was that the new flags added (those added by the
foreign key locks patch) do something cute with re-using multiple
other bits in an otherwise nonsensical combination to represent a
distinct state. So as written, the infoMask if statements will result
in spurious reporting of information stored in t_infomask. If you AND
some integer with HEAP_XMAX_SHR_LOCK and get something non-zero,
you'll surely also get a non-zero result with HEAP_LOCK_MASK, because
the latter flag has all the same bits set as the former (plus others,
obviously).
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 27, 2013 at 12:07 AM, Peter Geoghegan <pg@heroku.com> wrote:
I'm not sure what the resolution of Alvaro's concern was, so I left the
flag reporting the same as the previous patch.Alvaro's concern was that the new flags added (those added by the
foreign key locks patch) do something cute with re-using multiple
other bits in an otherwise nonsensical combination to represent a
distinct state.
I just realized that it wasn't that you didn't understand the nature
of the problem, but that you weren't sure of a resolution. Sorry for
the noise.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
On 2013-06-26 21:18:49 -0700, Peter Geoghegan wrote:
Heroku are interested in online verification of basebackups (i.e.
using checksums to verify the integrity of heap files as they are
backed up, with a view to relying less and less on logical backups).
Why not do this from a function/background worker in the backend where
you can go via the buffer manager to avoid torn pages et al. If you use
a buffer strategy the cache poisoning et al should be controlleable.
That would require having a functioning postmaster environment, which
seems like it would make such a tool much less flexible.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-06-27 09:51:07 -0400, Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
On 2013-06-26 21:18:49 -0700, Peter Geoghegan wrote:
Heroku are interested in online verification of basebackups (i.e.
using checksums to verify the integrity of heap files as they are
backed up, with a view to relying less and less on logical backups).Why not do this from a function/background worker in the backend where
you can go via the buffer manager to avoid torn pages et al. If you use
a buffer strategy the cache poisoning et al should be controlleable.That would require having a functioning postmaster environment, which
seems like it would make such a tool much less flexible.
I personally wouldn't trust online backups that aren't proven to be
replayable into a runnable state very much. I have seen too many cases
where that failed.
Maybe the trick is to add a recovery.conf option to make postgres replay
to the first restartpoint and then shutdown. At that point you can be
sure there aren't any torn pages anymore (bugs aside).
In fact that sounds like a rather useful pg_basebackup extension...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, 2013-06-27 at 15:59 +0200, Andres Freund wrote:
Maybe the trick is to add a recovery.conf option to make postgres replay
to the first restartpoint and then shutdown. At that point you can be
sure there aren't any torn pages anymore (bugs aside).
In fact that sounds like a rather useful pg_basebackup extension...
+1
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I have reviewed this patch as a CF reviewer.
(2013/06/27 4:07), Jeff Davis wrote:
On Mon, 2013-06-24 at 20:34 -0400, Josh Kupershmidt wrote:
This patch is in the current CommitFest, does it still need to be
reviewed? If so, I notice that the version in pgfoundry's CVS is
rather different than the version the patch seems to have been built
against (presumably the pg_filedump-9.2.0.tar.gz release), and
conflicts in several places with cvs tip.Rebased against CVS tip; attached.
It looks fine, but I have one question here.
When I run pg_filedump with -k against a database cluster which
does not support checksums, pg_filedump produced checksum error as
following. Is this expected or acceptable?
-----------------------------------------------------------------
*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility - Version 9.3.0
*
* File: /tmp/pgsql/data/base/16384/16397
* Options used: -k
*
* Dump created on: Sat Jul 6 10:32:15 2013
*******************************************************************
Block 0 ********************************************************
<Header> -----
Block Offset: 0x00000000 Offsets: Lower 268 (0x010c)
Block: Size 8192 Version 4 Upper 384 (0x0180)
LSN: logid 0 recoff 0x00000000 Special 8192 (0x2000)
Items: 61 Free Space: 116
Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 ()
Length (including item array): 268
Error: checksum failure: calculated 0xf797.
<Data> ------
Item 1 -- Length: 121 Offset: 8064 (0x1f80) Flags: NORMAL
Item 2 -- Length: 121 Offset: 7936 (0x1f00) Flags: NORMAL
Item 3 -- Length: 121 Offset: 7808 (0x1e80) Flags: NORMAL
-----------------------------------------------------------------
Please check attached script to reproduce it.
Also, I have update the help message and README.
Please check attached patch.
Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp
Attachments:
filedump-9.3.checksums_v2.difftext/plain; charset=Shift_JIS; name=filedump-9.3.checksums_v2.diffDownload
diff --git a/contrib/pg_filedump/README.pg_filedump b/contrib/pg_filedump/README.pg_filedump
index 63d086f..b3050cd 100644
--- a/contrib/pg_filedump/README.pg_filedump
+++ b/contrib/pg_filedump/README.pg_filedump
@@ -2,7 +2,7 @@ pg_filedump - Display formatted contents of a PostgreSQL heap, index,
or control file.
Copyright (c) 2002-2010 Red Hat, Inc.
-Copyright (c) 2011-2012, PostgreSQL Global Development Group
+Copyright (c) 2011-2013, PostgreSQL Global Development Group
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@@ -59,10 +59,10 @@ not require any manual adjustments of the Makefile.
------------------------------------------------------------------------
Invocation:
-pg_filedump [-abcdfhixy] [-R startblock [endblock]] [-S blocksize] file
+pg_filedump [-abcdfhikxy] [-R startblock [endblock]] [-S blocksize] file
-Defaults are: relative addressing, range of the entire file, block size
- as listed on block 0 in the file
+Defaults are: relative addressing, range of the entire file, block
+ size as listed on block 0 in the file
The following options are valid for heap and index files:
-a Display absolute addresses when formatting (Block header
@@ -74,6 +74,7 @@ The following options are valid for heap and index files:
-f Display formatted block content dump along with interpretation
-h Display this information
-i Display interpreted item details
+ -k Verify block checksums
-R Display specific block ranges within the file (Blocks are
indexed from 0)
[startblock]: block to start at
diff --git a/contrib/pg_filedump/pg_filedump.c b/contrib/pg_filedump/pg_filedump.c
index e5686ea..7d1aa9a 100644
--- a/contrib/pg_filedump/pg_filedump.c
+++ b/contrib/pg_filedump/pg_filedump.c
@@ -26,6 +26,13 @@
#include "utils/pg_crc_tables.h"
+// checksum_impl.h uses Assert, which doesn't work outside the server
+#undef Assert
+#define Assert(X)
+
+#include "storage/checksum.h"
+#include "storage/checksum_impl.h"
+
// Global variables for ease of use mostly
static FILE *fp = NULL; // File to dump or format
static char *fileName = NULL; // File name for display
@@ -40,12 +47,12 @@ static unsigned int blockVersion = 0; // Block version number
static void DisplayOptions (unsigned int validOptions);
static unsigned int ConsumeOptions (int numOptions, char **options);
static int GetOptionValue (char *optionString);
-static void FormatBlock ();
+static void FormatBlock (BlockNumber blkno);
static unsigned int GetBlockSize ();
static unsigned int GetSpecialSectionType (Page page);
static bool IsBtreeMetaPage(Page page);
static void CreateDumpFileHeader (int numOptions, char **options);
-static int FormatHeader (Page page);
+static int FormatHeader (Page page, BlockNumber blkno);
static void FormatItemBlock (Page page);
static void FormatItem (unsigned int numBytes, unsigned int startIndex,
unsigned int formatAs);
@@ -64,11 +71,11 @@ DisplayOptions (unsigned int validOptions)
printf
("\nVersion %s (for %s)"
"\nCopyright (c) 2002-2010 Red Hat, Inc."
- "\nCopyright (c) 2011-2012, PostgreSQL Global Development Group\n",
+ "\nCopyright (c) 2011-2013, PostgreSQL Global Development Group\n",
FD_VERSION, FD_PG_VERSION);
printf
- ("\nUsage: pg_filedump [-abcdfhixy] [-R startblock [endblock]] [-S blocksize] file\n\n"
+ ("\nUsage: pg_filedump [-abcdfhikxy] [-R startblock [endblock]] [-S blocksize] file\n\n"
"Display formatted contents of a PostgreSQL heap/index/control file\n"
"Defaults are: relative addressing, range of the entire file, block\n"
" size as listed on block 0 in the file\n\n"
@@ -82,6 +89,7 @@ DisplayOptions (unsigned int validOptions)
" -f Display formatted block content dump along with interpretation\n"
" -h Display this information\n"
" -i Display interpreted item details\n"
+ " -k Verify block checksums\n"
" -R Display specific block ranges within the file (Blocks are\n"
" indexed from 0)\n" " [startblock]: block to start at\n"
" [endblock]: block to end at\n"
@@ -288,6 +296,11 @@ ConsumeOptions (int numOptions, char **options)
SET_OPTION (itemOptions, ITEM_DETAIL, 'i');
break;
+ // Verify block checksums
+ case 'k':
+ SET_OPTION (blockOptions, BLOCK_CHECKSUMS, 'k');
+ break;
+
// Interpret items as standard index values
case 'x':
SET_OPTION (itemOptions, ITEM_INDEX, 'x');
@@ -555,7 +568,7 @@ CreateDumpFileHeader (int numOptions, char **options)
// Dump out a formatted block header for the requested block
static int
-FormatHeader (Page page)
+FormatHeader (Page page, BlockNumber blkno)
{
int rc = 0;
unsigned int headerBytes;
@@ -647,6 +660,14 @@ FormatHeader (Page page)
|| (pageHeader->pd_upper < pageHeader->pd_lower)
|| (pageHeader->pd_special > blockSize))
printf (" Error: Invalid header information.\n\n");
+
+ if (blockOptions & BLOCK_CHECKSUMS)
+ {
+ uint16 calc_checksum = pg_checksum_page(page, blkno);
+ if (calc_checksum != pageHeader->pd_checksum)
+ printf(" Error: checksum failure: calculated 0x%04x.\n\n",
+ calc_checksum);
+ }
}
// If we have reached the end of file while interpreting the header, let
@@ -1208,7 +1229,7 @@ FormatSpecial ()
// For each block, dump out formatted header and content information
static void
-FormatBlock ()
+FormatBlock (BlockNumber blkno)
{
Page page = (Page) buffer;
pageOffset = blockSize * currentBlock;
@@ -1228,7 +1249,7 @@ FormatBlock ()
int rc;
// Every block contains a header, items and possibly a special
// section. Beware of partial block reads though
- rc = FormatHeader (page);
+ rc = FormatHeader (page, blkno);
// If we didn't encounter a partial read in the header, carry on...
if (rc != EOF_ENCOUNTERED)
@@ -1498,7 +1519,7 @@ DumpFileContents ()
contentsToDump = false;
}
else
- FormatBlock ();
+ FormatBlock (currentBlock);
}
}
diff --git a/contrib/pg_filedump/pg_filedump.h b/contrib/pg_filedump/pg_filedump.h
index e0c61be..1953144 100644
--- a/contrib/pg_filedump/pg_filedump.h
+++ b/contrib/pg_filedump/pg_filedump.h
@@ -50,7 +50,8 @@ typedef enum
BLOCK_FORMAT = 0x00000004, // -f: Formatted dump of blocks / control file
BLOCK_FORCED = 0x00000008, // -S: Block size forced
BLOCK_NO_INTR = 0x00000010, // -d: Dump straight blocks
- BLOCK_RANGE = 0x00000020 // -R: Specific block range to dump
+ BLOCK_RANGE = 0x00000020, // -R: Specific block range to dump
+ BLOCK_CHECKSUMS = 0x00000040 // -k: verify block checksums
}
blockSwitches;
On Sat, 2013-07-06 at 10:30 +0900, Satoshi Nagayasu wrote:
Hi,
It looks fine, but I have one question here.
When I run pg_filedump with -k against a database cluster which
does not support checksums, pg_filedump produced checksum error as
following. Is this expected or acceptable?
Thank you for taking a look. That is expected, because there is not a
good way to determine whether the file was created with checksums or
not. So, we rely on the user to supply (or not) the -k flag.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, 2013-07-05 at 22:43 -0700, Jeff Davis wrote:
On Sat, 2013-07-06 at 10:30 +0900, Satoshi Nagayasu wrote:
Hi,
It looks fine, but I have one question here.
When I run pg_filedump with -k against a database cluster which
does not support checksums, pg_filedump produced checksum error as
following. Is this expected or acceptable?Thank you for taking a look. That is expected, because there is not a
good way to determine whether the file was created with checksums or
not. So, we rely on the user to supply (or not) the -k flag.
I see this patch is still "waiting on author" in the CF. Is there
something else needed from me, or should we move this to "ready for
committer"?
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jul 8, 2013 at 10:28 AM, Jeff Davis <pgsql@j-davis.com> wrote:
I see this patch is still "waiting on author" in the CF. Is there
something else needed from me, or should we move this to "ready for
committer"?
Well, obviously someone still needs to think through the handling of
the infoMask bits.
Alvaro?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Geoghegan escribi�:
On Mon, Jul 8, 2013 at 10:28 AM, Jeff Davis <pgsql@j-davis.com> wrote:
I see this patch is still "waiting on author" in the CF. Is there
something else needed from me, or should we move this to "ready for
committer"?Well, obviously someone still needs to think through the handling of
the infoMask bits.
Well, Tom opined in
/messages/by-id/23249.1370878717@sss.pgh.pa.us that
the current patch is okay. I have a mild opinion that it should instead
print only SHR_LOCK when both bits are set, and one of the others when
only one of them is set. But I don't have a strong opinion about this,
and since Tom disagrees with me, feel free to exercise your own (Jeff's)
judgement.
Tom's the only available committer in this case anyway. [Actually,
pgfoundry.org says it's a zero-people team in the pgfiledump project
right now, but I'm hoping that's just a temporary glitch.]
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jul 8, 2013 at 11:52 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Well, Tom opined in
/messages/by-id/23249.1370878717@sss.pgh.pa.us that
the current patch is okay. I have a mild opinion that it should instead
print only SHR_LOCK when both bits are set, and one of the others when
only one of them is set. But I don't have a strong opinion about this,
and since Tom disagrees with me, feel free to exercise your own (Jeff's)
judgement.
I'm inclined to agree with you here, but I suppose it isn't all that important.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Well, Tom opined in
/messages/by-id/23249.1370878717@sss.pgh.pa.us that
the current patch is okay. I have a mild opinion that it should instead
print only SHR_LOCK when both bits are set, and one of the others when
only one of them is set. But I don't have a strong opinion about this,
and since Tom disagrees with me, feel free to exercise your own (Jeff's)
judgement.
FWIW, I think that's exactly what I did in the preliminary 9.3 patch
that I committed to pg_filedump a few weeks ago. Could you take a look
at what's there now and see if that's what you meant?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 07/08/2013 04:59 PM, Tom Lane wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Well, Tom opined in
/messages/by-id/23249.1370878717@sss.pgh.pa.us that
the current patch is okay. I have a mild opinion that it should instead
print only SHR_LOCK when both bits are set, and one of the others when
only one of them is set. But I don't have a strong opinion about this,
and since Tom disagrees with me, feel free to exercise your own (Jeff's)
judgement.FWIW, I think that's exactly what I did in the preliminary 9.3 patch
that I committed to pg_filedump a few weeks ago. Could you take a look
at what's there now and see if that's what you meant?
So, is this getting committed today, or do we bounce it?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM1b95dd2cd2d440a06dd63bd82f51ff5ebae05518352eac101db72754adbfbfd8cbfc4a468ee2ab496ad24028a7400730@asav-2.01.com
Josh Berkus <josh@agliodbs.com> writes:
On 07/08/2013 04:59 PM, Tom Lane wrote:
FWIW, I think that's exactly what I did in the preliminary 9.3 patch
that I committed to pg_filedump a few weeks ago. Could you take a look
at what's there now and see if that's what you meant?
So, is this getting committed today, or do we bounce it?
I was hoping for a comment from Alvaro, but wouldn't have gotten to
committing it today in any case. IMO this patch doesn't really belong
in the commitfest queue, since pg_filedump isn't part of the community
distribution.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane escribi�:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Well, Tom opined in
/messages/by-id/23249.1370878717@sss.pgh.pa.us that
the current patch is okay. I have a mild opinion that it should instead
print only SHR_LOCK when both bits are set, and one of the others when
only one of them is set. But I don't have a strong opinion about this,
and since Tom disagrees with me, feel free to exercise your own (Jeff's)
judgement.FWIW, I think that's exactly what I did in the preliminary 9.3 patch
that I committed to pg_filedump a few weeks ago. Could you take a look
at what's there now and see if that's what you meant?
Here's sample output (-i) from the new code, i.e. this commit:
revision 1.7
date: 2013/06/06 18:33:17; author: tgl; state: Exp; lines: +14 -10
Preliminary updates for Postgres 9.3.
<Data> ------
Item 1 -- Length: 28 Offset: 8160 (0x1fe0) Flags: NORMAL
XMIN: 692 XMAX: 693 CID|XVAC: 0
Block Id: 0 linp Index: 1 Attributes: 1 Size: 24
infomask: 0x0190 (XMAX_KEYSHR_LOCK|XMAX_LOCK_ONLY|XMIN_COMMITTED)
Item 2 -- Length: 28 Offset: 8128 (0x1fc0) Flags: NORMAL
XMIN: 692 XMAX: 694 CID|XVAC: 0
Block Id: 0 linp Index: 2 Attributes: 1 Size: 24
infomask: 0x01d0 (XMAX_KEYSHR_LOCK|XMAX_EXCL_LOCK|XMAX_LOCK_ONLY|XMIN_COMMITTED)
Item 3 -- Length: 28 Offset: 8096 (0x1fa0) Flags: NORMAL
XMIN: 692 XMAX: 695 CID|XVAC: 0
Block Id: 0 linp Index: 3 Attributes: 1 Size: 24
infomask: 0x01c0 (XMAX_EXCL_LOCK|XMAX_LOCK_ONLY|XMIN_COMMITTED)
Item 4 -- Length: 28 Offset: 8064 (0x1f80) Flags: NORMAL
XMIN: 696 XMAX: 697 CID|XVAC: 0
Block Id: 0 linp Index: 4 Attributes: 1 Size: 24
infomask: 0x01c0 (XMAX_EXCL_LOCK|XMAX_LOCK_ONLY|XMIN_COMMITTED|KEYS_UPDATED)
Item 1 has SELECT FOR KEY SHARE
Item 2 has SELECT FOR SHARE
Item 3 has SELECT FOR NO KEY UPDATE
Item 4 has SELECT FOR UPDATE
The one I was talking about is the second case, which prints
KEYSHR_LOCK|EXCL_LOCK to mean that there's a FOR SHARE lock. I have no
problem reading it this way, but I fear that someone unfamiliar with
these bits might be confused. On the other hand, trying to be nice and
interpret these bits (i.e. translate presence of both into something
like SHR_LOCK) might also be confusing, because that bit doesn't really
exist. And one already needs to be careful while interpreting what do
KEYS_UPDATED and XMAX_LOCK_ONLY, or lack thereof, mean.
Perhaps it would be sensible to provide one more output line per tuple,
with interpretation of the flags, so it would tell you whether the tuple
has been locked or updated, and what kind of each it is. I'd propose
something like
status: locked (FOR {KEY SHARE,SHARE,NO KEY UPDATE,UPDATE}) [MultiXact: nnn]
status: [HOT] updated (KEYS UPDATED/KEYS NOT UPDATED) [MultiXact: nnn] To: blk/off
status: deleted [MultiXact: nnn]
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
The one I was talking about is the second case, which prints
KEYSHR_LOCK|EXCL_LOCK to mean that there's a FOR SHARE lock. I have no
problem reading it this way, but I fear that someone unfamiliar with
these bits might be confused. On the other hand, trying to be nice and
interpret these bits (i.e. translate presence of both into something
like SHR_LOCK) might also be confusing, because that bit doesn't really
exist. And one already needs to be careful while interpreting what do
KEYS_UPDATED and XMAX_LOCK_ONLY, or lack thereof, mean.
Perhaps it would be sensible to provide one more output line per tuple,
with interpretation of the flags, so it would tell you whether the tuple
has been locked or updated, and what kind of each it is. I'd propose
something like
status: locked (FOR {KEY SHARE,SHARE,NO KEY UPDATE,UPDATE}) [MultiXact: nnn]
status: [HOT] updated (KEYS UPDATED/KEYS NOT UPDATED) [MultiXact: nnn] To: blk/off
status: deleted [MultiXact: nnn]
Hm. I'm loath to add another output line per tuple, just for space
reasons.
My feeling about this code is that the reason we print the infomask in
hex is so you can see exactly which bits are set if you care, and that
the rest of the line ought to be designed to interpret the bits in as
reader-friendly a way as possible. So I don't buy the notion that we
should just print out a name for each bit that's set. I'd rather
replace individual bit names with items like LOCKED_FOR_KEY_SHARE,
LOCKED_FOR_SHARE, etc in cases where you have to combine multiple
bits to understand the meaning.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane escribi�:
My feeling about this code is that the reason we print the infomask in
hex is so you can see exactly which bits are set if you care, and that
the rest of the line ought to be designed to interpret the bits in as
reader-friendly a way as possible. So I don't buy the notion that we
should just print out a name for each bit that's set. I'd rather
replace individual bit names with items like LOCKED_FOR_KEY_SHARE,
LOCKED_FOR_SHARE, etc in cases where you have to combine multiple
bits to understand the meaning.
Okay, that's what I've been saying all along so I cannot but agree. I
haven't reviewed Jeff's patch lately; Jeff, does Tom's suggestion need
some more new code, and if so are you open to doing this work, or shall
I?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, 2013-07-17 at 13:43 -0400, Alvaro Herrera wrote:
Tom Lane escribió:
My feeling about this code is that the reason we print the infomask in
hex is so you can see exactly which bits are set if you care, and that
the rest of the line ought to be designed to interpret the bits in as
reader-friendly a way as possible. So I don't buy the notion that we
should just print out a name for each bit that's set. I'd rather
replace individual bit names with items like LOCKED_FOR_KEY_SHARE,
LOCKED_FOR_SHARE, etc in cases where you have to combine multiple
bits to understand the meaning.Okay, that's what I've been saying all along so I cannot but agree. I
haven't reviewed Jeff's patch lately; Jeff, does Tom's suggestion need
some more new code, and if so are you open to doing this work, or shall
I?
At first glance it seems like a pretty trivial change. I'm going on
vacation tomorrow and unfortunately I haven't had a chance to look at
this. Pgfoundry CVS is down, so I can't see whether it's already been
committed or not.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers