Re: Review: Patch to compute Max LSN of Data Pages
I think we should expect provided path to be relative to current directory
or may consider it to be relative to either one of Data or CWD.
Because normally we expect path to be relative to CWD if some program is
asking for path in command line.
Please find the attached patch to make the path relative to CWD and check if the path is under data directory.
Now the only point raised by Alvaro and you for this patch which is not yet addressed is :
Hmm. I think I'd expect that if I give pg_computemaxlsn a number, it
should consider that it is a relfilenode, and so it should get a list of
all segments for all forks of it. So if I give "12345" it should get
12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on.
However, if what I give it is a path, i.e. it contains a slash, I think
it should only consider the specific file mentioned. In that light, I'm
not sure that command line options chosen are the best set.
I am just not sure whether we should handle this functionality and if we have to handle what is better way to provide it to user.
Shall we provide new option -r or something for it.
Opinions/Suggestions?
IMHO, such functionality can be easily extendable in future.
However I have no problem in implementing such functionality if you are of opinion that this is basic and it should go with first version of feature.
With Regards,
Amit Kapila.
Attachments:
pg_computemaxlsn_v3.patchapplication/octet-stream; name=pg_computemaxlsn_v3.patchDownload
*** a/contrib/Makefile
--- b/contrib/Makefile
***************
*** 31,36 **** SUBDIRS = \
--- 31,37 ----
passwordcheck \
pg_archivecleanup \
pg_buffercache \
+ pg_computemaxlsn \
pg_freespacemap \
pg_standby \
pg_stat_statements \
*** /dev/null
--- b/contrib/pg_computemaxlsn/Makefile
***************
*** 0 ****
--- 1,22 ----
+ # contrib/pg_computemaxlsn/Makefile
+
+ PGFILEDESC = "pg_computemaxlsn - an utility to find max LSN from data pages"
+ PGAPPICON = win32
+
+ PROGRAM = pg_computemaxlsn
+ OBJS = pg_computemaxlsn.o $(WIN32RES)
+
+ PG_CPPFLAGS = -I$(srcdir)
+ PG_LIBS = $(libpq_pgport)
+
+
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/pg_computemaxlsn
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
*** /dev/null
--- b/contrib/pg_computemaxlsn/pg_computemaxlsn.c
***************
*** 0 ****
--- 1,834 ----
+ /*-------------------------------------------------------------------------
+ *
+ * pg_computemaxlsn.c
+ * A utility to compute the maximum LSN in data pages
+ *
+ * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * contrib/pg_computemaxlsn/pg_computemaxlsn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+ /*
+ * We have to use postgres.h not postgres_fe.h here, because there's so much
+ * backend-only stuff for reading data files we need. But we need a
+ * frontend-ish environment otherwise. Hence this ugly hack.
+ */
+ #define FRONTEND 1
+
+ #include "postgres.h"
+
+ #include <dirent.h>
+ #include <fcntl.h>
+ #include <locale.h>
+ #include <sys/stat.h>
+ #include <sys/time.h>
+ #include <time.h>
+ #include <unistd.h>
+
+ #include "getopt_long.h"
+
+ #include "access/xlog_internal.h"
+ #include "catalog/catalog.h"
+ #include "storage/bufpage.h"
+ #include "storage/fd.h"
+
+ /* Page header size */
+ #define PAGEHDRSZ (sizeof(PageHeaderData))
+
+ #define validateTablespaceDir(name) ((strlen(name) > 3) && (name[0] == 'P') && (name[1] == 'G') && (name[2] == '_'))
+
+ extern int optind;
+ extern char *optarg;
+ static const char *progname;
+
+ static int FindMaxLSNinFile(char *filename, XLogRecPtr *maxlsn);
+ static int FindMaxLSNinDir(char *path, XLogRecPtr *maxlsn, bool is_fromlink);
+ static int FindMaxLSNinPgData(char *pgdatapath, XLogRecPtr *maxlsn);
+ static void usage(void);
+ static int getLinkPath(struct stat * statbuf, char *path, char *linkpath, int length);
+
+ /*
+ * Removes the parent references
+ * The input *must* have been put through canonicalize_path previously.
+ */
+ static void
+ remove_parent_refernces(char *inpath)
+ {
+ char *path;
+ char *epath;
+ char *edestpath;
+ int len;
+ int i;
+ int pending_strips = 0;
+
+ len = strlen(inpath);
+ path = malloc(len + 2);
+ if (!path)
+ {
+ return;
+ }
+ MemSet(path, 0, len + 2);
+
+ epath = inpath + len;
+ edestpath = path + len;
+
+ for (i = len; i >= 0; i--, epath--)
+ {
+ if (IS_DIR_SEP(*epath))
+ {
+ if ((i > 3) && (*(epath - 1) == '.')
+ && (*(epath - 2) == '.') && IS_DIR_SEP(*(epath - 3)))
+ {
+ /* Matching "/../" - trim the parent directoy */
+ if (!pending_strips)
+ pending_strips = 1;
+ pending_strips++;
+ i -= 2;
+ epath -= 2;
+ }
+ else if ((i > 2) && (*(epath - 1) == '.') && IS_DIR_SEP(*(epath - 2)))
+ {
+ /* Matching "/./" - can be skiped */
+ i -= 1;
+ epath -= 1;
+ continue;
+ }
+ else if (pending_strips)
+ {
+ pending_strips--;
+ }
+ }
+
+ if (pending_strips == 0)
+ {
+ *edestpath = *epath;
+ edestpath--;
+ }
+ }
+
+ if (pending_strips)
+ {
+ /* "/../", "/../../" are considered as "/" */
+ if (is_absolute_path(inpath))
+ {
+ *(inpath++) = '/';
+ }
+ else
+ {
+ /* Incase of relative path prefix with "../" */
+ while (pending_strips--)
+ {
+ strcpy(inpath, "../");
+ inpath += 3;
+ }
+ }
+ }
+
+ strcpy(inpath, edestpath + 1);
+ free(path);
+
+ return;
+ }
+
+ /*
+ * If the given pathname isn't already absolute, make it so, interpreting
+ * it relative to the current working directory.
+ */
+ static char *
+ make_absolute_path(const char *in)
+ {
+ char *result;
+
+ if (is_absolute_path(in))
+ result = strdup(in);
+ else
+ {
+ char cwdbuf[MAXPGPATH];
+
+ if (!getcwd(cwdbuf, sizeof(cwdbuf)))
+ {
+ fprintf(stderr, _("could not get current working directory: %s\n"), strerror(errno));
+ exit(2);
+ }
+
+ result = malloc(strlen(cwdbuf) + strlen(in) + 2);
+ sprintf(result, "%s/%s", cwdbuf, in);
+ }
+
+ canonicalize_path(result);
+ remove_parent_refernces(result);
+ return result;
+ }
+
+ /*
+ * This function validates the given cluster directory - we search for a
+ * small set of subdirectories that we expect to find in a valid data directory.
+ * directory. If any of the subdirectories are missing (or secured against
+ * us) we display an error message and exit()
+ *
+ */
+ static bool
+ check_data_dir(const char *pg_data)
+ {
+ char subDirName[MAXPGPATH];
+ int dnum;
+
+ /* start check with top-most directory */
+ const char *requiredSubdirs[] = {"", "base", "global", "pg_tblspc",
+ "pg_multixact", "pg_subtrans", "pg_clog", "pg_twophase",
+ "pg_xlog"};
+
+ for (dnum = 0; dnum < lengthof(requiredSubdirs); ++dnum)
+ {
+ struct stat statBuf;
+
+ snprintf(subDirName, sizeof(subDirName), "%s%s%s", pg_data,
+ /* Win32 can't stat() a directory with a trailing slash. */
+ *requiredSubdirs[dnum] ? "/" : "",
+ requiredSubdirs[dnum]);
+
+ if (stat(subDirName, &statBuf) != 0)
+ {
+ fprintf(stderr, _("%s: check for \"%s\" failed: %s\n"),
+ progname, subDirName, strerror(errno));
+ return false;
+ }
+ else if (!S_ISDIR(statBuf.st_mode))
+ {
+ fprintf(stderr, _("%s: %s is not a directory.\n"), progname, subDirName);
+ return false;
+ }
+ }
+
+ return true;
+ }
+
+ /*
+ * This function validates the given path is in or below cluster directory or
+ * any one of "base", "global", or "pg_tblspc" subdirectory.
+ */
+ static bool
+ check_path_belongs_to_pgdata(const char *in_pg_data, char *in_path,
+ bool *is_inside_pg_data, bool *path_are_same)
+ {
+ char *usrpath;
+ char *pg_data;
+ int data_path_len;
+
+ pg_data = make_absolute_path(in_pg_data);
+ usrpath = make_absolute_path(in_path);
+
+ data_path_len = strlen(pg_data);
+
+ /* check input usrpath is subdirectory of pg_data usrpath */
+ if (strncmp(pg_data, usrpath, data_path_len) == 0)
+ {
+ if (usrpath[data_path_len] == '\0')
+ {
+ *path_are_same = true;
+ *is_inside_pg_data = true;
+ free(pg_data);
+ free(usrpath);
+ return true;
+ }
+ else if (!IS_DIR_SEP(usrpath[data_path_len]))
+ {
+ free(pg_data);
+ free(usrpath);
+ return false;
+ }
+
+ *is_inside_pg_data = true;
+
+ /* Input usrpath same as pg_data but ending with ending with DIR_SEP */
+ if (usrpath[data_path_len + 1] == '\0')
+ {
+ *path_are_same = true;
+ free(pg_data);
+ free(usrpath);
+ return true;
+ }
+
+ /*
+ * Input usrpath is subdirectory of pg_data usrpath, so validate the
+ * remaining usrpath it should be inside any one of "base", "global",
+ * "pg_tablespc"
+ */
+ if ((path_is_prefix_of_path("base", usrpath + (data_path_len + 1)))
+ || (path_is_prefix_of_path("global", usrpath + (data_path_len + 1)))
+ || (path_is_prefix_of_path("pg_tblspc", usrpath + (data_path_len + 1))))
+ {
+ free(pg_data);
+ free(usrpath);
+ return true;
+ }
+ }
+ free(pg_data);
+ free(usrpath);
+
+ return false;
+ }
+
+ /*
+ * relfilenode name validation.
+ * Format with_ext == true [0-9]+[ \w | _vm | _fsm | _init ][\.][0-9]*
+ * with_ext == false [0-9]+
+ */
+ static bool
+ validateRelfilenodename(char *name, bool with_ext)
+ {
+ int pos = 0;
+
+ while ((name[pos] >= '0') && (name[pos] <= '9'))
+ pos++;
+
+ if (with_ext)
+ {
+ if (name[pos] == '_')
+ {
+ pos++;
+ while ((name[pos] >= 'a') && (name[pos] <= 'z'))
+ pos++;
+ }
+ if (name[pos] == '.')
+ {
+ pos++;
+ while ((name[pos] >= '0') && (name[pos] <= '9'))
+ pos++;
+ }
+ }
+
+ if (name[pos] == 0)
+ return true;
+
+ return false;
+ }
+
+ int
+ main(int argc, char *argv[])
+ {
+ static struct option long_options[] = {
+ {"path", required_argument, NULL, 'p'},
+ {"data-directory", no_argument, NULL, 'P'},
+ {NULL, 0, NULL, 0}
+ };
+
+ int optindex;
+ int c;
+ char *DataDir;
+ int fd;
+ char path[MAXPGPATH];
+ bool print_max_lsn = false;
+ bool print_pgdata_max_lsn = false;
+ char *LsnSearchPath = NULL;
+ XLogRecPtr maxLSN = 0;
+ XLogSegNo logSegNo = 0;
+ int result = 0;
+ bool is_whole_data_dir = false;
+
+ set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_computemaxlsn"));
+
+ progname = get_progname(argv[0]);
+
+ if (argc > 1)
+ {
+ if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
+ {
+ usage();
+ exit(0);
+ }
+ if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)
+ {
+ puts("pg_computemaxlsn (PostgreSQL) " PG_VERSION);
+ exit(0);
+ }
+ }
+
+ while ((c = getopt_long(argc, argv, "p:P", long_options, &optindex)) != -1)
+ {
+ switch (c)
+ {
+ case 'p':
+ if (print_max_lsn)
+ {
+ fprintf(stderr, _("%s: multiple -p options are not supported.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+ print_max_lsn = true;
+ LsnSearchPath = strdup(optarg);
+ if (is_absolute_path(LsnSearchPath))
+ {
+ fprintf(stderr, _("%s: Path \"%s\" should be relative to CWD.\n"),
+ progname, LsnSearchPath);
+ exit(1);
+ }
+ break;
+
+ case 'P':
+ if (print_pgdata_max_lsn)
+ {
+ fprintf(stderr, _("%s: multiple -P options are not supported.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+ print_pgdata_max_lsn = true;
+ break;
+
+ default:
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+ }
+
+ if (print_max_lsn && print_pgdata_max_lsn)
+ {
+ fprintf(stderr, _("%s: both options -P and -p can not be combined.\n"), progname);
+ exit(1);
+ }
+
+ if (optind == argc)
+ {
+ fprintf(stderr, _("%s: no data directory specified.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+
+ if ((optind + 1) != argc)
+ {
+ fprintf(stderr, _("%s: multiple data directories not supported.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+
+ /*
+ * Don't allow pg_computemaxlsn to be run as root, to avoid overwriting
+ * the ownership of files in the data directory. We need only check for
+ * root -- any other user won't have sufficient permissions to modify
+ * files in the data directory.
+ */
+ #ifndef WIN32
+ if (geteuid() == 0)
+ {
+ fprintf(stderr, _("%s: cannot be executed by \"root\".\n"),
+ progname);
+ fprintf(stderr, _("You must run %s as the PostgreSQL superuser.\n"),
+ progname);
+ exit(1);
+ }
+ #endif
+
+ DataDir = strdup(argv[optind]);
+
+ canonicalize_path(DataDir);
+
+
+ if (!check_data_dir(DataDir))
+ {
+ fprintf(stderr, _("%s: \"%s\" not a valid data directory.\n"),
+ progname, DataDir);
+ exit(1);
+ }
+
+ /*
+ * Check for a postmaster lock file --- if there is one, refuse to
+ * proceed, on grounds we might be interfering with a live installation.
+ */
+ snprintf(path, MAXPGPATH, "%s/postmaster.pid", DataDir);
+
+ if ((fd = open(path, O_RDONLY, 0)) < 0)
+ {
+ if (errno != ENOENT)
+ {
+ fprintf(stderr, _("%s: could not open file \"%s\" for reading: %s\n"),
+ progname, path, strerror(errno));
+ exit(1);
+ }
+ }
+ else
+ {
+ fprintf(stderr, _("%s: lock file \"%s\" exists\n"
+ "Is a server running? If not, delete the lock file and try again.\n"),
+ progname, path);
+ exit(1);
+ }
+
+ if (print_max_lsn)
+ {
+ struct stat fst;
+ bool is_inside_pg_data = false;
+
+ canonicalize_path(LsnSearchPath);
+
+ if (!check_path_belongs_to_pgdata(DataDir, LsnSearchPath, &is_inside_pg_data, &is_whole_data_dir))
+ {
+ if (is_inside_pg_data)
+ {
+ fprintf(stderr, _("%s: Path \"%s\" should be in or below any of \"base\",\"global\" or \"pg_tblspc\" of data directory.\n"),
+ progname, LsnSearchPath);
+ }
+ else
+ {
+ fprintf(stderr, _("%s: Path \"%s\" should be in or below data directory.\n"),
+ progname, LsnSearchPath);
+ }
+ exit(1);
+ }
+
+ if (is_whole_data_dir)
+ {
+ /* Process for whole data directory. */
+ }
+ else if (lstat(LsnSearchPath, &fst) < 0)
+ {
+ if (errno == ENOENT)
+ {
+ fprintf(stderr, _("%s: file or directory \"%s\" does not exists.\n"),
+ progname, LsnSearchPath);
+ }
+ else
+ {
+ fprintf(stderr, _("%s: could not stat file or directory \"%s\": %s\n"),
+ progname, LsnSearchPath, strerror(errno));
+ }
+ exit(1);
+ }
+ else if (getLinkPath(&fst, LsnSearchPath, path, sizeof(path)) > 0)
+ {
+ result = FindMaxLSNinDir(path, &maxLSN, true);
+ }
+ else if (S_ISDIR(fst.st_mode))
+ {
+ result = FindMaxLSNinDir(LsnSearchPath, &maxLSN, false);
+ }
+ else if (S_ISREG(fst.st_mode))
+ {
+ result = FindMaxLSNinFile(LsnSearchPath, &maxLSN);
+ }
+ else
+ {
+ fprintf(stderr, _("%s: skipping special file \"%s\"\n"), progname, LsnSearchPath);
+ }
+ }
+
+ /* By default we need to compute max lsn for database */
+ if ((print_max_lsn == false) || is_whole_data_dir)
+ {
+ result = FindMaxLSNinPgData(DataDir, &maxLSN);
+ }
+
+ if (0 != result)
+ {
+ /* Message already provided, simply exit */
+ exit(1);
+ }
+
+ XLByteToSeg(maxLSN, logSegNo);
+
+ printf("Maximum LSN found is: %X/%X \nWAL segment file name (fileid,seg): %X/%X\n",
+ (uint32)(maxLSN >> 32), (uint32) maxLSN,
+ (uint32) (logSegNo >> 32), (uint32) (logSegNo));
+
+ return 0;
+ }
+
+
+ /*
+ * PageHeaderIsValid: Check page is valid or not
+ */
+ bool
+ PageHeaderIsValid(PageHeader page)
+ {
+ char *pagebytes;
+ int i;
+
+ /* Check normal case */
+ if (PageGetPageSize(page) == BLCKSZ &&
+ PageGetPageLayoutVersion(page) == PG_PAGE_LAYOUT_VERSION &&
+ (page->pd_flags & ~PD_VALID_FLAG_BITS) == 0 &&
+ page->pd_lower >= SizeOfPageHeaderData &&
+ page->pd_lower <= page->pd_upper &&
+ page->pd_upper <= page->pd_special &&
+ page->pd_special <= BLCKSZ &&
+ page->pd_special == MAXALIGN(page->pd_special))
+ return true;
+
+ /*
+ * Check all-zeroes till page header; this is used only to log the page
+ * details even we detect invalid page we will continue to next pages
+ */
+ pagebytes = (char *) page;
+ for (i = 0; i < PAGEHDRSZ; i++)
+ {
+ if (pagebytes[i] != 0)
+ return false;
+ }
+ return true;
+ }
+
+
+ /*
+ * Read the maximum LSN number in the one of data file (relnode file).
+ *
+ */
+ static int
+ FindMaxLSNinFile(char *filename, XLogRecPtr *maxlsn)
+ {
+ XLogRecPtr pagelsn;
+ off_t len,
+ seekpos;
+ uint32 nblocks,
+ blocknum;
+ char buffer[PAGEHDRSZ];
+ int nbytes;
+ int fd;
+
+ if ((fd = open(filename, O_RDONLY | PG_BINARY, 0)) < 0)
+ {
+ /*
+ * If file does not exist or we can't read it. give error
+ */
+ fprintf(stderr, _("%s: could not open file \"%s\" for reading: %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ /* Calculate the number of pages in file */
+ len = lseek(fd, 0L, SEEK_END);
+ if (len < 0)
+ {
+ close(fd);
+ fprintf(stderr, _("%s: .. file \"%s\" for seeking: %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ nblocks = (len / BLCKSZ);
+ if (nblocks > RELSEG_SIZE)
+ {
+ /*
+ * In one relfilenode file length can't be more that RELSEG_SIZE
+ */
+ close(fd);
+ fprintf(stderr, _("%s: .. file \"%s\" length is more than segment size: %d.\n"),
+ progname, filename, RELSEG_SIZE);
+ return -1;
+ }
+
+ /*
+ * Read the only page header and validate; if we find invalid page log the
+ * details of page and continue to next page.
+ */
+ seekpos = 0;
+ for (blocknum = 0; blocknum < nblocks; blocknum++)
+ {
+ len = lseek(fd, seekpos, SEEK_SET);
+ if (len != seekpos)
+ {
+ close(fd);
+ fprintf(stderr, _("%s: could not seek to next page \"%s\": %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ nbytes = read(fd, buffer, PAGEHDRSZ);
+ if (nbytes < 0)
+ {
+ close(fd);
+ fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ if (PageHeaderIsValid((PageHeader) buffer))
+ {
+ pagelsn = PageGetLSN(buffer);
+ if (XLByteLT(*maxlsn, pagelsn))
+ {
+ *maxlsn = pagelsn;
+ }
+ }
+ else
+ {
+ /*
+ * If page is invalid log the error and continue
+ */
+ fprintf(stderr, _("%s: Invalid page found in file \"%s\" pagid:%d\n"),
+ progname, filename, blocknum);
+ }
+ seekpos += (off_t) BLCKSZ;
+ }
+
+ close(fd);
+ return 0;
+ }
+
+ /*
+ * Read the maximum LSN number in current directory; including sub directories
+ * and links.
+ */
+ static int
+ FindMaxLSNinDir(char *path, XLogRecPtr *maxlsn, bool is_fromlink)
+ {
+ DIR *dir;
+ struct dirent *de;
+ char pathbuf[MAXPGPATH];
+ struct stat statbuf;
+ char linkpath[MAXPGPATH];
+ int result;
+
+ dir = opendir(path);
+ if (NULL == dir)
+ {
+ fprintf(stderr, _("%s: could not open directory \"%s\": %s\n"),
+ progname, path, strerror(errno));
+ return -1;
+ }
+
+ while ((de = readdir(dir)) != NULL)
+ {
+ /* Skip special stuff */
+ if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
+ continue;
+
+ /* Skip temporary files */
+ if (strncmp(de->d_name,
+ PG_TEMP_FILE_PREFIX,
+ strlen(PG_TEMP_FILE_PREFIX)) == 0)
+ continue;
+
+ /*
+ * Skip all the local/global temporary files, and read and read all
+ * reamining relfinenode files
+ */
+ if (is_fromlink)
+ {
+ /* If directory is link then only allow PG_* path only */
+ if (!validateTablespaceDir(de->d_name))
+ continue;
+ }
+ else if (!validateRelfilenodename(de->d_name, true))
+ continue;
+
+ snprintf(pathbuf, MAXPGPATH, "%s/%s", path, de->d_name);
+
+ if (lstat(pathbuf, &statbuf) != 0)
+ {
+ if (errno != ENOENT)
+ {
+ fprintf(stderr, _("%s: could not stat file or directory \"%s\": %s\n"),
+ progname, pathbuf, strerror(errno));
+ }
+ /* If the file went away while scanning, it's no error. */
+ continue;
+ }
+
+ result = getLinkPath(&statbuf, pathbuf, linkpath, sizeof(linkpath));
+ if (result < 0)
+ {
+ continue;
+ }
+ else if (result > 0)
+ {
+ (void) FindMaxLSNinDir(linkpath, maxlsn, true);
+ }
+ else if (S_ISDIR(statbuf.st_mode))
+ (void) FindMaxLSNinDir(pathbuf, maxlsn, false);
+ else
+ (void) FindMaxLSNinFile(pathbuf, maxlsn);
+ }
+
+ closedir(dir);
+ return 0;
+ }
+
+ /*
+ * Get the link path.
+ * On success, returns length of link filename.
+ * and return zero incase of is file is not a link type.
+ * On failure, returns -1.
+ */
+ static int
+ getLinkPath(struct stat * statbuf, char *path, char *linkpath, int length)
+ {
+ int rllen;
+
+ if (
+ #ifndef WIN32
+ S_ISLNK(statbuf->st_mode)
+ #else
+ pgwin32_is_junction(path)
+ #endif
+ )
+ {
+ #if defined(HAVE_READLINK) || defined(WIN32)
+
+ rllen = readlink(path, linkpath, length);
+ if (rllen < 0)
+ {
+ fprintf(stderr, _("%s: could not read symbolic link \"%s\", so skipping file.\n"),
+ progname, path);
+ return -1;
+ }
+
+ if (rllen >= length)
+ {
+ fprintf(stderr, _("%s: symbolic link \"%s\" target is too long, so skipping file.\n"),
+ progname, path);
+
+ return -1;
+ }
+
+ linkpath[rllen] = '\0';
+
+ return rllen;
+ #else
+ /* tablespaces are not supported on this platform */
+ return -1;
+ #endif /* HAVE_READLINK */
+ }
+
+ return 0;
+ }
+
+
+ /*
+ * Read the maximum LSN number in the DATA directory.
+ */
+ static int
+ FindMaxLSNinPgData(char *pgdatapath, XLogRecPtr *maxlsn)
+ {
+ char pathbuf[MAXPGPATH];
+
+ /* scan all the relfilenodes in data directory */
+ snprintf(pathbuf, MAXPGPATH, "%s/global", pgdatapath);
+ if (0 != FindMaxLSNinDir(pathbuf, maxlsn, false))
+ return -1;
+
+ snprintf(pathbuf, MAXPGPATH, "%s/base", pgdatapath);
+ if (0 != FindMaxLSNinDir(pathbuf, maxlsn, false))
+ return -1;
+
+ snprintf(pathbuf, MAXPGPATH, "%s/pg_tblspc", pgdatapath);
+ if (0 != FindMaxLSNinDir(pathbuf, maxlsn, false))
+ return -1;
+
+ return 0;
+ }
+
+ static void
+ usage(void)
+ {
+ printf(_("%s compute the maximum LSN in PostgreSQL data pages.\n\n"), progname);
+ printf(_("Usage:\n %s [OPTION]... DATADIR\n\n"), progname);
+ printf(_("Options:\n"));
+ printf(_(" -p, --path=RELATIVE_PATH print max LSN from file or directory name\n"));
+ printf(_(" -P, --data-directory print max LSN from whole data directory\n"));
+ printf(_(" -V, --version output version information, then exit\n"));
+ printf(_(" -?, --help show this help, then exit\n"));
+ printf(_("\nReport bugs to <pgsql-bugs@postgresql.org>.\n"));
+ }
*** a/doc/src/sgml/ref/allfiles.sgml
--- b/doc/src/sgml/ref/allfiles.sgml
***************
*** 177,182 **** Complete list of usable sgml source files in this directory.
--- 177,183 ----
<!ENTITY pgDumpall SYSTEM "pg_dumpall.sgml">
<!ENTITY pgReceivexlog SYSTEM "pg_receivexlog.sgml">
<!ENTITY pgResetxlog SYSTEM "pg_resetxlog.sgml">
+ <!ENTITY pgComputemaxlsn SYSTEM "pg_computemaxlsn.sgml">
<!ENTITY pgRestore SYSTEM "pg_restore.sgml">
<!ENTITY postgres SYSTEM "postgres-ref.sgml">
<!ENTITY postmaster SYSTEM "postmaster.sgml">
*** /dev/null
--- b/doc/src/sgml/ref/pg_computemaxlsn.sgml
***************
*** 0 ****
--- 1,79 ----
+ <!--
+ doc/src/sgml/ref/pg_computemaxlsn.sgml
+ PostgreSQL documentation
+ -->
+
+ <refentry id="APP-PGCOMPUTEMAXLSN">
+ <refmeta>
+ <refentrytitle><application>pg_computemaxlsn</application></refentrytitle>
+ <manvolnum>1</manvolnum>
+ <refmiscinfo>Application</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>pg_computemaxlsn</refname>
+ <refpurpose>computes the maximum LSN in database of a <productname>PostgreSQL</productname> database cluster</refpurpose>
+ </refnamediv>
+
+ <indexterm zone="app-pgcomputemaxlsn">
+ <primary>pg_computemaxlsn</primary>
+ </indexterm>
+
+ <refsynopsisdiv>
+ <cmdsynopsis>
+ <command>pg_computemaxlsn</command>
+ <arg choice="opt"><option>-P</option></arg>
+ <arg choice="opt"><option>-p</option> <replaceable class="parameter">file-name</replaceable> | <replaceable class="parameter">folder-name</replaceable></arg>
+ <arg choice="plain"><replaceable>datadir</replaceable></arg>
+ </cmdsynopsis>
+ </refsynopsisdiv>
+
+ <refsect1 id="R1-APP-PGCOMPUTEMAXLSN-1">
+ <title>Description</title>
+ <para>
+ <command>pg_computemaxlsn</command> computes maximun LSN from database pages.
+ </para>
+
+ <para>
+ This utility can only be run by the user who installed the server, because
+ it requires read/write access to the data directory.
+ For safety reasons, you must specify the data directory on the command line.
+ <command>pg_computemaxlsn</command> does not use the environment variable
+ <envar>PGDATA</>.
+ </para>
+
+ <para>
+ The <option>-P</> or <option>--data-directory</> for computing maximum LSN from all the pages in data directory.
+ This is the default option if none of the options are provided.
+ </para>
+
+ <para>
+ The <option>-p <replaceable class="parameter">file-name | folder-name</replaceable></option> or
+ <option>--path=<replaceable class="parameter">file-name | folder-name</replaceable></option> for computing
+ maximun LSN from specific file or folder. File or folder should be relative and in or below the data directory.
+ </para>
+
+ <para>
+ The <option>-V</> and <option>--version</> options print
+ the <application>pg_computemaxlsn</application> version and exit. The
+ options <option>-?</> and <option>--help</> show supported arguments,
+ and exit.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Notes</title>
+
+ <para>
+ This command must not be used when the server is
+ running. <command>pg_computemaxlsn</command> will refuse to start up if
+ it finds a server lock file in the data directory. If the
+ server crashed then a lock file might have been left
+ behind; in that case you can remove the lock file to allow
+ <command>pg_computemaxlsn</command> to run. But before you do
+ so, make doubly certain that there is no server process still alive.
+ </para>
+ </refsect1>
+
+ </refentry>
*** a/doc/src/sgml/ref/pg_resetxlog.sgml
--- b/doc/src/sgml/ref/pg_resetxlog.sgml
***************
*** 135,140 **** PostgreSQL documentation
--- 135,150 ----
largest entry in <filename>pg_xlog</>, use <literal>-l 00000001000000320000004B</> or higher.
</para>
+ <para>
+ If <command>pg_resetxlog</command> complains that it cannot determine
+ valid data for <filename>pg_control</>, and if you do not have or corrupted
+ WAL segment files in the directory <filename>pg_xlog</> under the data directory,
+ then to identify larger WAL segment file from data files we can use utility <command>pg_computemaxlsn</command>
+ with <option>-P</> option for finding maximum LSN from the data directory or
+ for from specific file or folder <option>-p <filename>file-name | folder-name</></>.
+ Once larger WAL segment file is found use <option>-l</> option for setting the value.
+ </para>
+
<note>
<para>
<command>pg_resetxlog</command> itself looks at the files in
*** a/doc/src/sgml/reference.sgml
--- b/doc/src/sgml/reference.sgml
***************
*** 248,253 ****
--- 248,254 ----
&pgControldata;
&pgCtl;
&pgResetxlog;
+ &pgComputemaxlsn;
&postgres;
&postmaster;
*** a/src/tools/msvc/Mkvcbuild.pm
--- b/src/tools/msvc/Mkvcbuild.pm
***************
*** 34,40 **** my @contrib_uselibpgport = (
'oid2name', 'pgbench',
'pg_standby', 'pg_archivecleanup',
'pg_test_fsync', 'pg_test_timing',
! 'pg_upgrade', 'vacuumlo');
my $contrib_extralibs = { 'pgbench' => ['wsock32.lib'] };
my $contrib_extraincludes =
{ 'tsearch2' => ['contrib/tsearch2'], 'dblink' => ['src/backend'] };
--- 34,40 ----
'oid2name', 'pgbench',
'pg_standby', 'pg_archivecleanup',
'pg_test_fsync', 'pg_test_timing',
! 'pg_upgrade', 'vacuumlo', 'pg_computemaxlsn');
my $contrib_extralibs = { 'pgbench' => ['wsock32.lib'] };
my $contrib_extraincludes =
{ 'tsearch2' => ['contrib/tsearch2'], 'dblink' => ['src/backend'] };
Hi Amit
On Mon, Dec 3, 2012 at 6:56 PM, Amit kapila <amit.kapila@huawei.com> wrote:
I think we should expect provided path to be relative to current
directory
or may consider it to be relative to either one of Data or CWD.
Because normally we expect path to be relative to CWD if some program is
asking for path in command line.Please find the attached patch to make the path relative to CWD and check
if the path is under data directory.
Works good now. Although I am thinking why are you disallowing the absolute
path of file. Any particular reason?
Now the only point raised by Alvaro and you for this patch which is not
yet addressed is :Hmm. I think I'd expect that if I give pg_computemaxlsn a number, it
should consider that it is a relfilenode, and so it should get a list of
all segments for all forks of it. So if I give "12345" it should get
12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on.
However, if what I give it is a path, i.e. it contains a slash, I think
it should only consider the specific file mentioned. In that light, I'm
not sure that command line options chosen are the best set.I am just not sure whether we should handle this functionality and if we
have to handle what is better way to provide it to user.
Shall we provide new option -r or something for it.Opinions/Suggestions?
IMHO, such functionality can be easily extendable in future.
However I have no problem in implementing such functionality if you are of
opinion that this is basic and it should go with first version of feature.
I also had a similar point made by Alvaro to allow all the segments of the
relation for a given relation file name, or add another option do do the
same. But if everybody is fine with leaving it for the future, I do not
have any further concerns with the patch. It is good from my side.
With Regards,
Amit Kapila.
Thanks
Muhammad Usama
Hi Muhammad,
On Friday, December 07, 2012 7:43 PM Muhammad Usama wrote:
Hi Amit
On Mon, Dec 3, 2012 at 6:56 PM, Amit kapila <amit.kapila@huawei.com<mailto:amit.kapila@huawei.com>> wrote:
I think we should expect provided path to be relative to current directory
or may consider it to be relative to either one of Data or CWD.
Because normally we expect path to be relative to CWD if some program is
asking for path in command line.
Please find the attached patch to make the path relative to CWD and check if the path is under data directory.
Works good now.
Thank you for verification.
Although I am thinking why are you disallowing the absolute path of file. Any particular reason?
The reason to disallow absolute path is that, we need to test on multiple platforms and to keep the scope little less.
I thought we can allow absolute paths in future.
I also had a similar point made by Alvaro to allow all the segments of the relation for a given relation file name, or add another option do do the same. But if everybody is fine with leaving it for the future, I do > not have any further concerns with the patch. It is good from my side.
In my opinion we can extend the utility in future for both the below points suggested:
1. allow absolute paths in file path
2. allow to get max lsn for relation segments.
If you are also okay, then we can proceed and let Committer also share his opinion.
Thank you for reviewing the patch.
With Regards,
Amit Kapila.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Amit kapila <amit.kapila@huawei.com> writes:
On Friday, December 07, 2012 7:43 PM Muhammad Usama wrote:
Although I am thinking why are you disallowing the absolute path of file. Any particular reason?
The reason to disallow absolute path is that, we need to test on multiple platforms and to keep the scope little less.
This argument seems to me to be completely nuts. What's wrong with an
absolute path? Wouldn't you have to go out of your way to disallow it?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Saturday, December 08, 2012 9:44 AM Tom Lane wrote:
Amit kapila <amit.kapila@huawei.com> writes:
On Friday, December 07, 2012 7:43 PM Muhammad Usama wrote:
Although I am thinking why are you disallowing the absolute path of file. Any particular reason?
The reason to disallow absolute path is that, we need to test on multiple platforms and to keep the scope little less.
This argument seems to me to be completely nuts. What's wrong with an
absolute path? Wouldn't you have to go out of your way to disallow it?
There's nothing wrong with absolute path. I have updated the patch to work for absolute path as well.
Updated patch attached with this mail.
With Regards,
Amit Kapila.
Attachments:
pg_computemaxlsn_v4.patchapplication/octet-stream; name=pg_computemaxlsn_v4.patchDownload
*** a/contrib/Makefile
--- b/contrib/Makefile
***************
*** 31,36 **** SUBDIRS = \
--- 31,37 ----
passwordcheck \
pg_archivecleanup \
pg_buffercache \
+ pg_computemaxlsn \
pg_freespacemap \
pg_standby \
pg_stat_statements \
*** /dev/null
--- b/contrib/pg_computemaxlsn/Makefile
***************
*** 0 ****
--- 1,22 ----
+ # contrib/pg_computemaxlsn/Makefile
+
+ PGFILEDESC = "pg_computemaxlsn - an utility to find max LSN from data pages"
+ PGAPPICON = win32
+
+ PROGRAM = pg_computemaxlsn
+ OBJS = pg_computemaxlsn.o $(WIN32RES)
+
+ PG_CPPFLAGS = -I$(srcdir)
+ PG_LIBS = $(libpq_pgport)
+
+
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/pg_computemaxlsn
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
*** /dev/null
--- b/contrib/pg_computemaxlsn/pg_computemaxlsn.c
***************
*** 0 ****
--- 1,828 ----
+ /*-------------------------------------------------------------------------
+ *
+ * pg_computemaxlsn.c
+ * A utility to compute the maximum LSN in data pages
+ *
+ * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * contrib/pg_computemaxlsn/pg_computemaxlsn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+ /*
+ * We have to use postgres.h not postgres_fe.h here, because there's so much
+ * backend-only stuff for reading data files we need. But we need a
+ * frontend-ish environment otherwise. Hence this ugly hack.
+ */
+ #define FRONTEND 1
+
+ #include "postgres.h"
+
+ #include <dirent.h>
+ #include <fcntl.h>
+ #include <locale.h>
+ #include <sys/stat.h>
+ #include <sys/time.h>
+ #include <time.h>
+ #include <unistd.h>
+
+ #include "getopt_long.h"
+
+ #include "access/xlog_internal.h"
+ #include "catalog/catalog.h"
+ #include "storage/bufpage.h"
+ #include "storage/fd.h"
+
+ /* Page header size */
+ #define PAGEHDRSZ (sizeof(PageHeaderData))
+
+ #define validateTablespaceDir(name) ((strlen(name) > 3) && (name[0] == 'P') && (name[1] == 'G') && (name[2] == '_'))
+
+ extern int optind;
+ extern char *optarg;
+ static const char *progname;
+
+ static int FindMaxLSNinFile(char *filename, XLogRecPtr *maxlsn);
+ static int FindMaxLSNinDir(char *path, XLogRecPtr *maxlsn, bool is_fromlink);
+ static int FindMaxLSNinPgData(char *pgdatapath, XLogRecPtr *maxlsn);
+ static void usage(void);
+ static int getLinkPath(struct stat * statbuf, char *path, char *linkpath, int length);
+
+ /*
+ * Removes the parent references
+ * The input *must* have been put through canonicalize_path previously.
+ */
+ static void
+ remove_parent_refernces(char *inpath)
+ {
+ char *path;
+ char *epath;
+ char *edestpath;
+ int len;
+ int i;
+ int pending_strips = 0;
+
+ len = strlen(inpath);
+ path = malloc(len + 2);
+ if (!path)
+ {
+ return;
+ }
+ MemSet(path, 0, len + 2);
+
+ epath = inpath + len;
+ edestpath = path + len;
+
+ for (i = len; i >= 0; i--, epath--)
+ {
+ if (IS_DIR_SEP(*epath))
+ {
+ if ((i > 3) && (*(epath - 1) == '.')
+ && (*(epath - 2) == '.') && IS_DIR_SEP(*(epath - 3)))
+ {
+ /* Matching "/../" - trim the parent directoy */
+ if (!pending_strips)
+ pending_strips = 1;
+ pending_strips++;
+ i -= 2;
+ epath -= 2;
+ }
+ else if ((i > 2) && (*(epath - 1) == '.') && IS_DIR_SEP(*(epath - 2)))
+ {
+ /* Matching "/./" - can be skiped */
+ i -= 1;
+ epath -= 1;
+ continue;
+ }
+ else if (pending_strips)
+ {
+ pending_strips--;
+ }
+ }
+
+ if (pending_strips == 0)
+ {
+ *edestpath = *epath;
+ edestpath--;
+ }
+ }
+
+ if (pending_strips)
+ {
+ /* "/../", "/../../" are considered as "/" */
+ if (is_absolute_path(inpath))
+ {
+ *(inpath++) = '/';
+ }
+ else
+ {
+ /* Incase of relative path prefix with "../" */
+ while (pending_strips--)
+ {
+ strcpy(inpath, "../");
+ inpath += 3;
+ }
+ }
+ }
+
+ strcpy(inpath, edestpath + 1);
+ free(path);
+
+ return;
+ }
+
+ /*
+ * If the given pathname isn't already absolute, make it so, interpreting
+ * it relative to the current working directory.
+ */
+ static char *
+ make_absolute_path(const char *in)
+ {
+ char *result;
+
+ if (is_absolute_path(in))
+ result = strdup(in);
+ else
+ {
+ char cwdbuf[MAXPGPATH];
+
+ if (!getcwd(cwdbuf, sizeof(cwdbuf)))
+ {
+ fprintf(stderr, _("could not get current working directory: %s\n"), strerror(errno));
+ exit(2);
+ }
+
+ result = malloc(strlen(cwdbuf) + strlen(in) + 2);
+ sprintf(result, "%s/%s", cwdbuf, in);
+ }
+
+ canonicalize_path(result);
+ remove_parent_refernces(result);
+ return result;
+ }
+
+ /*
+ * This function validates the given cluster directory - we search for a
+ * small set of subdirectories that we expect to find in a valid data directory.
+ * directory. If any of the subdirectories are missing (or secured against
+ * us) we display an error message and exit()
+ *
+ */
+ static bool
+ check_data_dir(const char *pg_data)
+ {
+ char subDirName[MAXPGPATH];
+ int dnum;
+
+ /* start check with top-most directory */
+ const char *requiredSubdirs[] = {"", "base", "global", "pg_tblspc",
+ "pg_multixact", "pg_subtrans", "pg_clog", "pg_twophase",
+ "pg_xlog"};
+
+ for (dnum = 0; dnum < lengthof(requiredSubdirs); ++dnum)
+ {
+ struct stat statBuf;
+
+ snprintf(subDirName, sizeof(subDirName), "%s%s%s", pg_data,
+ /* Win32 can't stat() a directory with a trailing slash. */
+ *requiredSubdirs[dnum] ? "/" : "",
+ requiredSubdirs[dnum]);
+
+ if (stat(subDirName, &statBuf) != 0)
+ {
+ fprintf(stderr, _("%s: check for \"%s\" failed: %s\n"),
+ progname, subDirName, strerror(errno));
+ return false;
+ }
+ else if (!S_ISDIR(statBuf.st_mode))
+ {
+ fprintf(stderr, _("%s: %s is not a directory.\n"), progname, subDirName);
+ return false;
+ }
+ }
+
+ return true;
+ }
+
+ /*
+ * This function validates the given path is in or below cluster directory or
+ * any one of "base", "global", or "pg_tblspc" subdirectory.
+ */
+ static bool
+ check_path_belongs_to_pgdata(const char *in_pg_data, char *in_path,
+ bool *is_inside_pg_data, bool *path_are_same)
+ {
+ char *usrpath;
+ char *pg_data;
+ int data_path_len;
+
+ pg_data = make_absolute_path(in_pg_data);
+ usrpath = make_absolute_path(in_path);
+
+ data_path_len = strlen(pg_data);
+
+ /* check input usrpath is subdirectory of pg_data usrpath */
+ if (strncmp(pg_data, usrpath, data_path_len) == 0)
+ {
+ if (usrpath[data_path_len] == '\0')
+ {
+ *path_are_same = true;
+ *is_inside_pg_data = true;
+ free(pg_data);
+ free(usrpath);
+ return true;
+ }
+ else if (!IS_DIR_SEP(usrpath[data_path_len]))
+ {
+ free(pg_data);
+ free(usrpath);
+ return false;
+ }
+
+ *is_inside_pg_data = true;
+
+ /* Input usrpath same as pg_data but ending with ending with DIR_SEP */
+ if (usrpath[data_path_len + 1] == '\0')
+ {
+ *path_are_same = true;
+ free(pg_data);
+ free(usrpath);
+ return true;
+ }
+
+ /*
+ * Input usrpath is subdirectory of pg_data usrpath, so validate the
+ * remaining usrpath it should be inside any one of "base", "global",
+ * "pg_tablespc"
+ */
+ if ((path_is_prefix_of_path("base", usrpath + (data_path_len + 1)))
+ || (path_is_prefix_of_path("global", usrpath + (data_path_len + 1)))
+ || (path_is_prefix_of_path("pg_tblspc", usrpath + (data_path_len + 1))))
+ {
+ free(pg_data);
+ free(usrpath);
+ return true;
+ }
+ }
+ free(pg_data);
+ free(usrpath);
+
+ return false;
+ }
+
+ /*
+ * relfilenode name validation.
+ * Format with_ext == true [0-9]+[ \w | _vm | _fsm | _init ][\.][0-9]*
+ * with_ext == false [0-9]+
+ */
+ static bool
+ validateRelfilenodename(char *name, bool with_ext)
+ {
+ int pos = 0;
+
+ while ((name[pos] >= '0') && (name[pos] <= '9'))
+ pos++;
+
+ if (with_ext)
+ {
+ if (name[pos] == '_')
+ {
+ pos++;
+ while ((name[pos] >= 'a') && (name[pos] <= 'z'))
+ pos++;
+ }
+ if (name[pos] == '.')
+ {
+ pos++;
+ while ((name[pos] >= '0') && (name[pos] <= '9'))
+ pos++;
+ }
+ }
+
+ if (name[pos] == 0)
+ return true;
+
+ return false;
+ }
+
+ int
+ main(int argc, char *argv[])
+ {
+ static struct option long_options[] = {
+ {"path", required_argument, NULL, 'p'},
+ {"data-directory", no_argument, NULL, 'P'},
+ {NULL, 0, NULL, 0}
+ };
+
+ int optindex;
+ int c;
+ char *DataDir;
+ int fd;
+ char path[MAXPGPATH];
+ bool print_max_lsn = false;
+ bool print_pgdata_max_lsn = false;
+ char *LsnSearchPath = NULL;
+ XLogRecPtr maxLSN = 0;
+ XLogSegNo logSegNo = 0;
+ int result = 0;
+ bool is_whole_data_dir = false;
+
+ set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_computemaxlsn"));
+
+ progname = get_progname(argv[0]);
+
+ if (argc > 1)
+ {
+ if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
+ {
+ usage();
+ exit(0);
+ }
+ if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)
+ {
+ puts("pg_computemaxlsn (PostgreSQL) " PG_VERSION);
+ exit(0);
+ }
+ }
+
+ while ((c = getopt_long(argc, argv, "p:P", long_options, &optindex)) != -1)
+ {
+ switch (c)
+ {
+ case 'p':
+ if (print_max_lsn)
+ {
+ fprintf(stderr, _("%s: multiple -p options are not supported.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+ print_max_lsn = true;
+ LsnSearchPath = strdup(optarg);
+ break;
+
+ case 'P':
+ if (print_pgdata_max_lsn)
+ {
+ fprintf(stderr, _("%s: multiple -P options are not supported.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+ print_pgdata_max_lsn = true;
+ break;
+
+ default:
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+ }
+
+ if (print_max_lsn && print_pgdata_max_lsn)
+ {
+ fprintf(stderr, _("%s: both options -P and -p can not be combined.\n"), progname);
+ exit(1);
+ }
+
+ if (optind == argc)
+ {
+ fprintf(stderr, _("%s: no data directory specified.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+
+ if ((optind + 1) != argc)
+ {
+ fprintf(stderr, _("%s: multiple data directories not supported.\n"), progname);
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ exit(1);
+ }
+
+ /*
+ * Don't allow pg_computemaxlsn to be run as root, to avoid overwriting
+ * the ownership of files in the data directory. We need only check for
+ * root -- any other user won't have sufficient permissions to modify
+ * files in the data directory.
+ */
+ #ifndef WIN32
+ if (geteuid() == 0)
+ {
+ fprintf(stderr, _("%s: cannot be executed by \"root\".\n"),
+ progname);
+ fprintf(stderr, _("You must run %s as the PostgreSQL superuser.\n"),
+ progname);
+ exit(1);
+ }
+ #endif
+
+ DataDir = strdup(argv[optind]);
+
+ canonicalize_path(DataDir);
+
+
+ if (!check_data_dir(DataDir))
+ {
+ fprintf(stderr, _("%s: \"%s\" not a valid data directory.\n"),
+ progname, DataDir);
+ exit(1);
+ }
+
+ /*
+ * Check for a postmaster lock file --- if there is one, refuse to
+ * proceed, on grounds we might be interfering with a live installation.
+ */
+ snprintf(path, MAXPGPATH, "%s/postmaster.pid", DataDir);
+
+ if ((fd = open(path, O_RDONLY, 0)) < 0)
+ {
+ if (errno != ENOENT)
+ {
+ fprintf(stderr, _("%s: could not open file \"%s\" for reading: %s\n"),
+ progname, path, strerror(errno));
+ exit(1);
+ }
+ }
+ else
+ {
+ fprintf(stderr, _("%s: lock file \"%s\" exists\n"
+ "Is a server running? If not, delete the lock file and try again.\n"),
+ progname, path);
+ exit(1);
+ }
+
+ if (print_max_lsn)
+ {
+ struct stat fst;
+ bool is_inside_pg_data = false;
+
+ canonicalize_path(LsnSearchPath);
+
+ if (!check_path_belongs_to_pgdata(DataDir, LsnSearchPath, &is_inside_pg_data, &is_whole_data_dir))
+ {
+ if (is_inside_pg_data)
+ {
+ fprintf(stderr, _("%s: Path \"%s\" should be in or below any of \"base\",\"global\" or \"pg_tblspc\" of data directory.\n"),
+ progname, LsnSearchPath);
+ }
+ else
+ {
+ fprintf(stderr, _("%s: Path \"%s\" should be in or below data directory.\n"),
+ progname, LsnSearchPath);
+ }
+ exit(1);
+ }
+
+ if (is_whole_data_dir)
+ {
+ /* Process for whole data directory. */
+ }
+ else if (lstat(LsnSearchPath, &fst) < 0)
+ {
+ if (errno == ENOENT)
+ {
+ fprintf(stderr, _("%s: file or directory \"%s\" does not exists.\n"),
+ progname, LsnSearchPath);
+ }
+ else
+ {
+ fprintf(stderr, _("%s: could not stat file or directory \"%s\": %s\n"),
+ progname, LsnSearchPath, strerror(errno));
+ }
+ exit(1);
+ }
+ else if (getLinkPath(&fst, LsnSearchPath, path, sizeof(path)) > 0)
+ {
+ result = FindMaxLSNinDir(path, &maxLSN, true);
+ }
+ else if (S_ISDIR(fst.st_mode))
+ {
+ result = FindMaxLSNinDir(LsnSearchPath, &maxLSN, false);
+ }
+ else if (S_ISREG(fst.st_mode))
+ {
+ result = FindMaxLSNinFile(LsnSearchPath, &maxLSN);
+ }
+ else
+ {
+ fprintf(stderr, _("%s: skipping special file \"%s\"\n"), progname, LsnSearchPath);
+ }
+ }
+
+ /* By default we need to compute max lsn for database */
+ if ((print_max_lsn == false) || is_whole_data_dir)
+ {
+ result = FindMaxLSNinPgData(DataDir, &maxLSN);
+ }
+
+ if (0 != result)
+ {
+ /* Message already provided, simply exit */
+ exit(1);
+ }
+
+ XLByteToSeg(maxLSN, logSegNo);
+
+ printf("Maximum LSN found is: %X/%X \nWAL segment file name (fileid,seg): %X/%X\n",
+ (uint32)(maxLSN >> 32), (uint32) maxLSN,
+ (uint32) (logSegNo >> 32), (uint32) (logSegNo));
+
+ return 0;
+ }
+
+
+ /*
+ * PageHeaderIsValid: Check page is valid or not
+ */
+ bool
+ PageHeaderIsValid(PageHeader page)
+ {
+ char *pagebytes;
+ int i;
+
+ /* Check normal case */
+ if (PageGetPageSize(page) == BLCKSZ &&
+ PageGetPageLayoutVersion(page) == PG_PAGE_LAYOUT_VERSION &&
+ (page->pd_flags & ~PD_VALID_FLAG_BITS) == 0 &&
+ page->pd_lower >= SizeOfPageHeaderData &&
+ page->pd_lower <= page->pd_upper &&
+ page->pd_upper <= page->pd_special &&
+ page->pd_special <= BLCKSZ &&
+ page->pd_special == MAXALIGN(page->pd_special))
+ return true;
+
+ /*
+ * Check all-zeroes till page header; this is used only to log the page
+ * details even we detect invalid page we will continue to next pages
+ */
+ pagebytes = (char *) page;
+ for (i = 0; i < PAGEHDRSZ; i++)
+ {
+ if (pagebytes[i] != 0)
+ return false;
+ }
+ return true;
+ }
+
+
+ /*
+ * Read the maximum LSN number in the one of data file (relnode file).
+ *
+ */
+ static int
+ FindMaxLSNinFile(char *filename, XLogRecPtr *maxlsn)
+ {
+ XLogRecPtr pagelsn;
+ off_t len,
+ seekpos;
+ uint32 nblocks,
+ blocknum;
+ char buffer[PAGEHDRSZ];
+ int nbytes;
+ int fd;
+
+ if ((fd = open(filename, O_RDONLY | PG_BINARY, 0)) < 0)
+ {
+ /*
+ * If file does not exist or we can't read it. give error
+ */
+ fprintf(stderr, _("%s: could not open file \"%s\" for reading: %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ /* Calculate the number of pages in file */
+ len = lseek(fd, 0L, SEEK_END);
+ if (len < 0)
+ {
+ close(fd);
+ fprintf(stderr, _("%s: .. file \"%s\" for seeking: %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ nblocks = (len / BLCKSZ);
+ if (nblocks > RELSEG_SIZE)
+ {
+ /*
+ * In one relfilenode file length can't be more that RELSEG_SIZE
+ */
+ close(fd);
+ fprintf(stderr, _("%s: .. file \"%s\" length is more than segment size: %d.\n"),
+ progname, filename, RELSEG_SIZE);
+ return -1;
+ }
+
+ /*
+ * Read the only page header and validate; if we find invalid page log the
+ * details of page and continue to next page.
+ */
+ seekpos = 0;
+ for (blocknum = 0; blocknum < nblocks; blocknum++)
+ {
+ len = lseek(fd, seekpos, SEEK_SET);
+ if (len != seekpos)
+ {
+ close(fd);
+ fprintf(stderr, _("%s: could not seek to next page \"%s\": %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ nbytes = read(fd, buffer, PAGEHDRSZ);
+ if (nbytes < 0)
+ {
+ close(fd);
+ fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ progname, filename, strerror(errno));
+ return -1;
+ }
+
+ if (PageHeaderIsValid((PageHeader) buffer))
+ {
+ pagelsn = PageGetLSN(buffer);
+ if (XLByteLT(*maxlsn, pagelsn))
+ {
+ *maxlsn = pagelsn;
+ }
+ }
+ else
+ {
+ /*
+ * If page is invalid log the error and continue
+ */
+ fprintf(stderr, _("%s: Invalid page found in file \"%s\" pagid:%d\n"),
+ progname, filename, blocknum);
+ }
+ seekpos += (off_t) BLCKSZ;
+ }
+
+ close(fd);
+ return 0;
+ }
+
+ /*
+ * Read the maximum LSN number in current directory; including sub directories
+ * and links.
+ */
+ static int
+ FindMaxLSNinDir(char *path, XLogRecPtr *maxlsn, bool is_fromlink)
+ {
+ DIR *dir;
+ struct dirent *de;
+ char pathbuf[MAXPGPATH];
+ struct stat statbuf;
+ char linkpath[MAXPGPATH];
+ int result;
+
+ dir = opendir(path);
+ if (NULL == dir)
+ {
+ fprintf(stderr, _("%s: could not open directory \"%s\": %s\n"),
+ progname, path, strerror(errno));
+ return -1;
+ }
+
+ while ((de = readdir(dir)) != NULL)
+ {
+ /* Skip special stuff */
+ if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
+ continue;
+
+ /* Skip temporary files */
+ if (strncmp(de->d_name,
+ PG_TEMP_FILE_PREFIX,
+ strlen(PG_TEMP_FILE_PREFIX)) == 0)
+ continue;
+
+ /*
+ * Skip all the local/global temporary files, and read and read all
+ * reamining relfinenode files
+ */
+ if (is_fromlink)
+ {
+ /* If directory is link then only allow PG_* path only */
+ if (!validateTablespaceDir(de->d_name))
+ continue;
+ }
+ else if (!validateRelfilenodename(de->d_name, true))
+ continue;
+
+ snprintf(pathbuf, MAXPGPATH, "%s/%s", path, de->d_name);
+
+ if (lstat(pathbuf, &statbuf) != 0)
+ {
+ if (errno != ENOENT)
+ {
+ fprintf(stderr, _("%s: could not stat file or directory \"%s\": %s\n"),
+ progname, pathbuf, strerror(errno));
+ }
+ /* If the file went away while scanning, it's no error. */
+ continue;
+ }
+
+ result = getLinkPath(&statbuf, pathbuf, linkpath, sizeof(linkpath));
+ if (result < 0)
+ {
+ continue;
+ }
+ else if (result > 0)
+ {
+ (void) FindMaxLSNinDir(linkpath, maxlsn, true);
+ }
+ else if (S_ISDIR(statbuf.st_mode))
+ (void) FindMaxLSNinDir(pathbuf, maxlsn, false);
+ else
+ (void) FindMaxLSNinFile(pathbuf, maxlsn);
+ }
+
+ closedir(dir);
+ return 0;
+ }
+
+ /*
+ * Get the link path.
+ * On success, returns length of link filename.
+ * and return zero incase of is file is not a link type.
+ * On failure, returns -1.
+ */
+ static int
+ getLinkPath(struct stat * statbuf, char *path, char *linkpath, int length)
+ {
+ int rllen;
+
+ if (
+ #ifndef WIN32
+ S_ISLNK(statbuf->st_mode)
+ #else
+ pgwin32_is_junction(path)
+ #endif
+ )
+ {
+ #if defined(HAVE_READLINK) || defined(WIN32)
+
+ rllen = readlink(path, linkpath, length);
+ if (rllen < 0)
+ {
+ fprintf(stderr, _("%s: could not read symbolic link \"%s\", so skipping file.\n"),
+ progname, path);
+ return -1;
+ }
+
+ if (rllen >= length)
+ {
+ fprintf(stderr, _("%s: symbolic link \"%s\" target is too long, so skipping file.\n"),
+ progname, path);
+
+ return -1;
+ }
+
+ linkpath[rllen] = '\0';
+
+ return rllen;
+ #else
+ /* tablespaces are not supported on this platform */
+ return -1;
+ #endif /* HAVE_READLINK */
+ }
+
+ return 0;
+ }
+
+
+ /*
+ * Read the maximum LSN number in the DATA directory.
+ */
+ static int
+ FindMaxLSNinPgData(char *pgdatapath, XLogRecPtr *maxlsn)
+ {
+ char pathbuf[MAXPGPATH];
+
+ /* scan all the relfilenodes in data directory */
+ snprintf(pathbuf, MAXPGPATH, "%s/global", pgdatapath);
+ if (0 != FindMaxLSNinDir(pathbuf, maxlsn, false))
+ return -1;
+
+ snprintf(pathbuf, MAXPGPATH, "%s/base", pgdatapath);
+ if (0 != FindMaxLSNinDir(pathbuf, maxlsn, false))
+ return -1;
+
+ snprintf(pathbuf, MAXPGPATH, "%s/pg_tblspc", pgdatapath);
+ if (0 != FindMaxLSNinDir(pathbuf, maxlsn, false))
+ return -1;
+
+ return 0;
+ }
+
+ static void
+ usage(void)
+ {
+ printf(_("%s compute the maximum LSN in PostgreSQL data pages.\n\n"), progname);
+ printf(_("Usage:\n %s [OPTION]... DATADIR\n\n"), progname);
+ printf(_("Options:\n"));
+ printf(_(" -p, --path=FILE_FOLDER_PATH print max LSN from file or directory name\n"));
+ printf(_(" -P, --data-directory print max LSN from whole data directory\n"));
+ printf(_(" -V, --version output version information, then exit\n"));
+ printf(_(" -?, --help show this help, then exit\n"));
+ printf(_("\nReport bugs to <pgsql-bugs@postgresql.org>.\n"));
+ }
*** a/doc/src/sgml/ref/allfiles.sgml
--- b/doc/src/sgml/ref/allfiles.sgml
***************
*** 177,182 **** Complete list of usable sgml source files in this directory.
--- 177,183 ----
<!ENTITY pgDumpall SYSTEM "pg_dumpall.sgml">
<!ENTITY pgReceivexlog SYSTEM "pg_receivexlog.sgml">
<!ENTITY pgResetxlog SYSTEM "pg_resetxlog.sgml">
+ <!ENTITY pgComputemaxlsn SYSTEM "pg_computemaxlsn.sgml">
<!ENTITY pgRestore SYSTEM "pg_restore.sgml">
<!ENTITY postgres SYSTEM "postgres-ref.sgml">
<!ENTITY postmaster SYSTEM "postmaster.sgml">
*** /dev/null
--- b/doc/src/sgml/ref/pg_computemaxlsn.sgml
***************
*** 0 ****
--- 1,79 ----
+ <!--
+ doc/src/sgml/ref/pg_computemaxlsn.sgml
+ PostgreSQL documentation
+ -->
+
+ <refentry id="APP-PGCOMPUTEMAXLSN">
+ <refmeta>
+ <refentrytitle><application>pg_computemaxlsn</application></refentrytitle>
+ <manvolnum>1</manvolnum>
+ <refmiscinfo>Application</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>pg_computemaxlsn</refname>
+ <refpurpose>computes the maximum LSN in database of a <productname>PostgreSQL</productname> database cluster</refpurpose>
+ </refnamediv>
+
+ <indexterm zone="app-pgcomputemaxlsn">
+ <primary>pg_computemaxlsn</primary>
+ </indexterm>
+
+ <refsynopsisdiv>
+ <cmdsynopsis>
+ <command>pg_computemaxlsn</command>
+ <arg choice="opt"><option>-P</option></arg>
+ <arg choice="opt"><option>-p</option> <replaceable class="parameter">file-name</replaceable> | <replaceable class="parameter">folder-name</replaceable></arg>
+ <arg choice="plain"><replaceable>datadir</replaceable></arg>
+ </cmdsynopsis>
+ </refsynopsisdiv>
+
+ <refsect1 id="R1-APP-PGCOMPUTEMAXLSN-1">
+ <title>Description</title>
+ <para>
+ <command>pg_computemaxlsn</command> computes maximun LSN from database pages.
+ </para>
+
+ <para>
+ This utility can only be run by the user who installed the server, because
+ it requires read/write access to the data directory.
+ For safety reasons, you must specify the data directory on the command line.
+ <command>pg_computemaxlsn</command> does not use the environment variable
+ <envar>PGDATA</>.
+ </para>
+
+ <para>
+ The <option>-P</> or <option>--data-directory</> for computing maximum LSN from all the pages in data directory.
+ This is the default option if none of the options are provided.
+ </para>
+
+ <para>
+ The <option>-p <replaceable class="parameter">file-name | folder-name</replaceable></option> or
+ <option>--path=<replaceable class="parameter">file-name | folder-name</replaceable></option> for computing
+ maximun LSN from specific file or folder. File or folder path should be in or below the data directory.
+ </para>
+
+ <para>
+ The <option>-V</> and <option>--version</> options print
+ the <application>pg_computemaxlsn</application> version and exit. The
+ options <option>-?</> and <option>--help</> show supported arguments,
+ and exit.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Notes</title>
+
+ <para>
+ This command must not be used when the server is
+ running. <command>pg_computemaxlsn</command> will refuse to start up if
+ it finds a server lock file in the data directory. If the
+ server crashed then a lock file might have been left
+ behind; in that case you can remove the lock file to allow
+ <command>pg_computemaxlsn</command> to run. But before you do
+ so, make doubly certain that there is no server process still alive.
+ </para>
+ </refsect1>
+
+ </refentry>
*** a/doc/src/sgml/ref/pg_resetxlog.sgml
--- b/doc/src/sgml/ref/pg_resetxlog.sgml
***************
*** 135,140 **** PostgreSQL documentation
--- 135,150 ----
largest entry in <filename>pg_xlog</>, use <literal>-l 00000001000000320000004B</> or higher.
</para>
+ <para>
+ If <command>pg_resetxlog</command> complains that it cannot determine
+ valid data for <filename>pg_control</>, and if you do not have or corrupted
+ WAL segment files in the directory <filename>pg_xlog</> under the data directory,
+ then to identify larger WAL segment file from data files we can use utility <command>pg_computemaxlsn</command>
+ with <option>-P</> option for finding maximum LSN from the data directory or
+ for from specific file or folder <option>-p <filename>file-name | folder-name</></>.
+ Once larger WAL segment file is found use <option>-l</> option for setting the value.
+ </para>
+
<note>
<para>
<command>pg_resetxlog</command> itself looks at the files in
*** a/doc/src/sgml/reference.sgml
--- b/doc/src/sgml/reference.sgml
***************
*** 248,253 ****
--- 248,254 ----
&pgControldata;
&pgCtl;
&pgResetxlog;
+ &pgComputemaxlsn;
&postgres;
&postmaster;
*** a/src/tools/msvc/Mkvcbuild.pm
--- b/src/tools/msvc/Mkvcbuild.pm
***************
*** 34,40 **** my @contrib_uselibpgport = (
'oid2name', 'pgbench',
'pg_standby', 'pg_archivecleanup',
'pg_test_fsync', 'pg_test_timing',
! 'pg_upgrade', 'vacuumlo');
my $contrib_extralibs = { 'pgbench' => ['wsock32.lib'] };
my $contrib_extraincludes =
{ 'tsearch2' => ['contrib/tsearch2'], 'dblink' => ['src/backend'] };
--- 34,40 ----
'oid2name', 'pgbench',
'pg_standby', 'pg_archivecleanup',
'pg_test_fsync', 'pg_test_timing',
! 'pg_upgrade', 'vacuumlo', 'pg_computemaxlsn');
my $contrib_extralibs = { 'pgbench' => ['wsock32.lib'] };
my $contrib_extraincludes =
{ 'tsearch2' => ['contrib/tsearch2'], 'dblink' => ['src/backend'] };