initdb / bootstrap design

Started by Andres Freundalmost 4 years ago16 messages

andres@anarazel.de

almost 4 years ago

Hi,

[1]: /messages/by-id/20220216012953.6d7bzmsblqou3ru4@alap3.anarazel.de

To me the division of labor between initdb and bootstrap doesn't make much
sense anymore:

initdb reads postgres.bki, replaces a few tokens, starts postgres in bootstrap
mode, and then painstakenly feeds bootstrap.bki lines to the server.

Given that bootstrap mode parsing is a dedicated parser, only invoked from a
single point, what's the point of initdb doing the preprocessing and then
incurring pipe overhead?

Sure, there's a few tokens that we replace in initdb. As it turns out there's
only two rows that are actually variable. The username of the initial
superuser in pg_authid and the pg_database row for template 1, where encoding,
lc_collate and lc_ctype varies. The rest is all compile time constant
replacements we could do as part of genbki.pl.

It seems we could save a good number of context switches by opening
postgres.bki just before boot_yyparse() in BootstrapModeMain() and having the
parser read it. The pg_authid / pg_database rows we could just do via
explicit insertions in BootstrapModeMain(), provided by commandline args?

Similarly, since the introduction of extensions at the latest, the server
knows how to execute SQL from a file. Why don't we just process
information_schema.sql, system_views.sql et al that way?

If we don't need a dedicated "input" mode feeding boot_yyparse() in bootstrap
mode anymore (because bootstrap mode feeds it from postgres.bki directly), we
likely could avoid the restart between bootstrap and single user mode. Afaics
that only really is needed because we need to send SQL after
bootstrap_template1(). That'd likely be a nice speedup, because we don't need
to write the bootstrap contents from shared buffers to the OS just to read
them back in single user mode.

I don't plan to work on this immediately, but I thought it's worth bringing up
anyway.

Greetings,

Andres Freund

[1]: /messages/by-id/20220216012953.6d7bzmsblqou3ru4@alap3.anarazel.de

John Naylor

john.naylor@enterprisedb.com

almost 4 years ago

In reply to: Andres Freund (#1)

Re: initdb / bootstrap design

On Wed, Feb 16, 2022 at 9:12 AM Andres Freund <andres@anarazel.de> wrote:

To me the division of labor between initdb and bootstrap doesn't make much
sense anymore:

[...]

I don't plan to work on this immediately, but I thought it's worth bringing up
anyway.

Added TODO item "Rationalize division of labor between initdb and bootstrap"

--
John Naylor
EDB: http://www.enterprisedb.com

Peter Eisentraut

peter.eisentraut@enterprisedb.com

almost 4 years ago

In reply to: Andres Freund (#1)

1 attachment(s)

Re: initdb / bootstrap design

On 16.02.22 03:12, Andres Freund wrote:

Sure, there's a few tokens that we replace in initdb. As it turns out there's
only two rows that are actually variable. The username of the initial
superuser in pg_authid and the pg_database row for template 1, where encoding,
lc_collate and lc_ctype varies. The rest is all compile time constant
replacements we could do as part of genbki.pl.

It seems we could save a good number of context switches by opening
postgres.bki just before boot_yyparse() in BootstrapModeMain() and having the
parser read it. The pg_authid / pg_database rows we could just do via
explicit insertions in BootstrapModeMain(), provided by commandline args?

I think we could do the locale setup by updating the pg_database row of
template1 after bootstrap, as in the attached patch. (The order of
proceedings in the surrounding function might need some refinement in a
final patch.) I suspect we could do the treatment of pg_authid similarly.

Attachments:

0001-Simplify-locale-setup-of-template1-in-initdb.patchtext/plain; charset=UTF-8; name=0001-Simplify-locale-setup-of-template1-in-initdb.patchDownload

From 10143067fb35191aaa53ce2e5c4a20c4601b7528 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 16 Feb 2022 11:43:10 +0100
Subject: [PATCH] Simplify locale setup of template1 in initdb

---
 src/bin/initdb/initdb.c             | 23 +++++------------------
 src/include/catalog/pg_database.dat |  6 +++---
 2 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 97f15971e2..42e42ca4a4 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -621,15 +621,6 @@ get_id(void)
 	return pg_strdup(username);
 }
 
-static char *
-encodingid_to_string(int enc)
-{
-	char		result[20];
-
-	sprintf(result, "%d", enc);
-	return pg_strdup(result);
-}
-
 /*
  * get the encoding id for a given encoding name
  */
@@ -1396,15 +1387,6 @@ bootstrap_template1(void)
 	bki_lines = replace_token(bki_lines, "POSTGRES",
 							  escape_quotes_bki(username));
 
-	bki_lines = replace_token(bki_lines, "ENCODING",
-							  encodingid_to_string(encodingid));
-
-	bki_lines = replace_token(bki_lines, "LC_COLLATE",
-							  escape_quotes_bki(lc_collate));
-
-	bki_lines = replace_token(bki_lines, "LC_CTYPE",
-							  escape_quotes_bki(lc_ctype));
-
 	/* Also ensure backend isn't confused by this environment var: */
 	unsetenv("PGCLIENTENCODING");
 
@@ -1886,6 +1868,11 @@ make_template0(FILE *cmdfd)
 		NULL
 	};
 
+	PG_CMD_PRINTF("UPDATE pg_database "
+				  "  SET encoding = %d, datcollate = '%s', datctype = '%s' "
+				  "  WHERE datname = 'template1';\n\n",
+				  encodingid, escape_quotes(lc_collate), escape_quotes(lc_ctype));
+
 	for (line = template0_setup; *line; line++)
 		PG_CMD_PUTS(*line);
 }
diff --git a/src/include/catalog/pg_database.dat b/src/include/catalog/pg_database.dat
index e7e42d6023..6bca1ec54b 100644
--- a/src/include/catalog/pg_database.dat
+++ b/src/include/catalog/pg_database.dat
@@ -14,9 +14,9 @@
 
 { oid => '1', oid_symbol => 'TemplateDbOid',
   descr => 'default template for new databases',
-  datname => 'template1', encoding => 'ENCODING', datistemplate => 't',
+  datname => 'template1', encoding => '0', datistemplate => 't',
   datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
-  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
-  datctype => 'LC_CTYPE', datacl => '_null_' },
+  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'C',
+  datctype => 'C', datacl => '_null_' },
 
 ]
-- 
2.35.1

Andres Freund

andres@anarazel.de

almost 4 years ago

In reply to: Peter Eisentraut (#3)

Re: initdb / bootstrap design

Hi,

On 2022-02-16 11:47:31 +0100, Peter Eisentraut wrote:

On 16.02.22 03:12, Andres Freund wrote:

Sure, there's a few tokens that we replace in initdb. As it turns out there's
only two rows that are actually variable. The username of the initial
superuser in pg_authid and the pg_database row for template 1, where encoding,
lc_collate and lc_ctype varies. The rest is all compile time constant
replacements we could do as part of genbki.pl.

It seems we could save a good number of context switches by opening
postgres.bki just before boot_yyparse() in BootstrapModeMain() and having the
parser read it. The pg_authid / pg_database rows we could just do via
explicit insertions in BootstrapModeMain(), provided by commandline args?

I think we could do the locale setup by updating the pg_database row of
template1 after bootstrap, as in the attached patch.

Another solution could be to have bootstrap create template0 instead of
template1. I think for template0 it'd more accurate to have a hardcoded C
collation and ascii encoding (which I don't think we actually have?).

I suspect we could do the treatment of pg_authid similarly.

Yea.

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

almost 4 years ago

In reply to: Andres Freund (#1)

Re: initdb / bootstrap design

Andres Freund <andres@anarazel.de> writes:

Sure, there's a few tokens that we replace in initdb. As it turns out there's
only two rows that are actually variable. The username of the initial
superuser in pg_authid and the pg_database row for template 1, where encoding,
lc_collate and lc_ctype varies. The rest is all compile time constant
replacements we could do as part of genbki.pl.

I remembered the reason why it's done that way: if we replaced those
values during genbki.pl, the contents of postgres.bki would become
architecture-dependent, belying its distribution as a "share" file.
While we don't absolutely have to continue treating postgres.bki
as architecture-independent, I'm skeptical that there's enough win
here to justify a packaging change.

initdb is already plenty fast enough for any plausible production
usage; it's cases like check-world where we wish it were faster.
So I'm thinking what we really ought to pursue is the idea that's
been kicked around more than once of capturing the post-initdb
state of a cluster's files and just doing "cp -a" to duplicate that
later in the test run.

regards, tom lane

Andres Freund

andres@anarazel.de

almost 4 years ago

In reply to: Tom Lane (#5)

Re: initdb / bootstrap design

Hi,

On 2022-02-16 13:24:41 -0500, Tom Lane wrote:

I remembered the reason why it's done that way: if we replaced those
values during genbki.pl, the contents of postgres.bki would become
architecture-dependent, belying its distribution as a "share" file.
While we don't absolutely have to continue treating postgres.bki
as architecture-independent, I'm skeptical that there's enough win
here to justify a packaging change.

Hm. Architecturally I still would like to move it to be processed server
side. I'd like to eventually get rid of single user mode (but keep bootstrap,
at least for longer).

Seems we could make NAMEDATALEN, FLOAT8PASSBYVAL, ALIGNOF_POINTER,
FLOAT8PASSBYVAL stuff that bootparse knows about? And remove the need for
POSTGRES, ENCODING, LC_COLLATE, LC_CTYPE as discussed already?

initdb is already plenty fast enough for any plausible production
usage; it's cases like check-world where we wish it were faster.

It's not just our own usage though. I've seen it be a noticable time in test
suites of applications using postgres. And that's not really addressable with
the template approach, unless we want to move use of the template database
into initdb itself. I've thought about it, but then we'd need to do a lot more
than if it's just for our own tests.

So I'm thinking what we really ought to pursue is the idea that's
been kicked around more than once of capturing the post-initdb
state of a cluster's files and just doing "cp -a" to duplicate that
later in the test run.

Yea, we should pursue that independently of improving initdb's architecture /
speed. initdb will never be as fast as copying files around.

I kind of got stuck on how to deal with install.pl / vcregress.pl. For make
it's easy enough to create the template during during temp-install. But for
the msvc stuff is less clear when / where to create the template
database. Nearly everyone uses NO_TEMP_INSTALL on windows, because install is
so slow and happens in every test. But right now there's no command to create
the "temp" installation. Probably need something like a 'temp-install' command
for vcregress.pl and then convert the buildfarm to use that.

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

almost 4 years ago

In reply to: Andres Freund (#6)

Re: initdb / bootstrap design

Andres Freund <andres@anarazel.de> writes:

On 2022-02-16 13:24:41 -0500, Tom Lane wrote:

I remembered the reason why it's done that way: if we replaced those
values during genbki.pl, the contents of postgres.bki would become
architecture-dependent, belying its distribution as a "share" file.

Hm. Architecturally I still would like to move it to be processed server
side. I'd like to eventually get rid of single user mode (but keep bootstrap,
at least for longer).
Seems we could make NAMEDATALEN, FLOAT8PASSBYVAL, ALIGNOF_POINTER,
FLOAT8PASSBYVAL stuff that bootparse knows about? And remove the need for
POSTGRES, ENCODING, LC_COLLATE, LC_CTYPE as discussed already?

Yeah, I have no objection to doing it that way. It should be possible
to do those substitutions on a per-field basis, which'd be cleaner than
what initdb does now ...

regards, tom lane

Robert Haas

robertmhaas@gmail.com

almost 4 years ago

In reply to: Andres Freund (#6)

Re: initdb / bootstrap design

On Wed, Feb 16, 2022 at 2:50 PM Andres Freund <andres@anarazel.de> wrote:

initdb is already plenty fast enough for any plausible production
usage; it's cases like check-world where we wish it were faster.

It's not just our own usage though. I've seen it be a noticable time in test
suites of applications using postgres.

I'd just like to second this point.

I was working on an EDB proprietary software project for a while
which, because of the nature of what it did, ran initdb frequently in
its test suite. And it was unbelievably painful. The test suite just
took forever. Fortunately, it always ran initdb with the same options,
so somebody invented a mechanism for doing one initdb and saving the
results someplace and just copying them every time, and it made a huge
difference. Before that experience, I probably would have agreed with
the idea that there was no need at all for initdb to be any faster
than it is already. But, like, what if we'd been trying to run initdb
with different options for different tests, the way the core code
does? That seems like an entirely plausible thing to want to do, and
then caching becomes a real pain.

--
Robert Haas
EDB: http://www.enterprisedb.com

Andrew Dunstan

andrew@dunslane.net

almost 4 years ago

In reply to: Robert Haas (#8)

Re: initdb / bootstrap design

On 2/17/22 10:36, Robert Haas wrote:

On Wed, Feb 16, 2022 at 2:50 PM Andres Freund <andres@anarazel.de> wrote:

initdb is already plenty fast enough for any plausible production
usage; it's cases like check-world where we wish it were faster.

It's not just our own usage though. I've seen it be a noticable time in test
suites of applications using postgres.

I'd just like to second this point.

I was working on an EDB proprietary software project for a while
which, because of the nature of what it did, ran initdb frequently in
its test suite. And it was unbelievably painful. The test suite just
took forever. Fortunately, it always ran initdb with the same options,
so somebody invented a mechanism for doing one initdb and saving the
results someplace and just copying them every time, and it made a huge
difference. Before that experience, I probably would have agreed with
the idea that there was no need at all for initdb to be any faster
than it is already. But, like, what if we'd been trying to run initdb
with different options for different tests, the way the core code
does? That seems like an entirely plausible thing to want to do, and
then caching becomes a real pain.

Indeed. When initdb.c was written the testing landscape was very
different both for the community and for projects that used Postgres. So
we need to catch up.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 4 years ago

In reply to: Andrew Dunstan (#9)

1 attachment(s)

Re: initdb / bootstrap design

Here's an initial patch that gets rid of the need for initdb to
change the contents of postgres.bki before feeding it to the
bootstrap backend. After this, we could look at having the
backend read the file directly.

I don't really detect any speed change from getting rid of initdb's
string manipulations, but TBH I was not expecting any. On my machine,
that was lost in the noise already, according to perf(1).

regards, tom lane

Attachments:

feed-postgres.bki-to-backend-unmodified-1.patchtext/x-diff; charset=us-ascii; name=feed-postgres.bki-to-backend-unmodified-1.patchDownload

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 9fa8fdd4cf..667c829064 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -635,6 +635,8 @@ InsertOneTuple(void)
 
 /* ----------------
  *		InsertOneValue
+ *
+ * Fill the i'th column of the current tuple with the given value.
  * ----------------
  */
 void
@@ -653,6 +655,21 @@ InsertOneValue(char *value, int i)
 
 	elog(DEBUG4, "inserting column %d value \"%s\"", i, value);
 
+	/*
+	 * In order to make the contents of postgres.bki architecture-independent,
+	 * certain values in it are represented symbolically, and we perform the
+	 * necessary replacements here.
+	 */
+	if (strcmp(value, "NAMEDATALEN") == 0)
+		value = CppAsString2(NAMEDATALEN);
+	else if (strcmp(value, "SIZEOF_POINTER") == 0)
+		value = CppAsString2(SIZEOF_VOID_P);
+	else if (strcmp(value, "ALIGNOF_POINTER") == 0)
+		value = (SIZEOF_VOID_P == 4) ? "i" : "d";
+	else if (strcmp(value, "FLOAT8PASSBYVAL") == 0)
+		value = FLOAT8PASSBYVAL ? "true" : "false";
+
+	/* Now convert the value to internal form */
 	typoid = TupleDescAttr(boot_reldesc->rd_att, i)->atttypid;
 
 	boot_get_type_io_data(typoid,
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 97f15971e2..6db9c4f334 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -265,13 +265,13 @@ static void setup_privileges(FILE *cmdfd);
 static void set_info_version(void);
 static void setup_schema(FILE *cmdfd);
 static void load_plpgsql(FILE *cmdfd);
+static void set_remaining_details(FILE *cmdfd);
 static void vacuum_db(FILE *cmdfd);
 static void make_template0(FILE *cmdfd);
 static void make_postgres(FILE *cmdfd);
 static void trapsig(int signum);
 static void check_ok(void);
 static char *escape_quotes(const char *src);
-static char *escape_quotes_bki(const char *src);
 static int	locale_date_order(const char *locale);
 static void check_locale_name(int category, const char *locale,
 							  char **canonname);
@@ -336,32 +336,6 @@ escape_quotes(const char *src)
 	return result;
 }
 
-/*
- * Escape a field value to be inserted into the BKI data.
- * Run the value through escape_quotes (which will be inverted
- * by the backend's DeescapeQuotedString() function), then wrap
- * the value in single quotes, even if that isn't strictly necessary.
- */
-static char *
-escape_quotes_bki(const char *src)
-{
-	char	   *result;
-	char	   *data = escape_quotes(src);
-	char	   *resultp;
-	char	   *datap;
-
-	result = (char *) pg_malloc(strlen(data) + 3);
-	resultp = result;
-	*resultp++ = '\'';
-	for (datap = data; *datap; datap++)
-		*resultp++ = *datap;
-	*resultp++ = '\'';
-	*resultp = '\0';
-
-	free(data);
-	return result;
-}
-
 /*
  * make a copy of the array of lines, with token replaced by replacement
  * the first time it occurs on each line.
@@ -1357,7 +1331,6 @@ bootstrap_template1(void)
 	char	  **line;
 	char	  **bki_lines;
 	char		headerline[MAXPGPATH];
-	char		buf[64];
 
 	printf(_("running bootstrap script ... "));
 	fflush(stdout);
@@ -1379,32 +1352,6 @@ bootstrap_template1(void)
 		exit(1);
 	}
 
-	/* Substitute for various symbols used in the BKI file */
-
-	sprintf(buf, "%d", NAMEDATALEN);
-	bki_lines = replace_token(bki_lines, "NAMEDATALEN", buf);
-
-	sprintf(buf, "%d", (int) sizeof(Pointer));
-	bki_lines = replace_token(bki_lines, "SIZEOF_POINTER", buf);
-
-	bki_lines = replace_token(bki_lines, "ALIGNOF_POINTER",
-							  (sizeof(Pointer) == 4) ? "i" : "d");
-
-	bki_lines = replace_token(bki_lines, "FLOAT8PASSBYVAL",
-							  FLOAT8PASSBYVAL ? "true" : "false");
-
-	bki_lines = replace_token(bki_lines, "POSTGRES",
-							  escape_quotes_bki(username));
-
-	bki_lines = replace_token(bki_lines, "ENCODING",
-							  encodingid_to_string(encodingid));
-
-	bki_lines = replace_token(bki_lines, "LC_COLLATE",
-							  escape_quotes_bki(lc_collate));
-
-	bki_lines = replace_token(bki_lines, "LC_CTYPE",
-							  escape_quotes_bki(lc_ctype));
-
 	/* Also ensure backend isn't confused by this environment var: */
 	unsetenv("PGCLIENTENCODING");
 
@@ -1622,12 +1569,11 @@ setup_collation(FILE *cmdfd)
 static void
 setup_privileges(FILE *cmdfd)
 {
-	char	  **line;
-	char	  **priv_lines;
-	static char *privileges_setup[] = {
+	const char *const *line;
+	static const char *const privileges_setup[] = {
 		"UPDATE pg_class "
 		"  SET relacl = (SELECT array_agg(a.acl) FROM "
-		" (SELECT E'=r/\"$POSTGRES_SUPERUSERNAME\"' as acl "
+		" (SELECT '=r/\"POSTGRES\"' as acl "
 		"  UNION SELECT unnest(pg_catalog.acldefault("
 		"    CASE WHEN relkind = " CppAsString2(RELKIND_SEQUENCE) " THEN 's' "
 		"         ELSE 'r' END::\"char\"," CppAsString2(BOOTSTRAP_SUPERUSERID) "::oid))"
@@ -1759,9 +1705,7 @@ setup_privileges(FILE *cmdfd)
 		NULL
 	};
 
-	priv_lines = replace_token(privileges_setup, "$POSTGRES_SUPERUSERNAME",
-							   escape_quotes(username));
-	for (line = priv_lines; *line != NULL; line++)
+	for (line = privileges_setup; *line != NULL; line++)
 		PG_CMD_PUTS(*line);
 }
 
@@ -1822,6 +1766,48 @@ load_plpgsql(FILE *cmdfd)
 	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n\n");
 }
 
+/*
+ * Set some remaining details that aren't known when postgres.bki is made.
+ *
+ * Up to now, the bootstrap superuser has been named "POSTGRES".
+ * Replace that with the user-specified name (often "postgres").
+ * Also, insert the desired locale and encoding details in pg_database.
+ *
+ * Note: this must run after setup_privileges(), which expects the superuser
+ * name to still be "POSTGRES".
+ */
+static void
+set_remaining_details(FILE *cmdfd)
+{
+	char	  **line;
+	char	  **detail_lines;
+
+	/*
+	 * Ideally we'd change the superuser name with ALTER USER, but the backend
+	 * will reject that with "session user cannot be renamed", so we must
+	 * cheat.  (In any case, we'd need a function to escape an identifier, not
+	 * a string literal.)  Likewise, we can't change template1's
+	 * locale/encoding without cheating.
+	 */
+	static char *final_details[] = {
+		"UPDATE pg_authid SET rolname = E'SUPERUSER_NAME' WHERE rolname = 'POSTGRES';\n\n",
+		"UPDATE pg_database SET encoding = E'ENCODING', datcollate = E'LC_COLLATE', datctype = E'LC_CTYPE';\n\n",
+		NULL
+	};
+
+	detail_lines = replace_token(final_details, "SUPERUSER_NAME",
+								 escape_quotes(username));
+	detail_lines = replace_token(detail_lines, "ENCODING",
+								 encodingid_to_string(encodingid));
+	detail_lines = replace_token(detail_lines, "LC_COLLATE",
+								 escape_quotes(lc_collate));
+	detail_lines = replace_token(detail_lines, "LC_CTYPE",
+								 escape_quotes(lc_ctype));
+
+	for (line = detail_lines; *line != NULL; line++)
+		PG_CMD_PUTS(*line);
+}
+
 /*
  * clean everything up in template1
  */
@@ -2851,6 +2837,8 @@ initialize_data_directory(void)
 
 	load_plpgsql(cmdfd);
 
+	set_remaining_details(cmdfd);
+
 	vacuum_db(cmdfd);
 
 	make_template0(cmdfd);
diff --git a/src/include/catalog/pg_database.dat b/src/include/catalog/pg_database.dat
index e7e42d6023..c92cdde260 100644
--- a/src/include/catalog/pg_database.dat
+++ b/src/include/catalog/pg_database.dat
@@ -12,11 +12,14 @@
 
 [
 
+# We initialize template1's encoding as PG_SQL_ASCII and its locales as C.
+# initdb will change that during database initialization.
+
 { oid => '1', oid_symbol => 'TemplateDbOid',
   descr => 'default template for new databases',
-  datname => 'template1', encoding => 'ENCODING', datistemplate => 't',
+  datname => 'template1', encoding => '0', datistemplate => 't',
   datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
-  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
-  datctype => 'LC_CTYPE', datacl => '_null_' },
+  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'C',
+  datctype => 'C', datacl => '_null_' },
 
 ]

#11

Andres Freund

andres@anarazel.de

almost 4 years ago

In reply to: Tom Lane (#10)

Re: initdb / bootstrap design

Hi,

On 2022-02-19 18:35:18 -0500, Tom Lane wrote:

Here's an initial patch that gets rid of the need for initdb to
change the contents of postgres.bki before feeding it to the
bootstrap backend. After this, we could look at having the
backend read the file directly.

Cool!

I don't really detect any speed change from getting rid of initdb's
string manipulations, but TBH I was not expecting any. On my machine,
that was lost in the noise already, according to perf(1).

Yea, I'd not expect much either. The slowdown around the string stuff that I
did see was on windows.

I would however expect some, but not huge, speedup by getting rid of the
line-by-line reading/writing of postgres.bki, even without moving the handling
to the backend.

A quick way to prototype the moving the handlign to the backend would be to
just call postgres with input redirection from postgres.bki...

+	/*
+	 * Ideally we'd change the superuser name with ALTER USER, but the backend
+	 * will reject that with "session user cannot be renamed", so we must
+	 * cheat.  (In any case, we'd need a function to escape an identifier, not
+	 * a string literal.)  Likewise, we can't change template1's
+	 * locale/encoding without cheating.
+	 */
+	static char *final_details[] = {
+		"UPDATE pg_authid SET rolname = E'SUPERUSER_NAME' WHERE rolname = 'POSTGRES';\n\n",
+		"UPDATE pg_database SET encoding = E'ENCODING', datcollate = E'LC_COLLATE', datctype = E'LC_CTYPE';\n\n",
+		NULL
+	};
+
+	detail_lines = replace_token(final_details, "SUPERUSER_NAME",
+								 escape_quotes(username));
+	detail_lines = replace_token(detail_lines, "ENCODING",
+								 encodingid_to_string(encodingid));
+	detail_lines = replace_token(detail_lines, "LC_COLLATE",
+								 escape_quotes(lc_collate));
+	detail_lines = replace_token(detail_lines, "LC_CTYPE",
+								 escape_quotes(lc_ctype));

Hm, wouldn't it be less code to just use printf?

Greetings,

Andres Freund

#12

Tom Lane

tgl@sss.pgh.pa.us

almost 4 years ago

In reply to: Andres Freund (#11)

Re: initdb / bootstrap design

Andres Freund <andres@anarazel.de> writes:

A quick way to prototype the moving the handlign to the backend would be to
just call postgres with input redirection from postgres.bki...

Hmm. I was thinking of inventing an include-file command in the
BKI language, and making initdb just send an INCLUDE command.
That's arguably overkill for the immediate need, but it looks like it
requires just a few lines of code (flex provides pretty much all of the
infrastructure already), and maybe we'd find another use for it later.

However, redirection does sound like a very easy answer ...

Hm, wouldn't it be less code to just use printf?

Meh --- it'd be different from the way we do it in the rest
of initdb, and it would not be "less code". Maybe it'd run
a shade faster, but I refuse to believe that that'd be
enough to matter.

regards, tom lane

#13

Andres Freund

andres@anarazel.de

almost 4 years ago

In reply to: Tom Lane (#12)

Re: initdb / bootstrap design

Hi,

On February 19, 2022 4:39:38 PM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

A quick way to prototype the moving the handlign to the backend would be to
just call postgres with input redirection from postgres.bki...

Hmm. I was thinking of inventing an include-file command in the
BKI language, and making initdb just send an INCLUDE command.
That's arguably overkill for the immediate need, but it looks like it
requires just a few lines of code (flex provides pretty much all of the
infrastructure already), and maybe we'd find another use for it later.

However, redirection does sound like a very easy answer ...

Medium term I'd rather do neither, because I'd like to avoid the restart in-between bootstrap and the various sql files. But short term redirection redirection might be good enough - it does mostly work on windows I think ...

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#14

Tom Lane

tgl@sss.pgh.pa.us

almost 4 years ago

In reply to: Tom Lane (#12)

1 attachment(s)

Re: initdb / bootstrap design

I wrote:

However, redirection does sound like a very easy answer ...

I tried it like that (full patch attached) and the results are
intensely disappointing. On my Mac laptop, the time needed for
50 iterations of initdb drops from 16.8 sec to 16.75 sec.
On my RHEL8 workstation, the change is actually in the wrong
direction, from 18.75s to 18.9s. I conclude that the time
spent on postgres.bki data transfer is so far down in the noise
as to be overwhelmed by irrelevancies. (Which, in fact, is
what perf told me before I started --- but I'd hoped that the
number of system calls would diminish noticeably. Seems not.)

Not sure that this is worth pursuing any further.

regards, tom lane

Attachments:

make-backend-read-postgres.bki-directly-1.patchtext/x-diff; charset=us-ascii; name=make-backend-read-postgres.bki-directly-1.patchDownload

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 9fa8fdd4cf..667c829064 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -635,6 +635,8 @@ InsertOneTuple(void)
 
 /* ----------------
  *		InsertOneValue
+ *
+ * Fill the i'th column of the current tuple with the given value.
  * ----------------
  */
 void
@@ -653,6 +655,21 @@ InsertOneValue(char *value, int i)
 
 	elog(DEBUG4, "inserting column %d value \"%s\"", i, value);
 
+	/*
+	 * In order to make the contents of postgres.bki architecture-independent,
+	 * certain values in it are represented symbolically, and we perform the
+	 * necessary replacements here.
+	 */
+	if (strcmp(value, "NAMEDATALEN") == 0)
+		value = CppAsString2(NAMEDATALEN);
+	else if (strcmp(value, "SIZEOF_POINTER") == 0)
+		value = CppAsString2(SIZEOF_VOID_P);
+	else if (strcmp(value, "ALIGNOF_POINTER") == 0)
+		value = (SIZEOF_VOID_P == 4) ? "i" : "d";
+	else if (strcmp(value, "FLOAT8PASSBYVAL") == 0)
+		value = FLOAT8PASSBYVAL ? "true" : "false";
+
+	/* Now convert the value to internal form */
 	typoid = TupleDescAttr(boot_reldesc->rd_att, i)->atttypid;
 
 	boot_get_type_io_data(typoid,
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 97f15971e2..9850f342bf 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -265,13 +265,13 @@ static void setup_privileges(FILE *cmdfd);
 static void set_info_version(void);
 static void setup_schema(FILE *cmdfd);
 static void load_plpgsql(FILE *cmdfd);
+static void set_remaining_details(FILE *cmdfd);
 static void vacuum_db(FILE *cmdfd);
 static void make_template0(FILE *cmdfd);
 static void make_postgres(FILE *cmdfd);
 static void trapsig(int signum);
 static void check_ok(void);
 static char *escape_quotes(const char *src);
-static char *escape_quotes_bki(const char *src);
 static int	locale_date_order(const char *locale);
 static void check_locale_name(int category, const char *locale,
 							  char **canonname);
@@ -336,32 +336,6 @@ escape_quotes(const char *src)
 	return result;
 }
 
-/*
- * Escape a field value to be inserted into the BKI data.
- * Run the value through escape_quotes (which will be inverted
- * by the backend's DeescapeQuotedString() function), then wrap
- * the value in single quotes, even if that isn't strictly necessary.
- */
-static char *
-escape_quotes_bki(const char *src)
-{
-	char	   *result;
-	char	   *data = escape_quotes(src);
-	char	   *resultp;
-	char	   *datap;
-
-	result = (char *) pg_malloc(strlen(data) + 3);
-	resultp = result;
-	*resultp++ = '\'';
-	for (datap = data; *datap; datap++)
-		*resultp++ = *datap;
-	*resultp++ = '\'';
-	*resultp = '\0';
-
-	free(data);
-	return result;
-}
-
 /*
  * make a copy of the array of lines, with token replaced by replacement
  * the first time it occurs on each line.
@@ -1354,22 +1328,26 @@ static void
 bootstrap_template1(void)
 {
 	PG_CMD_DECL;
-	char	  **line;
-	char	  **bki_lines;
+	FILE	   *infile;
 	char		headerline[MAXPGPATH];
-	char		buf[64];
+	char		firstline[MAXPGPATH];
 
 	printf(_("running bootstrap script ... "));
 	fflush(stdout);
 
-	bki_lines = readfile(bki_file);
-
 	/* Check that bki file appears to be of the right version */
 
 	snprintf(headerline, sizeof(headerline), "# PostgreSQL %s\n",
 			 PG_MAJORVERSION);
 
-	if (strcmp(headerline, *bki_lines) != 0)
+	if ((infile = fopen(bki_file, "r")) == NULL)
+	{
+		pg_log_error("could not open file \"%s\" for reading: %m", bki_file);
+		exit(1);
+	}
+
+	if (fgets(firstline, sizeof(firstline), infile) == NULL ||
+		strcmp(headerline, firstline) != 0)
 	{
 		pg_log_error("input file \"%s\" does not belong to PostgreSQL %s",
 					 bki_file, PG_VERSION);
@@ -1379,56 +1357,24 @@ bootstrap_template1(void)
 		exit(1);
 	}
 
-	/* Substitute for various symbols used in the BKI file */
-
-	sprintf(buf, "%d", NAMEDATALEN);
-	bki_lines = replace_token(bki_lines, "NAMEDATALEN", buf);
-
-	sprintf(buf, "%d", (int) sizeof(Pointer));
-	bki_lines = replace_token(bki_lines, "SIZEOF_POINTER", buf);
-
-	bki_lines = replace_token(bki_lines, "ALIGNOF_POINTER",
-							  (sizeof(Pointer) == 4) ? "i" : "d");
-
-	bki_lines = replace_token(bki_lines, "FLOAT8PASSBYVAL",
-							  FLOAT8PASSBYVAL ? "true" : "false");
-
-	bki_lines = replace_token(bki_lines, "POSTGRES",
-							  escape_quotes_bki(username));
-
-	bki_lines = replace_token(bki_lines, "ENCODING",
-							  encodingid_to_string(encodingid));
-
-	bki_lines = replace_token(bki_lines, "LC_COLLATE",
-							  escape_quotes_bki(lc_collate));
-
-	bki_lines = replace_token(bki_lines, "LC_CTYPE",
-							  escape_quotes_bki(lc_ctype));
+	fclose(infile);
 
 	/* Also ensure backend isn't confused by this environment var: */
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s <\"%s\"",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
 			 boot_options, extra_options,
-			 debug ? "-d 5" : "");
-
+			 debug ? "-d 5" : "",
+			 bki_file);
 
 	PG_CMD_OPEN;
-
-	for (line = bki_lines; *line != NULL; line++)
-	{
-		PG_CMD_PUTS(*line);
-		free(*line);
-	}
-
+	/* Nothing to write, since backend reads bki_file directly */
 	PG_CMD_CLOSE;
 
-	free(bki_lines);
-
 	check_ok();
 }
 
@@ -1622,12 +1568,11 @@ setup_collation(FILE *cmdfd)
 static void
 setup_privileges(FILE *cmdfd)
 {
-	char	  **line;
-	char	  **priv_lines;
-	static char *privileges_setup[] = {
+	const char *const *line;
+	static const char *const privileges_setup[] = {
 		"UPDATE pg_class "
 		"  SET relacl = (SELECT array_agg(a.acl) FROM "
-		" (SELECT E'=r/\"$POSTGRES_SUPERUSERNAME\"' as acl "
+		" (SELECT '=r/\"POSTGRES\"' as acl "
 		"  UNION SELECT unnest(pg_catalog.acldefault("
 		"    CASE WHEN relkind = " CppAsString2(RELKIND_SEQUENCE) " THEN 's' "
 		"         ELSE 'r' END::\"char\"," CppAsString2(BOOTSTRAP_SUPERUSERID) "::oid))"
@@ -1759,9 +1704,7 @@ setup_privileges(FILE *cmdfd)
 		NULL
 	};
 
-	priv_lines = replace_token(privileges_setup, "$POSTGRES_SUPERUSERNAME",
-							   escape_quotes(username));
-	for (line = priv_lines; *line != NULL; line++)
+	for (line = privileges_setup; *line != NULL; line++)
 		PG_CMD_PUTS(*line);
 }
 
@@ -1822,6 +1765,48 @@ load_plpgsql(FILE *cmdfd)
 	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n\n");
 }
 
+/*
+ * Set some remaining details that aren't known when postgres.bki is made.
+ *
+ * Up to now, the bootstrap superuser has been named "POSTGRES".
+ * Replace that with the user-specified name (often "postgres").
+ * Also, insert the desired locale and encoding details in pg_database.
+ *
+ * Note: this must run after setup_privileges(), which expects the superuser
+ * name to still be "POSTGRES".
+ */
+static void
+set_remaining_details(FILE *cmdfd)
+{
+	char	  **line;
+	char	  **detail_lines;
+
+	/*
+	 * Ideally we'd change the superuser name with ALTER USER, but the backend
+	 * will reject that with "session user cannot be renamed", so we must
+	 * cheat.  (In any case, we'd need a function to escape an identifier, not
+	 * a string literal.)  Likewise, we can't change template1's
+	 * locale/encoding without cheating.
+	 */
+	static char *final_details[] = {
+		"UPDATE pg_authid SET rolname = E'SUPERUSER_NAME' WHERE rolname = 'POSTGRES';\n\n",
+		"UPDATE pg_database SET encoding = E'ENCODING', datcollate = E'LC_COLLATE', datctype = E'LC_CTYPE';\n\n",
+		NULL
+	};
+
+	detail_lines = replace_token(final_details, "SUPERUSER_NAME",
+								 escape_quotes(username));
+	detail_lines = replace_token(detail_lines, "ENCODING",
+								 encodingid_to_string(encodingid));
+	detail_lines = replace_token(detail_lines, "LC_COLLATE",
+								 escape_quotes(lc_collate));
+	detail_lines = replace_token(detail_lines, "LC_CTYPE",
+								 escape_quotes(lc_ctype));
+
+	for (line = detail_lines; *line != NULL; line++)
+		PG_CMD_PUTS(*line);
+}
+
 /*
  * clean everything up in template1
  */
@@ -2851,6 +2836,8 @@ initialize_data_directory(void)
 
 	load_plpgsql(cmdfd);
 
+	set_remaining_details(cmdfd);
+
 	vacuum_db(cmdfd);
 
 	make_template0(cmdfd);
diff --git a/src/include/catalog/pg_database.dat b/src/include/catalog/pg_database.dat
index e7e42d6023..87aad30146 100644
--- a/src/include/catalog/pg_database.dat
+++ b/src/include/catalog/pg_database.dat
@@ -12,11 +12,15 @@
 
 [
 
+# We initialize template1's encoding as PG_SQL_ASCII and its locales as C.
+# initdb will change that during database initialization; however, the
+# post-bootstrap initialization session will run with those values.
+
 { oid => '1', oid_symbol => 'TemplateDbOid',
   descr => 'default template for new databases',
-  datname => 'template1', encoding => 'ENCODING', datistemplate => 't',
+  datname => 'template1', encoding => 'PG_SQL_ASCII', datistemplate => 't',
   datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
-  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
-  datctype => 'LC_CTYPE', datacl => '_null_' },
+  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'C',
+  datctype => 'C', datacl => '_null_' },
 
 ]
diff --git a/src/include/catalog/pg_database.h b/src/include/catalog/pg_database.h
index 76adbd4aad..9fd424d58d 100644
--- a/src/include/catalog/pg_database.h
+++ b/src/include/catalog/pg_database.h
@@ -38,7 +38,7 @@ CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID
 	Oid			datdba BKI_DEFAULT(POSTGRES) BKI_LOOKUP(pg_authid);
 
 	/* character encoding */
-	int32		encoding;
+	int32		encoding BKI_LOOKUP(encoding);
 
 	/* allowed as CREATE DATABASE template? */
 	bool		datistemplate;

#15

Andres Freund

andres@anarazel.de

almost 4 years ago

In reply to: Tom Lane (#14)

5 attachment(s)

Re: initdb / bootstrap design

Hi,

On 2022-02-19 20:46:26 -0500, Tom Lane wrote:

I tried it like that (full patch attached) and the results are intensely
disappointing. On my Mac laptop, the time needed for 50 iterations of
initdb drops from 16.8 sec to 16.75 sec.

Hm. I'd hoped for at least a little bit bigger win. But I think it enables
more, see below:

Not sure that this is worth pursuing any further.

I experimented with moving all the bootstrapping into --boot mode and got it
working. Albeit definitely with a few hacks (more below).

While I had hoped for a bit more of a win, it's IMO a nice improvement.
Executing 10 initdb -N --wal-segsize 1 in a loop:

HEAD:

assert:
8.06user 1.17system 0:09.25elapsed 99%CPU (0avgtext+0avgdata 91724maxresident)k
0inputs+549280outputs (40major+99824minor)pagefaults 0swaps

opt:
2.89user 0.99system 0:04.81elapsed 80%CPU (0avgtext+0avgdata 88864maxresident)k
0inputs+549280outputs (40major+99792minor)pagefaults 0swaps

default to lz4:

assert:
7.61user 1.03system 0:08.69elapsed 99%CPU (0avgtext+0avgdata 91508maxresident)k
0inputs+546400outputs (42major+99551minor)pagefaults 0swaps

opt:
2.55user 0.94system 0:03.49elapsed 99%CPU (0avgtext+0avgdata 88816maxresident)k
0inputs+546400outputs (40major+99551minor)pagefaults 0swaps

bootstrap replace:

assert:
7.42user 1.00system 0:08.52elapsed 98%CPU (0avgtext+0avgdata 91656maxresident)k
0inputs+546400outputs (40major+97737minor)pagefaults 0swaps

opt:
2.49user 0.98system 0:03.49elapsed 99%CPU (0avgtext+0avgdata 88700maxresident)k
0inputs+546400outputs (40major+97728minor)pagefaults 0swaps

everything in bootstrap:

assert:
6.31user 0.94system 0:07.35elapsed 98%CPU (0avgtext+0avgdata 97812maxresident)k
0inputs+547360outputs (30major+88617minor)pagefaults 0swaps

opt:
2.42user 0.85system 0:03.28elapsed 99%CPU (0avgtext+0avgdata 94572maxresident)k
0inputs+547360outputs (30major+83712minor)pagefaults 0swaps

optimize WAL in bootstrap:
assert:
6.26user 0.96system 0:07.29elapsed 99%CPU (0avgtext+0avgdata 97844maxresident)k
0inputs+547360outputs (30major+88586minor)pagefaults 0swaps

opt:
2.43user 0.80system 0:03.24elapsed 99%CPU (0avgtext+0avgdata 94436maxresident)k
0inputs+547360outputs (30major+83664minor)pagefaults 0swaps

remote isatty in bootstrap:

assert:
6.15user 0.83system 0:06.99elapsed 99%CPU (0avgtext+0avgdata 97832maxresident)k
0inputs+465120outputs (30major+88559minor)pagefaults 0swaps

opt:
2.28user 0.85system 0:03.14elapsed 99%CPU (0avgtext+0avgdata 94604maxresident)k
0inputs+465120outputs (30major+83728minor)pagefaults 0swaps

That's IMO not bad.

On windows I see a higher gains, which makes sense, because filesystem IO is
slower. Freebsd as well, but the variance is oddly high, so I might be doing
something wrong.

The main reason I like this however isn't the speedup itself, but that after
this initdb doesn't depend on single user mode at all anymore.

About the prototype:

- Most of the bootstrap SQL is executed from bootstrap.c itself. But some
still comes from the client. E.g. password, a few information_schema
details and the database / authid changes.

- To execute the sql I mostly used extension.c's
read_whole_file()/execute_sql_string(). But VACUUM, CREATE DATABASE require
all the transactional hacks in portal.c etc. So I wrapped
exec_simple_query() for that phase.

Might be better to just call vacuum.c / database.c directly.

- for indexed relcache access to work the phase of
RelationCacheInitializePhase3() that's initially skipped needs to be
executed. I hacked that up by adding a RelationCacheInitializePhase3b() that
bootstrap.c can call, but that's obviously too ugly to live.

- InvalidateSystemCaches() is needed after bki processing. Otherwise I see an
"row is too big:" error. Didn't investigate yet.

- I definitely removed some validation that we'd probably want. But that seems
something to care about later...

- 0004 prevents a fair bit of WAL from being written. While XLogInsert did
some of that, it didn't block FPIs, which obviously are bulky. This reduces
WAL from ~5MB to ~100kB.

There's quite a bit of further speedup potential:

- One bottleneck, particularly in optimized mode, is the handling of huge node
trees for views. strToNode() and nodeRead() are > 10% alone

- Enabling index access sometime during the postgres.bki processing would make
invalidation handling for subsequent indexes faster. Or maybe we can disable
a few more invalidations. Inval processing is >10%

- more than 10% (assert) / 7% (optimized) is spent in
compute_scalar_stats()->qsort_arg(). Something seems off with that to me.

Completely crazy?

Greetings,

Andres Freund

Attachments:

v1-0001-Set-default_toast_compression-lz4-if-available.patchtext/x-diff; charset=us-asciiDownload

From 45d63168ddeb8bdf3ed29ca150f453ffcd051697 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 20 Feb 2022 12:20:42 -0800
Subject: [PATCH v1 1/5] Set default_toast_compression=lz4 if available.

Makes initdb faster, generally a good idea, users shouldn't have to bother
with this.

Author: Justin Pryzby <pryzbyj@telsasoft.com>
Discussion: https://postgr.es/m/20220216212952.GH31460@telsasoft.com
---
 src/backend/utils/misc/guc.c | 4 ++++
 src/bin/initdb/initdb.c      | 6 ++++++
 doc/src/sgml/config.sgml     | 4 +++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 01f373815e0..f502f9840f5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -4727,7 +4727,11 @@ static struct config_enum ConfigureNamesEnum[] =
 			NULL
 		},
 		&default_toast_compression,
+#ifdef USE_LZ4
+		TOAST_LZ4_COMPRESSION,
+#else
 		TOAST_PGLZ_COMPRESSION,
+#endif
 		default_toast_compression_options,
 		NULL, NULL, NULL
 	},
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 97f15971e2b..73ccbf63207 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1185,6 +1185,12 @@ setup_config(void)
 							  "#update_process_title = off");
 #endif
 
+#ifdef USE_LZ4
+	conflines = replace_token(conflines,
+							  "#default_toast_compression = 'pglz'",
+							  "#default_toast_compression = 'lz4'");
+#endif
+
 	/*
 	 * Change password_encryption setting to md5 if md5 was chosen as an
 	 * authentication method, unless scram-sha-256 was also chosen.
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d99bf38e677..97e78506b13 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8536,7 +8536,9 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
         The supported compression methods are <literal>pglz</literal> and
         (if <productname>PostgreSQL</productname> was compiled with
         <option>--with-lz4</option>) <literal>lz4</literal>.
-        The default is <literal>pglz</literal>.
+        The default is <literal>lz4</literal> if available at the time 
+        <productname>PostgreSQL</productname> was compiled, otherwise
+        <literal>pglz</literal>.
        </para>
       </listitem>
      </varlistentry>
-- 
2.34.0

v1-0002-initdb-move-token-replacing-in-postgres.bki-to-ba.patchtext/x-diff; charset=us-asciiDownload

From f3ea20b09c3e66c3c3e86e729e1920ac26ffd706 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sat, 19 Feb 2022 17:33:34 -0800
Subject: [PATCH v1 2/5] initdb: move token replacing in postgres.bki to
 backend.

Author: Tom Lane
---
 src/include/catalog/pg_database.dat |   9 ++-
 src/backend/bootstrap/bootstrap.c   |  17 +++++
 src/bin/initdb/initdb.c             | 110 +++++++++++++---------------
 3 files changed, 72 insertions(+), 64 deletions(-)

diff --git a/src/include/catalog/pg_database.dat b/src/include/catalog/pg_database.dat
index e7e42d60234..c92cdde2600 100644
--- a/src/include/catalog/pg_database.dat
+++ b/src/include/catalog/pg_database.dat
@@ -12,11 +12,14 @@
 
 [
 
+# We initialize template1's encoding as PG_SQL_ASCII and its locales as C.
+# initdb will change that during database initialization.
+
 { oid => '1', oid_symbol => 'TemplateDbOid',
   descr => 'default template for new databases',
-  datname => 'template1', encoding => 'ENCODING', datistemplate => 't',
+  datname => 'template1', encoding => '0', datistemplate => 't',
   datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
-  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
-  datctype => 'LC_CTYPE', datacl => '_null_' },
+  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'C',
+  datctype => 'C', datacl => '_null_' },
 
 ]
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 9fa8fdd4cf3..667c829064d 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -635,6 +635,8 @@ InsertOneTuple(void)
 
 /* ----------------
  *		InsertOneValue
+ *
+ * Fill the i'th column of the current tuple with the given value.
  * ----------------
  */
 void
@@ -653,6 +655,21 @@ InsertOneValue(char *value, int i)
 
 	elog(DEBUG4, "inserting column %d value \"%s\"", i, value);
 
+	/*
+	 * In order to make the contents of postgres.bki architecture-independent,
+	 * certain values in it are represented symbolically, and we perform the
+	 * necessary replacements here.
+	 */
+	if (strcmp(value, "NAMEDATALEN") == 0)
+		value = CppAsString2(NAMEDATALEN);
+	else if (strcmp(value, "SIZEOF_POINTER") == 0)
+		value = CppAsString2(SIZEOF_VOID_P);
+	else if (strcmp(value, "ALIGNOF_POINTER") == 0)
+		value = (SIZEOF_VOID_P == 4) ? "i" : "d";
+	else if (strcmp(value, "FLOAT8PASSBYVAL") == 0)
+		value = FLOAT8PASSBYVAL ? "true" : "false";
+
+	/* Now convert the value to internal form */
 	typoid = TupleDescAttr(boot_reldesc->rd_att, i)->atttypid;
 
 	boot_get_type_io_data(typoid,
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 73ccbf63207..37ac928b2ef 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -265,13 +265,13 @@ static void setup_privileges(FILE *cmdfd);
 static void set_info_version(void);
 static void setup_schema(FILE *cmdfd);
 static void load_plpgsql(FILE *cmdfd);
+static void set_remaining_details(FILE *cmdfd);
 static void vacuum_db(FILE *cmdfd);
 static void make_template0(FILE *cmdfd);
 static void make_postgres(FILE *cmdfd);
 static void trapsig(int signum);
 static void check_ok(void);
 static char *escape_quotes(const char *src);
-static char *escape_quotes_bki(const char *src);
 static int	locale_date_order(const char *locale);
 static void check_locale_name(int category, const char *locale,
 							  char **canonname);
@@ -336,32 +336,6 @@ escape_quotes(const char *src)
 	return result;
 }
 
-/*
- * Escape a field value to be inserted into the BKI data.
- * Run the value through escape_quotes (which will be inverted
- * by the backend's DeescapeQuotedString() function), then wrap
- * the value in single quotes, even if that isn't strictly necessary.
- */
-static char *
-escape_quotes_bki(const char *src)
-{
-	char	   *result;
-	char	   *data = escape_quotes(src);
-	char	   *resultp;
-	char	   *datap;
-
-	result = (char *) pg_malloc(strlen(data) + 3);
-	resultp = result;
-	*resultp++ = '\'';
-	for (datap = data; *datap; datap++)
-		*resultp++ = *datap;
-	*resultp++ = '\'';
-	*resultp = '\0';
-
-	free(data);
-	return result;
-}
-
 /*
  * make a copy of the array of lines, with token replaced by replacement
  * the first time it occurs on each line.
@@ -1363,7 +1337,6 @@ bootstrap_template1(void)
 	char	  **line;
 	char	  **bki_lines;
 	char		headerline[MAXPGPATH];
-	char		buf[64];
 
 	printf(_("running bootstrap script ... "));
 	fflush(stdout);
@@ -1385,32 +1358,6 @@ bootstrap_template1(void)
 		exit(1);
 	}
 
-	/* Substitute for various symbols used in the BKI file */
-
-	sprintf(buf, "%d", NAMEDATALEN);
-	bki_lines = replace_token(bki_lines, "NAMEDATALEN", buf);
-
-	sprintf(buf, "%d", (int) sizeof(Pointer));
-	bki_lines = replace_token(bki_lines, "SIZEOF_POINTER", buf);
-
-	bki_lines = replace_token(bki_lines, "ALIGNOF_POINTER",
-							  (sizeof(Pointer) == 4) ? "i" : "d");
-
-	bki_lines = replace_token(bki_lines, "FLOAT8PASSBYVAL",
-							  FLOAT8PASSBYVAL ? "true" : "false");
-
-	bki_lines = replace_token(bki_lines, "POSTGRES",
-							  escape_quotes_bki(username));
-
-	bki_lines = replace_token(bki_lines, "ENCODING",
-							  encodingid_to_string(encodingid));
-
-	bki_lines = replace_token(bki_lines, "LC_COLLATE",
-							  escape_quotes_bki(lc_collate));
-
-	bki_lines = replace_token(bki_lines, "LC_CTYPE",
-							  escape_quotes_bki(lc_ctype));
-
 	/* Also ensure backend isn't confused by this environment var: */
 	unsetenv("PGCLIENTENCODING");
 
@@ -1628,12 +1575,11 @@ setup_collation(FILE *cmdfd)
 static void
 setup_privileges(FILE *cmdfd)
 {
-	char	  **line;
-	char	  **priv_lines;
-	static char *privileges_setup[] = {
+	const char *const *line;
+	static const char *const privileges_setup[] = {
 		"UPDATE pg_class "
 		"  SET relacl = (SELECT array_agg(a.acl) FROM "
-		" (SELECT E'=r/\"$POSTGRES_SUPERUSERNAME\"' as acl "
+		" (SELECT '=r/\"POSTGRES\"' as acl "
 		"  UNION SELECT unnest(pg_catalog.acldefault("
 		"    CASE WHEN relkind = " CppAsString2(RELKIND_SEQUENCE) " THEN 's' "
 		"         ELSE 'r' END::\"char\"," CppAsString2(BOOTSTRAP_SUPERUSERID) "::oid))"
@@ -1765,9 +1711,7 @@ setup_privileges(FILE *cmdfd)
 		NULL
 	};
 
-	priv_lines = replace_token(privileges_setup, "$POSTGRES_SUPERUSERNAME",
-							   escape_quotes(username));
-	for (line = priv_lines; *line != NULL; line++)
+	for (line = privileges_setup; *line != NULL; line++)
 		PG_CMD_PUTS(*line);
 }
 
@@ -1828,6 +1772,48 @@ load_plpgsql(FILE *cmdfd)
 	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n\n");
 }
 
+/*
+ * Set some remaining details that aren't known when postgres.bki is made.
+ *
+ * Up to now, the bootstrap superuser has been named "POSTGRES".
+ * Replace that with the user-specified name (often "postgres").
+ * Also, insert the desired locale and encoding details in pg_database.
+ *
+ * Note: this must run after setup_privileges(), which expects the superuser
+ * name to still be "POSTGRES".
+ */
+static void
+set_remaining_details(FILE *cmdfd)
+{
+	char	  **line;
+	char	  **detail_lines;
+
+	/*
+	 * Ideally we'd change the superuser name with ALTER USER, but the backend
+	 * will reject that with "session user cannot be renamed", so we must
+	 * cheat.  (In any case, we'd need a function to escape an identifier, not
+	 * a string literal.)  Likewise, we can't change template1's
+	 * locale/encoding without cheating.
+	 */
+	static char *final_details[] = {
+		"UPDATE pg_authid SET rolname = E'SUPERUSER_NAME' WHERE rolname = 'POSTGRES';\n\n",
+		"UPDATE pg_database SET encoding = E'ENCODING', datcollate = E'LC_COLLATE', datctype = E'LC_CTYPE';\n\n",
+		NULL
+	};
+
+	detail_lines = replace_token(final_details, "SUPERUSER_NAME",
+								 escape_quotes(username));
+	detail_lines = replace_token(detail_lines, "ENCODING",
+								 encodingid_to_string(encodingid));
+	detail_lines = replace_token(detail_lines, "LC_COLLATE",
+								 escape_quotes(lc_collate));
+	detail_lines = replace_token(detail_lines, "LC_CTYPE",
+								 escape_quotes(lc_ctype));
+
+	for (line = detail_lines; *line != NULL; line++)
+		PG_CMD_PUTS(*line);
+}
+
 /*
  * clean everything up in template1
  */
@@ -2857,6 +2843,8 @@ initialize_data_directory(void)
 
 	load_plpgsql(cmdfd);
 
+	set_remaining_details(cmdfd);
+
 	vacuum_db(cmdfd);
 
 	make_template0(cmdfd);
-- 
2.34.0

v1-0003-initdb-perform-everything-during-boot-mostly-in-b.patchtext/x-diff; charset=us-asciiDownload

From ed39cf53788242d6b6990497ae86250dd658d26c Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 20 Feb 2022 11:03:20 -0800
Subject: [PATCH v1 3/5] initdb: perform everything during --boot, mostly in
 backend.

Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 src/include/bootstrap/bootstrap.h   |   1 +
 src/include/commands/extension.h    |   2 +
 src/include/tcop/tcopprot.h         |   1 +
 src/include/utils/relcache.h        |   1 +
 src/backend/bootstrap/bootscanner.l |   6 +
 src/backend/bootstrap/bootstrap.c   | 306 +++++++++++++++-
 src/backend/catalog/Makefile        |   2 +
 src/backend/catalog/description.sql |  16 +
 src/backend/catalog/privileges.sql  | 154 ++++++++
 src/backend/commands/extension.c    |   5 +-
 src/backend/main/main.c             |   1 +
 src/backend/tcop/postgres.c         |  12 +
 src/backend/utils/cache/relcache.c  |  11 +-
 src/bin/initdb/initdb.c             | 532 ++--------------------------
 14 files changed, 535 insertions(+), 515 deletions(-)
 create mode 100644 src/backend/catalog/description.sql
 create mode 100644 src/backend/catalog/privileges.sql

diff --git a/src/include/bootstrap/bootstrap.h b/src/include/bootstrap/bootstrap.h
index 471414909f3..f94e9339373 100644
--- a/src/include/bootstrap/bootstrap.h
+++ b/src/include/bootstrap/bootstrap.h
@@ -54,6 +54,7 @@ extern void boot_get_type_io_data(Oid typid,
 								  Oid *typinput,
 								  Oid *typoutput);
 
+extern void boot_input(FILE *file);
 extern int	boot_yyparse(void);
 
 extern int	boot_yylex(void);
diff --git a/src/include/commands/extension.h b/src/include/commands/extension.h
index e24e3759f0c..cd1668e5865 100644
--- a/src/include/commands/extension.h
+++ b/src/include/commands/extension.h
@@ -48,6 +48,8 @@ extern ObjectAddress ExecAlterExtensionContentsStmt(AlterExtensionContentsStmt *
 extern Oid	get_extension_oid(const char *extname, bool missing_ok);
 extern char *get_extension_name(Oid ext_oid);
 extern bool extension_file_exists(const char *extensionName);
+extern void execute_sql_string(const char *sql);
+extern char *read_whole_file(const char *filename, int *length);
 
 extern ObjectAddress AlterExtensionNamespace(const char *extensionName, const char *newschema,
 											 Oid *oldschema);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 15a11bc3ff1..47c8ff283e4 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -60,6 +60,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
 extern List *pg_plan_queries(List *querytrees, const char *query_string,
 							 int cursorOptions,
 							 ParamListInfo boundParams);
+extern void exec_simple_query_bootstrap(const char *query_string);
 
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 84d6afef19b..e400146f648 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -93,6 +93,7 @@ extern int	errtableconstraint(Relation rel, const char *conname);
 extern void RelationCacheInitialize(void);
 extern void RelationCacheInitializePhase2(void);
 extern void RelationCacheInitializePhase3(void);
+extern void RelationCacheInitializePhase3b(bool needNewCacheFile);
 
 /*
  * Routine to create a relcache entry for an about-to-be-created relation
diff --git a/src/backend/bootstrap/bootscanner.l b/src/backend/bootstrap/bootscanner.l
index 3094ccb93f4..72c3f40a88f 100644
--- a/src/backend/bootstrap/bootscanner.l
+++ b/src/backend/bootstrap/bootscanner.l
@@ -125,3 +125,9 @@ yyerror(const char *message)
 {
 	elog(ERROR, "%s at line %d", message, yyline);
 }
+
+void
+boot_input(FILE *file)
+{
+	yyin = file;
+}
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 667c829064d..3fd8ed60715 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -2,7 +2,10 @@
  *
  * bootstrap.c
  *	  routines to support running postgres in 'bootstrap' mode
- *	bootstrap mode is used to create the initial template database
+ *
+ * bootstrap mode is used to create the initial template1 database, perform
+ * additional initialization it via SQL scripts, and then create template0,
+ * postgres from template1.
  *
  * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -26,10 +29,14 @@
 #include "access/xlog_internal.h"
 #include "bootstrap/bootstrap.h"
 #include "catalog/index.h"
+#include "catalog/pg_authid_d.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "commands/extension.h"
 #include "common/link-canary.h"
+#include "common/string.h"
 #include "libpq/pqsignal.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "pg_getopt.h"
@@ -41,6 +48,7 @@
 #include "tcop/tcopprot.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
+#include "utils/inval.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
 #include "utils/relmapper.h"
@@ -55,6 +63,12 @@ static void populate_typ_list(void);
 static Oid	gettype(char *type);
 static void cleanup(void);
 
+static void bootstrap_load_nonbki(const char *share_path);
+static void bootstrap_create_databases(void);
+
+static void exec_sql(const char *share_path, const char *str);
+static void exec_sql_file(const char *share_path, const char *str);
+
 /* ----------------
  *		global variables
  * ----------------
@@ -206,6 +220,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	char	   *progname = argv[0];
 	int			flag;
 	char	   *userDoption = NULL;
+	char	   *share_path = NULL;
 
 	Assert(!IsUnderPostmaster);
 
@@ -221,7 +236,12 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	argv++;
 	argc--;
 
-	while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
+	/*
+	 * XXX: -s for share_path is probably a bad choice, it conflicts with a
+	 * normal postgres option. Also, should probably just determine share path
+	 * ourselves.
+	 */
+	while ((flag = getopt(argc, argv, "B:c:d:D:s:Fkr:X:-:")) != -1)
 	{
 		switch (flag)
 		{
@@ -247,6 +267,9 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 			case 'F':
 				SetConfigOption("fsync", "false", PGC_POSTMASTER, PGC_S_ARGV);
 				break;
+			case 's':
+			    share_path = optarg;
+				break;
 			case 'k':
 				bootstrap_data_checksum_version = PG_DATA_CHECKSUM_VERSION;
 				break;
@@ -338,6 +361,12 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 		abort();
 	}
 
+	if (share_path == NULL)
+	{
+		write_stderr("%s: -s is required in --boot mode\n", progname);
+		proc_exit(1);
+	}
+
 	/*
 	 * Do backend-like initialization for bootstrap mode
 	 */
@@ -365,11 +394,34 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	}
 
 	/*
-	 * Process bootstrap input.
+	 * Process bootstrap file to create initial template1 contents.
 	 */
-	StartTransactionCommand();
-	boot_yyparse();
-	CommitTransactionCommand();
+	{
+		char bootstrap_file[MAXPGPATH];
+		FILE *boot;
+		instr_time start_ts, boot_ts;
+
+		INSTR_TIME_SET_CURRENT(start_ts);
+
+		sprintf(bootstrap_file, "%s/%s", share_path, "postgres.bki");
+
+		boot = fopen(bootstrap_file, "r");
+		if (boot == NULL)
+			elog(ERROR, "could not open bootstrap file \"%s\": %m",
+				 bootstrap_file);
+		boot_input(boot);
+
+		StartTransactionCommand();
+		boot_yyparse();
+		CommitTransactionCommand();
+
+		fclose(boot);
+
+		INSTR_TIME_SET_CURRENT(boot_ts);
+
+		elog(LOG, "boot in %.3f ms",
+			 (INSTR_TIME_GET_DOUBLE(boot_ts) - INSTR_TIME_GET_DOUBLE(start_ts)) * 1000);
+	}
 
 	/*
 	 * We should now know about all mapped relations, so it's okay to write
@@ -377,11 +429,251 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	 */
 	RelationMapFinishBootstrap();
 
-	/* Clean up and exit */
+	/* Clean up and exit FIXME */
 	cleanup();
+
+	/*
+	 * Now that the catalog is populated with crucial contents, bring up
+	 * system caches into a fully valid state.
+	 */
+
+	SetProcessingMode(InitProcessing);
+
+	IgnoreSystemIndexes = false;
+
+	StartTransactionCommand();
+	/* Seeing odd "row is too big:" failures without */
+	InvalidateSystemCaches();
+	/* FIXME: speechless-making API */
+	RelationCacheInitializePhase3b(true);
+	CommitTransactionCommand();
+
+	/*
+	 * Load further catalog contents by running a bunch of SQL commands.
+	 */
+	SetProcessingMode(NormalProcessing);
+
+	bootstrap_load_nonbki(share_path);
+
+	bootstrap_create_databases();
+
 	proc_exit(0);
 }
 
+/*
+ * Create template0 and postgres from template1.
+ *
+ * XXX: Several of the statements contain commands that cannot be executed in
+ * a transaction (VACUUM, CREATE DATABASE) and thus require a fairly
+ * complicated dance to maintain correct state. The easiest is to just rely on
+ * exec_simple_query() (via a wrapper) for that. Don't want to do that for
+ * everything else, because it's considerably faster to use exec_sql().
+ */
+static void
+bootstrap_create_databases(void)
+{
+	/*
+	 * pg_upgrade tries to preserve database OIDs across upgrades. It's smart
+	 * enough to drop and recreate a conflicting database with the same name,
+	 * but if the same OID were used for one system-created database in the
+	 * old cluster and a different system-created database in the new cluster,
+	 * it would fail. To avoid that, assign a fixed OID to template0 rather
+	 * than letting the server choose one.
+	 *
+	 * (Note that, while the user could have dropped and recreated these
+	 * objects in the old cluster, the problem scenario only exists if the OID
+	 * that is in use in the old cluster is also used in the new cluster - and
+	 * the new cluster should be the result of a fresh initdb.)
+	 */
+	static const char *const template0_setup[] = {
+		"CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false OID = "
+		CppAsString2(Template0ObjectId) ";\n",
+
+		/*
+		 * template0 shouldn't have any collation-dependent objects, so unset
+		 * the collation version.  This disables collation version checks when
+		 * making a new database from it.
+		 */
+		"UPDATE pg_database SET datcollversion = NULL WHERE datname = 'template0';\n",
+
+		/*
+		 * While we are here, do set the collation version on template1.
+		 */
+		"UPDATE pg_database SET datcollversion = pg_database_collation_actual_version(oid) WHERE datname = 'template1';\n",
+
+		/*
+		 * Explicitly revoke public create-schema and create-temp-table
+		 * privileges in template1 and template0; else the latter would be on
+		 * by default
+		 */
+		"REVOKE CREATE,TEMPORARY ON DATABASE template1 FROM public;\n",
+		"REVOKE CREATE,TEMPORARY ON DATABASE template0 FROM public;\n",
+
+		"COMMENT ON DATABASE template0 IS 'unmodifiable empty database';\n",
+		NULL
+	};
+
+	/* Assign a fixed OID to postgres, for the same reasons as template0 */
+	static const char *const postgres_setup[] = {
+		"CREATE DATABASE postgres OID = " CppAsString2(PostgresObjectId) ";\n",
+		"COMMENT ON DATABASE postgres IS 'default administrative connection database';\n",
+		NULL
+	};
+	instr_time start_ts, created_ts;
+
+	INSTR_TIME_SET_CURRENT(start_ts);
+
+
+	MessageContext = AllocSetContextCreate(TopMemoryContext,
+										   "MessageContext",
+										   ALLOCSET_DEFAULT_SIZES);
+
+	/*
+	 * clean everything up in template1
+	 */
+	exec_simple_query_bootstrap("VACUUM FREEZE");
+
+	/*
+	 * copy template1 to template0
+	 */
+	for (const char *const *line = template0_setup; *line; line++)
+		exec_simple_query_bootstrap(*line);
+
+	/*
+	 * copy template1 to postgres
+	 */
+	for (const char *const *line = postgres_setup; *line; line++)
+		exec_simple_query_bootstrap(*line);
+
+	/*
+	 * Finally vacuum to clean up dead rows in pg_database
+	 */
+	exec_simple_query_bootstrap("VACUUM pg_database");
+
+	MemoryContextDelete(MessageContext);
+	MessageContext = NULL;
+
+	INSTR_TIME_SET_CURRENT(created_ts);
+
+	elog(LOG, "created template0 and postgres in %.3f ms",
+		 (INSTR_TIME_GET_DOUBLE(created_ts) - INSTR_TIME_GET_DOUBLE(start_ts)) * 1000);
+}
+
+static void
+bootstrap_load_nonbki(const char *share_path)
+{
+	StringInfoData sql;
+
+	initStringInfo(&sql);
+
+	StartTransactionCommand();
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	exec_sql_file(share_path, "system_constraints.sql");
+	exec_sql_file(share_path, "system_functions.sql");
+
+	/*
+	 * set up the shadow password table
+	 */
+	exec_sql("pg_authid", "REVOKE ALL ON pg_authid FROM public;");
+
+	/*
+	 * Advance the OID counter so that subsequently-created objects aren't
+	 * pinned. Subsequent objects are all droppable at the whim of the DBA.
+	 */
+	StopGeneratingPinnedObjectIds();
+
+	exec_sql_file(share_path, "system_views.sql");
+
+	exec_sql_file(share_path, "description.sql");
+
+	/* populate pg_collation */
+	{
+		/*
+		 * Add an SQL-standard name.  We don't want to pin this, so it doesn't go
+		 * in pg_collation.h.  But add it before reading system collations, so
+		 * that it wins if libc defines a locale named ucs_basic.
+		 */
+		appendStringInfo(&sql,
+						 "INSERT INTO pg_collation (oid, collname, "
+						 "    collnamespace, collowner, "
+						 "    collprovider, collisdeterministic, collencoding, "
+						 "    collcollate, collctype) "
+						 "VALUES ("
+						 "    pg_nextoid('pg_catalog.pg_collation', 'oid', "
+						 "        'pg_catalog.pg_collation_oid_index'), "
+						 "    'ucs_basic', 'pg_catalog'::regnamespace, %u, "
+						 "    '%c', true, %d, 'C', 'C');",
+						 BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC, PG_UTF8);
+		exec_sql("pg_collation", sql.data);
+		resetStringInfo(&sql);
+
+		/* Now import all collations we can find in the operating system */
+		exec_sql("import collations",
+				 "SELECT pg_import_system_collations('pg_catalog');");
+	}
+
+	exec_sql_file(share_path, "snowball_create.sql");
+
+	exec_sql_file(share_path, "privileges.sql");
+
+	exec_sql_file(share_path, "information_schema.sql");
+
+	exec_sql("plpgsql", "CREATE EXTENSION plpgsql;");
+
+	/*
+	 * Process SQL coming from initdb. This includes things like setting
+	 * up passwords, which would be a bit of pain to move to the backend.
+	 */
+	while (true)
+	{
+		if (!pg_get_line_buf(stdin, &sql))
+			break;
+
+		/* XXX: better descriptor than more */
+		exec_sql("more", sql.data);
+		resetStringInfo(&sql);
+	}
+
+	/* Run analyze before VACUUM so the statistics are frozen. */
+	exec_sql("analyze", "ANALYZE");
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+}
+
+static void
+exec_sql(const char *name, const char *sql)
+{
+	instr_time start_ts, exec_ts;
+
+	INSTR_TIME_SET_CURRENT(start_ts);
+
+	execute_sql_string(sql);
+
+	INSTR_TIME_SET_CURRENT(exec_ts);
+
+	elog(LOG, "exec %s in %.3f ms",
+		 name,
+		 (INSTR_TIME_GET_DOUBLE(exec_ts) - INSTR_TIME_GET_DOUBLE(start_ts)) * 1000);
+}
+
+static void
+exec_sql_file(const char *share_path, const char *filename)
+{
+	int length;
+	char *str;
+	char filepath[MAXPGPATH];
+
+	sprintf(filepath, "%s/%s", share_path, filename);
+
+	str = read_whole_file(filepath, &length);
+
+	exec_sql(filename, str);
+
+	pfree(str);
+}
+
 
 /* ----------------------------------------------------------------
  *						misc functions
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index eefebb7bb83..d014e52968a 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -125,6 +125,8 @@ install-data: bki-stamp installdirs
 	$(INSTALL_DATA) $(call vpathsearch,system_constraints.sql) '$(DESTDIR)$(datadir)/system_constraints.sql'
 	$(INSTALL_DATA) $(srcdir)/system_functions.sql '$(DESTDIR)$(datadir)/system_functions.sql'
 	$(INSTALL_DATA) $(srcdir)/system_views.sql '$(DESTDIR)$(datadir)/system_views.sql'
+	$(INSTALL_DATA) $(srcdir)/description.sql '$(DESTDIR)$(datadir)/description.sql'
+	$(INSTALL_DATA) $(srcdir)/privileges.sql '$(DESTDIR)$(datadir)/privileges.sql'
 	$(INSTALL_DATA) $(srcdir)/information_schema.sql '$(DESTDIR)$(datadir)/information_schema.sql'
 	$(INSTALL_DATA) $(srcdir)/sql_features.txt '$(DESTDIR)$(datadir)/sql_features.txt'
 
diff --git a/src/backend/catalog/description.sql b/src/backend/catalog/description.sql
new file mode 100644
index 00000000000..b46a3094452
--- /dev/null
+++ b/src/backend/catalog/description.sql
@@ -0,0 +1,16 @@
+/* Create default descriptions for operator implementation functions */
+WITH funcdescs AS (
+    SELECT p.oid as p_oid, o.oid as o_oid, oprname
+    FROM pg_proc p  JOIN pg_operator o ON oprcode = p.oid
+)
+INSERT INTO pg_description
+    SELECT p_oid, 'pg_proc'::regclass, 0,
+        'implementation of ' || oprname || ' operator'
+    FROM funcdescs
+    WHERE NOT EXISTS (
+           SELECT 1 FROM pg_description
+           WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass)
+        AND NOT EXISTS (
+	    SELECT 1 FROM pg_description
+            WHERE objoid = o_oid AND classoid = 'pg_operator'::regclass
+                AND description LIKE 'deprecated%');
diff --git a/src/backend/catalog/privileges.sql b/src/backend/catalog/privileges.sql
new file mode 100644
index 00000000000..3843c970df5
--- /dev/null
+++ b/src/backend/catalog/privileges.sql
@@ -0,0 +1,154 @@
+/*
+ * Set up privileges
+ *
+ * We mark most system catalogs as world-readable.  We don't currently have
+ * to touch functions, languages, or databases, because their default
+ * permissions are OK.
+ *
+ * Some objects may require different permissions by default, so we
+ * make sure we don't overwrite privilege sets that have already been
+ * set (NOT NULL).
+ *
+ * Also populate pg_init_privs to save what the privileges are at init
+ * time.  This is used by pg_dump to allow users to change privileges
+ * on catalog objects and to have those privilege changes preserved
+ * across dump/reload and pg_upgrade.
+ *
+ * Note that pg_init_privs is only for per-database objects and therefore
+ * we don't include databases or tablespaces.
+ */
+
+UPDATE pg_class
+  SET relacl = (SELECT array_agg(a.acl) FROM
+ (SELECT '=r/"POSTGRES"' as acl
+  UNION SELECT unnest(pg_catalog.acldefault(
+    CASE WHEN relkind = 'S' THEN 's'
+         ELSE 'r' END::"char", 10::oid)) -- FIXME, inlined BOOTSTRAP_SUPERUSERID
+ ) as a)
+  WHERE relkind IN ('r', 'v', 'm','S')
+  AND relacl IS NULL;
+
+GRANT USAGE ON SCHEMA pg_catalog, public TO PUBLIC;
+REVOKE ALL ON pg_largeobject FROM PUBLIC;
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class WHERE relname = 'pg_class'),
+        0,
+        relacl,
+        'i'
+    FROM
+        pg_class
+    WHERE
+        relacl IS NOT NULL
+        AND relkind IN ('r', 'v', 'm','S');
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        pg_class.oid,
+        (SELECT oid FROM pg_class WHERE relname = 'pg_class'),
+        pg_attribute.attnum,
+        pg_attribute.attacl,
+        'i'
+    FROM
+        pg_class
+        JOIN pg_attribute ON (pg_class.oid = pg_attribute.attrelid)
+    WHERE
+        pg_attribute.attacl IS NOT NULL
+        AND pg_class.relkind IN ('r', 'v', 'm', 'S');
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class WHERE relname = 'pg_proc'),
+        0,
+        proacl,
+        'i'
+    FROM
+        pg_proc
+    WHERE
+        proacl IS NOT NULL;
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class WHERE relname = 'pg_type'),
+        0,
+        typacl,
+        'i'
+    FROM
+        pg_type
+    WHERE
+        typacl IS NOT NULL;
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class WHERE relname = 'pg_language'),
+        0,
+        lanacl,
+        'i'
+    FROM
+        pg_language
+    WHERE
+        lanacl IS NOT NULL;
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class WHERE
+         relname = 'pg_largeobject_metadata'),
+        0,
+        lomacl,
+        'i'
+    FROM
+        pg_largeobject_metadata
+    WHERE
+        lomacl IS NOT NULL;
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class WHERE relname = 'pg_namespace'),
+        0,
+        nspacl,
+        'i'
+    FROM
+        pg_namespace
+    WHERE
+        nspacl IS NOT NULL;
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class WHERE
+         relname = 'pg_foreign_data_wrapper'),
+        0,
+        fdwacl,
+        'i'
+    FROM
+        pg_foreign_data_wrapper
+    WHERE
+        fdwacl IS NOT NULL;
+
+INSERT INTO pg_init_privs
+  (objoid, classoid, objsubid, initprivs, privtype)
+    SELECT
+        oid,
+        (SELECT oid FROM pg_class
+         WHERE relname = 'pg_foreign_server'),
+        0,
+        srvacl,
+        'i'
+    FROM
+        pg_foreign_server
+    WHERE
+        srvacl IS NOT NULL;
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 0e04304cb09..c46e607bccb 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -127,7 +127,6 @@ static void ApplyExtensionUpdates(Oid extensionOid,
 								  char *origSchemaName,
 								  bool cascade,
 								  bool is_create);
-static char *read_whole_file(const char *filename, int *length);
 
 
 /*
@@ -716,7 +715,7 @@ read_extension_script_file(const ExtensionControlFile *control,
  * on printing the whole string as errcontext in case of any error, and that
  * could be very long.
  */
-static void
+void
 execute_sql_string(const char *sql)
 {
 	List	   *raw_parsetree_list;
@@ -3429,7 +3428,7 @@ ExecAlterExtensionContentsStmt(AlterExtensionContentsStmt *stmt,
  * The file contents are returned as a single palloc'd chunk. For convenience
  * of the callers, an extra \0 byte is added to the end.
  */
-static char *
+char *
 read_whole_file(const char *filename, int *length)
 {
 	char	   *buf;
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 3d67ce9dcea..3b28f9c57d0 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -394,6 +394,7 @@ help(const char *progname)
 	printf(_("  --check            selects check mode (must be first argument)\n"));
 	printf(_("  DBNAME             database name (mandatory argument in bootstrapping mode)\n"));
 	printf(_("  -r FILENAME        send stdout and stderr to given file\n"));
+	printf(_("  -s SHAREPATH       path to share directory\n"));
 
 	printf(_("\nPlease read the documentation for the complete list of run-time\n"
 			 "configuration settings and how to set them on the command line or in\n"
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3c7d08209f3..1e48645a748 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1314,6 +1314,18 @@ exec_simple_query(const char *query_string)
 	debug_query_string = NULL;
 }
 
+/* just for bootstrap */
+void
+exec_simple_query_bootstrap(const char *query_string)
+{
+	MemoryContextSwitchTo(MessageContext);
+	SetCurrentStatementStartTimestamp();
+
+	exec_simple_query(query_string);
+
+	MemoryContextReset(MessageContext);
+}
+
 /*
  * exec_parse_message
  *
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2707fed12f4..6b4b14819c1 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3949,8 +3949,6 @@ RelationCacheInitializePhase2(void)
 void
 RelationCacheInitializePhase3(void)
 {
-	HASH_SEQ_STATUS status;
-	RelIdCacheEnt *idhentry;
 	MemoryContext oldcxt;
 	bool		needNewCacheFile = !criticalSharedRelcachesBuilt;
 
@@ -3992,6 +3990,15 @@ RelationCacheInitializePhase3(void)
 	if (IsBootstrapProcessingMode())
 		return;
 
+	RelationCacheInitializePhase3b(needNewCacheFile);
+}
+
+void
+RelationCacheInitializePhase3b(bool needNewCacheFile)
+{
+	HASH_SEQ_STATUS status;
+	RelIdCacheEnt *idhentry;
+
 	/*
 	 * If we didn't get the critical system indexes loaded into relcache, do
 	 * so now.  These are critical because the catcache and/or opclass cache
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 37ac928b2ef..bff2e5cb407 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -76,6 +76,7 @@
 #include "getopt_long.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "portability/instr_time.h"
 
 
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
@@ -158,12 +159,7 @@ static char *bki_file;
 static char *hba_file;
 static char *ident_file;
 static char *conf_file;
-static char *dictionary_file;
-static char *info_schema_file;
 static char *features_file;
-static char *system_constraints_file;
-static char *system_functions_file;
-static char *system_views_file;
 static bool success = false;
 static bool made_new_pgdata = false;
 static bool found_existing_pgdata = false;
@@ -201,8 +197,7 @@ static bool authwarning = false;
  * but here it is more convenient to pass it as an environment variable
  * (no quoting to worry about).
  */
-static const char *boot_options = "-F -c log_checkpoints=false";
-static const char *backend_options = "--single -F -O -j -c search_path=pg_catalog -c exit_on_error=true -c log_checkpoints=false";
+static const char *boot_options = " -F -c allow_system_table_mods=true -c search_path=pg_catalog -c exit_on_error=true";
 
 /* Additional switches to pass to backend (either boot or standalone) */
 static char *extra_options = "";
@@ -254,21 +249,10 @@ static void write_version_file(const char *extrapath);
 static void set_null_conf(void);
 static void test_config_settings(void);
 static void setup_config(void);
-static void bootstrap_template1(void);
-static void setup_auth(FILE *cmdfd);
 static void get_su_pwd(void);
-static void setup_depend(FILE *cmdfd);
-static void setup_run_file(FILE *cmdfd, const char *filename);
-static void setup_description(FILE *cmdfd);
-static void setup_collation(FILE *cmdfd);
-static void setup_privileges(FILE *cmdfd);
 static void set_info_version(void);
 static void setup_schema(FILE *cmdfd);
-static void load_plpgsql(FILE *cmdfd);
 static void set_remaining_details(FILE *cmdfd);
-static void vacuum_db(FILE *cmdfd);
-static void make_template0(FILE *cmdfd);
-static void make_postgres(FILE *cmdfd);
 static void trapsig(int signum);
 static void check_ok(void);
 static char *escape_quotes(const char *src);
@@ -1326,89 +1310,6 @@ setup_config(void)
 	check_ok();
 }
 
-
-/*
- * run the BKI script in bootstrap mode to create template1
- */
-static void
-bootstrap_template1(void)
-{
-	PG_CMD_DECL;
-	char	  **line;
-	char	  **bki_lines;
-	char		headerline[MAXPGPATH];
-
-	printf(_("running bootstrap script ... "));
-	fflush(stdout);
-
-	bki_lines = readfile(bki_file);
-
-	/* Check that bki file appears to be of the right version */
-
-	snprintf(headerline, sizeof(headerline), "# PostgreSQL %s\n",
-			 PG_MAJORVERSION);
-
-	if (strcmp(headerline, *bki_lines) != 0)
-	{
-		pg_log_error("input file \"%s\" does not belong to PostgreSQL %s",
-					 bki_file, PG_VERSION);
-		fprintf(stderr,
-				_("Check your installation or specify the correct path "
-				  "using the option -L.\n"));
-		exit(1);
-	}
-
-	/* Also ensure backend isn't confused by this environment var: */
-	unsetenv("PGCLIENTENCODING");
-
-	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s",
-			 backend_exec,
-			 wal_segment_size_mb * (1024 * 1024),
-			 data_checksums ? "-k" : "",
-			 boot_options, extra_options,
-			 debug ? "-d 5" : "");
-
-
-	PG_CMD_OPEN;
-
-	for (line = bki_lines; *line != NULL; line++)
-	{
-		PG_CMD_PUTS(*line);
-		free(*line);
-	}
-
-	PG_CMD_CLOSE;
-
-	free(bki_lines);
-
-	check_ok();
-}
-
-/*
- * set up the shadow password table
- */
-static void
-setup_auth(FILE *cmdfd)
-{
-	const char *const *line;
-	static const char *const pg_authid_setup[] = {
-		/*
-		 * The authid table shouldn't be readable except through views, to
-		 * ensure passwords are not publicly visible.
-		 */
-		"REVOKE ALL ON pg_authid FROM public;\n\n",
-		NULL
-	};
-
-	for (line = pg_authid_setup; *line != NULL; line++)
-		PG_CMD_PUTS(*line);
-
-	if (superuser_password)
-		PG_CMD_PRINTF("ALTER USER \"%s\" WITH PASSWORD E'%s';\n\n",
-					  username, escape_quotes(superuser_password));
-}
-
 /*
  * get the superuser password if required
  */
@@ -1472,249 +1373,6 @@ get_su_pwd(void)
 	superuser_password = pwd1;
 }
 
-/*
- * set up pg_depend
- */
-static void
-setup_depend(FILE *cmdfd)
-{
-	const char *const *line;
-	static const char *const pg_depend_setup[] = {
-		/*
-		 * Advance the OID counter so that subsequently-created objects aren't
-		 * pinned.
-		 */
-		"SELECT pg_stop_making_pinned_objects();\n\n",
-		NULL
-	};
-
-	for (line = pg_depend_setup; *line != NULL; line++)
-		PG_CMD_PUTS(*line);
-}
-
-/*
- * Run external file
- */
-static void
-setup_run_file(FILE *cmdfd, const char *filename)
-{
-	char	  **lines;
-
-	lines = readfile(filename);
-
-	for (char **line = lines; *line != NULL; line++)
-	{
-		PG_CMD_PUTS(*line);
-		free(*line);
-	}
-
-	PG_CMD_PUTS("\n\n");
-
-	free(lines);
-}
-
-/*
- * fill in extra description data
- */
-static void
-setup_description(FILE *cmdfd)
-{
-	/* Create default descriptions for operator implementation functions */
-	PG_CMD_PUTS("WITH funcdescs AS ( "
-				"SELECT p.oid as p_oid, o.oid as o_oid, oprname "
-				"FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid ) "
-				"INSERT INTO pg_description "
-				"  SELECT p_oid, 'pg_proc'::regclass, 0, "
-				"    'implementation of ' || oprname || ' operator' "
-				"  FROM funcdescs "
-				"  WHERE NOT EXISTS (SELECT 1 FROM pg_description "
-				"   WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass) "
-				"  AND NOT EXISTS (SELECT 1 FROM pg_description "
-				"   WHERE objoid = o_oid AND classoid = 'pg_operator'::regclass"
-				"         AND description LIKE 'deprecated%');\n\n");
-}
-
-/*
- * populate pg_collation
- */
-static void
-setup_collation(FILE *cmdfd)
-{
-	/*
-	 * Add an SQL-standard name.  We don't want to pin this, so it doesn't go
-	 * in pg_collation.h.  But add it before reading system collations, so
-	 * that it wins if libc defines a locale named ucs_basic.
-	 */
-	PG_CMD_PRINTF("INSERT INTO pg_collation (oid, collname, collnamespace, collowner, collprovider, collisdeterministic, collencoding, collcollate, collctype)"
-				  "VALUES (pg_nextoid('pg_catalog.pg_collation', 'oid', 'pg_catalog.pg_collation_oid_index'), 'ucs_basic', 'pg_catalog'::regnamespace, %u, '%c', true, %d, 'C', 'C');\n\n",
-				  BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC, PG_UTF8);
-
-	/* Now import all collations we can find in the operating system */
-	PG_CMD_PUTS("SELECT pg_import_system_collations('pg_catalog');\n\n");
-}
-
-/*
- * Set up privileges
- *
- * We mark most system catalogs as world-readable.  We don't currently have
- * to touch functions, languages, or databases, because their default
- * permissions are OK.
- *
- * Some objects may require different permissions by default, so we
- * make sure we don't overwrite privilege sets that have already been
- * set (NOT NULL).
- *
- * Also populate pg_init_privs to save what the privileges are at init
- * time.  This is used by pg_dump to allow users to change privileges
- * on catalog objects and to have those privilege changes preserved
- * across dump/reload and pg_upgrade.
- *
- * Note that pg_init_privs is only for per-database objects and therefore
- * we don't include databases or tablespaces.
- */
-static void
-setup_privileges(FILE *cmdfd)
-{
-	const char *const *line;
-	static const char *const privileges_setup[] = {
-		"UPDATE pg_class "
-		"  SET relacl = (SELECT array_agg(a.acl) FROM "
-		" (SELECT '=r/\"POSTGRES\"' as acl "
-		"  UNION SELECT unnest(pg_catalog.acldefault("
-		"    CASE WHEN relkind = " CppAsString2(RELKIND_SEQUENCE) " THEN 's' "
-		"         ELSE 'r' END::\"char\"," CppAsString2(BOOTSTRAP_SUPERUSERID) "::oid))"
-		" ) as a) "
-		"  WHERE relkind IN (" CppAsString2(RELKIND_RELATION) ", "
-		CppAsString2(RELKIND_VIEW) ", " CppAsString2(RELKIND_MATVIEW) ", "
-		CppAsString2(RELKIND_SEQUENCE) ")"
-		"  AND relacl IS NULL;\n\n",
-		"GRANT USAGE ON SCHEMA pg_catalog, public TO PUBLIC;\n\n",
-		"REVOKE ALL ON pg_largeobject FROM PUBLIC;\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class WHERE relname = 'pg_class'),"
-		"        0,"
-		"        relacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_class"
-		"    WHERE"
-		"        relacl IS NOT NULL"
-		"        AND relkind IN (" CppAsString2(RELKIND_RELATION) ", "
-		CppAsString2(RELKIND_VIEW) ", " CppAsString2(RELKIND_MATVIEW) ", "
-		CppAsString2(RELKIND_SEQUENCE) ");\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        pg_class.oid,"
-		"        (SELECT oid FROM pg_class WHERE relname = 'pg_class'),"
-		"        pg_attribute.attnum,"
-		"        pg_attribute.attacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_class"
-		"        JOIN pg_attribute ON (pg_class.oid = pg_attribute.attrelid)"
-		"    WHERE"
-		"        pg_attribute.attacl IS NOT NULL"
-		"        AND pg_class.relkind IN (" CppAsString2(RELKIND_RELATION) ", "
-		CppAsString2(RELKIND_VIEW) ", " CppAsString2(RELKIND_MATVIEW) ", "
-		CppAsString2(RELKIND_SEQUENCE) ");\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class WHERE relname = 'pg_proc'),"
-		"        0,"
-		"        proacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_proc"
-		"    WHERE"
-		"        proacl IS NOT NULL;\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class WHERE relname = 'pg_type'),"
-		"        0,"
-		"        typacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_type"
-		"    WHERE"
-		"        typacl IS NOT NULL;\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class WHERE relname = 'pg_language'),"
-		"        0,"
-		"        lanacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_language"
-		"    WHERE"
-		"        lanacl IS NOT NULL;\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class WHERE "
-		"         relname = 'pg_largeobject_metadata'),"
-		"        0,"
-		"        lomacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_largeobject_metadata"
-		"    WHERE"
-		"        lomacl IS NOT NULL;\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class WHERE relname = 'pg_namespace'),"
-		"        0,"
-		"        nspacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_namespace"
-		"    WHERE"
-		"        nspacl IS NOT NULL;\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class WHERE "
-		"         relname = 'pg_foreign_data_wrapper'),"
-		"        0,"
-		"        fdwacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_foreign_data_wrapper"
-		"    WHERE"
-		"        fdwacl IS NOT NULL;\n\n",
-		"INSERT INTO pg_init_privs "
-		"  (objoid, classoid, objsubid, initprivs, privtype)"
-		"    SELECT"
-		"        oid,"
-		"        (SELECT oid FROM pg_class "
-		"         WHERE relname = 'pg_foreign_server'),"
-		"        0,"
-		"        srvacl,"
-		"        'i'"
-		"    FROM"
-		"        pg_foreign_server"
-		"    WHERE"
-		"        srvacl IS NOT NULL;\n\n",
-		NULL
-	};
-
-	for (line = privileges_setup; *line != NULL; line++)
-		PG_CMD_PUTS(*line);
-}
-
 /*
  * extract the strange version of version required for information schema
  * (09.08.0007abc)
@@ -1749,29 +1407,18 @@ set_info_version(void)
 static void
 setup_schema(FILE *cmdfd)
 {
-	setup_run_file(cmdfd, info_schema_file);
-
 	PG_CMD_PRINTF("UPDATE information_schema.sql_implementation_info "
 				  "  SET character_value = '%s' "
-				  "  WHERE implementation_info_name = 'DBMS VERSION';\n\n",
+				  "  WHERE implementation_info_name = 'DBMS VERSION';\n",
 				  infoversion);
 
 	PG_CMD_PRINTF("COPY information_schema.sql_features "
 				  "  (feature_id, feature_name, sub_feature_id, "
 				  "  sub_feature_name, is_supported, comments) "
-				  " FROM E'%s';\n\n",
+				  " FROM E'%s';\n",
 				  escape_quotes(features_file));
 }
 
-/*
- * load PL/pgSQL server-side language
- */
-static void
-load_plpgsql(FILE *cmdfd)
-{
-	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n\n");
-}
-
 /*
  * Set some remaining details that aren't known when postgres.bki is made.
  *
@@ -1796,8 +1443,8 @@ set_remaining_details(FILE *cmdfd)
 	 * locale/encoding without cheating.
 	 */
 	static char *final_details[] = {
-		"UPDATE pg_authid SET rolname = E'SUPERUSER_NAME' WHERE rolname = 'POSTGRES';\n\n",
-		"UPDATE pg_database SET encoding = E'ENCODING', datcollate = E'LC_COLLATE', datctype = E'LC_CTYPE';\n\n",
+		"UPDATE pg_authid SET rolname = E'SUPERUSER_NAME' WHERE rolname = 'POSTGRES';\n",
+		"UPDATE pg_database SET encoding = E'ENCODING', datcollate = E'LC_COLLATE', datctype = E'LC_CTYPE';\n",
 		NULL
 	};
 
@@ -1814,93 +1461,6 @@ set_remaining_details(FILE *cmdfd)
 		PG_CMD_PUTS(*line);
 }
 
-/*
- * clean everything up in template1
- */
-static void
-vacuum_db(FILE *cmdfd)
-{
-	/* Run analyze before VACUUM so the statistics are frozen. */
-	PG_CMD_PUTS("ANALYZE;\n\nVACUUM FREEZE;\n\n");
-}
-
-/*
- * copy template1 to template0
- */
-static void
-make_template0(FILE *cmdfd)
-{
-	const char *const *line;
-
-	/*
-	 * pg_upgrade tries to preserve database OIDs across upgrades. It's smart
-	 * enough to drop and recreate a conflicting database with the same name,
-	 * but if the same OID were used for one system-created database in the
-	 * old cluster and a different system-created database in the new cluster,
-	 * it would fail. To avoid that, assign a fixed OID to template0 rather
-	 * than letting the server choose one.
-	 *
-	 * (Note that, while the user could have dropped and recreated these
-	 * objects in the old cluster, the problem scenario only exists if the OID
-	 * that is in use in the old cluster is also used in the new cluster - and
-	 * the new cluster should be the result of a fresh initdb.)
-	 */
-	static const char *const template0_setup[] = {
-		"CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false OID = "
-		CppAsString2(Template0ObjectId) ";\n\n",
-
-		/*
-		 * template0 shouldn't have any collation-dependent objects, so unset
-		 * the collation version.  This disables collation version checks when
-		 * making a new database from it.
-		 */
-		"UPDATE pg_database SET datcollversion = NULL WHERE datname = 'template0';\n\n",
-
-		/*
-		 * While we are here, do set the collation version on template1.
-		 */
-		"UPDATE pg_database SET datcollversion = pg_database_collation_actual_version(oid) WHERE datname = 'template1';\n\n",
-
-		/*
-		 * Explicitly revoke public create-schema and create-temp-table
-		 * privileges in template1 and template0; else the latter would be on
-		 * by default
-		 */
-		"REVOKE CREATE,TEMPORARY ON DATABASE template1 FROM public;\n\n",
-		"REVOKE CREATE,TEMPORARY ON DATABASE template0 FROM public;\n\n",
-
-		"COMMENT ON DATABASE template0 IS 'unmodifiable empty database';\n\n",
-
-		/*
-		 * Finally vacuum to clean up dead rows in pg_database
-		 */
-		"VACUUM pg_database;\n\n",
-		NULL
-	};
-
-	for (line = template0_setup; *line; line++)
-		PG_CMD_PUTS(*line);
-}
-
-/*
- * copy template1 to postgres
- */
-static void
-make_postgres(FILE *cmdfd)
-{
-	const char *const *line;
-
-	/* Assign a fixed OID to postgres, for the same reasons as template0 */
-	static const char *const postgres_setup[] = {
-		"CREATE DATABASE postgres OID = " CppAsString2(PostgresObjectId) ";\n\n",
-		"COMMENT ON DATABASE postgres IS 'default administrative connection database';\n\n",
-		NULL
-	};
-
-	for (line = postgres_setup; *line; line++)
-		PG_CMD_PUTS(*line);
-}
-
 /*
  * signal handler in case we are interrupted.
  *
@@ -2446,16 +2006,10 @@ setup_locale_encoding(void)
 void
 setup_data_file_paths(void)
 {
-	set_input(&bki_file, "postgres.bki");
 	set_input(&hba_file, "pg_hba.conf.sample");
 	set_input(&ident_file, "pg_ident.conf.sample");
 	set_input(&conf_file, "postgresql.conf.sample");
-	set_input(&dictionary_file, "snowball_create.sql");
-	set_input(&info_schema_file, "information_schema.sql");
 	set_input(&features_file, "sql_features.txt");
-	set_input(&system_constraints_file, "system_constraints.sql");
-	set_input(&system_functions_file, "system_functions.sql");
-	set_input(&system_views_file, "system_views.sql");
 
 	if (show_setting || debug)
 	{
@@ -2474,16 +2028,9 @@ setup_data_file_paths(void)
 			exit(0);
 	}
 
-	check_input(bki_file);
 	check_input(hba_file);
 	check_input(ident_file);
 	check_input(conf_file);
-	check_input(dictionary_file);
-	check_input(info_schema_file);
-	check_input(features_file);
-	check_input(system_constraints_file);
-	check_input(system_functions_file);
-	check_input(system_views_file);
 }
 
 
@@ -2744,6 +2291,7 @@ initialize_data_directory(void)
 {
 	PG_CMD_DECL;
 	int			i;
+	instr_time	last_ts, cur_ts;
 
 	setup_signals();
 
@@ -2788,68 +2336,46 @@ initialize_data_directory(void)
 	write_version_file(NULL);
 
 	/* Select suitable configuration settings */
+	INSTR_TIME_SET_CURRENT(last_ts);
 	set_null_conf();
 	test_config_settings();
+	INSTR_TIME_SET_CURRENT(cur_ts);
+	fprintf(stderr, "config determination in %.3f ms\n",
+			(INSTR_TIME_GET_DOUBLE(cur_ts) - INSTR_TIME_GET_DOUBLE(last_ts)) * 1000);
 
 	/* Now create all the text config files */
 	setup_config();
 
-	/* Bootstrap template1 */
-	bootstrap_template1();
-
 	/*
 	 * Make the per-database PG_VERSION for template1 only after init'ing it
+	 *
+	 * FIXME: move to server
 	 */
 	write_version_file("base/1");
 
-	/*
-	 * Create the stuff we don't need to use bootstrap mode for, using a
-	 * backend running in simple standalone mode.
-	 */
-	fputs(_("performing post-bootstrap initialization ... "), stdout);
-	fflush(stdout);
-
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" %s %s template1 >%s",
-			 backend_exec, backend_options, extra_options,
+			 "\"%s\" --boot -s \"%s\" -X %d %s %s %s %s >%s 2>&1",
+			 backend_exec,
+			 share_path,
+			 wal_segment_size_mb * (1024 * 1024),
+			 data_checksums ? "-k" : "",
+			 boot_options, extra_options,
+			 debug ? "-d 5" : "",
 			 DEVNULL);
 
+	/* Also ensure backend isn't confused by this environment var: */
+	unsetenv("PGCLIENTENCODING");
+
+	printf(_("running database bootstrap ... "));
+
 	PG_CMD_OPEN;
 
-	setup_auth(cmdfd);
-
-	setup_run_file(cmdfd, system_constraints_file);
-
-	setup_run_file(cmdfd, system_functions_file);
-
-	setup_depend(cmdfd);
-
-	/*
-	 * Note that no objects created after setup_depend() will be "pinned".
-	 * They are all droppable at the whim of the DBA.
-	 */
-
-	setup_run_file(cmdfd, system_views_file);
-
-	setup_description(cmdfd);
-
-	setup_collation(cmdfd);
-
-	setup_run_file(cmdfd, dictionary_file);
-
-	setup_privileges(cmdfd);
-
 	setup_schema(cmdfd);
-
-	load_plpgsql(cmdfd);
-
 	set_remaining_details(cmdfd);
 
-	vacuum_db(cmdfd);
-
-	make_template0(cmdfd);
-
-	make_postgres(cmdfd);
+	if (superuser_password)
+		PG_CMD_PRINTF("ALTER USER \"%s\" WITH PASSWORD E'%s';\n",
+					  username, escape_quotes(superuser_password));
 
 	PG_CMD_CLOSE;
 
@@ -3178,7 +2704,7 @@ main(int argc, char *argv[])
 	else
 		printf(_("\nSync to disk skipped.\nThe data directory might become corrupt if the operating system crashes.\n"));
 
-	if (authwarning)
+	if (authwarning && false)
 	{
 		printf("\n");
 		pg_log_warning("enabling \"trust\" authentication for local connections");
-- 
2.34.0

v1-0004-initdb-Optimize-WAL-writing-during-initdb.patchtext/x-diff; charset=us-asciiDownload

From 3d97d3aaafb4a6e7ea8d283e858c0200b6a2a1cc Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 20 Feb 2022 12:17:03 -0800
Subject: [PATCH v1 4/5] initdb: Optimize WAL writing during initdb.

Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 src/include/miscadmin.h                 |  3 +++
 src/backend/access/transam/xloginsert.c | 18 ++++++++++++------
 src/backend/bootstrap/bootstrap.c       |  2 +-
 src/backend/catalog/heap.c              |  4 +++-
 src/backend/commands/dbcommands.c       | 16 +++++++++++++---
 5 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0abc3ad5405..9af1f46b3b2 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -396,6 +396,7 @@ extern bool superuser_arg(Oid roleid);	/* given user is superuser */
 typedef enum ProcessingMode
 {
 	BootstrapProcessing,		/* bootstrap creation of template database */
+	LateBootstrapProcessing,	/* XXX bootstrap initializing more stuff */
 	InitProcessing,				/* initializing system */
 	NormalProcessing			/* normal processing */
 } ProcessingMode;
@@ -403,6 +404,7 @@ typedef enum ProcessingMode
 extern ProcessingMode Mode;
 
 #define IsBootstrapProcessingMode() (Mode == BootstrapProcessing)
+#define IsLateBootstrapProcessingMode() (Mode == LateBootstrapProcessing)
 #define IsInitProcessingMode()		(Mode == InitProcessing)
 #define IsNormalProcessingMode()	(Mode == NormalProcessing)
 
@@ -411,6 +413,7 @@ extern ProcessingMode Mode;
 #define SetProcessingMode(mode) \
 	do { \
 		AssertArg((mode) == BootstrapProcessing || \
+				  (mode) == LateBootstrapProcessing || \
 				  (mode) == InitProcessing || \
 				  (mode) == NormalProcessing); \
 		Mode = (mode); \
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index c260310c4c8..cd4316b108e 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -446,14 +446,20 @@ XLogInsert(RmgrId rmid, uint8 info)
 	TRACE_POSTGRESQL_WAL_INSERT(rmid, info);
 
 	/*
-	 * In bootstrap mode, we don't actually log anything but XLOG resources;
-	 * return a phony record pointer.
+	 * In bootstrap mode, we don't actually log anything but shutdown
+	 * checkpoint records; return a phony record pointer.
 	 */
-	if (IsBootstrapProcessingMode() && rmid != RM_XLOG_ID)
+	if (IsBootstrapProcessingMode() || IsLateBootstrapProcessingMode())
 	{
-		XLogResetInsertion();
-		EndPos = SizeOfXLogLongPHD; /* start of 1st chkpt record */
-		return EndPos;
+		uint8		rectype = info & ~XLR_INFO_MASK;
+
+		if (rmid != RM_XLOG_ID ||
+			rectype != XLOG_CHECKPOINT_SHUTDOWN)
+		{
+			XLogResetInsertion();
+			EndPos = SizeOfXLogLongPHD; /* start of 1st chkpt record */
+			return EndPos;
+		}
 	}
 
 	do
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 3fd8ed60715..6230ba17685 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -451,7 +451,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	/*
 	 * Load further catalog contents by running a bunch of SQL commands.
 	 */
-	SetProcessingMode(NormalProcessing);
+	SetProcessingMode(LateBootstrapProcessing);
 
 	bootstrap_load_nonbki(share_path);
 
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7e99de88b34..486f0e5ea1f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1138,7 +1138,9 @@ heap_create_with_catalog(const char *relname,
 	/*
 	 * sanity checks
 	 */
-	Assert(IsNormalProcessingMode() || IsBootstrapProcessingMode());
+	Assert(IsNormalProcessingMode() ||
+		   IsBootstrapProcessingMode() ||
+		   IsLateBootstrapProcessingMode());
 
 	/*
 	 * Validate proposed tupdesc for the desired relkind.  If
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index c37e3c9a9a4..1720bad1d07 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -676,9 +676,15 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	 * happened while we're copying files, a file might be deleted just when
 	 * we're about to copy it, causing the lstat() call in copydir() to fail
 	 * with ENOENT.
+	 *
+	 * In bootstrap mode FlushDatabaseBuffers() suffices because there are
+	 * unlink requests.
 	 */
-	RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT
-					  | CHECKPOINT_FLUSH_ALL);
+	if (IsBootstrapProcessingMode() || IsLateBootstrapProcessingMode())
+		FlushDatabaseBuffers(src_dboid);
+	else
+		RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT
+						  | CHECKPOINT_FLUSH_ALL);
 
 	/*
 	 * Once we start copying subdirectories, we need to be able to clean 'em
@@ -782,8 +788,12 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 		 *
 		 * Perhaps if we ever implement CREATE DATABASE in a less cheesy way,
 		 * we can avoid this.
+		 *
+		 * We do not need this checkpoint in bootstrap mode - if we fail, the
+		 * cluster won't be valid anyway.
 		 */
-		RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
+		if (!IsBootstrapProcessingMode() && !IsLateBootstrapProcessingMode())
+			RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
 
 		/*
 		 * Close pg_database, but keep lock till commit.
-- 
2.34.0

v1-0005-initdb-call-isatty-only-once-in-bootparse.y.patchtext/x-diff; charset=us-asciiDownload

From defeeee9b04fca7f376a7a62fcdfa0fbd46cccef Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 20 Feb 2022 13:39:40 -0800
Subject: [PATCH v1 5/5] initdb: call isatty() only once in bootparse.y.

Causes a not insignificant amount of syscalls...
---
 src/backend/bootstrap/bootparse.y | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index 142433f63f3..dff86f71583 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -62,11 +62,17 @@ do_start(void)
 static void
 do_end(void)
 {
+	static int isatty_cached = -1;
+
 	/* Reclaim memory allocated while processing this line */
 	MemoryContextSwitchTo(CurTransactionContext);
 	MemoryContextReset(per_line_ctx);
 	CHECK_FOR_INTERRUPTS();		/* allow SIGINT to kill bootstrap run */
-	if (isatty(0))
+
+	if (isatty_cached == -1)
+		isatty_cached = isatty(0);
+
+	if (isatty_cached)
 	{
 		printf("bootstrap> ");
 		fflush(stdout);
-- 
2.34.0

#16

Peter Eisentraut

peter.eisentraut@enterprisedb.com

almost 4 years ago

In reply to: Tom Lane (#12)

Re: initdb / bootstrap design

On 20.02.22 01:39, Tom Lane wrote:

Hm, wouldn't it be less code to just use printf?

Meh --- it'd be different from the way we do it in the rest
of initdb, and it would not be "less code". Maybe it'd run
a shade faster, but I refuse to believe that that'd be
enough to matter.

There is a PG_CMD_PRINTF() that is used for that purpose.