Re: Bootstrap DATA is a pita

Started by Caleb Weltonabout 10 years ago33 messages
#1Caleb Welton
cwelton@pivotal.io

Hello Hackers,

Reviving an old thread on simplifying the bootstrap process.

I'm a developer from the GPDB / HAWQ side of the world where we did some
work a while back to enable catalog definition via SQL files and we have
found it valuable from a dev perspective. The mechanism currently in those
products is a bit.. convoluted where SQL is processed in perl to create the
existing DATA statements, which are then processed as they are today in
Postgres... I wouldn't suggest this route, but having worked with both the
DATA mechanism and the SQL based one I've certainly found SQL to be a more
convenient way of interacting with the catalog.

I'd propose:
- Keep enough of the existing bootstrap mechanism functional to get a
small tidy core, essentially you need enough of pg_type, pg_proc, pg_class,
pg_attribute to support the 25 types used by catalog tables and most
everything else can be moved into SQL processing like how system_views.sql
is handled today.

The above was largely proposed back in March and rejected based on
concerns that

1. initdb would be slower.
2. It would introduce too much special purpose bootstrap cruft into the
code.
3. Editing SQL commands is not comfortable in bulk

On 1.

I have a prototype that handles about 1000 functions (all the functions in
pg_proc.h that are not used by other catalog tables, e.g. pg_type,
pg_language, pg_range, pg_aggregate, window functions, pg_ts_parser, etc).

All of initdb can be processed in 1.53s. This compares to 1.37s with the
current bootstrap approach. So yes, this is slower, but not 'noticeably
slower' - I certainly didn't notice the 0.16s until I saw the concern and
then timed it.

On 2.

So far the amount of cruft has been:
- Enabling adding functions with specific OIDs when creating functions.
1 line changes in pg_aggregate.c, proclang.c, typecmds.c
about dozen lines of code in functioncmds.c
3 lines changed in pg_proc.c
- Update the fmgr_internal_validator for builtin functions while the
catalog is mutable
3 lines changed in pg_proc.c
- Update how the builtin function cache is built
Some significant work in fmgr.c that honestly still needs cleanup
before it would be ready to propose as a patch that would be worthy of
committing.
- Update how builtin functions are resolved outside of bootstrap
Minor updates to dynloader for lookup of symbols within the current
executable, so far I've only done darwin.c for my prototype, this would
need to be extended to the other ports.
- Initializitation of the builtin cache
2 line change in postinit.c
- Addition of a stage in initdb to process the sql directives similar in
scope to the processing of system_views.sql.

No changes needed in the parser, planner, etc. My assessment is that this
worry is not a major concern in practice with the right implementation.

On 3.

Having worked with both SQL and bki DATA directives I have personally found
the convenience of SQL outweighs the pain. In many cases changes, such as
adding a new column to pg_proc, have minimal impact on the SQL
representation and what changes are needed are often simple to implement.
E.g. accounting for COST only needs to be done for the functions that need
something other than the default value. This however is somewhat
subjective.

On the Pros side:

a. Debugging bootstrap is extremely painful, debugging once initdb has
gotten to 'postgres --single' is way easier.

b. It is easier to introduce minor issues with DATA directives than it is
when using the SQL processing used for all other user objects.

Example: currently in Postgres all builtin functions default to COST 1,
and all SQL functions default to cost 100. However the following SQL
functions included in bootstrap inexplicably are initialized with a COST of
1:
age(timestamp with time zone)
age(timestamp without time zone)
bit_length(bytea)
bit_length(text)
bit_length(bit)
date_part(text, abstime)
date_part(text, reltime)
date_part(text, date)
... and 26 other examples

c. SQL files are significantly less of a PITA (subjective opinion, but I
can say this from a perspective of experience working with both DATA
directives and SQL driven catalog definition).

If people have interest I can share my patch so far if that helps address
concerns, but if there is not interest then I'll probably leave my
prototype where it is rather than investing more effort in the proof of
concept.

Thanks,
Caleb

On Sat, Mar 7, 2015 at 5:20 PM, <pgsql-hackers-owner@postgresql.org> wrote:

Show quoted text

Date: Sat, 7 Mar 2015 23:46:54 +0100
From: Andres Freund <andres@2ndquadrant.com>
To: Jim Nasby <Jim.Nasby@BlueTreble.com>
Cc: Stephen Frost <sfrost@snowman.net>, Robert Haas <robertmhaas@gmail.com

,

Tom Lane <tgl@sss.pgh.pa.us>, Peter Eisentraut <peter_e@gmx.net>,
Josh Berkus <josh@agliodbs.com>,
"pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Subject: Re: Bootstrap DATA is a pita
Message-ID: <20150307224654.GC12213@awork2.anarazel.de>

On 2015-03-07 16:43:15 -0600, Jim Nasby wrote:

Semi-related... if we put some special handling in some places for

bootstrap

mode, couldn't most catalog objects be created using SQL, once we got
pg_class, pg_attributes and pg_type created? That would theoretically

allow

us to drive much more of initdb with plain SQL (possibly created via
pg_dump).

Several people have now made that suggestion, but I *seriously* doubt
that we actually want to go there. The overhead of executing SQL
commands in comparison to the bki stuff is really rather
noticeable. Doing the majority of the large number of insertions via SQL
will make initdb noticeably slower. And it's already annoyingly
slow. Besides make install it's probably the thing I wait most for
during development.

That's besides the fact that SQL commands aren't actually that
comfortably editable in bulk.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#2Caleb Welton
cwelton@pivotal.io
In reply to: Caleb Welton (#1)

I'm happy working these ideas forward if there is interest.

Basic design proposal is:
- keep a minimal amount of bootstrap to avoid intrusive changes to core
components
- Add capabilities of creating objects with specific OIDs via DDL during
initdb
- Update the caching/resolution mechanism for builtin functions to be
more dynamic.
- Move as much of bootstrap as possible into SQL files and create catalog
via DDL

Feedback appreciated.

I can provide a sample patch if there is interest, about ~500 lines of
combined diff for the needed infrastructure to support the above, not
including the modifications to pg_proc.h that would follow.

Thanks,
Caleb

Show quoted text

On Thu, Dec 10, 2015 at 11:47 AM, Caleb Welton wrote:

Hello Hackers,

Reviving an old thread on simplifying the bootstrap process.

I'm a developer from the GPDB / HAWQ side of the world where we did some
work a while back to enable catalog definition via SQL files and we have
found it valuable from a dev perspective. The mechanism currently in those
products is a bit.. convoluted where SQL is processed in perl to create the
existing DATA statements, which are then processed as they are today in
Postgres... I wouldn't suggest this route, but having worked with both the
DATA mechanism and the SQL based one I've certainly found SQL to be a more
convenient way of interacting with the catalog.

I'd propose:
- Keep enough of the existing bootstrap mechanism functional to get a
small tidy core, essentially you need enough of pg_type, pg_proc, pg_class,
pg_attribute to support the 25 types used by catalog tables and most
everything else can be moved into SQL processing like how system_views.sql
is handled today.

The above was largely proposed back in March and rejected based on
concerns that

1. initdb would be slower.
2. It would introduce too much special purpose bootstrap cruft into the
code.
3. Editing SQL commands is not comfortable in bulk

On 1.

I have a prototype that handles about 1000 functions (all the functions in
pg_proc.h that are not used by other catalog tables, e.g. pg_type,
pg_language, pg_range, pg_aggregate, window functions, pg_ts_parser, etc).

All of initdb can be processed in 1.53s. This compares to 1.37s with the
current bootstrap approach. So yes, this is slower, but not 'noticeably
slower' - I certainly didn't notice the 0.16s until I saw the concern and
then timed it.

On 2.

So far the amount of cruft has been:
- Enabling adding functions with specific OIDs when creating functions.
1 line changes in pg_aggregate.c, proclang.c, typecmds.c
about dozen lines of code in functioncmds.c
3 lines changed in pg_proc.c
- Update the fmgr_internal_validator for builtin functions while the
catalog is mutable
3 lines changed in pg_proc.c
- Update how the builtin function cache is built
Some significant work in fmgr.c that honestly still needs cleanup
before it would be ready to propose as a patch that would be worthy of
committing.
- Update how builtin functions are resolved outside of bootstrap
Minor updates to dynloader for lookup of symbols within the current
executable, so far I've only done darwin.c for my prototype, this would
need to be extended to the other ports.
- Initializitation of the builtin cache
2 line change in postinit.c
- Addition of a stage in initdb to process the sql directives similar in
scope to the processing of system_views.sql.

No changes needed in the parser, planner, etc. My assessment is that this
worry is not a major concern in practice with the right implementation.

On 3.

Having worked with both SQL and bki DATA directives I have personally found
the convenience of SQL outweighs the pain. In many cases changes, such as
adding a new column to pg_proc, have minimal impact on the SQL
representation and what changes are needed are often simple to implement.
E.g. accounting for COST only needs to be done for the functions that need
something other than the default value. This however is somewhat
subjective.

On the Pros side:

a. Debugging bootstrap is extremely painful, debugging once initdb has
gotten to 'postgres --single' is way easier.

b. It is easier to introduce minor issues with DATA directives than it is
when using the SQL processing used for all other user objects.

Example: currently in Postgres all builtin functions default to COST 1,
and all SQL functions default to cost 100. However the following SQL
functions included in bootstrap inexplicably are initialized with a COST of
1:
age(timestamp with time zone)
age(timestamp without time zone)
bit_length(bytea)
bit_length(text)
bit_length(bit)
date_part(text, abstime)
date_part(text, reltime)
date_part(text, date)
... and 26 other examples

c. SQL files are significantly less of a PITA (subjective opinion, but I
can say this from a perspective of experience working with both DATA
directives and SQL driven catalog definition).

If people have interest I can share my patch so far if that helps address
concerns, but if there is not interest then I'll probably leave my
prototype where it is rather than investing more effort in the proof of
concept.

Thanks,
Caleb

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Caleb Welton (#2)

Caleb Welton wrote:

I'm happy working these ideas forward if there is interest.

Basic design proposal is:
- keep a minimal amount of bootstrap to avoid intrusive changes to core
components
- Add capabilities of creating objects with specific OIDs via DDL during
initdb
- Update the caching/resolution mechanism for builtin functions to be
more dynamic.
- Move as much of bootstrap as possible into SQL files and create catalog
via DDL

I think the point we got stuck last time at was deciding on a good
format for the data coming from the DATA lines. One of the objections
raised for formats such as JSON is that it's trivial for "git merge" (or
similar tools) to make a mistake because object-end/object-start lines
are all identical. And as for the SQL-format version, the objection was
that it's hard to modify the lines en-masse when modifying the catalog
definition (new column, etc). Ideally we would like a format that can
be bulk-edited without too much trouble.

A SQL file would presumably not have the merge issue, but mass-editing
would be a pain.

Crazy idea: we could just have a CSV file which can be loaded into a
table for mass changes using regular DDL commands, then dumped back from
there into the file. We already know how to do these things, using
\copy etc. Since CSV uses one line per entry, there would be no merge
problems either (or rather: all merge problems would become conflicts,
which is what we want.)

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Caleb Welton
cwelton@pivotal.io
In reply to: Alvaro Herrera (#3)

Makes sense.

During my own prototyping what I did was generate the sql statements via sql querying the existing catalog. Way easier than hand writing 1000+ function definitions and not difficult to modify for future changes. As affirmed that it was very easy to adapt my existing sql to account for some of the newer features in master.

The biggest challenge was establishing a sort order that ensures both a unique ordering and that the dependencies needed for SQL functions have been processed before trying to define them. Which effects about 4/1000 functions based on a natural oid ordering.

On Dec 11, 2015, at 11:43 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Caleb Welton wrote:

I'm happy working these ideas forward if there is interest.

Basic design proposal is:
- keep a minimal amount of bootstrap to avoid intrusive changes to core
components
- Add capabilities of creating objects with specific OIDs via DDL during
initdb
- Update the caching/resolution mechanism for builtin functions to be
more dynamic.
- Move as much of bootstrap as possible into SQL files and create catalog
via DDL

I think the point we got stuck last time at was deciding on a good
format for the data coming from the DATA lines. One of the objections
raised for formats such as JSON is that it's trivial for "git merge" (or
similar tools) to make a mistake because object-end/object-start lines
are all identical. And as for the SQL-format version, the objection was
that it's hard to modify the lines en-masse when modifying the catalog
definition (new column, etc). Ideally we would like a format that can
be bulk-edited without too much trouble.

A SQL file would presumably not have the merge issue, but mass-editing
would be a pain.

Crazy idea: we could just have a CSV file which can be loaded into a
table for mass changes using regular DDL commands, then dumped back from
there into the file. We already know how to do these things, using
\copy etc. Since CSV uses one line per entry, there would be no merge
problems either (or rather: all merge problems would become conflicts,
which is what we want.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Crazy idea: we could just have a CSV file which can be loaded into a
table for mass changes using regular DDL commands, then dumped back from
there into the file. We already know how to do these things, using
\copy etc. Since CSV uses one line per entry, there would be no merge
problems either (or rather: all merge problems would become conflicts,
which is what we want.)

That's an interesting proposal. It would mean that the catalog files
stay at more or less their current semantic level (direct representations
of bootstrap catalog contents), but it does sound like a more attractive
way to perform complex edits than writing Emacs macros ;-).

You could actually do that the hard way right now, with a bit of script
to convert between DATA lines and CSV format. But if we anticipate that
becoming the standard approach, it would definitely make sense to migrate
the master copies into CSV or traditional COPY format, and teach BKI mode
to read that (or, perhaps, leave bootstrap.c alone and modify the code
that produces the .bki file).

This is somewhat orthogonal to the question of whether we want to do
things like converting noncritical operator-class definitions into
regular CREATE OPERATOR CLASS syntax. There's almost certainly going
to be some hard core of catalog entries that aren't amenable to that,
and will still need to be loaded from data files of some sort.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Mark Dilger
hornschnorter@gmail.com
In reply to: Tom Lane (#5)

On Dec 11, 2015, at 1:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Crazy idea: we could just have a CSV file which can be loaded into a
table for mass changes using regular DDL commands, then dumped back from
there into the file. We already know how to do these things, using
\copy etc. Since CSV uses one line per entry, there would be no merge
problems either (or rather: all merge problems would become conflicts,
which is what we want.)

That's an interesting proposal. It would mean that the catalog files
stay at more or less their current semantic level (direct representations
of bootstrap catalog contents), but it does sound like a more attractive
way to perform complex edits than writing Emacs macros ;-).

I would be happy to work on this, if there is much chance of the community
accepting a patch. Do you think replacing the numeric Oids for functions,
operators, opclasses and such in the source files with their names would
be ok, with the SQL converting those to Oids in the output? My eyes have
gotten tired more than once trying to read head files in src/include/catalog
looking for mistakes in what largely amounts to a big table of numbers.

For example, in pg_amop.h:

/* default operators int2 */
DATA(insert ( 1976 21 21 1 s 95 403 0 ));
DATA(insert ( 1976 21 21 2 s 522 403 0 ));
DATA(insert ( 1976 21 21 3 s 94 403 0 ));
DATA(insert ( 1976 21 21 4 s 524 403 0 ));
DATA(insert ( 1976 21 21 5 s 520 403 0 ));

Would become something like:

amopfamily amoplefttype amoprighttype amopstrategy amoppurpose amopopr amopmethod amopsortfamily
integer_ops int2 int2 1 search "<" btree 0
integer_ops int2 int2 2 search "<=" btree 0
integer_ops int2 int2 3 search "=" btree 0
integer_ops int2 int2 4 search ">=" btree 0
integer_ops int2 int2 5 search ">" btree 0

Note that I prefer to use tabs and a headerline, as the tabstop can be set to
line them up nicely, and the headerline allows you to see which column is
which, and what it is for. Csv is always harder for me to use that way, though
maybe that is just a matter of which editor i use. (vim)

And yes, I'd need to allow the HEADER option for copying tab delimited
files, since it is currently only allowed for csv, I believe.

mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#6)

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 11, 2015, at 1:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
That's an interesting proposal. It would mean that the catalog files
stay at more or less their current semantic level (direct representations
of bootstrap catalog contents), but it does sound like a more attractive
way to perform complex edits than writing Emacs macros ;-).

I would be happy to work on this, if there is much chance of the community
accepting a patch. Do you think replacing the numeric Oids for functions,
operators, opclasses and such in the source files with their names would
be ok, with the SQL converting those to Oids in the output?

Huh? Those files are the definition of that mapping, no? Isn't what
you're proposing circular?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Caleb Welton
cwelton@pivotal.io
In reply to: Caleb Welton (#4)
4 attachment(s)

I took a look at a few of the most recent bulk edit cases for pg_proc.h:

There were two this year:
* The addition of proparallel [1]https://github.com/postgres/postgres/commit/7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b
* The addition of protransform [2]https://github.com/postgres/postgres/commit/8f9fe6edce358f7904e0db119416b4d1080a83aa

And prior to that the most recent seems to be from 2012:
* The addition of proleakproof [3]https://github.com/postgres/postgres/commit/cd30728fb2ed7c367d545fc14ab850b5fa2a4850

Quick TLDR - the changes needed to reflect these are super simple to
reflect when generating SQL for CREATE FUNCTION statements.

Attached is the SQL that would generate function definitions prior to
proleakproof and the diffs that would be required after adding support for
proleakproof, protransform and proparallel.

Each of the diffs indicates the changes that would be needed after the new
column is added, the question of how to populate default values for the new
columns is beyond the scope that can easily be expressed in general terms
and depends entirely on what the nature of the new column is.

Note: Currently I have focused on the 'pure' functions, e.g. not the
drivers of type serialization, language validation, operators, or other
object types. I would want to deal with each of those while handling the
conversion for each of those object types in turn. Additional
modifications would likely be needed for other types of functions.

[1]: https://github.com/postgres/postgres/commit/7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b
https://github.com/postgres/postgres/commit/7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b
[2]: https://github.com/postgres/postgres/commit/8f9fe6edce358f7904e0db119416b4d1080a83aa
https://github.com/postgres/postgres/commit/8f9fe6edce358f7904e0db119416b4d1080a83aa
[3]: https://github.com/postgres/postgres/commit/cd30728fb2ed7c367d545fc14ab850b5fa2a4850
https://github.com/postgres/postgres/commit/cd30728fb2ed7c367d545fc14ab850b5fa2a4850

On Fri, Dec 11, 2015 at 12:55 PM, Caleb Welton <cwelton@pivotal.io> wrote:

Show quoted text

Makes sense.

During my own prototyping what I did was generate the sql statements via
sql querying the existing catalog. Way easier than hand writing 1000+
function definitions and not difficult to modify for future changes. As
affirmed that it was very easy to adapt my existing sql to account for some
of the newer features in master.

The biggest challenge was establishing a sort order that ensures both a
unique ordering and that the dependencies needed for SQL functions have
been processed before trying to define them. Which effects about 4/1000
functions based on a natural oid ordering.

On Dec 11, 2015, at 11:43 AM, Alvaro Herrera <alvherre@2ndquadrant.com>

wrote:

Caleb Welton wrote:

I'm happy working these ideas forward if there is interest.

Basic design proposal is:
- keep a minimal amount of bootstrap to avoid intrusive changes to core
components
- Add capabilities of creating objects with specific OIDs via DDL

during

initdb
- Update the caching/resolution mechanism for builtin functions to be
more dynamic.
- Move as much of bootstrap as possible into SQL files and create

catalog

via DDL

I think the point we got stuck last time at was deciding on a good
format for the data coming from the DATA lines. One of the objections
raised for formats such as JSON is that it's trivial for "git merge" (or
similar tools) to make a mistake because object-end/object-start lines
are all identical. And as for the SQL-format version, the objection was
that it's hard to modify the lines en-masse when modifying the catalog
definition (new column, etc). Ideally we would like a format that can
be bulk-edited without too much trouble.

A SQL file would presumably not have the merge issue, but mass-editing
would be a pain.

Crazy idea: we could just have a CSV file which can be loaded into a
table for mass changes using regular DDL commands, then dumped back from
there into the file. We already know how to do these things, using
\copy etc. Since CSV uses one line per entry, there would be no merge
problems either (or rather: all merge problems would become conflicts,
which is what we want.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

parallel.difftext/plain; charset=US-ASCII; name=parallel.diffDownload
*** gen_protransform.sql	2015-12-11 14:36:25.000000000 -0800
--- gen_proparallel.sql	2015-12-11 14:35:41.000000000 -0800
***************
*** 14,19 ****
--- 14,23 ----
                        ELSE '' END 
    || CASE WHEN proisstrict THEN ' STRICT' ELSE '' END 
    || CASE WHEN proleakproof THEN ' LEAKPROOF' ELSE '' END
+   || CASE proparallel WHEN 's' THEN ' PARALLEL SAFE'
+                       WHEN 'r' THEN ' PARALLEL RESTRICTED'
+                       WHEN 'u' THEN '' -- PARALLEL UNSAFE is DEFAULT
+                       ELSE '' END
    || CASE WHEN (procost != 1   and lanname = 'internal') OR
                 (procost != 100 and lanname = 'sql')
            THEN ' COST ' 
gen_start.sqlapplication/octet-stream; name=gen_start.sqlDownload
transform.difftext/plain; charset=US-ASCII; name=transform.diffDownload
*** gen_leakproof.sql	2015-12-11 14:36:09.000000000 -0800
--- gen_protransform.sql	2015-12-11 14:36:25.000000000 -0800
***************
*** 72,77 ****
      AND prorettype != 'anyenum'::regtype   /* Enum is special */
      AND 'anyenum'::regtype not in (SELECT unnest(proargtypes))
      AND not proiswindow                    /* Window Functions */
  order by p.prolang, p.oid
  ;
- 
--- 72,78 ----
      AND prorettype != 'anyenum'::regtype   /* Enum is special */
      AND 'anyenum'::regtype not in (SELECT unnest(proargtypes))
      AND not proiswindow                    /* Window Functions */
+     AND protransform = 0                   /* Transforms */
+     AND p.oid not in (SELECT protransform from pg_proc)
  order by p.prolang, p.oid
  ;
leakproof.difftext/plain; charset=US-ASCII; name=leakproof.diffDownload
*** gen_start.sql	2015-12-11 14:35:53.000000000 -0800
--- gen_leakproof.sql	2015-12-11 14:36:09.000000000 -0800
***************
*** 13,18 ****
--- 13,19 ----
                        WHEN 'v' THEN '' -- VOLATILE is DEFAULT
                        ELSE '' END 
    || CASE WHEN proisstrict THEN ' STRICT' ELSE '' END 
+   || CASE WHEN proleakproof THEN ' LEAKPROOF' ELSE '' END
    || CASE WHEN (procost != 1   and lanname = 'internal') OR
                 (procost != 100 and lanname = 'sql')
            THEN ' COST ' 
#9Mark Dilger
hornschnorter@gmail.com
In reply to: Tom Lane (#7)

On Dec 11, 2015, at 2:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 11, 2015, at 1:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
That's an interesting proposal. It would mean that the catalog files
stay at more or less their current semantic level (direct representations
of bootstrap catalog contents), but it does sound like a more attractive
way to perform complex edits than writing Emacs macros ;-).

I would be happy to work on this, if there is much chance of the community
accepting a patch. Do you think replacing the numeric Oids for functions,
operators, opclasses and such in the source files with their names would
be ok, with the SQL converting those to Oids in the output?

Huh? Those files are the definition of that mapping, no? Isn't what
you're proposing circular?

No, there are far more references to Oids than there are definitions of them.

For example, the line in pg_operator.h:

DATA(insert OID = 15 ( "=" PGNSP PGUID b t t 23 20 16 416 36 int48eq eqsel eqjoinsel ));

defines 15 as the oid for the equals operator for (int8,int4) returning bool, but the
fact that 23 is the Oid for int4, 20 is the Oid for int8, and 16 is the Oid for bool
is already defined elsewhere (int pg_type.h) and need not be duplicated here.

I'm just proposing that we don't keep specifying things by number everywhere.
Once you've established the Oid for something (operator, type, function) you
should use the name everywhere else.

mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Caleb Welton
cwelton@pivotal.io
In reply to: Tom Lane (#5)

The current semantic level is pretty low level, somewhat cumbersome, and
requires filling in values that most of the time the system has a pretty
good idea how to fill in default values.

Compare:

CREATE FUNCTION lo_export(oid, text) RETURNS integer LANGUAGE internal
STRICT AS 'lo_export' WITH (OID=765);
DATA(insert OID = 765 ( lo_export PGNSP PGUID 12 1 0 0 0 f f f f t f v u
2 0 23 "26 25" _null_ _null_ _null_ _null_ _null_ lo_export _null_ _null_
_null_ ));

In the first one someone has indicated:
1. a function name,
2. two parameter type names
3. a return type
4. a language
5. null handling
6. a symbol
7. an oid

In the second case 30 separate items have been indicated, and yet both of
them will generate identical end results within the catalog.

The former is more immune to even needing modification in the event that
the catalog structure changes.
- adding proleakproof? No change needed, default value is correct
- adding protransform? No change needed, not relevant
- adding proparallel? No change needed, default value is correct
- adding procost? No change needed, default value is correct

On Fri, Dec 11, 2015 at 1:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Crazy idea: we could just have a CSV file which can be loaded into a
table for mass changes using regular DDL commands, then dumped back from
there into the file. We already know how to do these things, using
\copy etc. Since CSV uses one line per entry, there would be no merge
problems either (or rather: all merge problems would become conflicts,
which is what we want.)

That's an interesting proposal. It would mean that the catalog files
stay at more or less their current semantic level (direct representations
of bootstrap catalog contents), but it does sound like a more attractive
way to perform complex edits than writing Emacs macros ;-).

You could actually do that the hard way right now, with a bit of script
to convert between DATA lines and CSV format. But if we anticipate that
becoming the standard approach, it would definitely make sense to migrate
the master copies into CSV or traditional COPY format, and teach BKI mode
to read that (or, perhaps, leave bootstrap.c alone and modify the code
that produces the .bki file).

This is somewhat orthogonal to the question of whether we want to do
things like converting noncritical operator-class definitions into
regular CREATE OPERATOR CLASS syntax. There's almost certainly going
to be some hard core of catalog entries that aren't amenable to that,
and will still need to be loaded from data files of some sort.

regards, tom lane

#11Caleb Welton
cwelton@pivotal.io
In reply to: Mark Dilger (#9)

Yes, that alone without any other changes would be a marked improvement and
could be implemented in many places, pg_operator is a good example.

... but there is some circularity especially with respect to type
definitions and the functions that define those types. If you changed the
definition of prorettype into a regtype then bootstrap would try to lookup
the type before the pg_type entry exists and throw a fit. That's handled
in SQL via shell types. If we wanted bootstrap to be able to handle this
then we'd have to make two passes of pg_type, the first to create the
shells and the second to handle populating the serialization functions.

Unfortunately types and functions tend to be the more volatile areas of the
catalog so this particular circularity is particularly vexing.

On Fri, Dec 11, 2015 at 2:53 PM, Mark Dilger <hornschnorter@gmail.com>
wrote:

Show quoted text

On Dec 11, 2015, at 2:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 11, 2015, at 1:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
That's an interesting proposal. It would mean that the catalog files
stay at more or less their current semantic level (direct

representations

of bootstrap catalog contents), but it does sound like a more

attractive

way to perform complex edits than writing Emacs macros ;-).

I would be happy to work on this, if there is much chance of the

community

accepting a patch. Do you think replacing the numeric Oids for

functions,

operators, opclasses and such in the source files with their names would
be ok, with the SQL converting those to Oids in the output?

Huh? Those files are the definition of that mapping, no? Isn't what
you're proposing circular?

No, there are far more references to Oids than there are definitions of
them.

For example, the line in pg_operator.h:

DATA(insert OID = 15 ( "=" PGNSP PGUID b t t 23 20 16 416 36
int48eq eqsel eqjoinsel ));

defines 15 as the oid for the equals operator for (int8,int4) returning
bool, but the
fact that 23 is the Oid for int4, 20 is the Oid for int8, and 16 is the
Oid for bool
is already defined elsewhere (int pg_type.h) and need not be duplicated
here.

I'm just proposing that we don't keep specifying things by number
everywhere.
Once you've established the Oid for something (operator, type, function)
you
should use the name everywhere else.

mark

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#9)

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 11, 2015, at 2:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Huh? Those files are the definition of that mapping, no? Isn't what
you're proposing circular?

No, there are far more references to Oids than there are definitions of them.

Well, you're still not being very clear, but I *think* what you're
proposing is to put a lot more smarts into the script that converts
the master source files into .bki format. That is, we might have
"=(int8,int4)" in an entry in the master source file for pg_amop, but
the script would look up that entry using the source data for pg_type
and pg_operator, and then emit a simple numeric OID into the .bki file.
(Presumably, it would know to do this because we'd redefine the
pg_amop.amopopr column as of regoperator type not plain OID.)

Yeah, that could work, though I'd be a bit concerned about the complexity
and speed of the script. Still, one doesn't usually rebuild postgres.bki
many times a day, so speed might not be a big problem.

This seems more or less orthogonal to the question of whether to get rid
of the DATA() lines in favor of a COPY-friendly data format. I'd suggest
treating those as separate patches.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Caleb Welton (#11)

Caleb Welton <cwelton@pivotal.io> writes:

... but there is some circularity especially with respect to type
definitions and the functions that define those types. If you changed the
definition of prorettype into a regtype then bootstrap would try to lookup
the type before the pg_type entry exists and throw a fit. That's handled
in SQL via shell types. If we wanted bootstrap to be able to handle this
then we'd have to make two passes of pg_type, the first to create the
shells and the second to handle populating the serialization functions.

I think what Mark is proposing is to do the lookups while preparing the
.bki file, which would eliminate the circularity ... at the cost of having
to, essentially, reimplement regprocedure_in and friends in Perl.

If we push hard on doing the other thing that you're proposing, which is
to take as much as possible out of the pure bootstrap-data phase, then
maybe it wouldn't be worth the work to do that. Not sure.

On the other hand, I'm not very much in love with the thought of having
two different notations for "core" and "not so core" built-in function
creation. There's something to be said for keeping all the data in one
format. If we push on making the .bki creation script smarter, then in
addition to the name lookup facilities Mark envisions, we could have
things like default column values. That would take us a long way toward
the same ease-of-use as full SQL definitions. We'd still be lacking
some error checks that the SQL commands could perform; but we've
traditionally used sanity checks in the regression tests to do
cross-checking that covers more or less those same bases.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Mark Dilger
hornschnorter@gmail.com
In reply to: Tom Lane (#12)

On Dec 11, 2015, at 3:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 11, 2015, at 2:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Huh? Those files are the definition of that mapping, no? Isn't what
you're proposing circular?

No, there are far more references to Oids than there are definitions of them.

Well, you're still not being very clear, but I *think* what you're
proposing is to put a lot more smarts into the script that converts
the master source files into .bki format. That is, we might have
"=(int8,int4)" in an entry in the master source file for pg_amop, but
the script would look up that entry using the source data for pg_type
and pg_operator, and then emit a simple numeric OID into the .bki file.
(Presumably, it would know to do this because we'd redefine the
pg_amop.amopopr column as of regoperator type not plain OID.)

Yeah, that could work, though I'd be a bit concerned about the complexity
and speed of the script. Still, one doesn't usually rebuild postgres.bki
many times a day, so speed might not be a big problem.

I am proposing that each of the catalog headers that currently has DATA
lines instead have a COPY loadable file that contains the same information.
So, for pg_type.h, there would be a pg_type.dat file. All the DATA lines
would be pulled out of pg_type.h and a corresponding tab delimited row
would be written to pg_type.dat. Henceforth, if you cloned the git repository,
you'd find no DATA lines in pg_type.h, but would find a pg_type.dat file
in the src/include/catalog directory. Likewise for the other header files.

There would be some script, SQL or perl or whatever, that would convert
these .dat files into the .bki file.

Now, if we know that pg_type.dat will be processed before pg_proc.dat,
we can replace all the Oids representing datatypes in pg_proc.dat with the
names for those types, given that we already have a name <=> oid
mapping for types.

Likewise, if we know that pg_proc.dat will be processed before pg_operator.dat,
we can specify both functions and datatypes by name rather than by Oid
in that file, making it much easier to read. By the time pg_operator.dat is
read, pg_type.dat and pg_proc.dat will already have been read and processed,
so there shouldn't be ambiguity.

By the time pg_amop.dat is processed, the operators, procs, datatypes,
opfamilies and so forth would already be know. The example I gave up
thread would be easy to parse:

amopfamily amoplefttype amoprighttype amopstrategy amoppurpose amopopr amopmethod amopsortfamily
integer_ops int2 int2 1 search "<" btree 0
integer_ops int2 int2 2 search "<=" btree 0
integer_ops int2 int2 3 search "=" btree 0
integer_ops int2 int2 4 search ">=" btree 0
integer_ops int2 int2 5 search ">" btree 0

And if I came along and defined a new datatype, int384, I could add rows to
this file much more easily, as:

amopfamily amoplefttype amoprighttype amopstrategy amoppurpose amopopr amopmethod amopsortfamily
integer_ops int384 int384 1 search "<" btree 0
integer_ops int384 int384 2 search "<=" btree 0
integer_ops int384 int384 3 search "=" btree 0
integer_ops int384 int384 4 search ">=" btree 0
integer_ops int384 int384 5 search ">" btree 0

I don't see how this creates all that much complication, and I clearly see
how it makes files like pg_operator.{h,dat} and pg_amop.{h,dat} easier to read.

mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#13)

On 2015-12-11 18:12:16 -0500, Tom Lane wrote:

I think what Mark is proposing is to do the lookups while preparing the
.bki file, which would eliminate the circularity ... at the cost of having
to, essentially, reimplement regprocedure_in and friends in Perl.

FWIW, I did that, when this came up last. Rather interesting, because it
leads to rather noticeable speedups - currently initdb spents a
significant amount of its time doing reproc lookups. Especially
interesting because at that stage we're largely not using indexes yet, IIRC.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#14)

Mark Dilger <hornschnorter@gmail.com> writes:

Now, if we know that pg_type.dat will be processed before pg_proc.dat,
we can replace all the Oids representing datatypes in pg_proc.dat with the
names for those types, given that we already have a name <=> oid
mapping for types.

I don't think this is quite as simple as you paint it. How can you
process pg_type.dat first, when it contains pg_proc references? Doing
pg_proc first is no better, because it contains pg_type references.

I believe it's soluble, but it's going to take something more like
loading up all the data at once and then doing lookups as we write
out the .bki entries for each catalog. Fortunately, the volume of
bootstrap data is small enough that that won't be a problem on any
machine capable of running modern Postgres ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#16)

On 2015-12-11 19:26:38 -0500, Tom Lane wrote:

I believe it's soluble, but it's going to take something more like
loading up all the data at once and then doing lookups as we write
out the .bki entries for each catalog. Fortunately, the volume of
bootstrap data is small enough that that won't be a problem on any
machine capable of running modern Postgres ...

I think that's exactly the right approach. Just building a few perl
hashes worked well enough, in my prototype of that.

If additionally a few more plain oid fields are converted into reg*
types, the source data fields are easier to understand and the catalogs
get much nicer to query...

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#17)

Andres Freund <andres@anarazel.de> writes:

On 2015-12-11 19:26:38 -0500, Tom Lane wrote:

I believe it's soluble, but it's going to take something more like
loading up all the data at once and then doing lookups as we write
out the .bki entries for each catalog. Fortunately, the volume of
bootstrap data is small enough that that won't be a problem on any
machine capable of running modern Postgres ...

I think that's exactly the right approach. Just building a few perl
hashes worked well enough, in my prototype of that.

Right. I would draw Mark's attention to src/backend/catalog/Catalog.pm
and the things that use that. Presumably all that would have be
rewritten, but the existing code would be a useful starting point
perhaps.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Mark Dilger
hornschnorter@gmail.com
In reply to: Caleb Welton (#10)

On Dec 11, 2015, at 2:54 PM, Caleb Welton <cwelton@pivotal.io> wrote:

The current semantic level is pretty low level, somewhat cumbersome, and requires filling in values that most of the time the system has a pretty good idea how to fill in default values.

Compare:
CREATE FUNCTION lo_export(oid, text) RETURNS integer LANGUAGE internal STRICT AS 'lo_export' WITH (OID=765);

DATA(insert OID = 765 ( lo_export PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 23 "26 25" _null_ _null_ _null_ _null_ _null_ lo_export _null_ _null_ _null_ ));

I would like to hear more about this idea. Are you proposing that we use something
like the above CREATE FUNCTION format to express what is currently being expressed
with DATA statements? That is an interesting idea, though I don't know what exactly
that would look like. If you want to forward this idea, I'd be eager to hear your thoughts.
If not, I'll try to make progress with my idea of tab delimited files and such (or really,
Alvaro's idea of csv files that I only slightly corrupted).

mark

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#19)

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 11, 2015, at 2:54 PM, Caleb Welton <cwelton@pivotal.io> wrote:
Compare:
CREATE FUNCTION lo_export(oid, text) RETURNS integer LANGUAGE internal STRICT AS 'lo_export' WITH (OID=765);

DATA(insert OID = 765 ( lo_export PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 23 "26 25" _null_ _null_ _null_ _null_ _null_ lo_export _null_ _null_ _null_ ));

I would like to hear more about this idea. Are you proposing that we use something
like the above CREATE FUNCTION format to express what is currently being expressed
with DATA statements?

Yes, that sort of idea has been kicked around some already, see the
archives.

That is an interesting idea, though I don't know what exactly
that would look like. If you want to forward this idea, I'd be eager to hear your thoughts.
If not, I'll try to make progress with my idea of tab delimited files and such (or really,
Alvaro's idea of csv files that I only slightly corrupted).

Personally I would like to see both approaches explored. Installing as
much as we can via SQL commands is attractive for a number of reasons;
but there is going to be an irreducible minimum amount of stuff that
has to be inserted by something close to the current bootstrapping
process. (And I'm not convinced that that "minimum amount" is going
to be very small...) So it's not impossible that we'd end up accepting
*both* types of patches, one to do more in the post-bootstrap SQL world
and one to make the bootstrap data notation less cumbersome. In any
case it would be useful to push both approaches forward some more before
we make any decisions between them.

BTW, there's another thing I'd like to see improved in this area, which is
a problem already but will get a lot worse if we push more work into the
post-bootstrap phase of initdb. That is that the post-bootstrap phase is
both inefficient and impossible to debug. If you've ever had a failure
there, you'll have seen that the backend spits out an entire SQL script
and says there's an error in it somewhere; that's because it gets the
whole per-stage script as one submission. (Try introducing a syntax error
somewhere in information_schema.sql, and you'll see what I mean.)
Breaking the stage scripts down further would help, but that is
unattractive because each one requires a fresh backend startup/shutdown,
including a full checkpoint. I'd like to see things rejiggered so that
there's only one post-bootstrap standalone backend session that performs
all the steps, but initdb feeds it just one SQL command at a time so that
errors are better localized. That should both speed up initdb noticeably
and make debugging easier.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#20)
1 attachment(s)
Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

I wrote:

BTW, there's another thing I'd like to see improved in this area, which is
a problem already but will get a lot worse if we push more work into the
post-bootstrap phase of initdb. That is that the post-bootstrap phase is
both inefficient and impossible to debug. If you've ever had a failure
there, you'll have seen that the backend spits out an entire SQL script
and says there's an error in it somewhere; that's because it gets the
whole per-stage script as one submission. (Try introducing a syntax error
somewhere in information_schema.sql, and you'll see what I mean.)
Breaking the stage scripts down further would help, but that is
unattractive because each one requires a fresh backend startup/shutdown,
including a full checkpoint. I'd like to see things rejiggered so that
there's only one post-bootstrap standalone backend session that performs
all the steps, but initdb feeds it just one SQL command at a time so that
errors are better localized. That should both speed up initdb noticeably
and make debugging easier.

I thought this sounded like a nice lazy-Saturday project, so I started
poking at it, and attached is a WIP patch. The core issue that has to
be dealt with is that standalone-backend mode currently has just two
rules for deciding when to stop collecting input and execute the command
buffer, and they both suck:

1. By default, execute after every newline. (Actually, you can quote
a newline with a backslash, but we don't use that ability anywhere.)

2. With -j, slurp the entire input until EOF, and execute it as one
giant multicommand string.

We're doing #2 to handle information_schema.sql and the other large
SQL scripts that initdb runs, which is why the response to an error in
those scripts is so yucky.

After some experimentation, I came up with the idea of executing any
time that a semicolon followed by two newlines is seen. This nicely
breaks up input like information_schema.sql. There are probably some
residual places where more than one command is executed in a single
string, but we could fix that with some more newlines. Obviously,
this rule is capable of being fooled if you have a newline followed by
a blank line in a comment or quoted literal --- but it turns out that
no such case exists anywhere in initdb's data.

I'm not particularly wedded to this rule. In principle we could go so
far as to import psql's code that parses commands and figures out which
semicolons are command terminators --- but that is a pretty large chunk
of code, and I think it'd really be overkill considering that initdb
deals only with fixed input scripts. But if anyone has another simple
rule for breaking SQL into commands, we can certainly discuss
alternatives.

Anyway, the attached patch tweaks postgres.c to follow that rule instead
of slurp-to-EOF when -j is given. I doubt that being non-backwards-
compatible is a problem here; in fact, I'm tempted to rip out the -j
switch altogether and just have standalone mode always parse input the
same way. Does anyone know of people using standalone mode other than
for initdb?

The other part of the patch modifies initdb to do all its post-bootstrap
steps using a single standalone backend session. I had to remove the
code that currently prints out progress markers for individual phases
of that processing, so that now you get output that looks like

creating directory /home/postgres/testversion/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
creating template1 database in /home/postgres/testversion/data/base/1 ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

Since initdb is just printing to the backend in an open-loop fashion,
it doesn't really know whether each command succeeds; in fact it is
usually pretty far ahead of the backend, until it does pclose() which
waits for the subprocess to exit. So we can't readily keep the old
progress markers. I don't think we'll miss them though. The whole
"post-bootstrap initialization" step only takes a second or so on my
main dev machine, so breaking down the progress more finely isn't that
useful anymore.

I had to change ;\n to ;\n\n in a few places in initdb's internal
scripts, to ensure that VACUUM commands were executed by themselves
(otherwise you get "VACUUM can't run in a transaction block" type
failures). I wasn't very thorough about that though, pending a
decision on exactly what the new command-boundary rule will be.

The upshot of these changes is that initdb runs about 10% faster overall
(more in -N mode), which is a useful savings. Also, the response to a
syntax error in information_schema.sql now looks like this:

creating template1 database in /home/postgres/testversion/data/base/1 ... ok
performing post-bootstrap initialization ... FATAL: column "routzine_schema" does not exist at character 243
HINT: Perhaps you meant to reference the column "routine_privileges.routine_schema".
STATEMENT:
/*
* 5.42
* ROLE_ROUTINE_GRANTS view
*/

CREATE VIEW role_routine_grants AS
SELECT grantor,
grantee,
specific_catalog,
specific_schema,
specific_name,
routine_catalog,
routzine_schema,
routine_name,
privilege_type,
is_grantable
FROM routine_privileges
WHERE grantor IN (SELECT role_name FROM enabled_roles)
OR grantee IN (SELECT role_name FROM enabled_roles);

child process exited with exit code 1
initdb: removing data directory "/home/postgres/testversion/data"

which is a *huge* improvement over what you got before.

What remains to be done before this'd be committable is updating
the documentation around the -j switch (or removing it altogether),
and going through initdb and its input scripts to ensure there's
a double newline after every nontrivial SQL command. I don't see
much point in finishing that detail work until we have consensus on
whether to use this or another command-boundary rule.

Thoughts?

regards, tom lane

Attachments:

initdb-improvement.patchtext/x-diff; charset=us-ascii; name=initdb-improvement.patchDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 1dc2eb0..1cb0cd9 100644
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** static int
*** 219,226 ****
  InteractiveBackend(StringInfo inBuf)
  {
  	int			c;				/* character read from getc() */
- 	bool		end = false;	/* end-of-input flag */
- 	bool		backslashSeen = false;	/* have we seen a \ ? */
  
  	/*
  	 * display a prompt and obtain input from the user
--- 219,224 ----
*************** InteractiveBackend(StringInfo inBuf)
*** 230,284 ****
  
  	resetStringInfo(inBuf);
  
! 	if (UseNewLine)
  	{
! 		/*
! 		 * if we are using \n as a delimiter, then read characters until the
! 		 * \n.
! 		 */
! 		while ((c = interactive_getc()) != EOF)
  		{
! 			if (c == '\n')
  			{
! 				if (backslashSeen)
  				{
  					/* discard backslash from inBuf */
  					inBuf->data[--inBuf->len] = '\0';
- 					backslashSeen = false;
  					continue;
  				}
  				else
  				{
! 					/* keep the newline character */
  					appendStringInfoChar(inBuf, '\n');
  					break;
  				}
  			}
- 			else if (c == '\\')
- 				backslashSeen = true;
  			else
! 				backslashSeen = false;
! 
! 			appendStringInfoChar(inBuf, (char) c);
  		}
  
! 		if (c == EOF)
! 			end = true;
! 	}
! 	else
! 	{
! 		/*
! 		 * otherwise read characters until EOF.
! 		 */
! 		while ((c = interactive_getc()) != EOF)
! 			appendStringInfoChar(inBuf, (char) c);
! 
! 		/* No input before EOF signal means time to quit. */
! 		if (inBuf->len == 0)
! 			end = true;
  	}
  
! 	if (end)
  		return EOF;
  
  	/*
--- 228,282 ----
  
  	resetStringInfo(inBuf);
  
! 	/*
! 	 * Read characters until EOF or the appropriate delimiter is seen.
! 	 */
! 	while ((c = interactive_getc()) != EOF)
  	{
! 		if (c == '\n')
  		{
! 			if (UseNewLine)
  			{
! 				/*
! 				 * In newline mode, newline is delimiter unless preceded by
! 				 * backslash.
! 				 */
! 				if (inBuf->len > 0 &&
! 					inBuf->data[inBuf->len - 1] == '\\')
  				{
  					/* discard backslash from inBuf */
  					inBuf->data[--inBuf->len] = '\0';
  					continue;
  				}
  				else
  				{
! 					/* keep the newline character, but end the command */
  					appendStringInfoChar(inBuf, '\n');
  					break;
  				}
  			}
  			else
! 			{
! 				/*
! 				 * In -j mode, semicolon followed by two newlines ends the
! 				 * command; otherwise treat newline as regular character.
! 				 */
! 				if (inBuf->len > 1 &&
! 					inBuf->data[inBuf->len - 1] == '\n' &&
! 					inBuf->data[inBuf->len - 2] == ';')
! 				{
! 					/* might as well drop the second newline */
! 					break;
! 				}
! 			}
  		}
  
! 		/* Not newline, or newline treated as regular character */
! 		appendStringInfoChar(inBuf, (char) c);
  	}
  
! 	/* No input before EOF signal means time to quit. */
! 	if (c == EOF && inBuf->len == 0)
  		return EOF;
  
  	/*
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index feeff9e..b056639 100644
*** a/src/bin/initdb/initdb.c
--- b/src/bin/initdb/initdb.c
*************** static void set_null_conf(void);
*** 244,264 ****
  static void test_config_settings(void);
  static void setup_config(void);
  static void bootstrap_template1(void);
! static void setup_auth(void);
! static void get_set_pwd(void);
! static void setup_depend(void);
! static void setup_sysviews(void);
! static void setup_description(void);
! static void setup_collation(void);
! static void setup_conversion(void);
! static void setup_dictionary(void);
! static void setup_privileges(void);
  static void set_info_version(void);
! static void setup_schema(void);
! static void load_plpgsql(void);
! static void vacuum_db(void);
! static void make_template0(void);
! static void make_postgres(void);
  static void fsync_pgdata(void);
  static void trapsig(int signum);
  static void check_ok(void);
--- 244,264 ----
  static void test_config_settings(void);
  static void setup_config(void);
  static void bootstrap_template1(void);
! static void setup_auth(FILE *cmdfd);
! static void get_set_pwd(FILE *cmdfd);
! static void setup_depend(FILE *cmdfd);
! static void setup_sysviews(FILE *cmdfd);
! static void setup_description(FILE *cmdfd);
! static void setup_collation(FILE *cmdfd);
! static void setup_conversion(FILE *cmdfd);
! static void setup_dictionary(FILE *cmdfd);
! static void setup_privileges(FILE *cmdfd);
  static void set_info_version(void);
! static void setup_schema(FILE *cmdfd);
! static void load_plpgsql(FILE *cmdfd);
! static void vacuum_db(FILE *cmdfd);
! static void make_template0(FILE *cmdfd);
! static void make_postgres(FILE *cmdfd);
  static void fsync_pgdata(void);
  static void trapsig(int signum);
  static void check_ok(void);
*************** bootstrap_template1(void)
*** 1545,1553 ****
   * set up the shadow password table
   */
  static void
! setup_auth(void)
  {
- 	PG_CMD_DECL;
  	const char **line;
  	static const char *pg_authid_setup[] = {
  		/*
--- 1545,1552 ----
   * set up the shadow password table
   */
  static void
! setup_auth(FILE *cmdfd)
  {
  	const char **line;
  	static const char *pg_authid_setup[] = {
  		/*
*************** setup_auth(void)
*** 1558,1589 ****
  		NULL
  	};
  
- 	fputs(_("initializing pg_authid ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = pg_authid_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * get the superuser password if required, and call postgres to set it
   */
  static void
! get_set_pwd(void)
  {
- 	PG_CMD_DECL;
- 
  	char	   *pwd1,
  			   *pwd2;
  
--- 1557,1572 ----
  		NULL
  	};
  
  	for (line = pg_authid_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
   * get the superuser password if required, and call postgres to set it
   */
  static void
! get_set_pwd(FILE *cmdfd)
  {
  	char	   *pwd1,
  			   *pwd2;
  
*************** get_set_pwd(void)
*** 1640,1673 ****
  		pwd1 = pg_strdup(pwdbuf);
  
  	}
- 	printf(_("setting password ... "));
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
  
  	PG_CMD_PRINTF2("ALTER USER \"%s\" WITH PASSWORD E'%s';\n",
  				   username, escape_quotes(pwd1));
  
- 	/* MM: pwd1 is no longer needed, freeing it */
  	free(pwd1);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * set up pg_depend
   */
  static void
! setup_depend(void)
  {
- 	PG_CMD_DECL;
  	const char **line;
  	static const char *pg_depend_setup[] = {
  		/*
--- 1623,1641 ----
  		pwd1 = pg_strdup(pwdbuf);
  
  	}
  
  	PG_CMD_PRINTF2("ALTER USER \"%s\" WITH PASSWORD E'%s';\n",
  				   username, escape_quotes(pwd1));
  
  	free(pwd1);
  }
  
  /*
   * set up pg_depend
   */
  static void
! setup_depend(FILE *cmdfd)
  {
  	const char **line;
  	static const char *pg_depend_setup[] = {
  		/*
*************** setup_depend(void)
*** 1684,1693 ****
  		 * First delete any already-made entries; PINs override all else, and
  		 * must be the only entries for their objects.
  		 */
! 		"DELETE FROM pg_depend;\n",
! 		"VACUUM pg_depend;\n",
! 		"DELETE FROM pg_shdepend;\n",
! 		"VACUUM pg_shdepend;\n",
  
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
  		" FROM pg_class;\n",
--- 1652,1661 ----
  		 * First delete any already-made entries; PINs override all else, and
  		 * must be the only entries for their objects.
  		 */
! 		"DELETE FROM pg_depend;\n\n",
! 		"VACUUM pg_depend;\n\n",
! 		"DELETE FROM pg_shdepend;\n\n",
! 		"VACUUM pg_shdepend;\n\n",
  
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
  		" FROM pg_class;\n",
*************** setup_depend(void)
*** 1740,1819 ****
  		NULL
  	};
  
- 	fputs(_("initializing dependencies ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = pg_depend_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * set up system views
   */
  static void
! setup_sysviews(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **sysviews_setup;
  
- 	fputs(_("creating system views ... "), stdout);
- 	fflush(stdout);
- 
  	sysviews_setup = readfile(system_views_file);
  
- 	/*
- 	 * We use -j here to avoid backslashing stuff in system_views.sql
- 	 */
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s -j template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = sysviews_setup; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
  		free(*line);
  	}
  
- 	PG_CMD_CLOSE;
- 
  	free(sysviews_setup);
- 
- 	check_ok();
  }
  
  /*
   * load description data
   */
  static void
! setup_description(void)
  {
- 	PG_CMD_DECL;
- 
- 	fputs(_("loading system objects' descriptions ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_description ( "
  				"	objoid oid, "
  				"	classname name, "
--- 1708,1743 ----
  		NULL
  	};
  
  	for (line = pg_depend_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
   * set up system views
   */
  static void
! setup_sysviews(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **sysviews_setup;
  
  	sysviews_setup = readfile(system_views_file);
  
  	for (line = sysviews_setup; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
  		free(*line);
  	}
  
  	free(sysviews_setup);
  }
  
  /*
   * load description data
   */
  static void
! setup_description(FILE *cmdfd)
  {
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_description ( "
  				"	objoid oid, "
  				"	classname name, "
*************** setup_description(void)
*** 1853,1862 ****
  				"  WHERE opdesc NOT LIKE 'deprecated%' AND "
  				"  NOT EXISTS (SELECT 1 FROM pg_description "
  		  "    WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass);\n");
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  #ifdef HAVE_LOCALE_T
--- 1777,1782 ----
*************** normalize_locale_name(char *new, const c
*** 1899,1905 ****
   * populate pg_collation
   */
  static void
! setup_collation(void)
  {
  #if defined(HAVE_LOCALE_T) && !defined(WIN32)
  	int			i;
--- 1819,1825 ----
   * populate pg_collation
   */
  static void
! setup_collation(FILE *cmdfd)
  {
  #if defined(HAVE_LOCALE_T) && !defined(WIN32)
  	int			i;
*************** setup_collation(void)
*** 1907,1930 ****
  	char		localebuf[NAMEDATALEN]; /* we assume ASCII so this is fine */
  	int			count = 0;
  
- 	PG_CMD_DECL;
- #endif
- 
- 	fputs(_("creating collations ... "), stdout);
- 	fflush(stdout);
- 
- #if defined(HAVE_LOCALE_T) && !defined(WIN32)
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
  	locale_a_handle = popen_check("locale -a", "r");
  	if (!locale_a_handle)
  		return;					/* complaint already printed */
  
- 	PG_CMD_OPEN;
- 
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_collation ( "
  				"	collname name, "
  				"	locale name, "
--- 1827,1836 ----
*************** setup_collation(void)
*** 2032,2048 ****
  	   "  ORDER BY collname, encoding, (collname = locale) DESC, locale;\n");
  
  	pclose(locale_a_handle);
- 	PG_CMD_CLOSE;
  
- 	check_ok();
  	if (count == 0 && !debug)
  	{
  		printf(_("No usable system locales were found.\n"));
  		printf(_("Use the option \"--debug\" to see details.\n"));
  	}
- #else							/* not HAVE_LOCALE_T && not WIN32 */
- 	printf(_("not supported on this platform\n"));
- 	fflush(stdout);
  #endif   /* not HAVE_LOCALE_T  && not WIN32 */
  }
  
--- 1938,1949 ----
*************** setup_collation(void)
*** 2050,2071 ****
   * load conversion functions
   */
  static void
! setup_conversion(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **conv_lines;
  
- 	fputs(_("creating conversions ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	conv_lines = readfile(conversion_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
--- 1951,1961 ----
   * load conversion functions
   */
  static void
! setup_conversion(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **conv_lines;
  
  	conv_lines = readfile(conversion_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
*************** setup_conversion(void)
*** 2075,2109 ****
  	}
  
  	free(conv_lines);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * load extra dictionaries (Snowball stemmers)
   */
  static void
! setup_dictionary(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **conv_lines;
  
- 	fputs(_("creating dictionaries ... "), stdout);
- 	fflush(stdout);
- 
- 	/*
- 	 * We use -j here to avoid backslashing stuff
- 	 */
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s -j template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	conv_lines = readfile(dictionary_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
--- 1965,1981 ----
  	}
  
  	free(conv_lines);
  }
  
  /*
   * load extra dictionaries (Snowball stemmers)
   */
  static void
! setup_dictionary(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **conv_lines;
  
  	conv_lines = readfile(dictionary_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
*************** setup_dictionary(void)
*** 2112,2121 ****
  	}
  
  	free(conv_lines);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
--- 1984,1989 ----
*************** setup_dictionary(void)
*** 2130,2138 ****
   * set (NOT NULL).
   */
  static void
! setup_privileges(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **priv_lines;
  	static char *privileges_setup[] = {
--- 1998,2005 ----
   * set (NOT NULL).
   */
  static void
! setup_privileges(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **priv_lines;
  	static char *privileges_setup[] = {
*************** setup_privileges(void)
*** 2145,2168 ****
  		NULL
  	};
  
- 	fputs(_("setting privileges on built-in objects ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	priv_lines = replace_token(privileges_setup, "$POSTGRES_SUPERUSERNAME",
  							   escape_quotes(username));
  	for (line = priv_lines; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
--- 2012,2021 ----
*************** set_info_version(void)
*** 2197,2223 ****
   * load info schema and populate from features file
   */
  static void
! setup_schema(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **lines;
  
- 	fputs(_("creating information schema ... "), stdout);
- 	fflush(stdout);
- 
  	lines = readfile(info_schema_file);
  
- 	/*
- 	 * We use -j here to avoid backslashing stuff in information_schema.sql
- 	 */
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s -j template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = lines; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
--- 2050,2062 ----
   * load info schema and populate from features file
   */
  static void
! setup_schema(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **lines;
  
  	lines = readfile(info_schema_file);
  
  	for (line = lines; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
*************** setup_schema(void)
*** 2226,2240 ****
  
  	free(lines);
  
- 	PG_CMD_CLOSE;
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	PG_CMD_PRINTF1("UPDATE information_schema.sql_implementation_info "
  				   "  SET character_value = '%s' "
  				   "  WHERE implementation_info_name = 'DBMS VERSION';\n",
--- 2065,2070 ----
*************** setup_schema(void)
*** 2245,2390 ****
  				   "  sub_feature_name, is_supported, comments) "
  				   " FROM E'%s';\n",
  				   escape_quotes(features_file));
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * load PL/pgsql server-side language
   */
  static void
! load_plpgsql(void)
  {
- 	PG_CMD_DECL;
- 
- 	fputs(_("loading PL/pgSQL server-side language ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n");
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * clean everything up in template1
   */
  static void
! vacuum_db(void)
  {
- 	PG_CMD_DECL;
- 
- 	fputs(_("vacuuming database template1 ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	/* Run analyze before VACUUM so the statistics are frozen. */
! 	PG_CMD_PUTS("ANALYZE;\nVACUUM FREEZE;\n");
! 
! 	PG_CMD_CLOSE;
! 
! 	check_ok();
  }
  
  /*
   * copy template1 to template0
   */
  static void
! make_template0(void)
  {
- 	PG_CMD_DECL;
  	const char **line;
  	static const char *template0_setup[] = {
! 		"CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false;\n",
  
  		/*
  		 * We use the OID of template0 to determine lastsysoid
  		 */
  		"UPDATE pg_database SET datlastsysoid = "
  		"    (SELECT oid FROM pg_database "
! 		"    WHERE datname = 'template0');\n",
  
  		/*
  		 * Explicitly revoke public create-schema and create-temp-table
  		 * privileges in template1 and template0; else the latter would be on
  		 * by default
  		 */
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template1 FROM public;\n",
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template0 FROM public;\n",
  
! 		"COMMENT ON DATABASE template0 IS 'unmodifiable empty database';\n",
  
  		/*
  		 * Finally vacuum to clean up dead rows in pg_database
  		 */
! 		"VACUUM FULL pg_database;\n",
  		NULL
  	};
  
- 	fputs(_("copying template1 to template0 ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = template0_setup; *line; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * copy template1 to postgres
   */
  static void
! make_postgres(void)
  {
- 	PG_CMD_DECL;
  	const char **line;
  	static const char *postgres_setup[] = {
! 		"CREATE DATABASE postgres;\n",
! 		"COMMENT ON DATABASE postgres IS 'default administrative connection database';\n",
  		NULL
  	};
  
- 	fputs(_("copying template1 to postgres ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = postgres_setup; *line; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
--- 2075,2154 ----
  				   "  sub_feature_name, is_supported, comments) "
  				   " FROM E'%s';\n",
  				   escape_quotes(features_file));
  }
  
  /*
   * load PL/pgsql server-side language
   */
  static void
! load_plpgsql(FILE *cmdfd)
  {
  	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n");
  }
  
  /*
   * clean everything up in template1
   */
  static void
! vacuum_db(FILE *cmdfd)
  {
  	/* Run analyze before VACUUM so the statistics are frozen. */
! 	PG_CMD_PUTS("ANALYZE;\n\nVACUUM FREEZE;\n\n");
  }
  
  /*
   * copy template1 to template0
   */
  static void
! make_template0(FILE *cmdfd)
  {
  	const char **line;
  	static const char *template0_setup[] = {
! 		"CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false;\n\n",
  
  		/*
  		 * We use the OID of template0 to determine lastsysoid
  		 */
  		"UPDATE pg_database SET datlastsysoid = "
  		"    (SELECT oid FROM pg_database "
! 		"    WHERE datname = 'template0');\n\n",
  
  		/*
  		 * Explicitly revoke public create-schema and create-temp-table
  		 * privileges in template1 and template0; else the latter would be on
  		 * by default
  		 */
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template1 FROM public;\n\n",
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template0 FROM public;\n\n",
  
! 		"COMMENT ON DATABASE template0 IS 'unmodifiable empty database';\n\n",
  
  		/*
  		 * Finally vacuum to clean up dead rows in pg_database
  		 */
! 		"VACUUM FULL pg_database;\n\n",
  		NULL
  	};
  
  	for (line = template0_setup; *line; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
   * copy template1 to postgres
   */
  static void
! make_postgres(FILE *cmdfd)
  {
  	const char **line;
  	static const char *postgres_setup[] = {
! 		"CREATE DATABASE postgres;\n\n",
! 		"COMMENT ON DATABASE postgres IS 'default administrative connection database';\n\n",
  		NULL
  	};
  
  	for (line = postgres_setup; *line; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
*************** warn_on_mount_point(int error)
*** 3303,3308 ****
--- 3067,3073 ----
  void
  initialize_data_directory(void)
  {
+ 	PG_CMD_DECL;
  	int			i;
  
  	setup_signals();
*************** initialize_data_directory(void)
*** 3343,3377 ****
  	 */
  	write_version_file("base/1");
  
! 	/* Create the stuff we don't need to use bootstrap mode for */
  
! 	setup_auth();
  	if (pwprompt || pwfilename)
! 		get_set_pwd();
  
! 	setup_depend();
  
! 	setup_sysviews();
  
! 	setup_description();
  
! 	setup_collation();
  
! 	setup_conversion();
  
! 	setup_dictionary();
  
! 	setup_privileges();
  
! 	setup_schema();
  
! 	load_plpgsql();
  
! 	vacuum_db();
  
! 	make_template0();
  
! 	make_postgres();
  }
  
  
--- 3108,3158 ----
  	 */
  	write_version_file("base/1");
  
! 	/*
! 	 * Create the stuff we don't need to use bootstrap mode for, using a
! 	 * backend running in simple standalone mode.
! 	 */
! 	fputs(_("performing post-bootstrap initialization ... "), stdout);
! 	fflush(stdout);
  
! 	snprintf(cmd, sizeof(cmd),
! 			 "\"%s\" %s -j template1 >%s",
! 			 backend_exec, backend_options,
! 			 DEVNULL);
! 
! 	PG_CMD_OPEN;
! 
! 	setup_auth(cmdfd);
  	if (pwprompt || pwfilename)
! 		get_set_pwd(cmdfd);
  
! 	setup_depend(cmdfd);
  
! 	setup_sysviews(cmdfd);
  
! 	setup_description(cmdfd);
  
! 	setup_collation(cmdfd);
  
! 	setup_conversion(cmdfd);
  
! 	setup_dictionary(cmdfd);
  
! 	setup_privileges(cmdfd);
  
! 	setup_schema(cmdfd);
  
! 	load_plpgsql(cmdfd);
  
! 	vacuum_db(cmdfd);
  
! 	make_template0(cmdfd);
  
! 	make_postgres(cmdfd);
! 
! 	PG_CMD_CLOSE;
! 
! 	check_ok();
  }
  
  
#22Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#20)

On 2015-12-12 13:28:28 -0500, Tom Lane wrote:

BTW, there's another thing I'd like to see improved in this area, which is
a problem already but will get a lot worse if we push more work into the
post-bootstrap phase of initdb. That is that the post-bootstrap phase is
both inefficient and impossible to debug. If you've ever had a failure
there, you'll have seen that the backend spits out an entire SQL script
and says there's an error in it somewhere; that's because it gets the
whole per-stage script as one submission.

Seen that more than once :(

Breaking the stage scripts down further would help, but that is
unattractive because each one requires a fresh backend startup/shutdown,
including a full checkpoint. I'd like to see things rejiggered so that
there's only one post-bootstrap standalone backend session that performs
all the steps, but initdb feeds it just one SQL command at a time so that
errors are better localized. That should both speed up initdb noticeably
and make debugging easier.

One way to do that would be to not use the single user mode for that
stage. Afair, at that point we could actually just start the cluster
"normally", with the socket pointing somewhere locally (ugh, some path
length issues afoot, maybe allow relative directories? And, uh,
windows), and use psql to do the splitting and everything for us.

Your approach has probably some significant performance benefits,
because it essentially does pipelining. So while aesthetically
attractive, I'm afraid my proposal would lead to worse performance. So
it's probably actually DOA :(

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#21)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

On 2015-12-12 17:31:49 -0500, Tom Lane wrote:

I thought this sounded like a nice lazy-Saturday project, so I started
poking at it, and attached is a WIP patch.

Not bad, not bad at all.

After some experimentation, I came up with the idea of executing any
time that a semicolon followed by two newlines is seen. ...
but it turns out that no such case exists anywhere in initdb's data.

Not pretty, but hardly any worse than the current situation.

I'm not particularly wedded to this rule. In principle we could go so
far as to import psql's code that parses commands and figures out which
semicolons are command terminators --- but that is a pretty large chunk
of code, and I think it'd really be overkill considering that initdb
deals only with fixed input scripts.

Having that code somewhere abstracted wouldn't be bad though, extension
scripts have a somewhat similar problem.

Anyway, the attached patch tweaks postgres.c to follow that rule instead
of slurp-to-EOF when -j is given. I doubt that being non-backwards-
compatible is a problem here; in fact, I'm tempted to rip out the -j
switch altogether and just have standalone mode always parse input the
same way.

No objection here.

Does anyone know of people using standalone mode other than
for initdb?

Unfortunately yes. There's docker instances around that configure users
and everything using it.

The other part of the patch modifies initdb to do all its post-bootstrap
steps using a single standalone backend session. I had to remove the
code that currently prints out progress markers for individual phases
of that processing, so that now you get output that looks like

That's cool too. Besides processing the .bki files, and there largely
reg*_in, the many restarts are the most expensive parts of initdb.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#21)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

On 12/12/2015 02:31 PM, Tom Lane wrote:

I'm not particularly wedded to this rule. In principle we could go so
far as to import psql's code that parses commands and figures out which
semicolons are command terminators --- but that is a pretty large chunk
of code, and I think it'd really be overkill considering that initdb
deals only with fixed input scripts. But if anyone has another simple
rule for breaking SQL into commands, we can certainly discuss
alternatives.

Possibly inadequate, but I wrote a get_one_query() function to grab one
statement at a time from a possibly multi-statement string and it isn't
all that many lines of code:

https://github.com/jconway/pgsynck/blob/master/pgsynck.c

Anyway, the attached patch tweaks postgres.c to follow that rule instead
of slurp-to-EOF when -j is given. I doubt that being non-backwards-
compatible is a problem here; in fact, I'm tempted to rip out the -j
switch altogether and just have standalone mode always parse input the
same way. Does anyone know of people using standalone mode other than
for initdb?

sepgsql uses it for installation, but it does not appear to use -j
I'm not sure why it is required but at some point I'd like to dig into that.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#23)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

Andres Freund <andres@anarazel.de> writes:

On 2015-12-12 17:31:49 -0500, Tom Lane wrote:

Does anyone know of people using standalone mode other than
for initdb?

Unfortunately yes. There's docker instances around that configure users
and everything using it.

Hm, that means that we *do* have to worry about backwards compatibility.
We might be able to get away with changing the behavior of -j mode anyway,
though, since this proposal mostly only changes when execution happens
and not what is valid input for -j mode. (It would probably break some
apps if we took away the switch, since right now, you do not need a
semicolon to terminate commands in the regular standalone mode.)
Failing that, we could define a new switch, I guess.

That's cool too. Besides processing the .bki files, and there largely
reg*_in, the many restarts are the most expensive parts of initdb.

Right. The proposal we were discussing upthread would move all the reg*
lookups into creation of the .bki file, basically, which would improve
that part of things quite a bit. (BTW, if we are concerned about initdb
speed, that might be a reason not to be too eager to shift processing
from bootstrap to non-bootstrap mode. Other than the reg* issue,
bootstrap is certainly a far faster way to put rows into the catalogs
than individual SQL commands could be.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#24)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

Joe Conway <mail@joeconway.com> writes:

On 12/12/2015 02:31 PM, Tom Lane wrote:

I'm not particularly wedded to this rule. In principle we could go so
far as to import psql's code that parses commands and figures out which
semicolons are command terminators --- but that is a pretty large chunk
of code, and I think it'd really be overkill considering that initdb
deals only with fixed input scripts. But if anyone has another simple
rule for breaking SQL into commands, we can certainly discuss
alternatives.

Possibly inadequate, but I wrote a get_one_query() function to grab one
statement at a time from a possibly multi-statement string and it isn't
all that many lines of code:
https://github.com/jconway/pgsynck/blob/master/pgsynck.c

Hmm. Doesn't look like that handles semicolons embedded in CREATE RULE;
for that you'd have to track parenthesis nesting as well. (It's arguable
that we won't ever need that case during initdb, but I'd just as soon not
wire in such an assumption.) In general, though, I'd rather not try to
teach InteractiveBackend() such a large amount about SQL syntax.

With a rule like "break at ;\n\n" it's possible to ensure that command
breaks occur only where wanted, though in corner cases you might have to
format your input oddly. (For instance, if you needed that in a SQL
literal, you might resort to E';\n\n' or use the standard's rules about
concatenated string literals.) If you get it wrong the consequences
aren't too disastrous: you'll get an unterminated-input syntax error,
or in the other direction multiple commands will get run together for
execution, which most of the time isn't a big issue.

Does anyone know of people using standalone mode other than
for initdb?

sepgsql uses it for installation, but it does not appear to use -j
I'm not sure why it is required but at some point I'd like to dig into that.

It might be easier than starting a full postmaster and having to figure
out a secure place for the socket etc. I'm prepared to back off the
proposal about changing the default behavior of standalone mode; that
leaves us with a choice between changing -j's behavior and inventing
a new switch.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#23)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

Andres Freund <andres@anarazel.de> writes:

That's cool too. Besides processing the .bki files, and there largely
reg*_in, the many restarts are the most expensive parts of initdb.

BTW, in case anyone is doubting it, I did a little bit of "perf" tracing
and confirmed Andres' comment here: more than 50% of the runtime of the
bootstrap phase is eaten by the pg_proc seqscans performed by regprocin.
There's nothing else amounting to more than a few percent, so basically
nothing else in bootstrap is worth optimizing before we fix that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Mark Dilger
hornschnorter@gmail.com
In reply to: Tom Lane (#26)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

On Dec 12, 2015, at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Joe Conway <mail@joeconway.com> writes:

On 12/12/2015 02:31 PM, Tom Lane wrote:

I'm not particularly wedded to this rule. In principle we could go so
far as to import psql's code that parses commands and figures out which
semicolons are command terminators --- but that is a pretty large chunk
of code, and I think it'd really be overkill considering that initdb
deals only with fixed input scripts. But if anyone has another simple
rule for breaking SQL into commands, we can certainly discuss
alternatives.

Possibly inadequate, but I wrote a get_one_query() function to grab one
statement at a time from a possibly multi-statement string and it isn't
all that many lines of code:
https://github.com/jconway/pgsynck/blob/master/pgsynck.c

Hmm. Doesn't look like that handles semicolons embedded in CREATE RULE;
for that you'd have to track parenthesis nesting as well. (It's arguable
that we won't ever need that case during initdb, but I'd just as soon not
wire in such an assumption.) In general, though, I'd rather not try to
teach InteractiveBackend() such a large amount about SQL syntax.

I use CREATE RULE within startup files in the fork that I maintain. I have
lots of them, totaling perhaps 50k lines of rule code. I don't think any of that
code would have a problem with the double-newline separation you propose,
which seems a more elegant solution to me. Admittedly, the double-newline
separation would need to be documented at the top of each sql file, otherwise
it would be quite surprising to those unfamiliar with it.

You mentioned upthread that introducing a syntax error in one of these files
results in a not-so-helpful error message that dumps the contents of the
entire file. I can confirm that happens, and is hardly useful.

mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#28)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 12, 2015, at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
... In general, though, I'd rather not try to
teach InteractiveBackend() such a large amount about SQL syntax.

I use CREATE RULE within startup files in the fork that I maintain. I have
lots of them, totaling perhaps 50k lines of rule code. I don't think any of that
code would have a problem with the double-newline separation you propose,
which seems a more elegant solution to me.

Yeah? Just for proof-of-concept, could you run your startup files with
the postgres.c patch as proposed, and see whether you get any failures?

Admittedly, the double-newline
separation would need to be documented at the top of each sql file, otherwise
it would be quite surprising to those unfamiliar with it.

Agreed, that wouldn't be a bad thing.

I thought of a positive argument not to do the "fully right" thing by
means of implementing the exactly-right command boundary rules. Suppose
that you mess up in information_schema.sql or another large input file
by introducing an extra left parenthesis in some query. What would happen
if InteractiveBackend() were cognizant of the paren-matching rule is that
it would slurp everything till the end-of-file and then produce a syntax
error message quoting all that text; not much better than what happens
today. With a command break rule like semi-newline-newline, there'll be
a limited horizon as to how much text gets swallowed before you get the
error message.

Note that this is different from the situation with a fully interactive
input processor like psql: if you're typing the same thing in psql,
you'll realize as soon as it doesn't execute the command when-expected
that something is wrong. You won't type another thousand lines of input
before looking closely at what you typed already.

I'm still not quite sold on semi-newline-newline as being the best
possible command boundary rule here; but I do think that "fully correct"
boundary rules are less attractive than they might sound.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#21)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

On 13 December 2015 at 06:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not particularly wedded to this rule. In principle we could go so
far as to import psql's code that parses commands and figures out which
semicolons are command terminators --- but that is a pretty large chunk
of code, and I think it'd really be overkill considering that initdb
deals only with fixed input scripts.

Shouldn't that be a bison/flex job anyway, rather than hand-coded? Or a
simple(ish) state machine?

Dollar-quoted strings are probably the only quite ugly bit due to their
arbitrary delimiters. So I thought I'd sketch out how it'd look as a state
machine. At which point I remembered that we allow $ in identifiers too. So
the state machine would have to bother with unquoted identifiers. Of course
$ marks parameters, so it has to keep track of if it's reading a parameter.
At which point you have half an SQL parser.

This strikes me as a really good reason for making it re-usable, because
it's horrid to write code that handles statement splitting in the
PostgreSQL dialect.

Optional handling of psql \commands would be required, but that'd make it
easier for PgAdmin to support psql backslash commands, so there'd be a win
there too.

I figured I'd sketch it out for kicks. Comment: yuck.

States would be at least:

SQL_TEXT
SQL_UNQUOTED_IDENTIFIER_OR_KEYWORD
NUMBER
QUOTED_IDENTIFIER
QUOTED_IDENTIFIER_QUOTE
SQL_TEXT_DOLLAR
DOLLAR_STRING_START_DELIM
DOLLAR_STRING
DOLLAR_STRING_DOLLAR
DOLLAR_STRING_END_DELIM
STANDARD_STRING
STANDARD_STRING_QUOTE
SQL_TEXT_E
ESCAPE_STRING
ESCAPE_STRING_ESCAPE

Transitions

SQL_TEXT => { SQL_TEXT, SQL_UNQUOTED_IDENTIFIER_OR_KEYWORD, NUMBER,
QUOTED_IDENTIFIER, SQL_TEXT_DOLLAR, STANDARD_STRING, SQL_TEXT_E,
ESCAPE_STRING }

SQL_TEXT_E => { SQL_TEXT, ESCAPE_STRING, SQL_UNQUOTED_IDENTIFIER_OR_KEYWORD
}

SQL_TEXT_DOLLAR => { SQL_TEXT, NUMBER, DOLLAR_STRING_START_DELIM }

QUOTED_IDENTIFIER => { QUOTED_IDENTIFIER, QUOTED_IDENTIFIER_QUOTE }

QUOTED_IDENTIFIER_QUOTE => { SQL_TEXT, QUOTED_IDENTIFIER }

DOLLAR_STRING_START_DELIM => { DOLLAR_STRING_START_DELIM, DOLLAR_STRING }

DOLLAR_STRING => { DOLLAR_STRING, DOLLAR_STRING_DOLLAR }

DOLLAR_STRING_END_DELIM => { DOLLAR_STRING_END_DELIM, SQL_TEXT,
DOLLAR_STRING }

STANDARD_STRING => { STANDARD_STRING, STANDARD_STRING_QUOTE }

STANDARD_STRING_QUOTE => { SQL_TEXT, STANDARD_STRING }

ESCAPE_STRING => { ESCAPE_STRING, ESCAPE_STRING_ESCAPE }

ESCAPE_STRING_ESCAPE => { SQL_TEXT, ESCAPE_STRING }

NUMBER consumes sequential digits and period chars and returns to SQL_TEXT
at any non-digit. (That way it can handle Pg's lazy parser quirks like
SELECT 123"f" being legal, and equivalent to SELECT 123 AS "f").

SQL_UNQUOTED_IDENTIFIER_OR_KEYWORD is needed because a $ within an
identifier is part of the identifier so it can't just be consumed as
SQL_TEXT .

For dollar strings, when a $ is found when reading SQL text (not an
identifier/keyword), enter SQL_TEXT_DOLLAR. What comes next must be a
parameter or the start of a dollar string. If the next char is a digit then
it's a parameter so switch to NUMBER, since dollar-quoted string delims
follow identifier rules and unquoted identifiers can't start with a number.
Otherwise switch to DOLLAR_STRING_START_DELIM and consume until a $ is
found or something illegal in an identifier is found. Or of course it could
be lone $ which is bogus syntax but as far as we're concerned still just
SQL_TEXT. Really, this is just looking for a dollar-quote start and doesn't
care what it finds if it isn't a valid dollar-quote start.

If a valid dollar-quote delim is found switch to DOLLAR_STRING and read
until we find the matching delim using a similar process, entering
DOLLAR_STRING_DOLLAR, looking for param vs end delim, etc. When a full
delimiter is read compare to the start delimiter and switch back to
SQL_TEXT mode if it matches, otherwise remain in DOLLAR_STRING.

If an invalid dollar string delim was found switch back to SQL_TEXT (since
it wasn't a valid beginning of a dollar string) and continue.

For QUOTED_IDENTIFIER_QUOTE and STANDARD_STRING_QUOTE, it found a " or '
while reading a quoted identifier or standard string and transitioned into
that state. If the next char doubles the quote it'll return to reading the
string; otherwise it'll return to the SQL_TEXT state since the identifier
or literal has ended.

Similarly with ESCAPE_STRING_ESCAPE. Having found an escape, consume the
next char even if it's a quote and return to the string parsing.

All this ignores psql backslash commands.

Have I missed anything really obvious? Does it seem useful to have more
re-usable statement splitting code? Is there any sane justification for
doing anything but extracting what psql does verbatim while carefully
changing nothing?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#31Mark Dilger
hornschnorter@gmail.com
In reply to: Tom Lane (#29)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

On Dec 12, 2015, at 9:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Mark Dilger <hornschnorter@gmail.com> writes:

On Dec 12, 2015, at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
... In general, though, I'd rather not try to
teach InteractiveBackend() such a large amount about SQL syntax.

I use CREATE RULE within startup files in the fork that I maintain. I have
lots of them, totaling perhaps 50k lines of rule code. I don't think any of that
code would have a problem with the double-newline separation you propose,
which seems a more elegant solution to me.

Yeah? Just for proof-of-concept, could you run your startup files with
the postgres.c patch as proposed, and see whether you get any failures?

Given all the changes I've made to initdb.c in my fork, that patch
of yours doesn't apply.

mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#30)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

Craig Ringer <craig@2ndquadrant.com> writes:

On 13 December 2015 at 06:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not particularly wedded to this rule. In principle we could go so
far as to import psql's code that parses commands and figures out which
semicolons are command terminators --- but that is a pretty large chunk
of code, and I think it'd really be overkill considering that initdb
deals only with fixed input scripts.

Shouldn't that be a bison/flex job anyway, rather than hand-coded?

It is, if you're speaking of how psql does it.

I thought about trying to get the backend's existing lexer to do it,
but that code will want to throw an error if it sees unterminated
input (such as an incomplete slash-star comment). I'm not sure that
it'd be a good thing to try to make that lexer serve two masters.

I'm also getting less and less enthused about trying to share code with
psql. In the first place, the backend has no interest in recognizing
psql backslash-commands, nor does it need to deal with some of the weird
constraints psql has like having to handle non-backend-safe encodings.
In the second, while it's reasonable for psql to deal with CREATE RULE
syntax by counting parentheses, there's a good argument that that is
not the behavior we want for noninteractive situations such as reading
information_schema.sql. We won't, for example, have anything
corresponding to psql's changing input prompt to help debug problems.
In the third place, it's just difficult and ugly to write code that
will work in both backend and frontend contexts. We've done it,
certainly, but not for any problem as involved as this would be.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#28)
2 attachment(s)
Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

Mark Dilger <hornschnorter@gmail.com> writes:

I use CREATE RULE within startup files in the fork that I maintain. I have
lots of them, totaling perhaps 50k lines of rule code. I don't think any of that
code would have a problem with the double-newline separation you propose,
which seems a more elegant solution to me. Admittedly, the double-newline
separation would need to be documented at the top of each sql file, otherwise
it would be quite surprising to those unfamiliar with it.

You mentioned upthread that introducing a syntax error in one of these files
results in a not-so-helpful error message that dumps the contents of the
entire file. I can confirm that happens, and is hardly useful.

Not having heard any ideas that sounded better than semi-newline-newline,
I went ahead and finished up this patch on that basis. Attached are two
patch files; the first one redefines the behavior of -j, and the second
one modifies initdb to use only one standalone-backend run. I present
them this way to emphasize that the -j change doesn't break much of
anything: initdb still works if you apply only the first patch. And
I didn't change anything in the initdb input files, except for adding
the comment documentation Mark suggests above.

In passing in the first patch, I got rid of the TCOP_DONTUSENEWLINE
#define, which could not have been used by anyone in a very long time
because it would break initdb. I then realized that
src/include/tcop/tcopdebug.h is completely dead code, because the
other debugging symbol it claims to specify got ripped out long ago.
And to add insult to injury, that file is included noplace. I imagine
Bruce's pgrminclude tool got rid of the inclusion that must once have
existed in postgres.c, after observing that postgres.c still compiled
without it :-(. (That tool really requires more adult supervision
than it has gotten.) So anyway, this patch removes tcopdebug.h entirely.

The second patch consists of removing extra backend starts/stops
and converting all of initdb's code to run in -j mode, rather than the
mishmash of -j and not-j behavior that was there before. I changed
all the semicolon-newlines in initdb's command strings to
semicolon-newline-newlines. As mentioned before, only a small number of
those changes *had* to be made to get it to work, namely the ones around
VACUUM and CREATE DATABASE statements, but I felt that for consistency
and error localization all of them should be changed. I also failed to
resist the temptation to const-ify some of the arrays more thoroughly.

Barring objections I'll push this into HEAD soon.

regards, tom lane

Attachments:

change-j-switch-behavior.patchtext/x-diff; charset=us-ascii; name=change-j-switch-behavior.patchDownload
diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml
index e2e9909..d60d4ff 100644
*** a/doc/src/sgml/ref/postgres-ref.sgml
--- b/doc/src/sgml/ref/postgres-ref.sgml
*************** PostgreSQL documentation
*** 529,535 ****
      </indexterm>
  
      <para>
!      The following options only apply to the single-user mode.
      </para>
  
      <variablelist>
--- 529,537 ----
      </indexterm>
  
      <para>
!      The following options only apply to the single-user mode
!      (see <xref linkend="app-postgres-single-user"
!      endterm="app-postgres-single-user-title">).
      </para>
  
      <variablelist>
*************** PostgreSQL documentation
*** 558,564 ****
        <term><option>-E</option></term>
        <listitem>
         <para>
!         Echo all commands.
         </para>
        </listitem>
       </varlistentry>
--- 560,566 ----
        <term><option>-E</option></term>
        <listitem>
         <para>
!         Echo all commands to standard output before executing them.
         </para>
        </listitem>
       </varlistentry>
*************** PostgreSQL documentation
*** 567,573 ****
        <term><option>-j</option></term>
        <listitem>
         <para>
!         Disables use of newline as a statement delimiter.
         </para>
        </listitem>
       </varlistentry>
--- 569,576 ----
        <term><option>-j</option></term>
        <listitem>
         <para>
!         Use semicolon followed by two newlines, rather than just newline,
!         as the command entry terminator.
         </para>
        </listitem>
       </varlistentry>
*************** PostgreSQL documentation
*** 760,767 ****
    </para>
   </refsect1>
  
!  <refsect1>
!   <title>Usage</title>
  
     <para>
      To start a single-user mode server, use a command like
--- 763,770 ----
    </para>
   </refsect1>
  
!  <refsect1 id="app-postgres-single-user">
!   <title id="app-postgres-single-user-title">Single-User Mode</title>
  
     <para>
      To start a single-user mode server, use a command like
*************** PostgreSQL documentation
*** 778,807 ****
      entry terminator; there is no intelligence about semicolons,
      as there is in <application>psql</>.  To continue a command
      across multiple lines, you must type backslash just before each
!     newline except the last one.
     </para>
  
     <para>
!     But if you use the <option>-j</> command line switch, then newline does
!     not terminate command entry.  In this case, the server will read the standard input
!     until the end-of-file (<acronym>EOF</>) marker, then
!     process the input as a single command string.  Backslash-newline is not
!     treated specially in this case.
     </para>
  
     <para>
      To quit the session, type <acronym>EOF</acronym>
      (<keycombo action="simul"><keycap>Control</><keycap>D</></>, usually).
!     If you've
!     used <option>-j</>, two consecutive <acronym>EOF</>s are needed to exit.
     </para>
  
     <para>
      Note that the single-user mode server does not provide sophisticated
      line-editing features (no command history, for example).
!     Single-User mode also does not do any background processing, like
!     automatic checkpoints.
! 
     </para>
   </refsect1>
  
--- 781,820 ----
      entry terminator; there is no intelligence about semicolons,
      as there is in <application>psql</>.  To continue a command
      across multiple lines, you must type backslash just before each
!     newline except the last one.  The backslash and adjacent newline are
!     both dropped from the input command.  Note that this will happen even
!     when within a string literal or comment.
     </para>
  
     <para>
!     But if you use the <option>-j</> command line switch, a single newline
!     does not terminate command entry; instead, the sequence
!     semicolon-newline-newline does.  That is, type a semicolon immediately
!     followed by a completely empty line.  Backslash-newline is not
!     treated specially in this mode.  Again, there is no intelligence about
!     such a sequence appearing within a string literal or comment.
!    </para>
! 
!    <para>
!     In either input mode, if you type a semicolon that is not just before or
!     part of a command entry terminator, it is considered a command separator.
!     When you do type a command entry terminator, the multiple statements
!     you've entered will be executed as a single transaction.
     </para>
  
     <para>
      To quit the session, type <acronym>EOF</acronym>
      (<keycombo action="simul"><keycap>Control</><keycap>D</></>, usually).
!     If you've entered any text since the last command entry terminator,
!     then <acronym>EOF</acronym> will be taken as a command entry terminator,
!     and another <acronym>EOF</> will be needed to exit.
     </para>
  
     <para>
      Note that the single-user mode server does not provide sophisticated
      line-editing features (no command history, for example).
!     Single-user mode also does not do any background processing, such as
!     automatic checkpoints or replication.
     </para>
   </refsect1>
  
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 6e1b241..2239859 100644
*** a/src/backend/catalog/information_schema.sql
--- b/src/backend/catalog/information_schema.sql
***************
*** 5,10 ****
--- 5,18 ----
   * Copyright (c) 2003-2015, PostgreSQL Global Development Group
   *
   * src/backend/catalog/information_schema.sql
+  *
+  * Note: this file is read in single-user -j mode, which means that the
+  * command terminator is semicolon-newline-newline; whenever the backend
+  * sees that, it stops and executes what it's got.  If you write a lot of
+  * statements without empty lines between, they'll all get quoted to you
+  * in any error message about one of them, so don't do that.  Also, you
+  * cannot write a semicolon immediately followed by an empty line in a
+  * string literal (including a function body!) or a multiline comment.
   */
  
  /*
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ccc030f..536c805 100644
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
***************
*** 4,9 ****
--- 4,17 ----
   * Copyright (c) 1996-2015, PostgreSQL Global Development Group
   *
   * src/backend/catalog/system_views.sql
+  *
+  * Note: this file is read in single-user -j mode, which means that the
+  * command terminator is semicolon-newline-newline; whenever the backend
+  * sees that, it stops and executes what it's got.  If you write a lot of
+  * statements without empty lines between, they'll all get quoted to you
+  * in any error message about one of them, so don't do that.  Also, you
+  * cannot write a semicolon immediately followed by an empty line in a
+  * string literal (including a function body!) or a multiline comment.
   */
  
  CREATE VIEW pg_roles AS
diff --git a/src/backend/snowball/snowball.sql.in b/src/backend/snowball/snowball.sql.in
index 2f68393..68363f1 100644
*** a/src/backend/snowball/snowball.sql.in
--- b/src/backend/snowball/snowball.sql.in
***************
*** 1,6 ****
! -- src/backend/snowball/snowball.sql.in$
  
- -- text search configuration for _LANGNAME_ language
  CREATE TEXT SEARCH DICTIONARY _DICTNAME_
  	(TEMPLATE = snowball, Language = _LANGNAME_ _STOPWORDS_);
  
--- 1,22 ----
! /*
!  * text search configuration for _LANGNAME_ language
!  *
!  * Copyright (c) 2007-2015, PostgreSQL Global Development Group
!  *
!  * src/backend/snowball/snowball.sql.in
!  *
!  * _LANGNAME_ and certain other macros are replaced for each language;
!  * see the Makefile for details.
!  *
!  * Note: this file is read in single-user -j mode, which means that the
!  * command terminator is semicolon-newline-newline; whenever the backend
!  * sees that, it stops and executes what it's got.  If you write a lot of
!  * statements without empty lines between, they'll all get quoted to you
!  * in any error message about one of them, so don't do that.  Also, you
!  * cannot write a semicolon immediately followed by an empty line in a
!  * string literal (including a function body!) or a multiline comment.
!  */
  
  CREATE TEXT SEARCH DICTIONARY _DICTNAME_
  	(TEMPLATE = snowball, Language = _LANGNAME_ _STOPWORDS_);
  
diff --git a/src/backend/snowball/snowball_func.sql.in b/src/backend/snowball/snowball_func.sql.in
index e7d4510..debb0e0 100644
*** a/src/backend/snowball/snowball_func.sql.in
--- b/src/backend/snowball/snowball_func.sql.in
***************
*** 1,4 ****
! -- src/backend/snowball/snowball_func.sql.in$
  
  SET search_path = pg_catalog;
  
--- 1,21 ----
! /*
!  * Create underlying C functions for Snowball stemmers
!  *
!  * Copyright (c) 2007-2015, PostgreSQL Global Development Group
!  *
!  * src/backend/snowball/snowball_func.sql.in
!  *
!  * This file is combined with multiple instances of snowball.sql.in to
!  * build snowball_create.sql, which is executed during initdb.
!  *
!  * Note: this file is read in single-user -j mode, which means that the
!  * command terminator is semicolon-newline-newline; whenever the backend
!  * sees that, it stops and executes what it's got.  If you write a lot of
!  * statements without empty lines between, they'll all get quoted to you
!  * in any error message about one of them, so don't do that.  Also, you
!  * cannot write a semicolon immediately followed by an empty line in a
!  * string literal (including a function body!) or a multiline comment.
!  */
  
  SET search_path = pg_catalog;
  
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 1dc2eb0..4ae2548 100644
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** static CachedPlanSource *unnamed_stmt_ps
*** 157,174 ****
  
  /* assorted command-line switches */
  static const char *userDoption = NULL;	/* -D switch */
- 
  static bool EchoQuery = false;	/* -E switch */
! 
! /*
!  * people who want to use EOF should #define DONTUSENEWLINE in
!  * tcop/tcopdebug.h
!  */
! #ifndef TCOP_DONTUSENEWLINE
! static int	UseNewLine = 1;		/* Use newlines query delimiters (the default) */
! #else
! static int	UseNewLine = 0;		/* Use EOF as query delimiters */
! #endif   /* TCOP_DONTUSENEWLINE */
  
  /* whether or not, and why, we were canceled by conflict with recovery */
  static bool RecoveryConflictPending = false;
--- 157,164 ----
  
  /* assorted command-line switches */
  static const char *userDoption = NULL;	/* -D switch */
  static bool EchoQuery = false;	/* -E switch */
! static bool UseSemiNewlineNewline = false;		/* -j switch */
  
  /* whether or not, and why, we were canceled by conflict with recovery */
  static bool RecoveryConflictPending = false;
*************** static int
*** 219,226 ****
  InteractiveBackend(StringInfo inBuf)
  {
  	int			c;				/* character read from getc() */
- 	bool		end = false;	/* end-of-input flag */
- 	bool		backslashSeen = false;	/* have we seen a \ ? */
  
  	/*
  	 * display a prompt and obtain input from the user
--- 209,214 ----
*************** InteractiveBackend(StringInfo inBuf)
*** 230,284 ****
  
  	resetStringInfo(inBuf);
  
! 	if (UseNewLine)
  	{
! 		/*
! 		 * if we are using \n as a delimiter, then read characters until the
! 		 * \n.
! 		 */
! 		while ((c = interactive_getc()) != EOF)
  		{
! 			if (c == '\n')
  			{
! 				if (backslashSeen)
  				{
  					/* discard backslash from inBuf */
  					inBuf->data[--inBuf->len] = '\0';
! 					backslashSeen = false;
  					continue;
  				}
  				else
  				{
! 					/* keep the newline character */
  					appendStringInfoChar(inBuf, '\n');
  					break;
  				}
  			}
- 			else if (c == '\\')
- 				backslashSeen = true;
- 			else
- 				backslashSeen = false;
- 
- 			appendStringInfoChar(inBuf, (char) c);
  		}
  
! 		if (c == EOF)
! 			end = true;
! 	}
! 	else
! 	{
! 		/*
! 		 * otherwise read characters until EOF.
! 		 */
! 		while ((c = interactive_getc()) != EOF)
! 			appendStringInfoChar(inBuf, (char) c);
! 
! 		/* No input before EOF signal means time to quit. */
! 		if (inBuf->len == 0)
! 			end = true;
  	}
  
! 	if (end)
  		return EOF;
  
  	/*
--- 218,273 ----
  
  	resetStringInfo(inBuf);
  
! 	/*
! 	 * Read characters until EOF or the appropriate delimiter is seen.
! 	 */
! 	while ((c = interactive_getc()) != EOF)
  	{
! 		if (c == '\n')
  		{
! 			if (UseSemiNewlineNewline)
  			{
! 				/*
! 				 * In -j mode, semicolon followed by two newlines ends the
! 				 * command; otherwise treat newline as regular character.
! 				 */
! 				if (inBuf->len > 1 &&
! 					inBuf->data[inBuf->len - 1] == '\n' &&
! 					inBuf->data[inBuf->len - 2] == ';')
! 				{
! 					/* might as well drop the second newline */
! 					break;
! 				}
! 			}
! 			else
! 			{
! 				/*
! 				 * In plain mode, newline ends the command unless preceded by
! 				 * backslash.
! 				 */
! 				if (inBuf->len > 0 &&
! 					inBuf->data[inBuf->len - 1] == '\\')
  				{
  					/* discard backslash from inBuf */
  					inBuf->data[--inBuf->len] = '\0';
! 					/* discard newline too */
  					continue;
  				}
  				else
  				{
! 					/* keep the newline character, but end the command */
  					appendStringInfoChar(inBuf, '\n');
  					break;
  				}
  			}
  		}
  
! 		/* Not newline, or newline treated as regular character */
! 		appendStringInfoChar(inBuf, (char) c);
  	}
  
! 	/* No input before EOF signal means time to quit. */
! 	if (c == EOF && inBuf->len == 0)
  		return EOF;
  
  	/*
*************** process_postgres_switches(int argc, char
*** 3391,3397 ****
  
  			case 'j':
  				if (secure)
! 					UseNewLine = 0;
  				break;
  
  			case 'k':
--- 3380,3386 ----
  
  			case 'j':
  				if (secure)
! 					UseSemiNewlineNewline = true;
  				break;
  
  			case 'k':
diff --git a/src/include/tcop/tcopdebug.h b/src/include/tcop/tcopdebug.h
index d7145ce..e69de29 100644
*** a/src/include/tcop/tcopdebug.h
--- b/src/include/tcop/tcopdebug.h
***************
*** 1,44 ****
- /*-------------------------------------------------------------------------
-  *
-  * tcopdebug.h
-  *	  #defines governing debugging behaviour in the traffic cop
-  *
-  *
-  * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
-  * Portions Copyright (c) 1994, Regents of the University of California
-  *
-  * src/include/tcop/tcopdebug.h
-  *
-  *-------------------------------------------------------------------------
-  */
- #ifndef TCOPDEBUG_H
- #define TCOPDEBUG_H
- 
- /* ----------------------------------------------------------------
-  *		debugging defines.
-  *
-  *		If you want certain debugging behaviour, then #define
-  *		the variable to 1, else #undef it. -cim 10/26/89
-  * ----------------------------------------------------------------
-  */
- 
- /* ----------------
-  *		TCOP_SHOWSTATS controls whether or not buffer and
-  *		access method statistics are shown for each query.  -cim 2/9/89
-  * ----------------
-  */
- #undef TCOP_SHOWSTATS
- 
- /* ----------------
-  *		TCOP_DONTUSENEWLINE controls the default setting of
-  *		the UseNewLine variable in postgres.c
-  * ----------------
-  */
- #undef TCOP_DONTUSENEWLINE
- 
- /* ----------------------------------------------------------------
-  *		#defines controlled by above definitions
-  * ----------------------------------------------------------------
-  */
- 
- #endif   /* TCOPDEBUG_H */
--- 0 ----
initdb-with-fewer-backends.patchtext/x-diff; charset=us-ascii; name=initdb-with-fewer-backends.patchDownload
diff --git a/src/backend/utils/mb/conversion_procs/Makefile b/src/backend/utils/mb/conversion_procs/Makefile
index 8481721..cad5948 100644
*** a/src/backend/utils/mb/conversion_procs/Makefile
--- b/src/backend/utils/mb/conversion_procs/Makefile
*************** $(SQLSCRIPT): Makefile
*** 181,186 ****
--- 181,187 ----
  		echo "DROP CONVERSION pg_catalog.$$name;"; \
  		echo "CREATE DEFAULT CONVERSION pg_catalog.$$name FOR '$$se' TO '$$de' FROM $$func;"; \
  	        echo "COMMENT ON CONVERSION pg_catalog.$$name IS 'conversion for $$se to $$de';"; \
+ 		echo; \
  	done > $@
  
  $(REGRESSION_SCRIPT): Makefile
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index feeff9e..31b2588 100644
*** a/src/bin/initdb/initdb.c
--- b/src/bin/initdb/initdb.c
***************
*** 79,85 ****
  /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
  extern const char *select_default_timezone(const char *share_path);
  
! static const char *auth_methods_host[] = {"trust", "reject", "md5", "password", "ident", "radius",
  #ifdef ENABLE_GSS
  	"gss",
  #endif
--- 79,86 ----
  /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
  extern const char *select_default_timezone(const char *share_path);
  
! static const char *const auth_methods_host[] = {
! 	"trust", "reject", "md5", "password", "ident", "radius",
  #ifdef ENABLE_GSS
  	"gss",
  #endif
*************** static const char *auth_methods_host[] =
*** 95,109 ****
  #ifdef USE_SSL
  	"cert",
  #endif
! NULL};
! static const char *auth_methods_local[] = {"trust", "reject", "md5", "password", "peer", "radius",
  #ifdef USE_PAM
  	"pam", "pam ",
  #endif
  #ifdef USE_LDAP
  	"ldap",
  #endif
! NULL};
  
  /*
   * these values are passed in by makefile defines
--- 96,113 ----
  #ifdef USE_SSL
  	"cert",
  #endif
! 	NULL
! };
! static const char *const auth_methods_local[] = {
! 	"trust", "reject", "md5", "password", "peer", "radius",
  #ifdef USE_PAM
  	"pam", "pam ",
  #endif
  #ifdef USE_LDAP
  	"ldap",
  #endif
! 	NULL
! };
  
  /*
   * these values are passed in by makefile defines
*************** static char *authwarning = NULL;
*** 185,193 ****
   * (no quoting to worry about).
   */
  static const char *boot_options = "-F";
! static const char *backend_options = "--single -F -O -c search_path=pg_catalog -c exit_on_error=true";
  
! static const char *subdirs[] = {
  	"global",
  	"pg_xlog",
  	"pg_xlog/archive_status",
--- 189,197 ----
   * (no quoting to worry about).
   */
  static const char *boot_options = "-F";
! static const char *backend_options = "--single -F -O -j -c search_path=pg_catalog -c exit_on_error=true";
  
! static const char *const subdirs[] = {
  	"global",
  	"pg_xlog",
  	"pg_xlog/archive_status",
*************** static void set_null_conf(void);
*** 244,264 ****
  static void test_config_settings(void);
  static void setup_config(void);
  static void bootstrap_template1(void);
! static void setup_auth(void);
! static void get_set_pwd(void);
! static void setup_depend(void);
! static void setup_sysviews(void);
! static void setup_description(void);
! static void setup_collation(void);
! static void setup_conversion(void);
! static void setup_dictionary(void);
! static void setup_privileges(void);
  static void set_info_version(void);
! static void setup_schema(void);
! static void load_plpgsql(void);
! static void vacuum_db(void);
! static void make_template0(void);
! static void make_postgres(void);
  static void fsync_pgdata(void);
  static void trapsig(int signum);
  static void check_ok(void);
--- 248,268 ----
  static void test_config_settings(void);
  static void setup_config(void);
  static void bootstrap_template1(void);
! static void setup_auth(FILE *cmdfd);
! static void get_set_pwd(FILE *cmdfd);
! static void setup_depend(FILE *cmdfd);
! static void setup_sysviews(FILE *cmdfd);
! static void setup_description(FILE *cmdfd);
! static void setup_collation(FILE *cmdfd);
! static void setup_conversion(FILE *cmdfd);
! static void setup_dictionary(FILE *cmdfd);
! static void setup_privileges(FILE *cmdfd);
  static void set_info_version(void);
! static void setup_schema(FILE *cmdfd);
! static void load_plpgsql(FILE *cmdfd);
! static void vacuum_db(FILE *cmdfd);
! static void make_template0(FILE *cmdfd);
! static void make_postgres(FILE *cmdfd);
  static void fsync_pgdata(void);
  static void trapsig(int signum);
  static void check_ok(void);
*************** bootstrap_template1(void)
*** 1545,1589 ****
   * set up the shadow password table
   */
  static void
! setup_auth(void)
  {
! 	PG_CMD_DECL;
! 	const char **line;
! 	static const char *pg_authid_setup[] = {
  		/*
  		 * The authid table shouldn't be readable except through views, to
  		 * ensure passwords are not publicly visible.
  		 */
! 		"REVOKE ALL on pg_authid FROM public;\n",
  		NULL
  	};
  
- 	fputs(_("initializing pg_authid ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = pg_authid_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * get the superuser password if required, and call postgres to set it
   */
  static void
! get_set_pwd(void)
  {
- 	PG_CMD_DECL;
- 
  	char	   *pwd1,
  			   *pwd2;
  
--- 1549,1576 ----
   * set up the shadow password table
   */
  static void
! setup_auth(FILE *cmdfd)
  {
! 	const char *const * line;
! 	static const char *const pg_authid_setup[] = {
  		/*
  		 * The authid table shouldn't be readable except through views, to
  		 * ensure passwords are not publicly visible.
  		 */
! 		"REVOKE ALL on pg_authid FROM public;\n\n",
  		NULL
  	};
  
  	for (line = pg_authid_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
   * get the superuser password if required, and call postgres to set it
   */
  static void
! get_set_pwd(FILE *cmdfd)
  {
  	char	   *pwd1,
  			   *pwd2;
  
*************** get_set_pwd(void)
*** 1640,1675 ****
  		pwd1 = pg_strdup(pwdbuf);
  
  	}
- 	printf(_("setting password ... "));
- 	fflush(stdout);
  
! 	snprintf(cmd, sizeof(cmd),
! 			 "\"%s\" %s template1 >%s",
! 			 backend_exec, backend_options,
! 			 DEVNULL);
! 
! 	PG_CMD_OPEN;
! 
! 	PG_CMD_PRINTF2("ALTER USER \"%s\" WITH PASSWORD E'%s';\n",
  				   username, escape_quotes(pwd1));
  
- 	/* MM: pwd1 is no longer needed, freeing it */
  	free(pwd1);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * set up pg_depend
   */
  static void
! setup_depend(void)
  {
! 	PG_CMD_DECL;
! 	const char **line;
! 	static const char *pg_depend_setup[] = {
  		/*
  		 * Make PIN entries in pg_depend for all objects made so far in the
  		 * tables that the dependency code handles.  This is overkill (the
--- 1627,1647 ----
  		pwd1 = pg_strdup(pwdbuf);
  
  	}
  
! 	PG_CMD_PRINTF2("ALTER USER \"%s\" WITH PASSWORD E'%s';\n\n",
  				   username, escape_quotes(pwd1));
  
  	free(pwd1);
  }
  
  /*
   * set up pg_depend
   */
  static void
! setup_depend(FILE *cmdfd)
  {
! 	const char *const * line;
! 	static const char *const pg_depend_setup[] = {
  		/*
  		 * Make PIN entries in pg_depend for all objects made so far in the
  		 * tables that the dependency code handles.  This is overkill (the
*************** setup_depend(void)
*** 1684,1845 ****
  		 * First delete any already-made entries; PINs override all else, and
  		 * must be the only entries for their objects.
  		 */
! 		"DELETE FROM pg_depend;\n",
! 		"VACUUM pg_depend;\n",
! 		"DELETE FROM pg_shdepend;\n",
! 		"VACUUM pg_shdepend;\n",
  
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_class;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_proc;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_type;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_cast;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_constraint;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_attrdef;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_language;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_operator;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_opclass;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_opfamily;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_amop;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_amproc;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_rewrite;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_trigger;\n",
  
  		/*
  		 * restriction here to avoid pinning the public namespace
  		 */
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
  		" FROM pg_namespace "
! 		"    WHERE nspname LIKE 'pg%';\n",
  
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_parser;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_dict;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_template;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_config;\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_collation;\n",
  		"INSERT INTO pg_shdepend SELECT 0,0,0,0, tableoid,oid, 'p' "
! 		" FROM pg_authid;\n",
  		NULL
  	};
  
- 	fputs(_("initializing dependencies ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = pg_depend_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * set up system views
   */
  static void
! setup_sysviews(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **sysviews_setup;
  
- 	fputs(_("creating system views ... "), stdout);
- 	fflush(stdout);
- 
  	sysviews_setup = readfile(system_views_file);
  
- 	/*
- 	 * We use -j here to avoid backslashing stuff in system_views.sql
- 	 */
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s -j template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = sysviews_setup; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
  		free(*line);
  	}
  
- 	PG_CMD_CLOSE;
- 
  	free(sysviews_setup);
- 
- 	check_ok();
  }
  
  /*
   * load description data
   */
  static void
! setup_description(void)
  {
- 	PG_CMD_DECL;
- 
- 	fputs(_("loading system objects' descriptions ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_description ( "
  				"	objoid oid, "
  				"	classname name, "
  				"	objsubid int4, "
! 				"	description text) WITHOUT OIDS;\n");
  
! 	PG_CMD_PRINTF1("COPY tmp_pg_description FROM E'%s';\n",
  				   escape_quotes(desc_file));
  
  	PG_CMD_PUTS("INSERT INTO pg_description "
  				" SELECT t.objoid, c.oid, t.objsubid, t.description "
  				"  FROM tmp_pg_description t, pg_class c "
! 				"    WHERE c.relname = t.classname;\n");
  
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_shdescription ( "
  				" objoid oid, "
  				" classname name, "
! 				" description text) WITHOUT OIDS;\n");
  
! 	PG_CMD_PRINTF1("COPY tmp_pg_shdescription FROM E'%s';\n",
  				   escape_quotes(shdesc_file));
  
  	PG_CMD_PUTS("INSERT INTO pg_shdescription "
  				" SELECT t.objoid, c.oid, t.description "
  				"  FROM tmp_pg_shdescription t, pg_class c "
! 				"   WHERE c.relname = t.classname;\n");
  
  	/* Create default descriptions for operator implementation functions */
  	PG_CMD_PUTS("WITH funcdescs AS ( "
--- 1656,1773 ----
  		 * First delete any already-made entries; PINs override all else, and
  		 * must be the only entries for their objects.
  		 */
! 		"DELETE FROM pg_depend;\n\n",
! 		"VACUUM pg_depend;\n\n",
! 		"DELETE FROM pg_shdepend;\n\n",
! 		"VACUUM pg_shdepend;\n\n",
  
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_class;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_proc;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_type;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_cast;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_constraint;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_attrdef;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_language;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_operator;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_opclass;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_opfamily;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_amop;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_amproc;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_rewrite;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_trigger;\n\n",
  
  		/*
  		 * restriction here to avoid pinning the public namespace
  		 */
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
  		" FROM pg_namespace "
! 		"    WHERE nspname LIKE 'pg%';\n\n",
  
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_parser;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_dict;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_template;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_ts_config;\n\n",
  		"INSERT INTO pg_depend SELECT 0,0,0, tableoid,oid,0, 'p' "
! 		" FROM pg_collation;\n\n",
  		"INSERT INTO pg_shdepend SELECT 0,0,0,0, tableoid,oid, 'p' "
! 		" FROM pg_authid;\n\n",
  		NULL
  	};
  
  	for (line = pg_depend_setup; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
   * set up system views
   */
  static void
! setup_sysviews(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **sysviews_setup;
  
  	sysviews_setup = readfile(system_views_file);
  
  	for (line = sysviews_setup; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
  		free(*line);
  	}
  
  	free(sysviews_setup);
  }
  
  /*
   * load description data
   */
  static void
! setup_description(FILE *cmdfd)
  {
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_description ( "
  				"	objoid oid, "
  				"	classname name, "
  				"	objsubid int4, "
! 				"	description text) WITHOUT OIDS;\n\n");
  
! 	PG_CMD_PRINTF1("COPY tmp_pg_description FROM E'%s';\n\n",
  				   escape_quotes(desc_file));
  
  	PG_CMD_PUTS("INSERT INTO pg_description "
  				" SELECT t.objoid, c.oid, t.objsubid, t.description "
  				"  FROM tmp_pg_description t, pg_class c "
! 				"    WHERE c.relname = t.classname;\n\n");
  
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_shdescription ( "
  				" objoid oid, "
  				" classname name, "
! 				" description text) WITHOUT OIDS;\n\n");
  
! 	PG_CMD_PRINTF1("COPY tmp_pg_shdescription FROM E'%s';\n\n",
  				   escape_quotes(shdesc_file));
  
  	PG_CMD_PUTS("INSERT INTO pg_shdescription "
  				" SELECT t.objoid, c.oid, t.description "
  				"  FROM tmp_pg_shdescription t, pg_class c "
! 				"   WHERE c.relname = t.classname;\n\n");
  
  	/* Create default descriptions for operator implementation functions */
  	PG_CMD_PUTS("WITH funcdescs AS ( "
*************** setup_description(void)
*** 1852,1862 ****
  				"  FROM funcdescs "
  				"  WHERE opdesc NOT LIKE 'deprecated%' AND "
  				"  NOT EXISTS (SELECT 1 FROM pg_description "
! 		  "    WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass);\n");
! 
! 	PG_CMD_CLOSE;
! 
! 	check_ok();
  }
  
  #ifdef HAVE_LOCALE_T
--- 1780,1786 ----
  				"  FROM funcdescs "
  				"  WHERE opdesc NOT LIKE 'deprecated%' AND "
  				"  NOT EXISTS (SELECT 1 FROM pg_description "
! 		"    WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass);\n\n");
  }
  
  #ifdef HAVE_LOCALE_T
*************** normalize_locale_name(char *new, const c
*** 1899,1905 ****
   * populate pg_collation
   */
  static void
! setup_collation(void)
  {
  #if defined(HAVE_LOCALE_T) && !defined(WIN32)
  	int			i;
--- 1823,1829 ----
   * populate pg_collation
   */
  static void
! setup_collation(FILE *cmdfd)
  {
  #if defined(HAVE_LOCALE_T) && !defined(WIN32)
  	int			i;
*************** setup_collation(void)
*** 1907,1934 ****
  	char		localebuf[NAMEDATALEN]; /* we assume ASCII so this is fine */
  	int			count = 0;
  
- 	PG_CMD_DECL;
- #endif
- 
- 	fputs(_("creating collations ... "), stdout);
- 	fflush(stdout);
- 
- #if defined(HAVE_LOCALE_T) && !defined(WIN32)
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
  	locale_a_handle = popen_check("locale -a", "r");
  	if (!locale_a_handle)
  		return;					/* complaint already printed */
  
- 	PG_CMD_OPEN;
- 
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_collation ( "
  				"	collname name, "
  				"	locale name, "
! 				"	encoding int) WITHOUT OIDS;\n");
  
  	while (fgets(localebuf, sizeof(localebuf), locale_a_handle))
  	{
--- 1831,1844 ----
  	char		localebuf[NAMEDATALEN]; /* we assume ASCII so this is fine */
  	int			count = 0;
  
  	locale_a_handle = popen_check("locale -a", "r");
  	if (!locale_a_handle)
  		return;					/* complaint already printed */
  
  	PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_collation ( "
  				"	collname name, "
  				"	locale name, "
! 				"	encoding int) WITHOUT OIDS;\n\n");
  
  	while (fgets(localebuf, sizeof(localebuf), locale_a_handle))
  	{
*************** setup_collation(void)
*** 1988,1994 ****
  
  		quoted_locale = escape_quotes(localebuf);
  
! 		PG_CMD_PRINTF3("INSERT INTO tmp_pg_collation VALUES (E'%s', E'%s', %d);\n",
  					   quoted_locale, quoted_locale, enc);
  
  		/*
--- 1898,1904 ----
  
  		quoted_locale = escape_quotes(localebuf);
  
! 		PG_CMD_PRINTF3("INSERT INTO tmp_pg_collation VALUES (E'%s', E'%s', %d);\n\n",
  					   quoted_locale, quoted_locale, enc);
  
  		/*
*************** setup_collation(void)
*** 2000,2006 ****
  		{
  			char	   *quoted_alias = escape_quotes(alias);
  
! 			PG_CMD_PRINTF3("INSERT INTO tmp_pg_collation VALUES (E'%s', E'%s', %d);\n",
  						   quoted_alias, quoted_locale, enc);
  			free(quoted_alias);
  		}
--- 1910,1916 ----
  		{
  			char	   *quoted_alias = escape_quotes(alias);
  
! 			PG_CMD_PRINTF3("INSERT INTO tmp_pg_collation VALUES (E'%s', E'%s', %d);\n\n",
  						   quoted_alias, quoted_locale, enc);
  			free(quoted_alias);
  		}
*************** setup_collation(void)
*** 2008,2014 ****
  	}
  
  	/* Add an SQL-standard name */
! 	PG_CMD_PRINTF1("INSERT INTO tmp_pg_collation VALUES ('ucs_basic', 'C', %d);\n", PG_UTF8);
  
  	/*
  	 * When copying collations to the final location, eliminate aliases that
--- 1918,1924 ----
  	}
  
  	/* Add an SQL-standard name */
! 	PG_CMD_PRINTF1("INSERT INTO tmp_pg_collation VALUES ('ucs_basic', 'C', %d);\n\n", PG_UTF8);
  
  	/*
  	 * When copying collations to the final location, eliminate aliases that
*************** setup_collation(void)
*** 2029,2048 ****
  				"   encoding, locale, locale "
  				"  FROM tmp_pg_collation"
  				"  WHERE NOT EXISTS (SELECT 1 FROM pg_collation WHERE collname = tmp_pg_collation.collname)"
! 	   "  ORDER BY collname, encoding, (collname = locale) DESC, locale;\n");
  
  	pclose(locale_a_handle);
- 	PG_CMD_CLOSE;
  
- 	check_ok();
  	if (count == 0 && !debug)
  	{
  		printf(_("No usable system locales were found.\n"));
  		printf(_("Use the option \"--debug\" to see details.\n"));
  	}
- #else							/* not HAVE_LOCALE_T && not WIN32 */
- 	printf(_("not supported on this platform\n"));
- 	fflush(stdout);
  #endif   /* not HAVE_LOCALE_T  && not WIN32 */
  }
  
--- 1939,1953 ----
  				"   encoding, locale, locale "
  				"  FROM tmp_pg_collation"
  				"  WHERE NOT EXISTS (SELECT 1 FROM pg_collation WHERE collname = tmp_pg_collation.collname)"
! 	 "  ORDER BY collname, encoding, (collname = locale) DESC, locale;\n\n");
  
  	pclose(locale_a_handle);
  
  	if (count == 0 && !debug)
  	{
  		printf(_("No usable system locales were found.\n"));
  		printf(_("Use the option \"--debug\" to see details.\n"));
  	}
  #endif   /* not HAVE_LOCALE_T  && not WIN32 */
  }
  
*************** setup_collation(void)
*** 2050,2071 ****
   * load conversion functions
   */
  static void
! setup_conversion(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **conv_lines;
  
- 	fputs(_("creating conversions ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	conv_lines = readfile(conversion_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
--- 1955,1965 ----
   * load conversion functions
   */
  static void
! setup_conversion(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **conv_lines;
  
  	conv_lines = readfile(conversion_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
*************** setup_conversion(void)
*** 2075,2109 ****
  	}
  
  	free(conv_lines);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * load extra dictionaries (Snowball stemmers)
   */
  static void
! setup_dictionary(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **conv_lines;
  
- 	fputs(_("creating dictionaries ... "), stdout);
- 	fflush(stdout);
- 
- 	/*
- 	 * We use -j here to avoid backslashing stuff
- 	 */
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s -j template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	conv_lines = readfile(dictionary_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
--- 1969,1985 ----
  	}
  
  	free(conv_lines);
  }
  
  /*
   * load extra dictionaries (Snowball stemmers)
   */
  static void
! setup_dictionary(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **conv_lines;
  
  	conv_lines = readfile(dictionary_file);
  	for (line = conv_lines; *line != NULL; line++)
  	{
*************** setup_dictionary(void)
*** 2112,2121 ****
  	}
  
  	free(conv_lines);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
--- 1988,1993 ----
*************** setup_dictionary(void)
*** 2130,2168 ****
   * set (NOT NULL).
   */
  static void
! setup_privileges(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **priv_lines;
  	static char *privileges_setup[] = {
  		"UPDATE pg_class "
  		"  SET relacl = E'{\"=r/\\\\\"$POSTGRES_SUPERUSERNAME\\\\\"\"}' "
! 		"  WHERE relkind IN ('r', 'v', 'm', 'S') AND relacl IS NULL;\n",
! 		"GRANT USAGE ON SCHEMA pg_catalog TO PUBLIC;\n",
! 		"GRANT CREATE, USAGE ON SCHEMA public TO PUBLIC;\n",
! 		"REVOKE ALL ON pg_largeobject FROM PUBLIC;\n",
  		NULL
  	};
  
- 	fputs(_("setting privileges on built-in objects ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	priv_lines = replace_token(privileges_setup, "$POSTGRES_SUPERUSERNAME",
  							   escape_quotes(username));
  	for (line = priv_lines; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
--- 2002,2025 ----
   * set (NOT NULL).
   */
  static void
! setup_privileges(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **priv_lines;
  	static char *privileges_setup[] = {
  		"UPDATE pg_class "
  		"  SET relacl = E'{\"=r/\\\\\"$POSTGRES_SUPERUSERNAME\\\\\"\"}' "
! 		"  WHERE relkind IN ('r', 'v', 'm', 'S') AND relacl IS NULL;\n\n",
! 		"GRANT USAGE ON SCHEMA pg_catalog TO PUBLIC;\n\n",
! 		"GRANT CREATE, USAGE ON SCHEMA public TO PUBLIC;\n\n",
! 		"REVOKE ALL ON pg_largeobject FROM PUBLIC;\n\n",
  		NULL
  	};
  
  	priv_lines = replace_token(privileges_setup, "$POSTGRES_SUPERUSERNAME",
  							   escape_quotes(username));
  	for (line = priv_lines; *line != NULL; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
*************** set_info_version(void)
*** 2197,2223 ****
   * load info schema and populate from features file
   */
  static void
! setup_schema(void)
  {
- 	PG_CMD_DECL;
  	char	  **line;
  	char	  **lines;
  
- 	fputs(_("creating information schema ... "), stdout);
- 	fflush(stdout);
- 
  	lines = readfile(info_schema_file);
  
- 	/*
- 	 * We use -j here to avoid backslashing stuff in information_schema.sql
- 	 */
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s -j template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = lines; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
--- 2054,2066 ----
   * load info schema and populate from features file
   */
  static void
! setup_schema(FILE *cmdfd)
  {
  	char	  **line;
  	char	  **lines;
  
  	lines = readfile(info_schema_file);
  
  	for (line = lines; *line != NULL; line++)
  	{
  		PG_CMD_PUTS(*line);
*************** setup_schema(void)
*** 2226,2390 ****
  
  	free(lines);
  
- 	PG_CMD_CLOSE;
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	PG_CMD_PRINTF1("UPDATE information_schema.sql_implementation_info "
  				   "  SET character_value = '%s' "
! 				   "  WHERE implementation_info_name = 'DBMS VERSION';\n",
  				   infoversion);
  
  	PG_CMD_PRINTF1("COPY information_schema.sql_features "
  				   "  (feature_id, feature_name, sub_feature_id, "
  				   "  sub_feature_name, is_supported, comments) "
! 				   " FROM E'%s';\n",
  				   escape_quotes(features_file));
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * load PL/pgsql server-side language
   */
  static void
! load_plpgsql(void)
  {
! 	PG_CMD_DECL;
! 
! 	fputs(_("loading PL/pgSQL server-side language ... "), stdout);
! 	fflush(stdout);
! 
! 	snprintf(cmd, sizeof(cmd),
! 			 "\"%s\" %s template1 >%s",
! 			 backend_exec, backend_options,
! 			 DEVNULL);
! 
! 	PG_CMD_OPEN;
! 
! 	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n");
! 
! 	PG_CMD_CLOSE;
! 
! 	check_ok();
  }
  
  /*
   * clean everything up in template1
   */
  static void
! vacuum_db(void)
  {
- 	PG_CMD_DECL;
- 
- 	fputs(_("vacuuming database template1 ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	/* Run analyze before VACUUM so the statistics are frozen. */
! 	PG_CMD_PUTS("ANALYZE;\nVACUUM FREEZE;\n");
! 
! 	PG_CMD_CLOSE;
! 
! 	check_ok();
  }
  
  /*
   * copy template1 to template0
   */
  static void
! make_template0(void)
  {
! 	PG_CMD_DECL;
! 	const char **line;
! 	static const char *template0_setup[] = {
! 		"CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false;\n",
  
  		/*
  		 * We use the OID of template0 to determine lastsysoid
  		 */
  		"UPDATE pg_database SET datlastsysoid = "
  		"    (SELECT oid FROM pg_database "
! 		"    WHERE datname = 'template0');\n",
  
  		/*
  		 * Explicitly revoke public create-schema and create-temp-table
  		 * privileges in template1 and template0; else the latter would be on
  		 * by default
  		 */
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template1 FROM public;\n",
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template0 FROM public;\n",
  
! 		"COMMENT ON DATABASE template0 IS 'unmodifiable empty database';\n",
  
  		/*
  		 * Finally vacuum to clean up dead rows in pg_database
  		 */
! 		"VACUUM FULL pg_database;\n",
  		NULL
  	};
  
- 	fputs(_("copying template1 to template0 ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = template0_setup; *line; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
   * copy template1 to postgres
   */
  static void
! make_postgres(void)
  {
! 	PG_CMD_DECL;
! 	const char **line;
! 	static const char *postgres_setup[] = {
! 		"CREATE DATABASE postgres;\n",
! 		"COMMENT ON DATABASE postgres IS 'default administrative connection database';\n",
  		NULL
  	};
  
- 	fputs(_("copying template1 to postgres ... "), stdout);
- 	fflush(stdout);
- 
- 	snprintf(cmd, sizeof(cmd),
- 			 "\"%s\" %s template1 >%s",
- 			 backend_exec, backend_options,
- 			 DEVNULL);
- 
- 	PG_CMD_OPEN;
- 
  	for (line = postgres_setup; *line; line++)
  		PG_CMD_PUTS(*line);
- 
- 	PG_CMD_CLOSE;
- 
- 	check_ok();
  }
  
  /*
--- 2069,2158 ----
  
  	free(lines);
  
  	PG_CMD_PRINTF1("UPDATE information_schema.sql_implementation_info "
  				   "  SET character_value = '%s' "
! 				   "  WHERE implementation_info_name = 'DBMS VERSION';\n\n",
  				   infoversion);
  
  	PG_CMD_PRINTF1("COPY information_schema.sql_features "
  				   "  (feature_id, feature_name, sub_feature_id, "
  				   "  sub_feature_name, is_supported, comments) "
! 				   " FROM E'%s';\n\n",
  				   escape_quotes(features_file));
  }
  
  /*
   * load PL/pgsql server-side language
   */
  static void
! load_plpgsql(FILE *cmdfd)
  {
! 	PG_CMD_PUTS("CREATE EXTENSION plpgsql;\n\n");
  }
  
  /*
   * clean everything up in template1
   */
  static void
! vacuum_db(FILE *cmdfd)
  {
  	/* Run analyze before VACUUM so the statistics are frozen. */
! 	PG_CMD_PUTS("ANALYZE;\n\nVACUUM FREEZE;\n\n");
  }
  
  /*
   * copy template1 to template0
   */
  static void
! make_template0(FILE *cmdfd)
  {
! 	const char *const * line;
! 	static const char *const template0_setup[] = {
! 		"CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false;\n\n",
  
  		/*
  		 * We use the OID of template0 to determine lastsysoid
  		 */
  		"UPDATE pg_database SET datlastsysoid = "
  		"    (SELECT oid FROM pg_database "
! 		"    WHERE datname = 'template0');\n\n",
  
  		/*
  		 * Explicitly revoke public create-schema and create-temp-table
  		 * privileges in template1 and template0; else the latter would be on
  		 * by default
  		 */
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template1 FROM public;\n\n",
! 		"REVOKE CREATE,TEMPORARY ON DATABASE template0 FROM public;\n\n",
  
! 		"COMMENT ON DATABASE template0 IS 'unmodifiable empty database';\n\n",
  
  		/*
  		 * Finally vacuum to clean up dead rows in pg_database
  		 */
! 		"VACUUM FULL pg_database;\n\n",
  		NULL
  	};
  
  	for (line = template0_setup; *line; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
   * copy template1 to postgres
   */
  static void
! make_postgres(FILE *cmdfd)
  {
! 	const char *const * line;
! 	static const char *const postgres_setup[] = {
! 		"CREATE DATABASE postgres;\n\n",
! 		"COMMENT ON DATABASE postgres IS 'default administrative connection database';\n\n",
  		NULL
  	};
  
  	for (line = postgres_setup; *line; line++)
  		PG_CMD_PUTS(*line);
  }
  
  /*
*************** check_authmethod_unspecified(const char 
*** 2794,2802 ****
  }
  
  static void
! check_authmethod_valid(const char *authmethod, const char **valid_methods, const char *conntype)
  {
! 	const char **p;
  
  	for (p = valid_methods; *p; p++)
  	{
--- 2562,2570 ----
  }
  
  static void
! check_authmethod_valid(const char *authmethod, const char *const * valid_methods, const char *conntype)
  {
! 	const char *const * p;
  
  	for (p = valid_methods; *p; p++)
  	{
*************** warn_on_mount_point(int error)
*** 3303,3308 ****
--- 3071,3077 ----
  void
  initialize_data_directory(void)
  {
+ 	PG_CMD_DECL;
  	int			i;
  
  	setup_signals();
*************** initialize_data_directory(void)
*** 3343,3377 ****
  	 */
  	write_version_file("base/1");
  
! 	/* Create the stuff we don't need to use bootstrap mode for */
  
! 	setup_auth();
  	if (pwprompt || pwfilename)
! 		get_set_pwd();
  
! 	setup_depend();
  
! 	setup_sysviews();
  
! 	setup_description();
  
! 	setup_collation();
  
! 	setup_conversion();
  
! 	setup_dictionary();
  
! 	setup_privileges();
  
! 	setup_schema();
  
! 	load_plpgsql();
  
! 	vacuum_db();
  
! 	make_template0();
  
! 	make_postgres();
  }
  
  
--- 3112,3162 ----
  	 */
  	write_version_file("base/1");
  
! 	/*
! 	 * Create the stuff we don't need to use bootstrap mode for, using a
! 	 * backend running in simple standalone mode.
! 	 */
! 	fputs(_("performing post-bootstrap initialization ... "), stdout);
! 	fflush(stdout);
  
! 	snprintf(cmd, sizeof(cmd),
! 			 "\"%s\" %s template1 >%s",
! 			 backend_exec, backend_options,
! 			 DEVNULL);
! 
! 	PG_CMD_OPEN;
! 
! 	setup_auth(cmdfd);
  	if (pwprompt || pwfilename)
! 		get_set_pwd(cmdfd);
  
! 	setup_depend(cmdfd);
  
! 	setup_sysviews(cmdfd);
  
! 	setup_description(cmdfd);
  
! 	setup_collation(cmdfd);
  
! 	setup_conversion(cmdfd);
  
! 	setup_dictionary(cmdfd);
  
! 	setup_privileges(cmdfd);
  
! 	setup_schema(cmdfd);
  
! 	load_plpgsql(cmdfd);
  
! 	vacuum_db(cmdfd);
  
! 	make_template0(cmdfd);
  
! 	make_postgres(cmdfd);
! 
! 	PG_CMD_CLOSE;
! 
! 	check_ok();
  }
  
  
diff --git a/src/tools/msvc/Install.pm b/src/tools/msvc/Install.pm
index f955725..40e06f6 100644
*** a/src/tools/msvc/Install.pm
--- b/src/tools/msvc/Install.pm
*************** sub GenerateConversionScript
*** 365,371 ****
  		$sql .=
  "CREATE DEFAULT CONVERSION pg_catalog.$name FOR '$se' TO '$de' FROM $func;\n";
  		$sql .=
! "COMMENT ON CONVERSION pg_catalog.$name IS 'conversion for $se to $de';\n";
  	}
  	open($F, ">$target/share/conversion_create.sql")
  	  || die "Could not write to conversion_create.sql\n";
--- 365,371 ----
  		$sql .=
  "CREATE DEFAULT CONVERSION pg_catalog.$name FOR '$se' TO '$de' FROM $func;\n";
  		$sql .=
! "COMMENT ON CONVERSION pg_catalog.$name IS 'conversion for $se to $de';\n\n";
  	}
  	open($F, ">$target/share/conversion_create.sql")
  	  || die "Could not write to conversion_create.sql\n";