Parsing config files in a directory
Per discussion at the developer meeting back in Ottawa, attached is an
initial patch that implements reading a directory of configuration
files instead of just one. The idea being that something like a tuning
tool, or pgadmin, for example can drop and modify files in this
directory instead of modifying the main config file (which can be very
hard to machine-parse). The idea is the same as other software like
apache that parses multiple files.
Files are parsed in alphabetical order so it's predictable, and you
can make sure some files override others etc.
Comments, before I go do the final polishing? :-)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Attachments:
guc_dir.patchtext/x-diff; charset=US-ASCII; name=guc_dir.patchDownload+148-11
On 24 Oct 2009, at 14:41, Magnus Hagander wrote:
Per discussion at the developer meeting back in Ottawa, attached is an
initial patch that implements reading a directory of configuration
files instead of just one. The idea being that something like a tuning
tool, or pgadmin, for example can drop and modify files in this
directory instead of modifying the main config file (which can be very
hard to machine-parse). The idea is the same as other software like
apache that parses multiple files.Files are parsed in alphabetical order so it's predictable, and you
can make sure some files override others etc.Comments, before I go do the final polishing? :-)
I don't know what the discussion topics were, since I was not there.
But primary question is, cannot that be achieved with simple includes
in postgresql.conf ?
On Sat, 2009-10-24 at 15:41 +0200, Magnus Hagander wrote:
Per discussion at the developer meeting back in Ottawa, attached is an
initial patch that implements reading a directory of configuration
files instead of just one. The idea being that something like a tuning
tool, or pgadmin, for example can drop and modify files in this
directory instead of modifying the main config file (which can be very
hard to machine-parse). The idea is the same as other software like
apache that parses multiple files.Files are parsed in alphabetical order so it's predictable, and you
can make sure some files override others etc.Comments, before I go do the final polishing? :-)
I really don't like this at all. It seems like too much change. The
whole world knows about postgresql.conf, lets not change that.
I'm happy with the new feature, however, so is there a way to do this?
Could we have a new directive in postgresql.conf that allows you to
specify an includedirectory? Like an include directive but for a whole
directory rather than just a file. Users would then also be able to
specify more than one directory, if required. This way we would allow
people to have the multi-conf file feature but without changing existing
ways of working. By default, we would have one entry at the bottom of
postgresql.conf which would point to pg_conf, a new directory that
starts off empty. So by default, nothing has changed, yet the new
feature is allowed.
--
Simon Riggs www.2ndQuadrant.com
2009/10/24 Grzegorz Jaskiewicz <gj@pointblue.com.pl>:
On 24 Oct 2009, at 14:41, Magnus Hagander wrote:
Per discussion at the developer meeting back in Ottawa, attached is an
initial patch that implements reading a directory of configuration
files instead of just one. The idea being that something like a tuning
tool, or pgadmin, for example can drop and modify files in this
directory instead of modifying the main config file (which can be very
hard to machine-parse). The idea is the same as other software like
apache that parses multiple files.Files are parsed in alphabetical order so it's predictable, and you
can make sure some files override others etc.Comments, before I go do the final polishing? :-)
I don't know what the discussion topics were, since I was not there. But primary question is, cannot that be achieved with simple includes in postgresql.conf ?
You could, but that would take away the main point - which is that the
utilities would once again have to edit and parse postgresql.conf,
which is *very* hard to do reliably given that it's a free-format
file.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
2009/10/24 Simon Riggs <simon@2ndquadrant.com>:
On Sat, 2009-10-24 at 15:41 +0200, Magnus Hagander wrote:
Per discussion at the developer meeting back in Ottawa, attached is an
initial patch that implements reading a directory of configuration
files instead of just one. The idea being that something like a tuning
tool, or pgadmin, for example can drop and modify files in this
directory instead of modifying the main config file (which can be very
hard to machine-parse). The idea is the same as other software like
apache that parses multiple files.Files are parsed in alphabetical order so it's predictable, and you
can make sure some files override others etc.Comments, before I go do the final polishing? :-)
I really don't like this at all. It seems like too much change. The
whole world knows about postgresql.conf, lets not change that.
We're not. It will still be there, just like before. We're just adding
one more way to do it.
I'm happy with the new feature, however, so is there a way to do this?
Could we have a new directive in postgresql.conf that allows you to
specify an includedirectory? Like an include directive but for a whole
directory rather than just a file.
We could do it that way, but that would make the change bigger, not smaller :-P
Users would then also be able to
specify more than one directory, if required. This way we would allow
people to have the multi-conf file feature but without changing existing
ways of working. By default, we would have one entry at the bottom of
postgresql.conf which would point to pg_conf, a new directory that
starts off empty. So by default, nothing has changed, yet the new
feature is allowed.
Did you look at the patch? That's basically what it does now, except
it doesn't add a parameter in postgresql.conf. If you lkeave the
pg_config directory empty, it will just parse the postgresql.conf file
just like before, and that's it. only if you put something in the
pg_config directory will it load it, and only *after* it has loaded
the main configuration file.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Sat, 2009-10-24 at 18:34 +0200, Magnus Hagander wrote:
Did you look at the patch?
I did, yes. But no docs with it. It would be good to see the proposal,
not just the patch (or a reference to the written proposal from
earlier).
That's basically what it does now, except
it doesn't add a parameter in postgresql.conf. If you lkeave the
pg_config directory empty, it will just parse the postgresql.conf file
just like before, and that's it. only if you put something in the
pg_config directory will it load it, and only *after* it has loaded
the main configuration file.
OK, I didn't pick up on that. So now I like it, apart from one thing: I
would prefer to have explicit control over that (via the directive I
mentioned or otherwise), rather than just doing that implicitly. Doing
things implicitly will make it even harder for most people to trace
which parameters will get picked up and stupid mistakes will be made on
production servers.
--
Simon Riggs www.2ndQuadrant.com
Magnus Hagander wrote:
I'm happy with the new feature, however, so is there a way to do this?
Could we have a new directive in postgresql.conf that allows you to
specify an includedirectory? Like an include directive but for a whole
directory rather than just a file.We could do it that way, but that would make the change bigger, not smaller :-P
If we're going to do this at all, ISTM the location should be
configurable, just like other file locations are.
Users would then also be able to
specify more than one directory, if required. This way we would allow
people to have the multi-conf file feature but without changing existing
ways of working. By default, we would have one entry at the bottom of
postgresql.conf which would point to pg_conf, a new directory that
starts off empty. So by default, nothing has changed, yet the new
feature is allowed.Did you look at the patch? That's basically what it does now, except
it doesn't add a parameter in postgresql.conf. If you lkeave the
pg_config directory empty, it will just parse the postgresql.conf file
just like before, and that's it. only if you put something in the
pg_config directory will it load it, and only *after* it has loaded
the main configuration file.
What bothers me some is that it sounds like a bit of a footgun. A
postgres cluster is a shared resource, and we surely don't want
applications meddling with the shared config. This seems quite different
from, say, an application dropping a file in /etc/cron.d.
I don't have strong feelings on it, but I do have some niggling worries.
cheers
andrew
2009/10/24 Simon Riggs <simon@2ndquadrant.com>:
On Sat, 2009-10-24 at 18:34 +0200, Magnus Hagander wrote:
Did you look at the patch?
I did, yes. But no docs with it. It would be good to see the proposal,
not just the patch (or a reference to the written proposal from
earlier).
We discussed it at the developer meeting - I believe you were there by phone.
http://wiki.postgresql.org/wiki/PgCon_2009_Developer_Meeting
See the section on auto tuning, including an action list.
That's basically what it does now, except
it doesn't add a parameter in postgresql.conf. If you lkeave the
pg_config directory empty, it will just parse the postgresql.conf file
just like before, and that's it. only if you put something in the
pg_config directory will it load it, and only *after* it has loaded
the main configuration file.OK, I didn't pick up on that. So now I like it, apart from one thing: I
would prefer to have explicit control over that (via the directive I
mentioned or otherwise), rather than just doing that implicitly. Doing
things implicitly will make it even harder for most people to trace
which parameters will get picked up and stupid mistakes will be made on
production servers.
So you're suggesting an includedir parameter, right? And one such
parameter at the bottom of the default postgresql.conf file, with the
ability to remove it and/or add others, correct?
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Sat, 2009-10-24 at 12:52 -0400, Andrew Dunstan wrote:
What bothers me some is that it sounds like a bit of a footgun. A
postgres cluster is a shared resource, and we surely don't want
applications meddling with the shared config. This seems quite different
from, say, an application dropping a file in /etc/cron.d.I don't have strong feelings on it, but I do have some niggling worries.
Yes, that describes my mild unease about this a little better.
--
Simon Riggs www.2ndQuadrant.com
On Sat, 2009-10-24 at 18:57 +0200, Magnus Hagander wrote:
That's basically what it does now, except
it doesn't add a parameter in postgresql.conf. If you lkeave the
pg_config directory empty, it will just parse the postgresql.conf file
just like before, and that's it. only if you put something in the
pg_config directory will it load it, and only *after* it has loaded
the main configuration file.OK, I didn't pick up on that. So now I like it, apart from one thing: I
would prefer to have explicit control over that (via the directive I
mentioned or otherwise), rather than just doing that implicitly. Doing
things implicitly will make it even harder for most people to trace
which parameters will get picked up and stupid mistakes will be made on
production servers.So you're suggesting an includedir parameter, right? And one such
parameter at the bottom of the default postgresql.conf file, with the
ability to remove it and/or add others, correct?
Yes, please.
2009/10/24 Simon Riggs <simon@2ndquadrant.com>:
On Sat, 2009-10-24 at 18:34 +0200, Magnus Hagander wrote:
Did you look at the patch?
I did, yes. But no docs with it. It would be good to see the proposal,
not just the patch (or a reference to the written proposal from
earlier).We discussed it at the developer meeting - I believe you were there by phone.
http://wiki.postgresql.org/wiki/PgCon_2009_Developer_Meeting
See the section on auto tuning, including an action list.
I wasn't there for that part of the meeting, and the notes from the
meeting aren't a design proposal, just a note to do a feature. The notes
do put an action on someone to "work out details on hackers". Not doing
design before coding is a general bugbear of mine - no need to pick on
you in particular with this and I'm sorry I brought that up.
--
Simon Riggs www.2ndQuadrant.com
On Sat, 24 Oct 2009, Andrew Dunstan wrote:
If we're going to do this at all, ISTM the location should be configurable,
just like other file locations are.
...
What bothers me some is that it sounds like a bit of a footgun. A postgres
cluster is a shared resource, and we surely don't want applications meddling
with the shared config. This seems quite different from, say, an application
dropping a file in /etc/cron.d.
It's hard to satisfy both these at once. If you want to make it secured
against random application changes, the logical default place to put the
files is in the database directory itself. If apps are allowed to write
there, they can surely cause more mayhem them just overwriting the config
files. If instead you allow the files to go anywhere, that's more
flexible, but then requires you to make sure that alternate location is
similarly secured.
Regardless, the UI I was hoping for was to make the default
postgresql.conf file end with a line like this:
directory 'conf'
Which makes the inclusion explicit for people used to navigating the
current style: look for a "conf" directory under PGDATA, those have the
modifications you're looking for. That's also easy for distributors to
patch, so you might even have the RHEL build ship with something like:
directory '/etc/sysconfig/postgresql/config'
I don't think it's asking too much of tool authors that given a PGDATA,
they would need to parse the postgresql.conf file in there just well
enough to figure out where the directory(s) of additional config files is
at. That level of config file manipulation there's already code for in
initdb, I was planning to refactor that into a library I can include for
the built-in pgtune I've been planning.
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Sat, Oct 24, 2009 at 9:41 AM, Magnus Hagander <magnus@hagander.net> wrote:
Per discussion at the developer meeting back in Ottawa, attached is an
initial patch that implements reading a directory of configuration
files instead of just one. The idea being that something like a tuning
tool, or pgadmin, for example can drop and modify files in this
directory instead of modifying the main config file (which can be very
hard to machine-parse).
The solution to the problem mentioned parenthetically here is
$PGDATA/postgresql.conf
And no matter how much anyone cares to protest, that is the ONLY real
solution to the problem of postgresql.conf being hard to parse until
we have software that can pass the Turing test. At the aforementioned
developer meeting, or anyway sometime at PGcon, there was some
discussion of the following variant, which would also work:
echo "# 'man postgresql.conf' for information about the contents of
this file" > $PGDATA/postgresql.conf
Supporting an include-directory seems harmless to me, and even
potentially useful. But the only way to solve the problem of
machine-parsing the config file is to remove the instructions (which
can only EVER be parsed by human beings) and put them somewhere else.
To reiterate, I have no problem with the proposal (I have not examined
the code), but I respectfully submit that it's not solving the problem
you think it's solving.
...Robert
phone answering, please forgive style...
--
dim
Le 24 oct. 2009 à 20:10, Robert Haas <robertmhaas@gmail.com> a écrit :
$PGDATA/postgresql.conf
I think the multiple files should go in $PGDATA/postgresql.conf.d
But the only way to solve the problem of
machine-parsing the config file is to remove the instructions (which
can only EVER be parsed by human beings) and put them somewhere else.
Not true. The problem we're trying to solve, methinks, is to be able
to provide a kind of SET PERMANENT feature.
The easiest alternative for providing this given current conf file
content style is to offer one file per GUC, where the first non
comment line is supposed to contain the option value.
Now if you first load postresql.conf then files in postresql.conf.d,
you did not change current behavior for $EDITOR people and made it
easy to have SET PERSISTENT + free form comment.
Regards,
On Oct 24, 2009, at 4:01 PM, Dimitri Fontaine <dfontaine@hi-media.com>
wrote:
Le 24 oct. 2009 à 20:10, Robert Haas <robertmhaas@gmail.com> a écrit
:$PGDATA/postgresql.conf
I think the multiple files should go in $PGDATA/postgresql.conf.d
But the only way to solve the problem of
machine-parsing the config file is to remove the instructions (which
can only EVER be parsed by human beings) and put them somewhere else.Not true. The problem we're trying to solve, methinks, is to be able
to provide a kind of SET PERMANENT feature.The easiest alternative for providing this given current conf file
content style is to offer one file per GUC, where the first non
comment line is supposed to contain the option value.Now if you first load postresql.conf then files in postresql.conf.d,
you did not change current behavior for $EDITOR people and made it
easy to have SET PERSISTENT + free form comment.
I guess that would solve the problem of knowing which comment is
associated with which setting, but it won't prevent SET PERSISTENT
from falsifying a comment like "the previous value of this setting was
4MB, but I changed it on YYYY-MM-DD". Maybe we don't care about that,
though.
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
Supporting an include-directory seems harmless to me, and even
potentially useful. But the only way to solve the problem of
machine-parsing the config file is to remove the instructions (which
can only EVER be parsed by human beings) and put them somewhere else.
Uh, that is complete nonsense. The instructions are not a problem.
The ability to have comments at all might be thought to be a problem,
but removing it isn't going to be an acceptable solution.
regards, tom lane
On Sat, 24 Oct 2009, Robert Haas wrote:
But the only way to solve the problem of machine-parsing the config file
is to remove the instructions (which can only EVER be parsed by human
beings) and put them somewhere else.
Ah, back to this again already. Your suggestion presumes there is someone
who can successfully force a decision to deprecate the existing format.
There is no such person, and thinking you have to win that battle is one
of the many traps one can fall into here and never escape from.
The roadmap here looks something like this:
1) Support a standard for include directories
2) Update tools to use them rather than ever modifying the primary
postgresql.conf
3) Once one of those matures, bundle a standard tool with the database
that does most of what people want for initial configuration. That only
has to worry about writing to the new include directory format rather than
trying to parse existing postgresql.conf files and write them back out
again at all.
4) Once the tool has proven itself, and people have become more
comfortable with using the newer config format, allow the option of
generating a shorter example file rather than the current large one.
5) Completely deprecate the giant config file *only* if the new approach
becomes so wildly successful that fans of editing the giant file admit
defeat at the hands of the new approach. This is completely optional and
possibly not ever possible.
If we get bogged down presuming (5) is the first useful step here, which
seems to be what you're suggesting, no progress will ever get made here.
To reiterate, I have no problem with the proposal (I have not examined
the code), but I respectfully submit that it's not solving the problem
you think it's solving.
As the message introducing it says, the goal this aims at is making life
easier for tool builders. I think you're extrapolating beyond its
intended scope in your evaluation of what problem it's aiming to solve.
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Sat, Oct 24, 2009 at 7:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Supporting an include-directory seems harmless to me, and even
potentially useful. But the only way to solve the problem of
machine-parsing the config file is to remove the instructions (which
can only EVER be parsed by human beings) and put them somewhere else.Uh, that is complete nonsense. The instructions are not a problem.
The ability to have comments at all might be thought to be a problem,
but removing it isn't going to be an acceptable solution.
If you're saying that the instructions are no more of a problem than
any other comment, then I guess I agree. But the comments in the
default file which we ship are mostly instructions, hence my original
statement.
I don't believe that the *ability* to have comments is the problem.
It wouldn't hurt anything to ship a file with a general comment block
at the top, with whatever content someone wants to put there. What
makes it impossible to machine-edit this file is that there is a
comment for every single setting, and that the "right place" to insert
a value for any particular setting is (at least in the default
configuration) marked by a comment which can be interpreted by humans
and not by a computer.
Shipping a mostly-empty file with a pointer to a man page that
included all of the instructions now contained in the file would make
things immensely easier for people who want to write programs to tune
the configuration, because it would transform the task of writing a
program that rewrites the configuration file from Turing-complete to
very easy. For the same reason, it would also allow us to support SET
PERSISTENT. On the flip side, as long as we leave it the way it is,
we can't do those things. We can argue about whether that's a good
trade-off, but that IS the trade-off.
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
I don't believe that the *ability* to have comments is the problem.
It wouldn't hurt anything to ship a file with a general comment block
at the top, with whatever content someone wants to put there. What
makes it impossible to machine-edit this file is that there is a
comment for every single setting, and that the "right place" to insert
a value for any particular setting is (at least in the default
configuration) marked by a comment which can be interpreted by humans
and not by a computer.
Right, but your mistake is in supposing that that statement has
something to do with the instructions. What it has to do with is
a style of usage that the instructions happen to exemplify --- but
getting rid of the instructions wouldn't make people change their
usage habits.
I concur with Greg Smith's nearby comments that the way to go at this
is a stepwise process. It is only *after* there is a workable tool
that is a clear improvement on manual editing that you will have any
chance of getting people to move away from manual editing, or even
getting them to entertain any change proposals that make manual editing
less pleasant.
regards, tom lane
Magnus Hagander <magnus@hagander.net> writes:
2009/10/24 Simon Riggs <simon@2ndquadrant.com>:
Could we have a new directive in postgresql.conf that allows you to
specify an includedirectory? Like an include directive but for a whole
directory rather than just a file.
We could do it that way, but that would make the change bigger, not smaller :-P
I think we should have an explicit include-directory directive, and the
reason I think so is that it makes it fairly easy for the user to
control the relative precedence of the manual settings (presumed to
still be kept in postgresql.conf) and the automatic settings (presumed
to be in files in the directory). Manual settings before the include
are overridable, those after are not.
Did you look at the patch? That's basically what it does now, except
it doesn't add a parameter in postgresql.conf. If you lkeave the
pg_config directory empty, it will just parse the postgresql.conf file
just like before, and that's it. only if you put something in the
pg_config directory will it load it, and only *after* it has loaded
the main configuration file.
That last is a deal-breaker for me; I do not want a hard wired
presumption that manual settings should be overridden.
regards, tom lane
On Sat, Oct 24, 2009 at 9:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I don't believe that the *ability* to have comments is the problem.
It wouldn't hurt anything to ship a file with a general comment block
at the top, with whatever content someone wants to put there. What
makes it impossible to machine-edit this file is that there is a
comment for every single setting, and that the "right place" to insert
a value for any particular setting is (at least in the default
configuration) marked by a comment which can be interpreted by humans
and not by a computer.Right, but your mistake is in supposing that that statement has
something to do with the instructions. What it has to do with is
a style of usage that the instructions happen to exemplify --- but
getting rid of the instructions wouldn't make people change their
usage habits.
I don't really understand this. What usage habits do we need to
change? The problem is that people expect that the setting for some
GUC is going to be near the comment block that describes that GUC. If
the comment blocks describing those GUCs were gone, then that
expectation would be removed.
It's true that someone might take an empty default file and do
something like this:
# ok, so now i'm going to set work_mem to a ridiculous value
work_mem = '1GB'
And they might be surprised if some automated-config-file rewriter
ended up shuffling the setting to some location in the file that was
no longer close to the comment. But most people probably won't add
things like that if they're not already present, and even if they do
there probably won't be that many of them, and they'll get it sorted
out.
I concur with Greg Smith's nearby comments that the way to go at this
is a stepwise process. It is only *after* there is a workable tool
that is a clear improvement on manual editing that you will have any
chance of getting people to move away from manual editing, or even
getting them to entertain any change proposals that make manual editing
less pleasant.
Well, there are certainly config-tuning tools already. Just since I
started reading this list, there is pgtune; and I'm sure there are
others I don't know about. Coming up with the settings is the easy
part; getting them into the file is the hard part. There have also
been several requests for a SQL command that updates postgresql.conf,
which would not be very hard to write if we made it so that such a
command needn't care about the comments beyond preserving them, but
which is completely unworkable as things stand. I think it's
completely backwards to suppose that nobody has written the tools but
when they do we'll consider adjusting the file; rather, people have
done the best they can already given that the file is very difficult
to work with, and when/if it stops being so difficult, they'll likely
do more.
I realize that the current file format is an old and familiar friend;
it is for me, too. But I think it's standing in the way of progress.
Being able to type a SQL command to update postgresql.conf would be
more substantially convenient than logging in as root, using su to
become postgres, changing to the correct directory, starting up vi,
finding the right setting, editing it, quitting out, and requesting a
reload. And I deal with 1 PostgreSQL instance at a time, not tens or
hundreds or thousands.
...Robert