Parsing of pg_hba.conf and authentication inconsistencies
The way pg_hba.conf is set up to be loaded/parsed now, we have:
postmaster: open and tokenize file (load_hba(), tokenize_file()).
backend: parse lines (parse_hba) and check for matches
(check_hba/hba_getauthmethod)
This means that the code in the postmaster is very simple, and it's
shared with the caching of the role and ident files.
It also means that we don't catch any syntax errors in the hba file
until a client connects. For example, if I misspell "local" on the first
line of the file (or just leave a bogus character on a line by mistake),
no client will be able to connect. But I can still do a pg_ctl reload
loading the broken file into the backend, thus making it impossible for
anybody to connect to the database. (when there's a broken line, we
won't continue to look at further lines in the file, obviously)
Is there any actual gain by not doing the parsing in the postmaster,
other than the fact that it's slightly less shared code with the other
two files? If not, then I'd like to move that parsing there for above
reasons.
I've also noticed that authentication methods error out in different
ways when they are not supported. For example, if I try to use Kerberos
without having it compiled in, I get an error when a client tries to
connect (because we compile in stub functions for the authentication
that just throw an error). But if I use pam, I get an "missing or
erroneous pg_hba.conf file" error (because we #ifdef out the entire
option all over the place). I'd like to make these consistent - but
which one of them do people prefer?
//Magnus
"Magnus Hagander" <magnus@hagander.net> writes:
I've also noticed that authentication methods error out in different
ways when they are not supported. For example, if I try to use Kerberos
without having it compiled in, I get an error when a client tries to
connect (because we compile in stub functions for the authentication
that just throw an error). But if I use pam, I get an "missing or
erroneous pg_hba.conf file" error (because we #ifdef out the entire
option all over the place). I'd like to make these consistent - but
which one of them do people prefer?
Generally I prefer having stub functions which error out because it means
other code doesn't need lots of ifdef's around the call sites.
However it would be nice to throw an error or at least a warning when parsing
the file instead of pretending everything's ok. Perhaps authentication methods
should have a function to check whether the method is supported which is
called when the file is parsed.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!
Magnus Hagander <magnus@hagander.net> writes:
Is there any actual gain by not doing the parsing in the postmaster,
Define "parsing". There's quite a lot of possible errors in pg_hba
that it would be totally unreasonable for the postmaster to detect.
We could catch some simple problems at file load time, perhaps,
but those usually aren't the ones that cause trouble for people.
On the whole, I am against putting any more functionality into the
main postmaster process than absolutely has to be there. Every
new function you put onto it is another potential source of
service-outage-inducing bugs.
I've also noticed that authentication methods error out in different
ways when they are not supported.
Yeah, that's something that should be made more consistent.
Idle thought: maybe what would really make sense here is a "lint"
for PG config files, which you'd run as a standalone program and
which would look for not only clear errors but questionable things
to warn about. For instance it might notice multiple pg_hba.conf
entries for the same IP addresses, check whether an LDAP server
can be connected to, check that all user/group/database names
used in the file actually exist, etc. These are things that we'd
certainly not put into any load- or reload-time tests.
regards, tom lane
Tom Lane wrote:
Idle thought: maybe what would really make sense here is a "lint"
for PG config files, which you'd run as a standalone program and
which would look for not only clear errors but questionable things
to warn about. For instance it might notice multiple pg_hba.conf
entries for the same IP addresses, check whether an LDAP server
can be connected to, check that all user/group/database names
used in the file actually exist, etc. These are things that we'd
certainly not put into any load- or reload-time tests.
I like this idea.
postgres --check-hba-file /path/to/hba.conf
postgres --check-conf-file /path/to/postgresql.conf
(I think it's better to reuse the same postmaster executable, because
that way it's easier to have the same parsing routines.)
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Magnus,
However it would be nice to throw an error or at least a warning when parsing
the file instead of pretending everything's ok. Perhaps authentication methods
should have a function to check whether the method is supported which is
called when the file is parsed.
The good way to solve this would be to have independant command line
utilities which check pg_hba.conf, pg_ident.conf and postgresql.conf for
errors. Then DBAs could run a check *before* restarting the server.
--Josh
Tom Lane wrote:
Magnus Hagander <magnus@hagander.net> writes:
Is there any actual gain by not doing the parsing in the postmaster,
Define "parsing". There's quite a lot of possible errors in pg_hba
that it would be totally unreasonable for the postmaster to detect.
Parsing as in turning into a struct with clearly defined parts. Like
what type it is (host/local/hostssl), CIDR mask, auth method and parameters.
We could catch some simple problems at file load time, perhaps,
but those usually aren't the ones that cause trouble for people.
It would catch things like typos, invalid CIDR address/mask and
specifying an auth method that doesn't exist. This is the far most
common errors I've seen - which ones are you referring to?
On the whole, I am against putting any more functionality into the
main postmaster process than absolutely has to be there. Every
new function you put onto it is another potential source of
service-outage-inducing bugs.
True.
But as a counterexample, we have a whole lot of code in there to do the
same for GUC. Which can even call user code (custom variables), no? Are
you also proposing we should look at getting rid of that?
I've also noticed that authentication methods error out in different
ways when they are not supported.Yeah, that's something that should be made more consistent.
Idle thought: maybe what would really make sense here is a "lint"
for PG config files, which you'd run as a standalone program and
which would look for not only clear errors but questionable things
to warn about. For instance it might notice multiple pg_hba.conf
entries for the same IP addresses, check whether an LDAP server
can be connected to, check that all user/group/database names
used in the file actually exist, etc. These are things that we'd
certainly not put into any load- or reload-time tests.
That would also be a valuable tool, but IMHO for a slightly different
purpose. To me that sounds more in the line of the tool to "tune/suggest
certain postgresql.conf parameters" that has been discussed earlier.
It would have to be implemented as a SQL callable function or so in
order to make it usable for people doing remote admin, but that could
certainly be done.
It would still leave a fairly large hole open for anybody editing the
config file and just HUPing the postmaster (which a whole lot of people
do, since they're used to doing that to their daemon processes)
//Magnus
Josh Berkus wrote:
Magnus,
However it would be nice to throw an error or at least a warning when
parsing
the file instead of pretending everything's ok. Perhaps authentication
methods
should have a function to check whether the method is supported which is
called when the file is parsed.The good way to solve this would be to have independant command line
utilities which check pg_hba.conf, pg_ident.conf and postgresql.conf for
errors. Then DBAs could run a check *before* restarting the server.
While clearly useful, it'd still leave the fairly large foot-gun that is
editing the hba file and HUPing things which can leave you with a
completely un-connectable database because of a small typo. The
difference in the "could run" vs "must run, thus runs automatically" part...
//Magnus
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
Idle thought: maybe what would really make sense here is a "lint"
for PG config files,
I like this idea.
postgres --check-hba-file /path/to/hba.conf
postgres --check-conf-file /path/to/postgresql.conf
(I think it's better to reuse the same postmaster executable, because
that way it's easier to have the same parsing routines.)
I'd go for just
postgres --check-config -D $PGDATA
(In a reload scenario, you'd edit the files in-place and then do this
before issuing SIGHUP.)
Reasons:
1. Easier to use: one command gets you all the checks, and you can't
accidentally forget to check the one config file that's gonna give
you problems.
2. An in-place check is the only way to be sure that, for instance,
relative-path 'include' directives are okay.
3. To implement the suggested check on role/database validity, the code
is going to need to pull in the flat-file copies of pg_database etc.
(Or try to contact a live postmaster, but that won't work when you don't
have one.) So it needs access to $PGDATA in any case.
4. There might be usefulness in cross-checking different config files,
so examining a single one out of context just seems like the wrong
mindset.
regards, tom lane
Magnus Hagander <magnus@hagander.net> writes:
The good way to solve this would be to have independant command line
utilities which check pg_hba.conf, pg_ident.conf and postgresql.conf for
errors. Then DBAs could run a check *before* restarting the server.
While clearly useful, it'd still leave the fairly large foot-gun that is
editing the hba file and HUPing things which can leave you with a
completely un-connectable database because of a small typo.
That will *always* be possible, just because software is finite and
human foolishness is not ;-).
Now, we could ameliorate it a bit given a "postgres --check-config"
mode by having pg_ctl automatically run that mode before any start,
restart, or reload command, and then refusing to proceed if the check
detects any indubitable errors. On the other hand, that would leave
us with the scenario where the checking code warns about stuff that it
can't be sure is wrong, but then we go ahead and install the borked
config anyway. (Nobody is going to put up with code that refuses
to install config settings that aren't 100% clean, unless the checks
are so weak that they miss a lot of possibly-useful warnings.)
Seems a lot better to me to just train people to run the check-config
code by hand before pulling the trigger to load the settings for real.
regards, tom lane
Alvaro Herrera wrote:
Tom Lane wrote:
Idle thought: maybe what would really make sense here is a "lint"
for PG config files, which you'd run as a standalone program and
which would look for not only clear errors but questionable things
to warn about. For instance it might notice multiple pg_hba.conf
entries for the same IP addresses, check whether an LDAP server
can be connected to, check that all user/group/database names
used in the file actually exist, etc. These are things that we'd
certainly not put into any load- or reload-time tests.I like this idea.
postgres --check-hba-file /path/to/hba.conf
postgres --check-conf-file /path/to/postgresql.conf(I think it's better to reuse the same postmaster executable, because
that way it's easier to have the same parsing routines.)
Change that to pg_ctl and you have a deal :)
Joshua D. Drake
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
Idle thought: maybe what would really make sense here is a "lint"
for PG config files,I like this idea.
postgres --check-hba-file /path/to/hba.conf
postgres --check-conf-file /path/to/postgresql.conf(I think it's better to reuse the same postmaster executable, because
that way it's easier to have the same parsing routines.)I'd go for just
postgres --check-config -D $PGDATA
(In a reload scenario, you'd edit the files in-place and then do this
before issuing SIGHUP.)regards, tom lane
Doesn't it seem reasonable that it should be pg_ctl? You should never
run postgres directly unless it is for DR.
Joshua D. Drake
Tom,
Seems a lot better to me to just train people to run the check-config
code by hand before pulling the trigger to load the settings for real.
Or just have pg_ctl --check-first as an option. Cautious DBAs would use
that. In 2-3 versions, we can make --check-first the default, and
require the option --dont-check for unattended restarts.
--Josh
Josh Berkus wrote:
Tom,
Seems a lot better to me to just train people to run the check-config
code by hand before pulling the trigger to load the settings for real.Or just have pg_ctl --check-first as an option. Cautious DBAs would use
that. In 2-3 versions, we can make --check-first the default, and
require the option --dont-check for unattended restarts.
If the config file is known to be broken, why should you *ever* allow it
to try to restart it?
//Magnus
Joshua D. Drake wrote:
Tom Lane wrote:
I'd go for just
postgres --check-config -D $PGDATA
(In a reload scenario, you'd edit the files in-place and then do this
before issuing SIGHUP.)
Sounds good.
Doesn't it seem reasonable that it should be pg_ctl? You should never
run postgres directly unless it is for DR.
What on earth is DR?
The problem with pg_ctl is that it's indirectly calling postgres, and it
doesn't have a lot of a way to know what happened after calling it;
consider the mess we have with pg_ctl -w.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote:
Joshua D. Drake wrote:
Tom Lane wrote:
Doesn't it seem reasonable that it should be pg_ctl? You should never
run postgres directly unless it is for DR.What on earth is DR?
Disaster Recovery.
The problem with pg_ctl is that it's indirectly calling postgres, and it
doesn't have a lot of a way to know what happened after calling it;
consider the mess we have with pg_ctl -w.
True enough but perhaps that is a problem in itself. IMO, we should be
encouraging people to never touch the postgres binary. If that means
pg_ctl becomes a lot smarter, then we have to consider that as well.
Comparatively if I do a apachectl configtest it tells me if it is
correct. Now I assume it is actually calling httpd (I haven't checked)
but the point is, I am not calling httpd.
Sincerely,
Joshua D. Drake
Joshua D. Drake wrote:
Alvaro Herrera wrote:
The problem with pg_ctl is that it's indirectly calling postgres, and it
doesn't have a lot of a way to know what happened after calling it;
consider the mess we have with pg_ctl -w.True enough but perhaps that is a problem in itself. IMO, we should be
encouraging people to never touch the postgres binary. If that means
pg_ctl becomes a lot smarter, then we have to consider that as well.
That's a great idea, but I don't think we should weigh down this idea of
config checking with that responsability. We could discuss solutions to
this problem, but I think it's going to be quite a lot harder than you
seem to think.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Saturday 02 August 2008 13:36:40 Magnus Hagander wrote:
Josh Berkus wrote:
Tom,
Seems a lot better to me to just train people to run the check-config
code by hand before pulling the trigger to load the settings for real.Or just have pg_ctl --check-first as an option. Cautious DBAs would use
that. In 2-3 versions, we can make --check-first the default, and
require the option --dont-check for unattended restarts.If the config file is known to be broken, why should you *ever* allow it
to try to restart it?
Certainly there isn't any reason to allow a reload of a file that is just
going to break things when the first connection happens. For that matter,
why would we ever not want to parse it at HUP time rather than connect time?
--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
"Joshua D. Drake" <jd@commandprompt.com> writes:
True enough but perhaps that is a problem in itself. IMO, we should be
encouraging people to never touch the postgres binary.
I don't buy that at all. pg_ctl is useful for some people and not so
useful for others; in particular, from the perspective of a system
startup script it tends to get in the way more than it helps.
regards, tom lane
Robert Treat <xzilla@users.sourceforge.net> writes:
Certainly there isn't any reason to allow a reload of a file that is just
going to break things when the first connection happens. For that matter,
why would we ever not want to parse it at HUP time rather than connect time?
Two or three reasons why not were already mentioned upthread, but for
the stubborn, here's another one: are you volunteering to write the code
that backs out the config-file reload after the checks have determined
it was bad? Given the amount of pain we suffered trying to make GUC do
something similar, any sane person would run screaming from the
prospect.
regards, tom lane
Tom Lane wrote:
Robert Treat <xzilla@users.sourceforge.net> writes:
Certainly there isn't any reason to allow a reload of a file that is just
going to break things when the first connection happens. For that matter,
why would we ever not want to parse it at HUP time rather than connect time?Two or three reasons why not were already mentioned upthread, but for
the stubborn, here's another one: are you volunteering to write the code
that backs out the config-file reload after the checks have determined
it was bad? Given the amount of pain we suffered trying to make GUC do
something similar, any sane person would run screaming from the
prospect.
For pg_hba.conf, I don't see that as a very big problem, really. It
doesn't (and shouldn't) modify any "external" variables, so it should be
as simple as parsing the new file into a completely separate
list-of-structs and only if it's all correct switch the main pointer
(and free the old struct).
Yes, I still think we should do the "simple parsing" step at HUP time.
That doesn't mean that it wouldn't be a good idea to have one of these
check-config options that can look for conflicting options *as well*, of
course. But I'm getting the feeling I'm on the losing side of the debate
here...
//Magnus