proposal: a validator for configuration files
Hi,
I'd like to add an ability to validate the corectness of PostgreSQL
configuration files, i.e. postgresql.conf, pg_hba.conf, pg_ident.conf without
actually applying them. The idea is that it would be a command-line option to
postgres for a stand-alone case, or a user-callable function when postmaster
is running.
Per the former discussion of a validator for PostgreSQL configuration files
(see http://archives.postgresql.org/pgsql-hackers/2008-08/msg00048.php),
here's an implementation proposal. The development plan consists of 2 parts.
The first one is to add new code that would allow running the checks in both a
stand-alone process, when postmaster is not running, and as a function call
from a backend postgres process. Initially the code would simply loads
configuration files, without performing any of the validation checks. The
second part consists of adding specific checks.
I think most of the code related to this feature should be put into the
auxiliary process. The rationale is that we already have a stand-alone
CheckerProcess, which nowadays only parses postgresql.conf, runs BaseInit and
exists. We can easily load pg_hba.conf and pg_ident.conf and run all the
necessary checks there. Moreover, the same process can be used when checks are
launched from a built-in function. In that case, it would save the postgres
backend from reloading postgresql.conf, pg_hba.conf and pg_ident.conf
internally and restoring the previous configuration options when the function
exists. Below is a more detailed description of implementation steps:
1.1. Stand-alone process (postmaster is not running):
- Add a new option (--check-config) to main.c. Run AuxiliaryProcessMain with
auxType= CheckerProcess if this option is specified.
- In CheckerModeMain load pg_hba.conf, pg_ident.conf
1.2. Built-in function, called from a backend process.
- Add a new built-in function
- Add LastCheckState shared flag to denote whether the check has been
successful or failed
- Add a new PMSignalReason
- From the function call SendPostmasterSignal with the reason added on the
previous step.
- In the postmaster add a check for this reason in sigusr1_handler, spawning
a checker process. Set LastCheckStatus to InProgress.
- Store current configuration options in AuxillaryProcessMain before reloading
configuration files to avoid checking for options that haven't changed.
- In AuxillaryProcessMain, modify SelectConfigFiles invocation to call it
if IsUnderPostmaster = true if auxType process type is CheckerProcess.
- In the CheckerModeMain, set LastCheckState to either success or failure at the
end of all checks.
- The built-in function would wait until LastCheckState changes its value from
InProgress to either Succcess or Failure, or until some predefined timeout
expires, in which case it would emit an error.
2. Add following configuration checks:
postgresql.conf:
Check that:
1. postgres can bind to listen address:port
2. unix_socket_directory has proper permissions (n/a on Windows).
3. unix_socket_group group exists on a target system (n/a Windows).
4. unix_socket_permissions are valid (write permission is not revoked from
the owner, or a group, if group is different from the user).
5. server.crt and server.key exist in the data directory if ssl is
set to true (and server is compiled with openssl)
6. krb_server_keyfile is readable by the server (if set)
7. server is not requesting more shared memory than it's available in the system.
8. shared_preload_libraries and local_preload_libraries exist.
9. synchronous_standby_names are not empty and max_wal_senders
are > 0 if synchronous_replication = true
10. all enable_* query planner parameters are on.
11. seq_page_cost <= random_page_cost, and cpu_ costs are lower than page_ costs.
12. effective_cache_size is less than the amount of physical memory available.
13. geqo is turned on
14. default_statistics_target is >= 100
15. logging collector is enabled if log destination is stderr
16. log directory either exists and writable or that the parent
directory allows creating subdris
17. track counts is on if autovacuum is enabled
18. stats_temp_directory is writeable
19. default tablespace exists (if set)
20. temp_tablespace exists (if set)
21. statement_timeout is 0 (disabled)
22. dynamic_library_path exists
23. sql_inheritance is off
24. zero_damaged_pages is off
pg_hba.conf:
Check that:
1. the server is compiled with ssl if hostssl is specified
2. ssl is not required if hostnossl is specified
- Add a Warning value to LastCheckState for cases when configuration
files were loaded successfully, but one or more validation checks have failed.
Note that these are basic checks don't require the checker process to have
access to the database, Also, with this implementation, a client won't receive
an exact error message in case validation is unsuccessful, however, it would
receive a status (success, failure, warnings), and an exact error message
would be available in the server's log. It's possible to address these
shortcomings in the future.
Ideas, suggestions are welcome.
--
Alexey Klyukin
The PostgreSQL Company - Command Prompt, Inc.
Hi!
On Wed, Mar 30, 2011 at 8:40 AM, Alexey Klyukin <alexk@commandprompt.com> wrote:
I'd like to add an ability to validate the corectness of PostgreSQL
configuration files, i.e. postgresql.conf, pg_hba.conf, pg_ident.conf without
actually applying them. The idea is that it would be a command-line option to
postgres for a stand-alone case, or a user-callable function when postmaster
is running.Per the former discussion of a validator for PostgreSQL configuration files
(see http://archives.postgresql.org/pgsql-hackers/2008-08/msg00048.php),
here's an implementation proposal.
I did a little bit of work on this, and we discussed it here:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php
Probably there's a bit of bitrot in there.
The development plan consists of 2 parts.
The first one is to add new code that would allow running the checks in both a
stand-alone process, when postmaster is not running, and as a function call
from a backend postgres process. Initially the code would simply loads
configuration files, without performing any of the validation checks. The
second part consists of adding specific checks.
Cool! Mine was only going to work if the system started up or was reloaded.
I think most of the code related to this feature should be put into the
auxiliary process. The rationale is that we already have a stand-alone
CheckerProcess, which nowadays only parses postgresql.conf, runs BaseInit and
exists. We can easily load pg_hba.conf and pg_ident.conf and run all the
necessary checks there. Moreover, the same process can be used when checks are
launched from a built-in function. In that case, it would save the postgres
backend from reloading postgresql.conf, pg_hba.conf and pg_ident.conf
internally and restoring the previous configuration options when the function
exists. Below is a more detailed description of implementation steps:1.1. Stand-alone process (postmaster is not running):
- Add a new option (--check-config) to main.c. Run AuxiliaryProcessMain with
auxType= CheckerProcess if this option is specified.
- In CheckerModeMain load pg_hba.conf, pg_ident.conf1.2. Built-in function, called from a backend process.
- Add a new built-in function
- Add LastCheckState shared flag to denote whether the check has been
successful or failed
- Add a new PMSignalReason
- From the function call SendPostmasterSignal with the reason added on the
previous step.
- In the postmaster add a check for this reason in sigusr1_handler, spawning
a checker process. Set LastCheckStatus to InProgress.
- Store current configuration options in AuxillaryProcessMain before reloading
configuration files to avoid checking for options that haven't changed.
- In AuxillaryProcessMain, modify SelectConfigFiles invocation to call it
if IsUnderPostmaster = true if auxType process type is CheckerProcess.
- In the CheckerModeMain, set LastCheckState to either success or failure at the
end of all checks.
- The built-in function would wait until LastCheckState changes its value from
InProgress to either Succcess or Failure, or until some predefined timeout
expires, in which case it would emit an error.2. Add following configuration checks:
postgresql.conf:
Check that:
1. postgres can bind to listen address:port
2. unix_socket_directory has proper permissions (n/a on Windows).
3. unix_socket_group group exists on a target system (n/a Windows).
4. unix_socket_permissions are valid (write permission is not revoked from
the owner, or a group, if group is different from the user).
5. server.crt and server.key exist in the data directory if ssl is
set to true (and server is compiled with openssl)
6. krb_server_keyfile is readable by the server (if set)
7. server is not requesting more shared memory than it's available in the system.
8. shared_preload_libraries and local_preload_libraries exist.
9. synchronous_standby_names are not empty and max_wal_senders
are > 0 if synchronous_replication = true
10. all enable_* query planner parameters are on.
11. seq_page_cost <= random_page_cost, and cpu_ costs are lower than page_ costs.
12. effective_cache_size is less than the amount of physical memory available.
13. geqo is turned on
14. default_statistics_target is >= 100
15. logging collector is enabled if log destination is stderr
16. log directory either exists and writable or that the parent
directory allows creating subdris
17. track counts is on if autovacuum is enabled
18. stats_temp_directory is writeable
19. default tablespace exists (if set)
20. temp_tablespace exists (if set)
21. statement_timeout is 0 (disabled)
22. dynamic_library_path exists
23. sql_inheritance is off
24. zero_damaged_pages is off
Some of this seems not like things that postgres itself should be
doing, but more something that an external validation program (like
what Greg Smith maintains) would do.
pg_hba.conf:
Check that:
1. the server is compiled with ssl if hostssl is specified
2. ssl is not required if hostnossl is specified- Add a Warning value to LastCheckState for cases when configuration
files were loaded successfully, but one or more validation checks have failed.Note that these are basic checks don't require the checker process to have
access to the database, Also, with this implementation, a client won't receive
an exact error message in case validation is unsuccessful, however, it would
receive a status (success, failure, warnings), and an exact error message
would be available in the server's log. It's possible to address these
shortcomings in the future.Ideas, suggestions are welcome.
As I said above, some of what you've suggested seems more like a
non-postgres core thing.. maybe an extension? Or maybe offer the
option to read
My idea was to just check that settings were *valid* not that they met
some other, more subjective criteria.
-selena
Hi Selena,
On Mar 30, 2011, at 11:42 PM, Selena Deckelmann wrote:
Hi!
On Wed, Mar 30, 2011 at 8:40 AM, Alexey Klyukin <alexk@commandprompt.com> wrote:
I did a little bit of work on this, and we discussed it here:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.phpProbably there's a bit of bitrot in there.
Cool, I was not aware of your work in this direction. I've updated your patch
to apply to the latest HEAD, implementing Tom Lane's suggestions (attached). I
think I'll implement the other part (reporting all invalid parameters, as
opposed to only the first one) tomorrow.
The development plan consists of 2 parts.
The first one is to add new code that would allow running the checks in both a
stand-alone process, when postmaster is not running, and as a function call
from a backend postgres process. Initially the code would simply loads
configuration files, without performing any of the validation checks. The
second part consists of adding specific checks.Cool! Mine was only going to work if the system started up or was reloaded.
Well, I think a stand-alone check is an easy part :)
As I said above, some of what you've suggested seems more like a
non-postgres core thing.. maybe an extension? Or maybe offer the
option to read
Well, initially I'm going to start with just a check that configuration files
are valid, and add other checks afterwards. I think it makes sense for them
to be optional.
My idea was to just check that settings were *valid* not that they met
some other, more subjective criteria.
Well, my definition of valid configuration is not only the one that server
is able to parse and load, but also to actually apply (i.e. can bind to
a listen_address or read SSL certificate files). I agree that's not always
necessary (i.e. when checking configuration on a different server than
the one it should be applied to), so we can add a flag to turn them off.
--
Alexey Klyukin
The PostgreSQL Company - Command Prompt, Inc.
Attachments:
parser_continue_on_errors.diffapplication/octet-stream; name=parser_continue_on_errors.diffDownload+42-32
On Apr 1, 2011, at 12:08 AM, Alexey Klyukin wrote:
Hi Selena,
On Mar 30, 2011, at 11:42 PM, Selena Deckelmann wrote:
Hi!
On Wed, Mar 30, 2011 at 8:40 AM, Alexey Klyukin <alexk@commandprompt.com> wrote:
I did a little bit of work on this, and we discussed it here:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.phpProbably there's a bit of bitrot in there.
Cool, I was not aware of your work in this direction. I've updated your patch
to apply to the latest HEAD, implementing Tom Lane's suggestions (attached). I
think I'll implement the other part (reporting all invalid parameters, as
opposed to only the first one) tomorrow.
Here's the update of Selena's patch, which also shows all errors in
configuration parameters (as well as parser errors) during reload.
When I talked to Alvaro the other day he suggested starting with a stand-alone
process, which would load the postgresql.conf in a postmaster context, load
other configuration files and do some of the checks I've mentioned in my
initial proposal (particularly it would check that system's shared memory
limit is high enough by trying to allocate a new shared memory segment).
Afterwards, I'd like to implement checks from a user-callable function, and
not all checks would be available from it.
--
Alexey Klyukin
The PostgreSQL Company - Command Prompt, Inc.
Attachments:
guc-file.diffapplication/octet-stream; name=guc-file.diffDownload+57-48
On Mon, Apr 4, 2011 at 2:03 PM, Alexey Klyukin <alexk@commandprompt.com> wrote:
Here's the update of Selena's patch, which also shows all errors in
configuration parameters (as well as parser errors) during reload.
You should add this here:
https://commitfest.postgresql.org/action/commitfest_view/open
On a quick glance, this patch appears to contain some superfluous
hunks where you changed whitespace or variable names. You might want
to remove those and repost before adding to the CF app. Also, some
submission notes would be very helpful - when you send in the revised
version, detail in the email the exact purpose of the changes so that
someone can review the patch without having to read this thread and
all preceding threads in their entirety.
Thanks,
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On Apr 14, 2011, at 9:50 PM, Robert Haas wrote:
On Mon, Apr 4, 2011 at 2:03 PM, Alexey Klyukin <alexk@commandprompt.com> wrote:
Here's the update of Selena's patch, which also shows all errors in
configuration parameters (as well as parser errors) during reload.You should add this here:
https://commitfest.postgresql.org/action/commitfest_view/open
On a quick glance, this patch appears to contain some superfluous
hunks where you changed whitespace or variable names. You might want
to remove those and repost before adding to the CF app. Also, some
submission notes would be very helpful - when you send in the revised
version, detail in the email the exact purpose of the changes so that
someone can review the patch without having to read this thread and
all preceding threads in their entirety.
Thank you for the feedback, I've updated the patch, attached is a new version.
I'll add it to the commitfest after posting this message.
The patch forces the parser to report all errors (max 100) from the
ProcessConfigFile/ParseConfigFp. Currently, only the first parse error or an
invalid directive is reported. Reporting all of them is crucial to automatic
validation of postgres config files.
This patch is based on the one submitted earlier by Selena Deckelmann:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php
It incorporates suggestions by Tom Lane for avoiding excessive bloat in logs
in case there is a junk instead of postgresql.conf.
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php
Regards,
--
Alexey Klyukin
The PostgreSQL Company - Command Prompt, Inc.
Attachments:
parser_continue_on_errors.diffapplication/octet-stream; name=parser_continue_on_errors.diffDownload+46-43
Hi
On May14, 2011, at 00:49 , Alexey Klyukin wrote:
The patch forces the parser to report all errors (max 100) from the
ProcessConfigFile/ParseConfigFp. Currently, only the first parse error or an
invalid directive is reported. Reporting all of them is crucial to automatic
validation of postgres config files.This patch is based on the one submitted earlier by Selena Deckelmann:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.phpIt incorporates suggestions by Tom Lane for avoiding excessive bloat in logs
in case there is a junk instead of postgresql.conf.
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php
Here's my review of your patch.
The patch is in context diff format and applies cleanly to HEAD. It doesn't
contain superfluous whitespace changes any more is and quite readable.
First, the behaviour.
The first problem I ran into when I tried to test this is that it *only*
reports multiple errors during config file reload on SIHUP, not during
postmaster startup. I guess it's been done that way because we
ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's
not immediatly clear how to report multiple errors. But that proplem
seems solvable. What if you ereport(LOG,..)ed the individual errors during
postmaster startup, and then emitted an ereport(ERROR) at the end if
errors occurred? The ERROR could either repeat the first error that was
encountered, or simply say "config file contains errors".
Now to the code.
I see that you basically replaced "goto cleanup..." in both ParseConfigFp()
and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp()
to return false, and for ProcessConfigFile() to skip the GUC updates if
"errorcount > 0". The actual value of errorcount is never inspected. The value
is also wrong in the case of include files containing more than error, since
ParseConfigFp() simply increments errorcount by one for each failed
ParseConfigFile() of an included file.
I thus suggest that you replace "errorcount" with a boolean "erroroccurred".
You've also introduced a bug in ParseConfigFp(). Previously, if an included
file contained an error, it did "goto cleanup_exit()" and thus didn't
ereport() on it's own. With your patch applied it ereport()s a parse error
at the location of the "include" directive, which seems wrong.
Finally, I believe that ParseConfigFp() should make at least some effort to
resync after hitting a parser error. I suggest that you simply fast-forward
to the next GUC_EOL token after reporting a parser error.
best regards,
Florian Pflug
Florian,
On Jun 16, 2011, at 2:34 PM, Florian Pflug wrote:
Hi
On May14, 2011, at 00:49 , Alexey Klyukin wrote:
The patch forces the parser to report all errors (max 100) from the
ProcessConfigFile/ParseConfigFp. Currently, only the first parse error or an
invalid directive is reported. Reporting all of them is crucial to automatic
validation of postgres config files.This patch is based on the one submitted earlier by Selena Deckelmann:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.phpIt incorporates suggestions by Tom Lane for avoiding excessive bloat in logs
in case there is a junk instead of postgresql.conf.
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.phpHere's my review of your patch.
The patch is in context diff format and applies cleanly to HEAD. It doesn't
contain superfluous whitespace changes any more is and quite readable.First, the behaviour.
The first problem I ran into when I tried to test this is that it *only*
reports multiple errors during config file reload on SIHUP, not during
postmaster startup. I guess it's been done that way because we
ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's
not immediatly clear how to report multiple errors. But that proplem
seems solvable. What if you ereport(LOG,..)ed the individual errors during
postmaster startup, and then emitted an ereport(ERROR) at the end if
errors occurred? The ERROR could either repeat the first error that was
encountered, or simply say "config file contains errors".
Makes sense. One problem I came across is that set_config_option from guc.c
sets the elevel by itself. I've changed it to emit LOG errors on PGC_S_FILE
source, apparently all the callers of this function with this source are from
guc-file.l, so hopefully I won't break anything with this change.
Now to the code.
I see that you basically replaced "goto cleanup..." in both ParseConfigFp()
and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp()
to return false, and for ProcessConfigFile() to skip the GUC updates if
"errorcount > 0". The actual value of errorcount is never inspected. The value
is also wrong in the case of include files containing more than error, since
ParseConfigFp() simply increments errorcount by one for each failed
ParseConfigFile() of an included file.I thus suggest that you replace "errorcount" with a boolean "erroroccurred".
I can actually pass errorcount down from the ParseConfigFp() to report the total
number of errors (w/ the include files) at the end of ProcessConfigFile if there is
any interest in having the number of errors reported to a user. If not - I'll change
it to boolean.
You've also introduced a bug in ParseConfigFp(). Previously, if an included
file contained an error, it did "goto cleanup_exit()" and thus didn't
ereport() on it's own. With your patch applied it ereport()s a parse error
at the location of the "include" directive, which seems wrong.
Right, I noticed that I skipped switching the buffers and restoring the Lineno
as well. I'll fix it in the next revision.
Finally, I believe that ParseConfigFp() should make at least some effort to
resync after hitting a parser error. I suggest that you simply fast-forward
to the next GUC_EOL token after reporting a parser error.
Makes sense.
Thank you for the review.
I'll post an updated patch, addressing you concerns, shortly.
Alexey.
--
Command Prompt, Inc. http://www.CommandPrompt.com
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun16, 2011, at 17:23 , Alexey Klyukin wrote:
On Jun 16, 2011, at 2:34 PM, Florian Pflug wrote:
The first problem I ran into when I tried to test this is that it *only*
reports multiple errors during config file reload on SIHUP, not during
postmaster startup. I guess it's been done that way because we
ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's
not immediatly clear how to report multiple errors. But that proplem
seems solvable. What if you ereport(LOG,..)ed the individual errors during
postmaster startup, and then emitted an ereport(ERROR) at the end if
errors occurred? The ERROR could either repeat the first error that was
encountered, or simply say "config file contains errors".Makes sense. One problem I came across is that set_config_option from guc.c
sets the elevel by itself. I've changed it to emit LOG errors on PGC_S_FILE
source, apparently all the callers of this function with this source are from
guc-file.l, so hopefully I won't break anything with this change.
Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate?
I see that you basically replaced "goto cleanup..." in both ParseConfigFp()
and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp()
to return false, and for ProcessConfigFile() to skip the GUC updates if
"errorcount > 0". The actual value of errorcount is never inspected. The value
is also wrong in the case of include files containing more than error, since
ParseConfigFp() simply increments errorcount by one for each failed
ParseConfigFile() of an included file.I thus suggest that you replace "errorcount" with a boolean "erroroccurred".
I can actually pass errorcount down from the ParseConfigFp() to report the total
number of errors (w/ the include files) at the end of ProcessConfigFile if there is
any interest in having the number of errors reported to a user. If not - I'll change
it to boolean.
Nah, just use a boolean, unless you have concrete plans to actually use the errorcount
for something other than test a la "errorcount > 0".
best regards,
Florian Pflug
On Jun 16, 2011, at 6:49 PM, Florian Pflug wrote:
On Jun16, 2011, at 17:23 , Alexey Klyukin wrote:
On Jun 16, 2011, at 2:34 PM, Florian Pflug wrote:
The first problem I ran into when I tried to test this is that it *only*
reports multiple errors during config file reload on SIHUP, not during
postmaster startup. I guess it's been done that way because we
ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's
not immediatly clear how to report multiple errors. But that proplem
seems solvable. What if you ereport(LOG,..)ed the individual errors during
postmaster startup, and then emitted an ereport(ERROR) at the end if
errors occurred? The ERROR could either repeat the first error that was
encountered, or simply say "config file contains errors".Makes sense. One problem I came across is that set_config_option from guc.c
sets the elevel by itself. I've changed it to emit LOG errors on PGC_S_FILE
source, apparently all the callers of this function with this source are from
guc-file.l, so hopefully I won't break anything with this change.Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate?
In such a case the errors caused by command-line arguments won't stop the postmaster.
PGC_S_FILE seems to handle this correctly. I'm not sure whether it is appropriate to use
there though.
I see that you basically replaced "goto cleanup..." in both ParseConfigFp()
and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp()
to return false, and for ProcessConfigFile() to skip the GUC updates if
"errorcount > 0". The actual value of errorcount is never inspected. The value
is also wrong in the case of include files containing more than error, since
ParseConfigFp() simply increments errorcount by one for each failed
ParseConfigFile() of an included file.I thus suggest that you replace "errorcount" with a boolean "erroroccurred".
I can actually pass errorcount down from the ParseConfigFp() to report the total
number of errors (w/ the include files) at the end of ProcessConfigFile if there is
any interest in having the number of errors reported to a user. If not - I'll change
it to boolean.Nah, just use a boolean, unless you have concrete plans to actually use the errorcount
for something other than test a la "errorcount > 0".
I just recalled a reason for counting the total number of errors. There is a condition that
checks that the total number of errors is less than 100 and bails out if it's more than that
(100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated
to postgresql.conf. That was suggested by Tom Lane here:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php
Thank you,
Alexey.
--
Command Prompt, Inc. http://www.CommandPrompt.com
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun16, 2011, at 18:46 , Alexey Klyukin wrote:
On Jun 16, 2011, at 6:49 PM, Florian Pflug wrote:
Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate?
In such a case the errors caused by command-line arguments won't stop the postmaster.
PGC_S_FILE seems to handle this correctly. I'm not sure whether it is appropriate to use
there though.
Ah, yeah, you're right. PGC_S_FILE sounds fine, then. I guess this means you can
drop the check for "context == PGC_SIGHUP" though, because surely the source must
be PGC_S_DEFAULT or PGC_S_FILE if context == PGC_SIGHUP, right? So the check would
become
if (source == PGC_S_FILE || source == PGC_S_DEFAULT)
where it now says
if (context == PGC_SIGHUP || source == PGC_S_DEFAULT)
I see that you basically replaced "goto cleanup..." in both ParseConfigFp()
and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp()
to return false, and for ProcessConfigFile() to skip the GUC updates if
"errorcount > 0". The actual value of errorcount is never inspected. The value
is also wrong in the case of include files containing more than error, since
ParseConfigFp() simply increments errorcount by one for each failed
ParseConfigFile() of an included file.I thus suggest that you replace "errorcount" with a boolean "erroroccurred".
I can actually pass errorcount down from the ParseConfigFp() to report the total
number of errors (w/ the include files) at the end of ProcessConfigFile if there is
any interest in having the number of errors reported to a user. If not - I'll change
it to boolean.Nah, just use a boolean, unless you have concrete plans to actually use the errorcount
for something other than test a la "errorcount > 0".I just recalled a reason for counting the total number of errors. There is a condition that
checks that the total number of errors is less than 100 and bails out if it's more than that
(100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated
to postgresql.conf. That was suggested by Tom Lane here:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php
Ah, right, I missed that. Guess it'll have to stay a counter, then. Still, I don't think it's
worth the effort to make the count correct in case of included files, so I'd say just add
a comment explaining that the count isn't totally accurate.
best regards,
Florian Pflug
On Jun 16, 2011, at 8:01 PM, Florian Pflug wrote:
On Jun16, 2011, at 18:46 , Alexey Klyukin wrote:
On Jun 16, 2011, at 6:49 PM, Florian Pflug wrote:
Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate?
In such a case the errors caused by command-line arguments won't stop the postmaster.
PGC_S_FILE seems to handle this correctly. I'm not sure whether it is appropriate to use
there though.Ah, yeah, you're right. PGC_S_FILE sounds fine, then. I guess this means you can
drop the check for "context == PGC_SIGHUP" though, because surely the source must
be PGC_S_DEFAULT or PGC_S_FILE if context == PGC_SIGHUP, right? So the check would
become
if (source == PGC_S_FILE || source == PGC_S_DEFAULT)
where it now says
if (context == PGC_SIGHUP || source == PGC_S_DEFAULT)
Yes, AFAIK PGC_SIGHUP is redundant, thank you for the suggestion.
I just recalled a reason for counting the total number of errors. There is a condition that
checks that the total number of errors is less than 100 and bails out if it's more than that
(100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated
to postgresql.conf. That was suggested by Tom Lane here:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.phpAh, right, I missed that. Guess it'll have to stay a counter, then. Still, I don't think it's
worth the effort to make the count correct in case of included files, so I'd say just add
a comment explaining that the count isn't totally accurate.
Well, while thinking about this I decided to leave the counter for the ParseConfigFp, but
drop it in ProcessConfigFile. The case we are protecting against is a single file full of junk.
It's unlikely that this junk would contain include directives with valid file paths, neither it's
likely to find a file with a correct syntax, but full of invalid directives.
Thank you,
Alexey.
--
Command Prompt, Inc. http://www.CommandPrompt.com
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun16, 2011, at 20:14 , Alexey Klyukin wrote:
On Jun 16, 2011, at 8:01 PM, Florian Pflug wrote:
On Jun16, 2011, at 18:46 , Alexey Klyukin wrote:
I just recalled a reason for counting the total number of errors. There is a condition that
checks that the total number of errors is less than 100 and bails out if it's more than that
(100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated
to postgresql.conf. That was suggested by Tom Lane here:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.phpAh, right, I missed that. Guess it'll have to stay a counter, then. Still, I don't think it's
worth the effort to make the count correct in case of included files, so I'd say just add
a comment explaining that the count isn't totally accurate.Well, while thinking about this I decided to leave the counter for the ParseConfigFp, but
drop it in ProcessConfigFile. The case we are protecting against is a single file full of junk.
It's unlikely that this junk would contain include directives with valid file paths, neither it's
likely to find a file with a correct syntax, but full of invalid directives.
Sounds good.
best regards,
Florian Pflug
Hi,
On Jun 16, 2011, at 9:18 PM, Florian Pflug wrote:
On Jun16, 2011, at 20:14 , Alexey Klyukin wrote:
Well, while thinking about this I decided to leave the counter for the ParseConfigFp, but
drop it in ProcessConfigFile. The case we are protecting against is a single file full of junk.
It's unlikely that this junk would contain include directives with valid file paths, neither it's
likely to find a file with a correct syntax, but full of invalid directives.Sounds good.
Attached is the v2 of the patch to show all parse errors at postgresql.conf.
Changes (per review and suggestions from Florian):
- do not stop on the first error during postmaster's start.
- fix errors in processing files from include directives.
- show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one.
- additional comments/error messages, code cleanup
Questions:
- Should we add a comment for the changes in guc.c? I think the existing ones are still valid, but they might be harder go grasp, given that we've removed PGC_SIGHUP from the condition.
- The error message that we emit when the parsing is unsuccessful, will it cause incompatibility w/ 3rd party tools, which may, in theory, show only one error message (would it better to show the first error instead, as proposed by Florian?).
I'd appreciate your comments and suggestions.
Thank you,
Alexey.
--
Command Prompt, Inc. http://www.CommandPrompt.com
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachments:
parser_continue_on_errors_v2.diffapplication/octet-stream; name=parser_continue_on_errors_v2.diffDownload+70-47
On Jun16, 2011, at 22:34 , Alexey Klyukin wrote:
Attached is the v2 of the patch to show all parse errors at postgresql.conf.
Changes (per review and suggestions from Florian):- do not stop on the first error during postmaster's start.
- fix errors in processing files from include directives.
- show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one.
- additional comments/error messages, code cleanup
Looks mostly good.
I found one issue which I missed earlier. As it stands, the behaviour
of ProcessConfigFile() changes subtly when IsUnderPostmaster because of
the early abort on errors. Now, in theory the outcome should still be
the same, since we don't apply any settings if there's an error in one
of them. But still, there's a risk that this code isn't 100% waterproof,
and then we'd end up with different settings in the postmaster compared
to the backends. The benefit seems also quite small - since the backends
emit their messages at DEBUG2, you usually won't see the difference
anyway.
The elevel setting at the start of ProcessConfigFile() also seemed
needlessly complex, since we cannot have IsUnderPostmaster and
context == PGCS_POSTMASTER at the same time.
I figured it'd be harder to explain the fixes I have in mind than
simply do them and let the code speak. Attached you'll find an updated
version of your v2 patch (called v2a) as well as an incremental patch
against your v2 (called v2a_delta).
The main changes are the removal of the early aborts when
IsUnderPostmaster and the simplification of the elevel setting
logic in ProcessConfigFile().
The updated patch also adds a few comments. If you agree with my changes,
feel free to mark this patch "Ready for Committer".
best regards,
Florian Pflug
Florian,
On Jun 18, 2011, at 5:40 PM, Florian Pflug wrote:
On Jun16, 2011, at 22:34 , Alexey Klyukin wrote:
Attached is the v2 of the patch to show all parse errors at postgresql.conf.
Changes (per review and suggestions from Florian):- do not stop on the first error during postmaster's start.
- fix errors in processing files from include directives.
- show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one.
- additional comments/error messages, code cleanupLooks mostly good.
I found one issue which I missed earlier. As it stands, the behaviour
of ProcessConfigFile() changes subtly when IsUnderPostmaster because of
the early abort on errors. Now, in theory the outcome should still be
the same, since we don't apply any settings if there's an error in one
of them. But still, there's a risk that this code isn't 100% waterproof,
and then we'd end up with different settings in the postmaster compared
to the backends. The benefit seems also quite small - since the backends
emit their messages at DEBUG2, you usually won't see the difference
anyway.
I don't think it has changed at all. Previously, we did goto cleanup_list (or
cleanup_exit in ParseConfigFp) right after the first error, no matter whether
that was a postmaster or its child. What I did in my patch is removing the
goto for the postmaster's case. It was my intention to exit after the initial
error for the postmaster's child, to avoid complaining about all errors both
in the postmaster and in the normal backend (imagine seeing 100 errors from
the postmaster and the same 100 from each of the backends if your log level is
DEBUG2). I think the postmaster's child case won't cause any problems, since
we do exactly what we used to do before.
The elevel setting at the start of ProcessConfigFile() also seemed
needlessly complex, since we cannot have IsUnderPostmaster and
context == PGCS_POSTMASTER at the same time.
Agreed.
I figured it'd be harder to explain the fixes I have in mind than
simply do them and let the code speak. Attached you'll find an updated
version of your v2 patch (called v2a) as well as an incremental patch
against your v2 (called v2a_delta).The main changes are the removal of the early aborts when
IsUnderPostmaster and the simplification of the elevel setting
logic in ProcessConfigFile().
Attached is the new patch and the delta. It includes some of the changes from
your patch, while leaving the early abort stuff for postmaster's children.
Thank you,
Alexey.
--
Command Prompt, Inc. http://www.CommandPrompt.com
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun20, 2011, at 17:02 , Alexey Klyukin wrote:
On Jun 18, 2011, at 5:40 PM, Florian Pflug wrote:
On Jun16, 2011, at 22:34 , Alexey Klyukin wrote:
Attached is the v2 of the patch to show all parse errors at postgresql.conf.
Changes (per review and suggestions from Florian):- do not stop on the first error during postmaster's start.
- fix errors in processing files from include directives.
- show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one.
- additional comments/error messages, code cleanupLooks mostly good.
I found one issue which I missed earlier. As it stands, the behaviour
of ProcessConfigFile() changes subtly when IsUnderPostmaster because of
the early abort on errors. Now, in theory the outcome should still be
the same, since we don't apply any settings if there's an error in one
of them. But still, there's a risk that this code isn't 100% waterproof,
and then we'd end up with different settings in the postmaster compared
to the backends. The benefit seems also quite small - since the backends
emit their messages at DEBUG2, you usually won't see the difference
anyway.I don't think it has changed at all. Previously, we did goto cleanup_list (or
cleanup_exit in ParseConfigFp) right after the first error, no matter whether
that was a postmaster or its child. What I did in my patch is removing the
goto for the postmaster's case. It was my intention to exit after the initial
error for the postmaster's child, to avoid complaining about all errors both
in the postmaster and in the normal backend (imagine seeing 100 errors from
the postmaster and the same 100 from each of the backends if your log level is
DEBUG2). I think the postmaster's child case won't cause any problems, since
we do exactly what we used to do before.
Hm, I think you miss-understood what I was trying to say, probably because I
explained it badly. Let me try again.
I fully agree that there *shouldn't* be any difference in behaviour, because
it *shouldn't* matter whether we abort early or not - we won't have applied
any of the settings anway.
But.
The code the actually implements the "check settings first, apply later" logic
isn't easy to read. Now, assume that this code has a bug. Then, with your
patch applied, we might end up with the postmaster applying a setting (because
it didn't abort early) but the backend ignoring it (because they did abort early).
This is obviously bad. Depending on the setting, the consequences may range
from slightly confusing behaviour to outright crashes I guess...
Now, the risk of that happening might be very small. But on the other hand,
the benefit is pretty small also - you get a little less output for log level
DEBUG2, that's it. A level that people probably don't use for the production
databases anyway. This convinced me that the risk/benefit ratio isn't high enough
to warrant the early abort.
Another benefit of removing the check is that it reduces code complexity. Maybe
not when measured in line counts, but it's one less outside factor that changes
ProcessConfigFiles()'s behaviour and thus one thing less you need to think when
you modify that part again in the future. Again, it's a small benefit, but IMHO
it still outweights the benefit.
Having said that, this is my personal opinion and whoever will eventually
commit this may very will assess the cost/benefit ratio differently. So, if
after this more detailed explanations of my reasoning, you still feel that
it makes sense to keep the early abort, then feel free to mark the
patch "Ready for Committer" nevertheless.
best regards,
Florian Pflug
Florian Pflug <fgp@phlo.org> writes:
The code the actually implements the "check settings first, apply later" logic
isn't easy to read. Now, assume that this code has a bug. Then, with your
patch applied, we might end up with the postmaster applying a setting (because
it didn't abort early) but the backend ignoring it (because they did abort early).
This is obviously bad. Depending on the setting, the consequences may range
from slightly confusing behaviour to outright crashes I guess...
This is already known to happen: there are cases where the postmaster
and a backend can come to different conclusions about whether a setting
is valid (eg, because it depends on database encoding). Whether that's
a bug or not isn't completely clear, but if this patch is critically
dependent on the situation never happening, I don't think we can accept
it.
regards, tom lane
On Jun20, 2011, at 18:16 , Tom Lane wrote:
Florian Pflug <fgp@phlo.org> writes:
The code the actually implements the "check settings first, apply later" logic
isn't easy to read. Now, assume that this code has a bug. Then, with your
patch applied, we might end up with the postmaster applying a setting (because
it didn't abort early) but the backend ignoring it (because they did abort early).
This is obviously bad. Depending on the setting, the consequences may range
from slightly confusing behaviour to outright crashes I guess...This is already known to happen: there are cases where the postmaster
and a backend can come to different conclusions about whether a setting
is valid (eg, because it depends on database encoding). Whether that's
a bug or not isn't completely clear, but if this patch is critically
dependent on the situation never happening, I don't think we can accept
it.
Does that mean that some backends might currently choose to ignore an
updated config file wholesale on SIGUP (because some settings are invalid)
while others happily apply it? Meaning that they'll afterwards disagree
even on modified settings which *would* be valid for both backends?
Or do these kinds of setting rejections happen late enough to fall out
of the all-or-nothing logic in ProcessConfigFile?
Anyway, the patch *doesn't* depend on all backends's setting being in sync.
The issue we were discussion was whether it's OK to add another small risk
of them getting out of sync by aborting early on errors in backends but
not in the postmaster. I was arguing against that, while Alexey was in favour
of it, on the grounds that it reduces log traffic (but only at DEBUG2 or
beyond).
best regards,
Florian Pflug
Florian Pflug <fgp@phlo.org> writes:
On Jun20, 2011, at 18:16 , Tom Lane wrote:
This is already known to happen: there are cases where the postmaster
and a backend can come to different conclusions about whether a setting
is valid (eg, because it depends on database encoding). Whether that's
a bug or not isn't completely clear, but if this patch is critically
dependent on the situation never happening, I don't think we can accept
it.
Does that mean that some backends might currently choose to ignore an
updated config file wholesale on SIGUP (because some settings are invalid)
while others happily apply it? Meaning that they'll afterwards disagree
even on modified settings which *would* be valid for both backends?
Yes. I complained about that before:
http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php
but we didn't come to any consensus about fixing it. This patch might
be a good vehicle for revisiting the issue, though.
regards, tom lane