Thread: proposal: a validator for configuration files
Hi, I'd like to add an ability to validate the corectness of PostgreSQL configuration files, i.e. postgresql.conf, pg_hba.conf, pg_ident.conf without actually applying them. The idea is that it would be a command-line option to postgres for a stand-alone case, or a user-callable function when postmaster is running. Per the former discussion of a validator for PostgreSQL configuration files (see http://archives.postgresql.org/pgsql-hackers/2008-08/msg00048.php), here's an implementation proposal. The development plan consists of 2 parts. The first one is to add new code that would allow running the checks in both a stand-alone process, when postmaster is not running, and as a function call from a backend postgres process. Initially the code would simply loads configuration files, without performing any of the validation checks. The second part consists of adding specific checks. I think most of the code related to this feature should be put into the auxiliary process. The rationale is that we already have a stand-alone CheckerProcess, which nowadays only parses postgresql.conf, runs BaseInit and exists. We can easily load pg_hba.conf and pg_ident.conf and run all the necessary checks there. Moreover, the same process can be used when checks are launched from a built-in function. In that case, it would save the postgres backend from reloading postgresql.conf, pg_hba.conf and pg_ident.conf internally and restoring the previous configuration options when the function exists. Below is a more detailed description of implementation steps: 1.1. Stand-alone process (postmaster is not running): - Add a new option (--check-config) to main.c. Run AuxiliaryProcessMain with auxType= CheckerProcess if this option is specified. - In CheckerModeMain load pg_hba.conf, pg_ident.conf 1.2. Built-in function, called from a backend process. - Add a new built-in function - Add LastCheckState shared flag to denote whether the check has been successful or failed - Add a new PMSignalReason - From the function call SendPostmasterSignal with the reason added on the previous step. - In the postmaster add a check for this reason in sigusr1_handler, spawning a checker process. Set LastCheckStatus to InProgress. - Store current configuration options in AuxillaryProcessMain before reloading configuration files to avoid checking foroptions that haven't changed. - In AuxillaryProcessMain, modify SelectConfigFiles invocation to call it if IsUnderPostmaster = true if auxType processtype is CheckerProcess. - In the CheckerModeMain, set LastCheckState to either success or failure at the end of all checks. - The built-in function would wait until LastCheckState changes its value from InProgress to either Succcess or Failure,or until some predefined timeout expires, in which case it would emit an error. 2. Add following configuration checks: postgresql.conf: Check that: 1. postgres can bind to listen address:port 2. unix_socket_directory has proper permissions (n/a on Windows).3. unix_socket_group group exists on a target system (n/a Windows). 4. unix_socket_permissions are valid (writepermission is not revoked from the owner, or a group, if group is different from the user). 5. server.crt andserver.key exist in the data directory if ssl is set to true (and server is compiled with openssl) 6. krb_server_keyfileis readable by the server (if set) 7. server is not requesting more shared memory than it's availablein the system. 8. shared_preload_libraries and local_preload_libraries exist. 9. synchronous_standby_names are notempty and max_wal_senders are > 0 if synchronous_replication = true 10. all enable_* query planner parameters areon. 11. seq_page_cost <= random_page_cost, and cpu_ costs are lower than page_ costs. 12. effective_cache_size is lessthan the amount of physical memory available. 13. geqo is turned on 14. default_statistics_target is >= 100 15. loggingcollector is enabled if log destination is stderr 16. log directory either exists and writable or that the parent directory allows creating subdris 17. track counts is on if autovacuum is enabled 18. stats_temp_directory iswriteable 19. default tablespace exists (if set) 20. temp_tablespace exists (if set) 21. statement_timeout is 0 (disabled)22. dynamic_library_path exists 23. sql_inheritance is off 24. zero_damaged_pages is off pg_hba.conf: Check that: 1. the server is compiled with ssl if hostssl is specified 2. ssl is not required if hostnossl is specified - Add a Warning value to LastCheckState for cases when configuration files were loaded successfully, but one or more validation checks have failed. Note that these are basic checks don't require the checker process to have access to the database, Also, with this implementation, a client won't receive an exact error message in case validation is unsuccessful, however, it would receive a status (success, failure, warnings), and an exact error message would be available in the server's log. It's possible to address these shortcomings in the future. Ideas, suggestions are welcome. -- Alexey Klyukin The PostgreSQL Company - Command Prompt, Inc.
Hi! On Wed, Mar 30, 2011 at 8:40 AM, Alexey Klyukin <alexk@commandprompt.com> wrote: > I'd like to add an ability to validate the corectness of PostgreSQL > configuration files, i.e. postgresql.conf, pg_hba.conf, pg_ident.conf without > actually applying them. The idea is that it would be a command-line option to > postgres for a stand-alone case, or a user-callable function when postmaster > is running. > > Per the former discussion of a validator for PostgreSQL configuration files > (see http://archives.postgresql.org/pgsql-hackers/2008-08/msg00048.php), > here's an implementation proposal. I did a little bit of work on this, and we discussed it here: http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php Probably there's a bit of bitrot in there. > The development plan consists of 2 parts. > The first one is to add new code that would allow running the checks in both a > stand-alone process, when postmaster is not running, and as a function call > from a backend postgres process. Initially the code would simply loads > configuration files, without performing any of the validation checks. The > second part consists of adding specific checks. Cool! Mine was only going to work if the system started up or was reloaded. > I think most of the code related to this feature should be put into the > auxiliary process. The rationale is that we already have a stand-alone > CheckerProcess, which nowadays only parses postgresql.conf, runs BaseInit and > exists. We can easily load pg_hba.conf and pg_ident.conf and run all the > necessary checks there. Moreover, the same process can be used when checks are > launched from a built-in function. In that case, it would save the postgres > backend from reloading postgresql.conf, pg_hba.conf and pg_ident.conf > internally and restoring the previous configuration options when the function > exists. Below is a more detailed description of implementation steps: > > 1.1. Stand-alone process (postmaster is not running): > > - Add a new option (--check-config) to main.c. Run AuxiliaryProcessMain with > auxType= CheckerProcess if this option is specified. > - In CheckerModeMain load pg_hba.conf, pg_ident.conf > > > 1.2. Built-in function, called from a backend process. > - Add a new built-in function > - Add LastCheckState shared flag to denote whether the check has been > successful or failed > - Add a new PMSignalReason > - From the function call SendPostmasterSignal with the reason added on the > previous step. > - In the postmaster add a check for this reason in sigusr1_handler, spawning > a checker process. Set LastCheckStatus to InProgress. > - Store current configuration options in AuxillaryProcessMain before reloading > configuration files to avoid checking for options that haven't changed. > - In AuxillaryProcessMain, modify SelectConfigFiles invocation to call it > if IsUnderPostmaster = true if auxType process type is CheckerProcess. > - In the CheckerModeMain, set LastCheckState to either success or failure at the > end of all checks. > - The built-in function would wait until LastCheckState changes its value from > InProgress to either Succcess or Failure, or until some predefined timeout > expires, in which case it would emit an error. > > 2. Add following configuration checks: > > postgresql.conf: > > Check that: > 1. postgres can bind to listen address:port > 2. unix_socket_directory has proper permissions (n/a on Windows). > 3. unix_socket_group group exists on a target system (n/a Windows). > 4. unix_socket_permissions are valid (write permission is not revoked from > the owner, or a group, if group is different from the user). > 5. server.crt and server.key exist in the data directory if ssl is > set to true (and server is compiled with openssl) > 6. krb_server_keyfile is readable by the server (if set) > 7. server is not requesting more shared memory than it's available in the system. > 8. shared_preload_libraries and local_preload_libraries exist. > 9. synchronous_standby_names are not empty and max_wal_senders > are > 0 if synchronous_replication = true > 10. all enable_* query planner parameters are on. > 11. seq_page_cost <= random_page_cost, and cpu_ costs are lower than page_ costs. > 12. effective_cache_size is less than the amount of physical memory available. > 13. geqo is turned on > 14. default_statistics_target is >= 100 > 15. logging collector is enabled if log destination is stderr > 16. log directory either exists and writable or that the parent > directory allows creating subdris > 17. track counts is on if autovacuum is enabled > 18. stats_temp_directory is writeable > 19. default tablespace exists (if set) > 20. temp_tablespace exists (if set) > 21. statement_timeout is 0 (disabled) > 22. dynamic_library_path exists > 23. sql_inheritance is off > 24. zero_damaged_pages is off Some of this seems not like things that postgres itself should be doing, but more something that an external validation program (like what Greg Smith maintains) would do. > pg_hba.conf: > Check that: > 1. the server is compiled with ssl if hostssl is specified > 2. ssl is not required if hostnossl is specified > > - Add a Warning value to LastCheckState for cases when configuration > files were loaded successfully, but one or more validation checks have failed. > > Note that these are basic checks don't require the checker process to have > access to the database, Also, with this implementation, a client won't receive > an exact error message in case validation is unsuccessful, however, it would > receive a status (success, failure, warnings), and an exact error message > would be available in the server's log. It's possible to address these > shortcomings in the future. > > Ideas, suggestions are welcome. As I said above, some of what you've suggested seems more like a non-postgres core thing.. maybe an extension? Or maybe offer the option to read My idea was to just check that settings were *valid* not that they met some other, more subjective criteria. -selena -- http://chesnok.com
Hi Selena, On Mar 30, 2011, at 11:42 PM, Selena Deckelmann wrote: > Hi! > > On Wed, Mar 30, 2011 at 8:40 AM, Alexey Klyukin <alexk@commandprompt.com> wrote: > > > I did a little bit of work on this, and we discussed it here: > > http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php > http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php > > Probably there's a bit of bitrot in there. Cool, I was not aware of your work in this direction. I've updated your patch to apply to the latest HEAD, implementing Tom Lane's suggestions (attached). I think I'll implement the other part (reporting all invalid parameters, as opposed to only the first one) tomorrow. > >> The development plan consists of 2 parts. >> The first one is to add new code that would allow running the checks in both a >> stand-alone process, when postmaster is not running, and as a function call >> from a backend postgres process. Initially the code would simply loads >> configuration files, without performing any of the validation checks. The >> second part consists of adding specific checks. > > Cool! Mine was only going to work if the system started up or was reloaded. Well, I think a stand-alone check is an easy part :) > > As I said above, some of what you've suggested seems more like a > non-postgres core thing.. maybe an extension? Or maybe offer the > option to read Well, initially I'm going to start with just a check that configuration files are valid, and add other checks afterwards. I think it makes sense for them to be optional. > > My idea was to just check that settings were *valid* not that they met > some other, more subjective criteria. Well, my definition of valid configuration is not only the one that server is able to parse and load, but also to actually apply (i.e. can bind to a listen_address or read SSL certificate files). I agree that's not always necessary (i.e. when checking configuration on a different server than the one it should be applied to), so we can add a flag to turn them off. -- Alexey Klyukin The PostgreSQL Company - Command Prompt, Inc.
Attachment
On Apr 1, 2011, at 12:08 AM, Alexey Klyukin wrote: > Hi Selena, > > On Mar 30, 2011, at 11:42 PM, Selena Deckelmann wrote: > >> Hi! >> >> On Wed, Mar 30, 2011 at 8:40 AM, Alexey Klyukin <alexk@commandprompt.com> wrote: >> >> >> I did a little bit of work on this, and we discussed it here: >> >> http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php >> http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php >> >> Probably there's a bit of bitrot in there. > > Cool, I was not aware of your work in this direction. I've updated your patch > to apply to the latest HEAD, implementing Tom Lane's suggestions (attached). I > think I'll implement the other part (reporting all invalid parameters, as > opposed to only the first one) tomorrow. Here's the update of Selena's patch, which also shows all errors in configuration parameters (as well as parser errors) during reload. When I talked to Alvaro the other day he suggested starting with a stand-alone process, which would load the postgresql.conf in a postmaster context, load other configuration files and do some of the checks I've mentioned in my initial proposal (particularly it would check that system's shared memory limit is high enough by trying to allocate a new shared memory segment). Afterwards, I'd like to implement checks from a user-callable function, and not all checks would be available from it. -- Alexey Klyukin The PostgreSQL Company - Command Prompt, Inc.
Attachment
On Mon, Apr 4, 2011 at 2:03 PM, Alexey Klyukin <alexk@commandprompt.com> wrote: > Here's the update of Selena's patch, which also shows all errors in > configuration parameters (as well as parser errors) during reload. You should add this here: https://commitfest.postgresql.org/action/commitfest_view/open On a quick glance, this patch appears to contain some superfluous hunks where you changed whitespace or variable names. You might want to remove those and repost before adding to the CF app. Also, some submission notes would be very helpful - when you send in the revised version, detail in the email the exact purpose of the changes so that someone can review the patch without having to read this thread and all preceding threads in their entirety. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On Apr 14, 2011, at 9:50 PM, Robert Haas wrote: > On Mon, Apr 4, 2011 at 2:03 PM, Alexey Klyukin <alexk@commandprompt.com> wrote: >> Here's the update of Selena's patch, which also shows all errors in >> configuration parameters (as well as parser errors) during reload. > > You should add this here: > > https://commitfest.postgresql.org/action/commitfest_view/open > > On a quick glance, this patch appears to contain some superfluous > hunks where you changed whitespace or variable names. You might want > to remove those and repost before adding to the CF app. Also, some > submission notes would be very helpful - when you send in the revised > version, detail in the email the exact purpose of the changes so that > someone can review the patch without having to read this thread and > all preceding threads in their entirety. Thank you for the feedback, I've updated the patch, attached is a new version. I'll add it to the commitfest after posting this message. The patch forces the parser to report all errors (max 100) from the ProcessConfigFile/ParseConfigFp. Currently, only the first parse error or an invalid directive is reported. Reporting all of them is crucial to automatic validation of postgres config files. This patch is based on the one submitted earlier by Selena Deckelmann: http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php It incorporates suggestions by Tom Lane for avoiding excessive bloat in logs in case there is a junk instead of postgresql.conf. http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php Regards, -- Alexey Klyukin The PostgreSQL Company - Command Prompt, Inc.
Attachment
Hi On May14, 2011, at 00:49 , Alexey Klyukin wrote: > The patch forces the parser to report all errors (max 100) from the > ProcessConfigFile/ParseConfigFp. Currently, only the first parse error or an > invalid directive is reported. Reporting all of them is crucial to automatic > validation of postgres config files. > > This patch is based on the one submitted earlier by Selena Deckelmann: > http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php > > It incorporates suggestions by Tom Lane for avoiding excessive bloat in logs > in case there is a junk instead of postgresql.conf. > http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php Here's my review of your patch. The patch is in context diff format and applies cleanly to HEAD. It doesn't contain superfluous whitespace changes any more is and quite readable. First, the behaviour. The first problem I ran into when I tried to test this is that it *only* reports multiple errors during config file reload on SIHUP, not during postmaster startup. I guess it's been done that way because we ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's not immediatly clear how to report multiple errors. But that proplem seems solvable. What if you ereport(LOG,..)ed the individual errors during postmaster startup, and then emitted an ereport(ERROR) at the end if errors occurred? The ERROR could either repeat the first error that was encountered, or simply say "config file contains errors". Now to the code. I see that you basically replaced "goto cleanup..." in both ParseConfigFp() and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp() to return false, and for ProcessConfigFile() to skip the GUC updates if "errorcount > 0". The actual value of errorcount is never inspected. The value is also wrong in the case of include files containing more than error, since ParseConfigFp() simply increments errorcount by one for each failed ParseConfigFile() of an included file. I thus suggest that you replace "errorcount" with a boolean "erroroccurred". You've also introduced a bug in ParseConfigFp(). Previously, if an included file contained an error, it did "goto cleanup_exit()" and thus didn't ereport() on it's own. With your patch applied it ereport()s a parse error at the location of the "include" directive, which seems wrong. Finally, I believe that ParseConfigFp() should make at least some effort to resync after hitting a parser error. I suggest that you simply fast-forward to the next GUC_EOL token after reporting a parser error. best regards, Florian Pflug
Florian, On Jun 16, 2011, at 2:34 PM, Florian Pflug wrote: > Hi > > On May14, 2011, at 00:49 , Alexey Klyukin wrote: >> The patch forces the parser to report all errors (max 100) from the >> ProcessConfigFile/ParseConfigFp. Currently, only the first parse error or an >> invalid directive is reported. Reporting all of them is crucial to automatic >> validation of postgres config files. >> >> This patch is based on the one submitted earlier by Selena Deckelmann: >> http://archives.postgresql.org/pgsql-hackers/2009-03/msg00345.php >> >> It incorporates suggestions by Tom Lane for avoiding excessive bloat in logs >> in case there is a junk instead of postgresql.conf. >> http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php > > Here's my review of your patch. > > The patch is in context diff format and applies cleanly to HEAD. It doesn't > contain superfluous whitespace changes any more is and quite readable. > > First, the behaviour. > > The first problem I ran into when I tried to test this is that it *only* > reports multiple errors during config file reload on SIHUP, not during > postmaster startup. I guess it's been done that way because we > ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's > not immediatly clear how to report multiple errors. But that proplem > seems solvable. What if you ereport(LOG,..)ed the individual errors during > postmaster startup, and then emitted an ereport(ERROR) at the end if > errors occurred? The ERROR could either repeat the first error that was > encountered, or simply say "config file contains errors". Makes sense. One problem I came across is that set_config_option from guc.c sets the elevel by itself. I've changed it to emit LOG errors on PGC_S_FILE source, apparently all the callers of this function with this source are from guc-file.l, so hopefully I won't break anything with this change. > > Now to the code. > > I see that you basically replaced "goto cleanup..." in both ParseConfigFp() > and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp() > to return false, and for ProcessConfigFile() to skip the GUC updates if > "errorcount > 0". The actual value of errorcount is never inspected. The value > is also wrong in the case of include files containing more than error, since > ParseConfigFp() simply increments errorcount by one for each failed > ParseConfigFile() of an included file. > > I thus suggest that you replace "errorcount" with a boolean "erroroccurred". I can actually pass errorcount down from the ParseConfigFp() to report the total number of errors (w/ the include files) at the end of ProcessConfigFile if there is any interest in having the number of errors reported to a user. If not - I'll change it to boolean. > > You've also introduced a bug in ParseConfigFp(). Previously, if an included > file contained an error, it did "goto cleanup_exit()" and thus didn't > ereport() on it's own. With your patch applied it ereport()s a parse error > at the location of the "include" directive, which seems wrong. Right, I noticed that I skipped switching the buffers and restoring the Lineno as well. I'll fix it in the next revision. > > Finally, I believe that ParseConfigFp() should make at least some effort to > resync after hitting a parser error. I suggest that you simply fast-forward > to the next GUC_EOL token after reporting a parser error. Makes sense. Thank you for the review. I'll post an updated patch, addressing you concerns, shortly. Alexey. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun16, 2011, at 17:23 , Alexey Klyukin wrote: > On Jun 16, 2011, at 2:34 PM, Florian Pflug wrote: >> The first problem I ran into when I tried to test this is that it *only* >> reports multiple errors during config file reload on SIHUP, not during >> postmaster startup. I guess it's been done that way because we >> ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's >> not immediatly clear how to report multiple errors. But that proplem >> seems solvable. What if you ereport(LOG,..)ed the individual errors during >> postmaster startup, and then emitted an ereport(ERROR) at the end if >> errors occurred? The ERROR could either repeat the first error that was >> encountered, or simply say "config file contains errors". > > Makes sense. One problem I came across is that set_config_option from guc.c > sets the elevel by itself. I've changed it to emit LOG errors on PGC_S_FILE > source, apparently all the callers of this function with this source are from > guc-file.l, so hopefully I won't break anything with this change. Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate? >> I see that you basically replaced "goto cleanup..." in both ParseConfigFp() >> and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp() >> to return false, and for ProcessConfigFile() to skip the GUC updates if >> "errorcount > 0". The actual value of errorcount is never inspected. The value >> is also wrong in the case of include files containing more than error, since >> ParseConfigFp() simply increments errorcount by one for each failed >> ParseConfigFile() of an included file. >> >> I thus suggest that you replace "errorcount" with a boolean "erroroccurred". > > I can actually pass errorcount down from the ParseConfigFp() to report the total > number of errors (w/ the include files) at the end of ProcessConfigFile if there is > any interest in having the number of errors reported to a user. If not - I'll change > it to boolean. Nah, just use a boolean, unless you have concrete plans to actually use the errorcount for something other than test a la "errorcount > 0". best regards, Florian Pflug
On Jun 16, 2011, at 6:49 PM, Florian Pflug wrote: > On Jun16, 2011, at 17:23 , Alexey Klyukin wrote: >> On Jun 16, 2011, at 2:34 PM, Florian Pflug wrote: >>> The first problem I ran into when I tried to test this is that it *only* >>> reports multiple errors during config file reload on SIHUP, not during >>> postmaster startup. I guess it's been done that way because we >>> ereport(ERROR,..) not ereport(LOG,...) during postmaster startup, so it's >>> not immediatly clear how to report multiple errors. But that proplem >>> seems solvable. What if you ereport(LOG,..)ed the individual errors during >>> postmaster startup, and then emitted an ereport(ERROR) at the end if >>> errors occurred? The ERROR could either repeat the first error that was >>> encountered, or simply say "config file contains errors". >> >> Makes sense. One problem I came across is that set_config_option from guc.c >> sets the elevel by itself. I've changed it to emit LOG errors on PGC_S_FILE >> source, apparently all the callers of this function with this source are from >> guc-file.l, so hopefully I won't break anything with this change. > > Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate? In such a case the errors caused by command-line arguments won't stop the postmaster. PGC_S_FILE seems to handle this correctly. I'm not sure whether it is appropriate to use there though. > >>> I see that you basically replaced "goto cleanup..." in both ParseConfigFp() >>> and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp() >>> to return false, and for ProcessConfigFile() to skip the GUC updates if >>> "errorcount > 0". The actual value of errorcount is never inspected. The value >>> is also wrong in the case of include files containing more than error, since >>> ParseConfigFp() simply increments errorcount by one for each failed >>> ParseConfigFile() of an included file. >>> >>> I thus suggest that you replace "errorcount" with a boolean "erroroccurred". >> >> I can actually pass errorcount down from the ParseConfigFp() to report the total >> number of errors (w/ the include files) at the end of ProcessConfigFile if there is >> any interest in having the number of errors reported to a user. If not - I'll change >> it to boolean. > > Nah, just use a boolean, unless you have concrete plans to actually use the errorcount > for something other than test a la "errorcount > 0". I just recalled a reason for counting the total number of errors. There is a condition that checks that the total number of errors is less than 100 and bails out if it's more than that (100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated to postgresql.conf. That was suggested by Tom Lane here: http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php Thank you, Alexey. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun16, 2011, at 18:46 , Alexey Klyukin wrote: > On Jun 16, 2011, at 6:49 PM, Florian Pflug wrote: >> Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate? > > In such a case the errors caused by command-line arguments won't stop the postmaster. > PGC_S_FILE seems to handle this correctly. I'm not sure whether it is appropriate to use > there though. Ah, yeah, you're right. PGC_S_FILE sounds fine, then. I guess this means you can drop the check for "context == PGC_SIGHUP" though, because surely the source must be PGC_S_DEFAULT or PGC_S_FILE if context == PGC_SIGHUP, right? So the check would become if (source == PGC_S_FILE || source == PGC_S_DEFAULT) where it now says if (context == PGC_SIGHUP || source == PGC_S_DEFAULT) >>>> I see that you basically replaced "goto cleanup..." in both ParseConfigFp() >>>> and ProcessConfigFile() with "++errorcount", and arranged for ParseConfigFp() >>>> to return false, and for ProcessConfigFile() to skip the GUC updates if >>>> "errorcount > 0". The actual value of errorcount is never inspected. The value >>>> is also wrong in the case of include files containing more than error, since >>>> ParseConfigFp() simply increments errorcount by one for each failed >>>> ParseConfigFile() of an included file. >>>> >>>> I thus suggest that you replace "errorcount" with a boolean "erroroccurred". >>> >>> I can actually pass errorcount down from the ParseConfigFp() to report the total >>> number of errors (w/ the include files) at the end of ProcessConfigFile if there is >>> any interest in having the number of errors reported to a user. If not - I'll change >>> it to boolean. >> >> Nah, just use a boolean, unless you have concrete plans to actually use the errorcount >> for something other than test a la "errorcount > 0". > > I just recalled a reason for counting the total number of errors. There is a condition that > checks that the total number of errors is less than 100 and bails out if it's more than that > (100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated > to postgresql.conf. That was suggested by Tom Lane here: > http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php Ah, right, I missed that. Guess it'll have to stay a counter, then. Still, I don't think it's worth the effort to make the count correct in case of included files, so I'd say just add a comment explaining that the count isn't totally accurate. best regards, Florian Pflug
On Jun 16, 2011, at 8:01 PM, Florian Pflug wrote: > On Jun16, 2011, at 18:46 , Alexey Klyukin wrote: >> On Jun 16, 2011, at 6:49 PM, Florian Pflug wrote: >>> Hm, wouldn't a test for "context == PGC_POSTMASTER" be more appropriate? >> >> In such a case the errors caused by command-line arguments won't stop the postmaster. >> PGC_S_FILE seems to handle this correctly. I'm not sure whether it is appropriate to use >> there though. > > Ah, yeah, you're right. PGC_S_FILE sounds fine, then. I guess this means you can > drop the check for "context == PGC_SIGHUP" though, because surely the source must > be PGC_S_DEFAULT or PGC_S_FILE if context == PGC_SIGHUP, right? So the check would > become > if (source == PGC_S_FILE || source == PGC_S_DEFAULT) > where it now says > if (context == PGC_SIGHUP || source == PGC_S_DEFAULT) Yes, AFAIK PGC_SIGHUP is redundant, thank you for the suggestion. >> >> I just recalled a reason for counting the total number of errors. There is a condition that >> checks that the total number of errors is less than 100 and bails out if it's more than that >> (100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated >> to postgresql.conf. That was suggested by Tom Lane here: >> http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php > > Ah, right, I missed that. Guess it'll have to stay a counter, then. Still, I don't think it's > worth the effort to make the count correct in case of included files, so I'd say just add > a comment explaining that the count isn't totally accurate. Well, while thinking about this I decided to leave the counter for the ParseConfigFp, but drop it in ProcessConfigFile. The case we are protecting against is a single file full of junk. It's unlikely that this junk would contain include directives with valid file paths, neither it's likely to find a file with a correct syntax, but full of invalid directives. Thank you, Alexey. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun16, 2011, at 20:14 , Alexey Klyukin wrote: > On Jun 16, 2011, at 8:01 PM, Florian Pflug wrote: >> On Jun16, 2011, at 18:46 , Alexey Klyukin wrote: >>> I just recalled a reason for counting the total number of errors. There is a condition that >>> checks that the total number of errors is less than 100 and bails out if it's more than that >>> (100 is arbitrary). The reason is to avoid bloating the logs w/ something totally unrelated >>> to postgresql.conf. That was suggested by Tom Lane here: >>> http://archives.postgresql.org/pgsql-hackers/2009-03/msg01142.php >> >> Ah, right, I missed that. Guess it'll have to stay a counter, then. Still, I don't think it's >> worth the effort to make the count correct in case of included files, so I'd say just add >> a comment explaining that the count isn't totally accurate. > > Well, while thinking about this I decided to leave the counter for the ParseConfigFp, but > drop it in ProcessConfigFile. The case we are protecting against is a single file full of junk. > It's unlikely that this junk would contain include directives with valid file paths, neither it's > likely to find a file with a correct syntax, but full of invalid directives. Sounds good. best regards, Florian Pflug
Hi, On Jun 16, 2011, at 9:18 PM, Florian Pflug wrote: > On Jun16, 2011, at 20:14 , Alexey Klyukin wrote: >> >> Well, while thinking about this I decided to leave the counter for the ParseConfigFp, but >> drop it in ProcessConfigFile. The case we are protecting against is a single file full of junk. >> It's unlikely that this junk would contain include directives with valid file paths, neither it's >> likely to find a file with a correct syntax, but full of invalid directives. > > Sounds good. Attached is the v2 of the patch to show all parse errors at postgresql.conf. Changes (per review and suggestions from Florian): - do not stop on the first error during postmaster's start. - fix errors in processing files from include directives. - show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one. - additional comments/error messages, code cleanup Questions: - Should we add a comment for the changes in guc.c? I think the existing ones are still valid, but they might be harder gograsp, given that we've removed PGC_SIGHUP from the condition. - The error message that we emit when the parsing is unsuccessful, will it cause incompatibility w/ 3rd party tools, whichmay, in theory, show only one error message (would it better to show the first error instead, as proposed by Florian?). I'd appreciate your comments and suggestions. Thank you, Alexey. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
On Jun16, 2011, at 22:34 , Alexey Klyukin wrote: > Attached is the v2 of the patch to show all parse errors at postgresql.conf. > Changes (per review and suggestions from Florian): > > - do not stop on the first error during postmaster's start. > - fix errors in processing files from include directives. > - show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one. > - additional comments/error messages, code cleanup Looks mostly good. I found one issue which I missed earlier. As it stands, the behaviour of ProcessConfigFile() changes subtly when IsUnderPostmaster because of the early abort on errors. Now, in theory the outcome should still be the same, since we don't apply any settings if there's an error in one of them. But still, there's a risk that this code isn't 100% waterproof, and then we'd end up with different settings in the postmaster compared to the backends. The benefit seems also quite small - since the backends emit their messages at DEBUG2, you usually won't see the difference anyway. The elevel setting at the start of ProcessConfigFile() also seemed needlessly complex, since we cannot have IsUnderPostmaster and context == PGCS_POSTMASTER at the same time. I figured it'd be harder to explain the fixes I have in mind than simply do them and let the code speak. Attached you'll find an updated version of your v2 patch (called v2a) as well as an incremental patch against your v2 (called v2a_delta). The main changes are the removal of the early aborts when IsUnderPostmaster and the simplification of the elevel setting logic in ProcessConfigFile(). The updated patch also adds a few comments. If you agree with my changes, feel free to mark this patch "Ready for Committer". best regards, Florian Pflug
Attachment
Florian, On Jun 18, 2011, at 5:40 PM, Florian Pflug wrote: > On Jun16, 2011, at 22:34 , Alexey Klyukin wrote: >> Attached is the v2 of the patch to show all parse errors at postgresql.conf. >> Changes (per review and suggestions from Florian): >> >> - do not stop on the first error during postmaster's start. >> - fix errors in processing files from include directives. >> - show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one. >> - additional comments/error messages, code cleanup > > Looks mostly good. > > I found one issue which I missed earlier. As it stands, the behaviour > of ProcessConfigFile() changes subtly when IsUnderPostmaster because of > the early abort on errors. Now, in theory the outcome should still be > the same, since we don't apply any settings if there's an error in one > of them. But still, there's a risk that this code isn't 100% waterproof, > and then we'd end up with different settings in the postmaster compared > to the backends. The benefit seems also quite small - since the backends > emit their messages at DEBUG2, you usually won't see the difference > anyway. I don't think it has changed at all. Previously, we did goto cleanup_list (or cleanup_exit in ParseConfigFp) right after the first error, no matter whether that was a postmaster or its child. What I did in my patch is removing the goto for the postmaster's case. It was my intention to exit after the initial error for the postmaster's child, to avoid complaining about all errors both in the postmaster and in the normal backend (imagine seeing 100 errors from the postmaster and the same 100 from each of the backends if your log level is DEBUG2). I think the postmaster's child case won't cause any problems, since we do exactly what we used to do before. > > The elevel setting at the start of ProcessConfigFile() also seemed > needlessly complex, since we cannot have IsUnderPostmaster and > context == PGCS_POSTMASTER at the same time. Agreed. > > I figured it'd be harder to explain the fixes I have in mind than > simply do them and let the code speak. Attached you'll find an updated > version of your v2 patch (called v2a) as well as an incremental patch > against your v2 (called v2a_delta). > > The main changes are the removal of the early aborts when > IsUnderPostmaster and the simplification of the elevel setting > logic in ProcessConfigFile(). Attached is the new patch and the delta. It includes some of the changes from your patch, while leaving the early abort stuff for postmaster's children. Thank you, Alexey. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
On Jun20, 2011, at 17:02 , Alexey Klyukin wrote: > On Jun 18, 2011, at 5:40 PM, Florian Pflug wrote: >> On Jun16, 2011, at 22:34 , Alexey Klyukin wrote: >>> Attached is the v2 of the patch to show all parse errors at postgresql.conf. >>> Changes (per review and suggestions from Florian): >>> >>> - do not stop on the first error during postmaster's start. >>> - fix errors in processing files from include directives. >>> - show only a single syntax error per line, i.e. fast forward to the EOL after coming across the first one. >>> - additional comments/error messages, code cleanup >> >> Looks mostly good. >> >> I found one issue which I missed earlier. As it stands, the behaviour >> of ProcessConfigFile() changes subtly when IsUnderPostmaster because of >> the early abort on errors. Now, in theory the outcome should still be >> the same, since we don't apply any settings if there's an error in one >> of them. But still, there's a risk that this code isn't 100% waterproof, >> and then we'd end up with different settings in the postmaster compared >> to the backends. The benefit seems also quite small - since the backends >> emit their messages at DEBUG2, you usually won't see the difference >> anyway. > > I don't think it has changed at all. Previously, we did goto cleanup_list (or > cleanup_exit in ParseConfigFp) right after the first error, no matter whether > that was a postmaster or its child. What I did in my patch is removing the > goto for the postmaster's case. It was my intention to exit after the initial > error for the postmaster's child, to avoid complaining about all errors both > in the postmaster and in the normal backend (imagine seeing 100 errors from > the postmaster and the same 100 from each of the backends if your log level is > DEBUG2). I think the postmaster's child case won't cause any problems, since > we do exactly what we used to do before. Hm, I think you miss-understood what I was trying to say, probably because I explained it badly. Let me try again. I fully agree that there *shouldn't* be any difference in behaviour, because it *shouldn't* matter whether we abort early or not - we won't have applied any of the settings anway. But. The code the actually implements the "check settings first, apply later" logic isn't easy to read. Now, assume that this code has a bug. Then, with your patch applied, we might end up with the postmaster applying a setting (because it didn't abort early) but the backend ignoring it (because they did abort early). This is obviously bad. Depending on the setting, the consequences may range from slightly confusing behaviour to outright crashes I guess... Now, the risk of that happening might be very small. But on the other hand, the benefit is pretty small also - you get a little less output for log level DEBUG2, that's it. A level that people probably don't use for the production databases anyway. This convinced me that the risk/benefit ratio isn't high enough to warrant the early abort. Another benefit of removing the check is that it reduces code complexity. Maybe not when measured in line counts, but it's one less outside factor that changes ProcessConfigFiles()'s behaviour and thus one thing less you need to think when you modify that part again in the future. Again, it's a small benefit, but IMHO it still outweights the benefit. Having said that, this is my personal opinion and whoever will eventually commit this may very will assess the cost/benefit ratio differently. So, if after this more detailed explanations of my reasoning, you still feel that it makes sense to keep the early abort, then feel free to mark the patch "Ready for Committer" nevertheless. best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: > The code the actually implements the "check settings first, apply later" logic > isn't easy to read. Now, assume that this code has a bug. Then, with your > patch applied, we might end up with the postmaster applying a setting (because > it didn't abort early) but the backend ignoring it (because they did abort early). > This is obviously bad. Depending on the setting, the consequences may range > from slightly confusing behaviour to outright crashes I guess... This is already known to happen: there are cases where the postmaster and a backend can come to different conclusions about whether a setting is valid (eg, because it depends on database encoding). Whether that's a bug or not isn't completely clear, but if this patch is critically dependent on the situation never happening, I don't think we can accept it. regards, tom lane
On Jun20, 2011, at 18:16 , Tom Lane wrote: > Florian Pflug <fgp@phlo.org> writes: >> The code the actually implements the "check settings first, apply later" logic >> isn't easy to read. Now, assume that this code has a bug. Then, with your >> patch applied, we might end up with the postmaster applying a setting (because >> it didn't abort early) but the backend ignoring it (because they did abort early). >> This is obviously bad. Depending on the setting, the consequences may range >> from slightly confusing behaviour to outright crashes I guess... > > This is already known to happen: there are cases where the postmaster > and a backend can come to different conclusions about whether a setting > is valid (eg, because it depends on database encoding). Whether that's > a bug or not isn't completely clear, but if this patch is critically > dependent on the situation never happening, I don't think we can accept > it. Does that mean that some backends might currently choose to ignore an updated config file wholesale on SIGUP (because some settings are invalid) while others happily apply it? Meaning that they'll afterwards disagree even on modified settings which *would* be valid for both backends? Or do these kinds of setting rejections happen late enough to fall out of the all-or-nothing logic in ProcessConfigFile? Anyway, the patch *doesn't* depend on all backends's setting being in sync. The issue we were discussion was whether it's OK to add another small risk of them getting out of sync by aborting early on errors in backends but not in the postmaster. I was arguing against that, while Alexey was in favour of it, on the grounds that it reduces log traffic (but only at DEBUG2 or beyond). best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: > On Jun20, 2011, at 18:16 , Tom Lane wrote: >> This is already known to happen: there are cases where the postmaster >> and a backend can come to different conclusions about whether a setting >> is valid (eg, because it depends on database encoding). Whether that's >> a bug or not isn't completely clear, but if this patch is critically >> dependent on the situation never happening, I don't think we can accept >> it. > Does that mean that some backends might currently choose to ignore an > updated config file wholesale on SIGUP (because some settings are invalid) > while others happily apply it? Meaning that they'll afterwards disagree > even on modified settings which *would* be valid for both backends? Yes. I complained about that before: http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php but we didn't come to any consensus about fixing it. This patch might be a good vehicle for revisiting the issue, though. regards, tom lane
On Jun 20, 2011, at 6:22 PM, Florian Pflug wrote: > On Jun20, 2011, at 17:02 , Alexey Klyukin wrote: >> >> I don't think it has changed at all. Previously, we did goto cleanup_list (or >> cleanup_exit in ParseConfigFp) right after the first error, no matter whether >> that was a postmaster or its child. What I did in my patch is removing the >> goto for the postmaster's case. It was my intention to exit after the initial >> error for the postmaster's child, to avoid complaining about all errors both >> in the postmaster and in the normal backend (imagine seeing 100 errors from >> the postmaster and the same 100 from each of the backends if your log level is >> DEBUG2). I think the postmaster's child case won't cause any problems, since >> we do exactly what we used to do before. > > Hm, I think you miss-understood what I was trying to say, probably because I > explained it badly. Let me try again. > > I fully agree that there *shouldn't* be any difference in behaviour, because > it *shouldn't* matter whether we abort early or not - we won't have applied > any of the settings anway. > > But. > > The code the actually implements the "check settings first, apply later" logic > isn't easy to read. Now, assume that this code has a bug. Then, with your > patch applied, we might end up with the postmaster applying a setting (because > it didn't abort early) but the backend ignoring it (because they did abort early). > This is obviously bad. Depending on the setting, the consequences may range > from slightly confusing behaviour to outright crashes I guess... > > Now, the risk of that happening might be very small. But on the other hand, > the benefit is pretty small also - you get a little less output for log level > DEBUG2, that's it. A level that people probably don't use for the production > databases anyway. This convinced me that the risk/benefit ratio isn't high enough > to warrant the early abort. > > Another benefit of removing the check is that it reduces code complexity. Maybe > not when measured in line counts, but it's one less outside factor that changes > ProcessConfigFiles()'s behaviour and thus one thing less you need to think when > you modify that part again in the future. Again, it's a small benefit, but IMHO > it still outweights the benefit. While I agree that making the code potentially less bug prone is a good idea, I don't agree with the statement that since DEBUG2 output gets rarely turned on we can make it less usable, hoping that nobody would use in production. > > Having said that, this is my personal opinion and whoever will eventually > commit this may very will assess the cost/benefit ratio differently. So, if > after this more detailed explanations of my reasoning, you still feel that > it makes sense to keep the early abort, then feel free to mark the > patch "Ready for Committer" nevertheless. I'd say that this issue is a judgement call. Depending on a point of view, both arguments are valid. I've marked this patch as 'Ready for Committer' w/o removing the early abort stuff, but if there will be more people willing to remove them - I'll do that. Thank you, Alexey. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Alexey Kluykin's message of mar jun 21 07:43:02 -0400 2011: > > Another benefit of removing the check is that it reduces code complexity. Maybe > > not when measured in line counts, but it's one less outside factor that changes > > ProcessConfigFiles()'s behaviour and thus one thing less you need to think when > > you modify that part again in the future. Again, it's a small benefit, but IMHO > > it still outweights the benefit. > > While I agree that making the code potentially less bug prone is a good idea, > I don't agree with the statement that since DEBUG2 output gets rarely turned > on we can make it less usable, hoping that nobody would use in production. I would have sided with Florian on this issue, but Tom commented otherwise somewhere, and I think his rationale makes sense, so I left the patch as Alexey had it. I also tweaked the patch so that it really stops processing after 100 errors, say if you include file A which has 50 errors and then file B which has another 50 errors; instead of looking for 98 more errors, we just give up at that point. It makes more sense to me anyway, and it doesn't seem to add any code complexity. Also, if you have already seen 98 errors, it doesn't make sense to let another file contain 100 errors more, so with this version, that 98 is passed down so that only 2 more errors are allowed. I had to touch the other callers of ParseConfigFile but it is clean enough. I also made the code barf when bailing out of one of those loops. One strange thing here is that you could get two such messages; say if a file has 100 parse errors and there are also valid lines that contain bogus settings (foo = bar). I don't find this to be too problematic, and I think fixing it would be excessively annoying. For example, a bogus run would end like this: 95 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 4, near end of line 96 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 41, near end of line 97 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 104, near end of line 98 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 156, near end of line 99 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 208, near end of line 100 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 260, near end of line 101 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" 102 LOG: unrecognized configuration parameter "plperl.err" 103 LOG: unrecognized configuration parameter "this1" 104 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" 105 FATAL: errors detected while parsing configuration files One thing I don't like is that those "unrecognized configuration parameter" messages don't say what file those parameters were found in. That's material for a different patch however. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
On Jul14, 2011, at 01:38 , Alvaro Herrera wrote: > One strange thing here is that you could get two such messages; say if a > file has 100 parse errors and there are also valid lines that contain > bogus settings (foo = bar). I don't find this to be too problematic, > and I think fixing it would be excessively annoying. > > For example, a bogus run would end like this: > > 95 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 4, near end of line > 96 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 41, near end of line > 97 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 104, near end of line > 98 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 156, near end of line > 99 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 208, near end of line > 100 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 260, near end of line > 101 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" > 102 LOG: unrecognized configuration parameter "plperl.err" > 103 LOG: unrecognized configuration parameter "this1" > 104 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" > 105 FATAL: errors detected while parsing configuration files How about changing ParseConfigFile to say "too many *syntax* error found" instead? It'd be more precise, and we wouldn't emit exactly the same message twice. Do you want me to take a closer look at your modified version of the patch before you commit, or did you post it more as a "FYI, this is how it's going to look like"? best regards, Florian Pflug
Excerpts from Florian Pflug's message of mié jul 13 20:12:28 -0400 2011: > On Jul14, 2011, at 01:38 , Alvaro Herrera wrote: > > One strange thing here is that you could get two such messages; say if a > > file has 100 parse errors and there are also valid lines that contain > > bogus settings (foo = bar). I don't find this to be too problematic, > > and I think fixing it would be excessively annoying. > > > > For example, a bogus run would end like this: > > > > 95 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 4, near end of line > > 96 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 41, near end of line > > 97 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 104, near end of line > > 98 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 156, near end of line > > 99 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 208, near end of line > > 100 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 260, near end of line > > 101 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" > > 102 LOG: unrecognized configuration parameter "plperl.err" > > 103 LOG: unrecognized configuration parameter "this1" > > 104 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" > > 105 FATAL: errors detected while parsing configuration files > > How about changing ParseConfigFile to say "too many *syntax* error found" > instead? It'd be more precise, and we wouldn't emit exactly the > same message twice. Yeah, I thought about doing it that way but refrained because it'd be one more string to translate. That's a poor reason, I admit :-) I'll change it. > Do you want me to take a closer look at your modified version of the > patch before you commit, or did you post it more as a "FYI, this is > how it's going to look like"? I know I'd feel more comfortable if you (and Alexey, and Selena) gave it another look :-) -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jul 14, 2011, at 4:38 AM, Alvaro Herrera wrote: > Excerpts from Florian Pflug's message of mié jul 13 20:12:28 -0400 2011: >> On Jul14, 2011, at 01:38 , Alvaro Herrera wrote: >>> One strange thing here is that you could get two such messages; say if a >>> file has 100 parse errors and there are also valid lines that contain >>> bogus settings (foo = bar). I don't find this to be too problematic, >>> and I think fixing it would be excessively annoying. >>> >>> For example, a bogus run would end like this: >>> >>> 95 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 4, near end of line >>> 96 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 41, near end of line >>> 97 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 104, near end of line >>> 98 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 156, near end of line >>> 99 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 208, near end of line >>> 100 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" line 260, near end of line >>> 101 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" >>> 102 LOG: unrecognized configuration parameter "plperl.err" >>> 103 LOG: unrecognized configuration parameter "this1" >>> 104 LOG: too many errors found, stopped processing file "/pgsql/install/HEAD/data/postgresql.conf" >>> 105 FATAL: errors detected while parsing configuration files >> >> How about changing ParseConfigFile to say "too many *syntax* error found" >> instead? It'd be more precise, and we wouldn't emit exactly the >> same message twice. > > Yeah, I thought about doing it that way but refrained because it'd be > one more string to translate. That's a poor reason, I admit :-) I'll > change it. This is happening because a check for total number of errors so far is happening only after coming across at least one non-recognizedconfiguration option. What about adding one more check right after ParseConfigFile, so we can bail out earlywhen overwhelmed with syntax errors? This would save a line in translation :). > >> Do you want me to take a closer look at your modified version of the >> patch before you commit, or did you post it more as a "FYI, this is >> how it's going to look like"? > > I know I'd feel more comfortable if you (and Alexey, and Selena) gave it > another look :-) I have checked it here and don't see any more problems with it. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Alexey Kluykin's message of jue jul 14 09:18:15 -0400 2011: > > On Jul 14, 2011, at 4:38 AM, Alvaro Herrera wrote: > > >> On Jul14, 2011, at 01:38 , Alvaro Herrera wrote: > This is happening because a check for total number of errors so far is happening only after coming across at least onenon-recognized configuration option. What about adding one more check right after ParseConfigFile, so we can bail outearly when overwhelmed with syntax errors? This would save a line in translation :). Actually I think it would make sense to do it completely the other way around: reset the number of errors to zero before starting to apply the values. The rationale is that all the settings that made it past the tokenisation are completely different lines for which the errors were reported earlier. > > I know I'd feel more comfortable if you (and Alexey, and Selena) gave it > > another look :-) > > I have checked it here and don't see any more problems with it. Thanks. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Alexey Kluykin's message of jue jul 14 09:18:15 -0400 2011: >> This is happening because a check for total number of errors so far >> is happening only after coming across at least one non-recognized >> configuration option. What about adding one more check right after >> ParseConfigFile, so we can bail out early when overwhelmed with >> syntax errors? This would save a line in translation :). > Actually I think it would make sense to do it completely the other way > around: reset the number of errors to zero before starting to apply the > values. The rationale is that all the settings that made it past the > tokenisation are completely different lines for which the errors were > reported earlier. I'd like to re-open the discussion from here http://archives.postgresql.org/pgsql-hackers/2011-06/msg01741.php http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php specifically about this concern: >>> Does that mean that some backends might currently choose to ignore an >>> updated config file wholesale on SIGUP (because some settings are invalid) >>> while others happily apply it? Meaning that they'll afterwards disagree >>> even on modified settings which *would* be valid for both backends? I complained about exactly that point back in April, but at the time people (quite reasonably) didn't want to touch the behavior. Now that we are entering a new development cycle, it's time to reconsider. Particularly so if we're considering a patch that touches the behavior already. I think that it might be sensible to have the following behavior: 1. Parse the file, where "parse" means collect all the name = value pairs. Bail out if we find any syntax errors at that level of detail. (With this patch, we could report some or all of the syntax errors first.) 2. Tentatively apply the new custom_variable_classes setting if any. 3. Check to see whether all the "name"s are valid. If not, report the ones that aren't and bail out. 4. Apply each "value". If some of them aren't valid, report that, but continue, and apply all the ones that are valid. We can expect that the postmaster and all backends will agree on the results of steps 1 through 3. They might differ as to the validity of individual values in step 4 (as per my example of a setting that depends on database_encoding), but we should never end up with a situation where a globally correct value is not globally applied. The original argument for the current behavior was to avoid applying settings from a thoroughly munged config file, but I think that the checks involved in steps 1-3 would be sufficient to reject files that had major problems. It's possible that step 1 is really sufficient to cover the issue, in which case we could drop the separate step-3 pass and just treat invalid GUC names as a reason to ignore the particular line rather than the whole file. That would make things simpler and faster, and maybe less surprising too. Comments? regards, tom lane
I wrote: > I think that it might be sensible to have the following behavior: > 1. Parse the file, where "parse" means collect all the name = value > pairs. Bail out if we find any syntax errors at that level of detail. > (With this patch, we could report some or all of the syntax errors > first.) > 2. Tentatively apply the new custom_variable_classes setting if any. > 3. Check to see whether all the "name"s are valid. If not, report > the ones that aren't and bail out. > 4. Apply each "value". If some of them aren't valid, report that, > but continue, and apply all the ones that are valid. > We can expect that the postmaster and all backends will agree on the > results of steps 1 through 3. They might differ as to the validity > of individual values in step 4 (as per my example of a setting that > depends on database_encoding), but we should never end up with a > situation where a globally correct value is not globally applied. I thought some more about this, and it occurred to me that it's not that hard to foresee a situation where different backends might have different opinions about the results of step 3, ie, different ideas about the set of valid GUC names. This could arise as a result of some of them having a particular extension module loaded and others not. Right now, whether or not you have say plpgsql loaded will not affect your ability to do "SET plpgsql.junk = foobar" --- as long as "plpgsql" is listed in custom_variable_classes, we'll accept the command and create a placeholder variable for plpgsql.junk. But it seems perfectly plausible that we might someday try to tighten that up so that once a module has done EmitWarningsOnPlaceholders("plpgsql"), we'll no longer allow creation of new placeholders named plpgsql.something. If we did that, we could no longer assume that all backends agree on the set of legal GUC variable names. So that seems like an argument --- not terribly strong, but still an argument --- for doing what I suggested next: > The original argument for the current behavior was to avoid applying > settings from a thoroughly munged config file, but I think that the > checks involved in steps 1-3 would be sufficient to reject files that > had major problems. It's possible that step 1 is really sufficient to > cover the issue, in which case we could drop the separate step-3 pass > and just treat invalid GUC names as a reason to ignore the particular > line rather than the whole file. That would make things simpler and > faster, and maybe less surprising too. IOW, I'm now pretty well convinced that so long as the configuration file is syntactically valid, we should go ahead and attempt to apply each name = value setting individually, without allowing the invalidity of any one name or value to prevent others from being applied. regards, tom lane
On Jul16, 2011, at 20:55 , Tom Lane wrote: >> The original argument for the current behavior was to avoid applying >> settings from a thoroughly munged config file, but I think that the >> checks involved in steps 1-3 would be sufficient to reject files that >> had major problems. It's possible that step 1 is really sufficient to >> cover the issue, in which case we could drop the separate step-3 pass >> and just treat invalid GUC names as a reason to ignore the particular >> line rather than the whole file. That would make things simpler and >> faster, and maybe less surprising too. > > IOW, I'm now pretty well convinced that so long as the configuration > file is syntactically valid, we should go ahead and attempt to apply > each name = value setting individually, without allowing the invalidity > of any one name or value to prevent others from being applied. One benefit of this would be that it'd make the logic of ProcessConfigFile and its interaction with set_config_option() much simpler, and the behaviour more predictable. Given that it took me a while to work through the interactions of these two functions, I all for that. On the downside, the current behaviour prevents problems if someone changes two interrelated GUCs, but makes a mistake at one of them. For example, someone might drastically lower bgwriter_delay but might botch the matching adjustment of bgwriter_lru_maxpages. Not sure if that out-weights the benefits, but I thought I'd bring it up. best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: > On the downside, the current behaviour prevents problems if someone changes > two interrelated GUCs, but makes a mistake at one of them. For example, > someone might drastically lower bgwriter_delay but might botch the matching > adjustment of bgwriter_lru_maxpages. That's a fair point, but the current behavior only saves you if the botch is such that the new value is detectably invalid, as opposed to say just a factor of 100 off from what you meant. Not sure that that's all that helpful. regards, tom lane
On Jul16, 2011, at 21:23 , Tom Lane wrote: > Florian Pflug <fgp@phlo.org> writes: >> On the downside, the current behaviour prevents problems if someone changes >> two interrelated GUCs, but makes a mistake at one of them. For example, >> someone might drastically lower bgwriter_delay but might botch the matching >> adjustment of bgwriter_lru_maxpages. > > That's a fair point, but the current behavior only saves you if the > botch is such that the new value is detectably invalid, as opposed to > say just a factor of 100 off from what you meant. Not sure that that's > all that helpful. True. And a forgotten zero or wrong unit probably is even more likely than a totally botched value. So +1 from me. Btw, if we touch that, I think we should think about providing some way to detect when a backend fails to apply a value. Showing the precise option that caused the trouble is probably hard, but could we add a flag to PGPROC that gets set once a backend fails to apply some setting on SIGUP? If we showed the state of such a flag in pg_stat_activity, it'd give an admin a quick way of verifying that all is well after a SIGUP. We might also want to record the timestamp of the last processed file so that backends which haven't yet processed the SIHUP can also be detected. best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: > Btw, if we touch that, I think we should think about providing some way > to detect when a backend fails to apply a value. Hm, maybe, but keep in mind that there are valid reasons for a backend to ignore a postgresql.conf setting --- in particular, it might have a local override from some flavor of SET command. So I don't think we'd want the flag to have the semantics of "this backend is actually *using* the value"; and yet, if that's not what it means, people could still be confused. There might be some implementation gotchas as well. I'm not sure offhand how thoroughly the GUC code checks a value that is being overridden. regards, tom lane
On Jul16, 2011, at 22:55 , Tom Lane wrote: > Florian Pflug <fgp@phlo.org> writes: >> Btw, if we touch that, I think we should think about providing some way >> to detect when a backend fails to apply a value. > > Hm, maybe, but keep in mind that there are valid reasons for a backend > to ignore a postgresql.conf setting --- in particular, it might have a > local override from some flavor of SET command. So I don't think we'd > want the flag to have the semantics of "this backend is actually *using* > the value"; Yeah, the flag would simply indicate whether a particular backend encountered an error during config file reload or not. Actually being able to inspect other backend's GUCs would be nice, but is way beyond the scope of this of course. > and yet, if that's not what it means, people could still be > confused. Hm, if it's called "cfgfile_valid" or a prettier version thereof I think the risk is small. > There might be some implementation gotchas as well. I'm not > sure offhand how thoroughly the GUC code checks a value that is being > overridden. If it doesn't, then what happens when the overriding scope ends, and the value reverts (or attempts to revert) to its default? best regards, Florian Pflug
On Fri, Jul 15, 2011 at 8:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > 2. Tentatively apply the new custom_variable_classes setting if any. Is there any way that we could get *rid* of custom_variable_classes? The idea of using a GUC to define the set of valid GUCs seems intrinsically problematic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Jul 15, 2011 at 8:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 2. Tentatively apply the new custom_variable_classes setting if any. > Is there any way that we could get *rid* of custom_variable_classes? > The idea of using a GUC to define the set of valid GUCs seems > intrinsically problematic. Well, we could just drop it and say you can set any dotted-name GUC you feel like. The only reason to have it is to have some modicum of error checking ... but I'm not sure why we should bother if there's no checking on the second half of the name. Not sure if that's going too far in the laissez-faire direction, though. I can definitely imagine people complaining "I set plpqsql.variable_conflict in postgresql.conf and it didn't do anything, how come?" regards, tom lane
On Sat, Jul 16, 2011 at 10:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Jul 15, 2011 at 8:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> 2. Tentatively apply the new custom_variable_classes setting if any. > >> Is there any way that we could get *rid* of custom_variable_classes? >> The idea of using a GUC to define the set of valid GUCs seems >> intrinsically problematic. > > Well, we could just drop it and say you can set any dotted-name GUC > you feel like. The only reason to have it is to have some modicum of > error checking ... but I'm not sure why we should bother if there's no > checking on the second half of the name. Not sure if that's going too > far in the laissez-faire direction, though. I can definitely imagine > people complaining "I set plpqsql.variable_conflict in postgresql.conf > and it didn't do anything, how come?" I'm not sure that's really making anything any better. Maybe I'm misunderstanding, but if that's going to be a problem, then presumably this will create the same problem: custom_variable_classes='plpgsql' plpgsql.variable_conflict='on' ...and the fact that we've made them set an extra GUC to shoot themselves in the foot hardly seems like an improvement. I was more thinking along the lines of having loadable modules register custom variable classes at load time, through some sort of C API provided for that purpose, rather than having the user declare a list that may or may not match what the .so files really care about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Jul 16, 2011 at 10:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Is there any way that we could get *rid* of custom_variable_classes? >> Well, we could just drop it and say you can set any dotted-name GUC >> you feel like. > ...and the fact that we've made them set an extra GUC to shoot > themselves in the foot hardly seems like an improvement. I was more > thinking along the lines of having loadable modules register custom > variable classes at load time, through some sort of C API provided for > that purpose, rather than having the user declare a list that may or > may not match what the .so files really care about. Well, we *do* have a C API for that, of a sort. The problem is, what do you do in processes that have not loaded the relevant extension? (And no, I don't like the answer of "let's force the postmaster to load every extension we want to set any parameters for".) I agree custom_variable_classes is conceptually messy, but it's a reasonably lightweight compromise that gives some error checking without requiring a lot of possibly-irrelevant extensions to be loaded into every postgres process. regards, tom lane
On Sun, Jul 17, 2011 at 12:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sat, Jul 16, 2011 at 10:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> Is there any way that we could get *rid* of custom_variable_classes? > >>> Well, we could just drop it and say you can set any dotted-name GUC >>> you feel like. > >> ...and the fact that we've made them set an extra GUC to shoot >> themselves in the foot hardly seems like an improvement. I was more >> thinking along the lines of having loadable modules register custom >> variable classes at load time, through some sort of C API provided for >> that purpose, rather than having the user declare a list that may or >> may not match what the .so files really care about. > > Well, we *do* have a C API for that, of a sort. The problem is, what do > you do in processes that have not loaded the relevant extension? (And > no, I don't like the answer of "let's force the postmaster to load every > extension we want to set any parameters for".) > > I agree custom_variable_classes is conceptually messy, but it's a > reasonably lightweight compromise that gives some error checking without > requiring a lot of possibly-irrelevant extensions to be loaded into > every postgres process. Hmm. Maybe what we need is a mechanism that allows the configuration to be associated a loadable module, and whenever that module is loaded, we also load the associated configuration settings. This is probably terribly syntax, but something like: ALTER LOAD 'plpgsql' SET plpgsql.variable_conflict = 'whatever'; AFAICS, that would remove the need to set variables in postgresql.conf that can't be validated, and so we could just disallow it. OTOH, maybe that's more trouble than can be justified by the size of the problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Jul18, 2011, at 01:29 , Robert Haas wrote: > Hmm. Maybe what we need is a mechanism that allows the configuration > to be associated a loadable module, and whenever that module is > loaded, we also load the associated configuration settings. This is > probably terribly syntax, but something like: > > ALTER LOAD 'plpgsql' SET plpgsql.variable_conflict = 'whatever'; A variant of this would be to allow extensions (in the CREATE EXTENSION sense) to declare custom GUCs in their control file. Then we'd only need to load those files, which seems better than loading a shared library. We do expect people to wrap their loadable modules in extensions anyway nowadays, do we? best regards, Florian Pflug
2011/7/18 Florian Pflug <fgp@phlo.org>: > On Jul18, 2011, at 01:29 , Robert Haas wrote: >> Hmm. Maybe what we need is a mechanism that allows the configuration >> to be associated a loadable module, and whenever that module is >> loaded, we also load the associated configuration settings. This is >> probably terribly syntax, but something like: >> >> ALTER LOAD 'plpgsql' SET plpgsql.variable_conflict = 'whatever'; > > A variant of this would be to allow extensions (in the CREATE EXTENSION > sense) to declare custom GUCs in their control file. Then we'd only > need to load those files, which seems better than loading a shared > library. +1 Pavel > > We do expect people to wrap their loadable modules in extensions > anyway nowadays, do we? > > best regards, > Florian Pflug > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Jul 17, 2011 at 12:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I agree custom_variable_classes is conceptually messy, but it's a >> reasonably lightweight compromise that gives some error checking without >> requiring a lot of possibly-irrelevant extensions to be loaded into >> every postgres process. > Hmm. Maybe what we need is a mechanism that allows the configuration > to be associated a loadable module, and whenever that module is > loaded, we also load the associated configuration settings. This is > probably terribly syntax, but something like: > ALTER LOAD 'plpgsql' SET plpgsql.variable_conflict = 'whatever'; > AFAICS, that would remove the need to set variables in postgresql.conf > that can't be validated, and so we could just disallow it. No, that only fixes things for the case of setting a variable in the control file. It isn't useful for ALTER ROLE/DATABASE SET. And it still has the problem of forcing every process to load every extension, and as written it would also require the postmaster to read catalogs. > OTOH, maybe that's more trouble than can be justified by the size of > the problem. Yeah. regards, tom lane
Tom, Florian, >> On the downside, the current behaviour prevents problems if someone changes >> two interrelated GUCs, but makes a mistake at one of them. For example, >> someone might drastically lower bgwriter_delay but might botch the matching >> adjustment of bgwriter_lru_maxpages. > > That's a fair point, but the current behavior only saves you if the > botch is such that the new value is detectably invalid, as opposed to > say just a factor of 100 off from what you meant. Not sure that that's > all that helpful. Hmmm. As someone who often deploys pg.conf changes as part of a production code rollout, I actually like the "atomic" nature of updating postgresql.conf -- that is, all your changes succeed, or they all fail. If we add this feature, I'd want there to be an option which allows getting the current all-or-none behavior. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On sön, 2011-07-17 at 00:59 -0400, Tom Lane wrote: > Well, we *do* have a C API for that, of a sort. The problem is, what > do you do in processes that have not loaded the relevant extension? Those processes that have the extension loaded check the parameter settings in their namespace, those that don't ignore them.
Peter Eisentraut <peter_e@gmx.net> writes: > On sön, 2011-07-17 at 00:59 -0400, Tom Lane wrote: >> Well, we *do* have a C API for that, of a sort. The problem is, what >> do you do in processes that have not loaded the relevant extension? > Those processes that have the extension loaded check the parameter > settings in their namespace, those that don't ignore them. Then you don't have any meaningful reporting of whether you have entered valid values --- particularly not with the policy that only the postmaster makes logfile entries about bad values. It'd work but I don't think it's tremendously user-friendly. regards, tom lane
Josh Berkus <josh@agliodbs.com> writes: > Hmmm. As someone who often deploys pg.conf changes as part of a > production code rollout, I actually like the "atomic" nature of updating > postgresql.conf -- that is, all your changes succeed, or they all fail. If we actually *had* that, I'd agree with you. The problem is that it appears that we have such a behavior, but it fails to work that way in corner cases. My proposal is aimed at making the corner cases less corner-y, by adopting a uniform rule that each backend adopts all the changes it can. regards, tom lane
Florian Pflug <fgp@phlo.org> writes: > A variant of this would be to allow extensions (in the CREATE EXTENSION > sense) to declare custom GUCs in their control file. Then we'd only > need to load those files, which seems better than loading a shared > library. The original patch for the extensions had that feature. It had to be removed, though. The control file value has to be stored in the catalogs and now only backends can look that up once connected to a database. This part worked well. http://git.postgresql.org/gitweb/?p=postgresql-extension.git;a=commit;h=480fa10f4ec7b495cf4f434e6f44a50b94487326 http://git.postgresql.org/gitweb/?p=postgresql-extension.git;a=commit;h=98d802aa1ee12c77c5270c7bc9ed7479360f35b9 IIRC the problem is that now, the postmaster is not seeing cvc at all, and you can have there some custom parameters to set shared memory segments, which only postmaster will take care of. An example of that is the pg_stat_statements contrib. I would love that we find a way around that, btw. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Jul 16, 2011, at 9:55 PM, Tom Lane wrote: > I wrote: >> I think that it might be sensible to have the following behavior: > >> 1. Parse the file, where "parse" means collect all the name = value >> pairs. Bail out if we find any syntax errors at that level of detail. >> (With this patch, we could report some or all of the syntax errors >> first.) > >> 2. Tentatively apply the new custom_variable_classes setting if any. > >> 3. Check to see whether all the "name"s are valid. If not, report >> the ones that aren't and bail out. > >> 4. Apply each "value". If some of them aren't valid, report that, >> but continue, and apply all the ones that are valid. > >> We can expect that the postmaster and all backends will agree on the >> results of steps 1 through 3. They might differ as to the validity >> of individual values in step 4 (as per my example of a setting that >> depends on database_encoding), but we should never end up with a >> situation where a globally correct value is not globally applied. > Attached is my first attempt to implement your plan. Basically, I've reshuffled pieces of the ProcessConfigFile on top of my previous patch, dropped verification calls of set_config_option and moved the check for custom_variable_class existence right inside the loop that assigns new values to GUC variables. I'd think that removal of custom_variable_classes or setting it from the extensions could be a separate patch. I appreciate your comments and suggestions. > I thought some more about this, and it occurred to me that it's not that > hard to foresee a situation where different backends might have > different opinions about the results of step 3, ie, different ideas > about the set of valid GUC names. This could arise as a result of some > of them having a particular extension module loaded and others not. > > Right now, whether or not you have say plpgsql loaded will not affect > your ability to do "SET plpgsql.junk = foobar" --- as long as "plpgsql" > is listed in custom_variable_classes, we'll accept the command and > create a placeholder variable for plpgsql.junk. But it seems perfectly > plausible that we might someday try to tighten that up so that once a > module has done EmitWarningsOnPlaceholders("plpgsql"), we'll no longer > allow creation of new placeholders named plpgsql.something. If we did > that, we could no longer assume that all backends agree on the set of > legal GUC variable names. > > So that seems like an argument --- not terribly strong, but still an > argument --- for doing what I suggested next: > >> The original argument for the current behavior was to avoid applying >> settings from a thoroughly munged config file, but I think that the >> checks involved in steps 1-3 would be sufficient to reject files that >> had major problems. It's possible that step 1 is really sufficient to >> cover the issue, in which case we could drop the separate step-3 pass >> and just treat invalid GUC names as a reason to ignore the particular >> line rather than the whole file. That would make things simpler and >> faster, and maybe less surprising too. > > IOW, I'm now pretty well convinced that so long as the configuration > file is syntactically valid, we should go ahead and attempt to apply > each name = value setting individually, without allowing the invalidity > of any one name or value to prevent others from being applied. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
Hi Alexey, I was taking a quick look at this patch, and have a question for ya. I have a default config from initdb, there is a new setting at the end but its commented out. root@storm: /db/pg92 # /etc/rc.d/postgresql start Starting PostgreSQL: root@storm: /db/pg92 # more serverlog LOG: database system was shut down at 2011-09-06 22:30:17 CDT LOG: database system is ready to accept connections LOG: autovacuum launcher started root@storm: /db/pg92 # /etc/rc.d/postgresql reload Reload PostgreSQL: No directory, logging in with HOME=/ root@storm: /db/pg92 # more serverlog LOG: database system was shut down at 2011-09-06 22:30:17 CDT LOG: database system is ready to accept connections LOG: autovacuum launcher started LOG: received SIGHUP, reloading configuration files WARNING: errors detected while parsing configuration files WARNING: errors detected while parsing configuration files WARNING: errors detected while parsing configuration files WARNING: errors detected while parsing configuration files WARNING: errors detected while parsing configuration files I didn't edit the config, it was fine at startup, so why does reload upset it so? Also, what line is the warning for? If I edit postgresql.conf and just add bob at the last line, then reload: LOG: received SIGHUP, reloading configuration files LOG: syntax error in file "/db/pg92/postgresql.conf" line 570, near end of line FATAL: errors detected while parsing configuration files Where did the other warnings go? Its right though, line 570 is bad. It also seems to have killed the server. I have notgotten through the history of messages regarding this patch, but is it supposed to kill the server if there is a syntaxerror in the config file? -Andy
Andy Colson <andy@squeakycode.net> writes: > Where did the other warnings go? Its right though, line 570 is bad. It also seems to have killed the server. I havenot gotten through the history of messages regarding this patch, but is it supposed to kill the server if there is asyntax error in the config file? The historical behavior is that a configuration file error detected during postmaster startup should prevent the server from starting, but an error detected during reload should only result in a LOG message and the reload not occurring. I don't believe anyone will accept a patch that causes the server to quit on a failed reload. There has however been some debate about the exact extent of ignoring bad values during reload --- currently the theory is "ignore the whole file if anything is wrong", but there's some support for applying all non-bad values as long as the overall file syntax is okay. regards, tom lane
Hello, On Sep 7, 2011, at 5:00 PM, Tom Lane wrote: > Andy Colson <andy@squeakycode.net> writes: >> Where did the other warnings go? Its right though, line 570 is bad. It also seems to have killed the server. I havenot gotten through the history of messages regarding this patch, but is it supposed to kill the server if there is asyntax error in the config file? > > The historical behavior is that a configuration file error detected > during postmaster startup should prevent the server from starting, but > an error detected during reload should only result in a LOG message and > the reload not occurring. I don't believe anyone will accept a patch > that causes the server to quit on a failed reload. There has however > been some debate about the exact extent of ignoring bad values during > reload --- currently the theory is "ignore the whole file if anything is > wrong", but there's some support for applying all non-bad values as long > as the overall file syntax is okay. This patch actually aims to do the latter, applying all good values and reporting the bad ones. It removes the calls to set_config_optionwith changeVal = false from ProcessConfigFile, trying to apply every option at once and incrementing thewarnings counter if set_config_option returns false. After some investigation I've discovered that set_config_option returnsfalse even if a variable value is unchanged, but is applied in the wrong GucContext. The particular case is tryingto reload the configuration file variables during SIGHUP: if, say, shared_buffers are commented out, the call to set_config_optionwith changeVal = true will return false due to this code: > if (prohibitValueChange) > { > if (*conf->variable != newval) > ereport(elevel, > (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM), > errmsg("parameter \"%s\" cannot be changed without restarting the server", > name))); > return false; > } > Due to the above, set_config_option returns false for unchanged PGC_POSTMASTER variables during SIGHUP, so it's impossibleto distinguish between valid and non valid values and report the latter ones with a single call of this functionper var. What do you think about changing the code above to return true if the variable is actually unchanged? This explains the WARNINGs emitted during reload even for a pristine configuration file right after the installation. I'mlooking into why the server gets killed during reload if there is a parse error, it might be as well related to the problemabove. -- Alexey Klyukin http://www.commandprompt.com The PostgreSQL Company – Command Prompt, Inc.
Hi Andy, On Sep 7, 2011, at 6:40 AM, Andy Colson wrote: > Hi Alexey, I was taking a quick look at this patch, and have a question for ya. > ... > Where did the other warnings go? Its right though, line 570 is bad. It also seems to have killed the server. I havenot gotten through the history of messages regarding this patch, but is it supposed to kill the server if there is asyntax error in the config file? Thank you for looking at this patch. v4 was more a "what if" concept that took a lot of time for somebody to look at. Therewere a lot of problems with it, but I think I've nailed down most of them. Attached is v5. It should fix both problems you've experienced with v4. As with the current code, the startup process getsinterrupted on any error detected in the configuration file. Unlike the current code, the patch tries to report all ofthem before bailing out. The behavior during configuration reload has changed significantly. Instead of ignoring all changesafter the first error, the code reports the problematic value and continues. It only skips applying new values completelyon syntax errors and invalid configuration option names. In no cases should it bring the server down during reload. One problem I'm not sure how to address is the fact that we require 2 calls of set_config_option for each option, one toverify the new value and another to actually apply it. Normally, this function returns true for a valid value and falseif it is unable to set the new value for whatever reason (either the value is invalid, or the value cannot be changedin the context of the caller). However, calling it once per value in the 'apply' mode during reload produces falsefor every unchanged value that can only be changed during startup (i.e. shared_buffers, or max_connections). If we ignoreits return value, we'll lose the ability to detect whether invalid values were encountered during the reload and reportthis fact at the end of the function. I think the function should be changed, as described in my previous email (http://archives.postgresql.org/message-id/97A66029-9D3E-43AF-95AA-46FE1B447447@commandprompt.com)and I'd like to hear otheropinions on that. Meanwhile, due to 2 calls to set_config_option, it currently reports all invalid values twice. Ifothers will be opposed to changing the set_config_option, I'll fix this by removing the first, verification call and final'errors were detected' warning to avoid 'false positives' on that (i.e. the WARNING you saw with the previous versionfor the valid .conf). I'd appreciate your further comments and suggestions. Thank you. -- Alexey Klyukin http://www.commandprompt.com The PostgreSQL Company – Command Prompt, Inc.
Attachment
On 09/10/2011 11:39 AM, Alexey Klyukin wrote: > Hi Andy, > > On Sep 7, 2011, at 6:40 AM, Andy Colson wrote: > >> Hi Alexey, I was taking a quick look at this patch, and have a question for ya. >> > ... > >> Where did the other warnings go? Its right though, line 570 is bad. It also seems to have killed the server. I havenot gotten through the history of messages regarding this patch, but is it supposed to kill the server if there is asyntax error in the config file? > > > Thank you for looking at this patch. v4 was more a "what if" concept that took a lot of time for somebody to look at. Therewere a lot of problems with it, but I think I've nailed down most of them. > > Attached is v5. It should fix both problems you've experienced with v4. As with the current code, the startup process getsinterrupted on any error detected in the configuration file. Unlike the current code, the patch tries to report all ofthem before bailing out. The behavior during configuration reload has changed significantly. Instead of ignoring all changesafter the first error, the code reports the problematic value and continues. It only skips applying new values completelyon syntax errors and invalid configuration option names. In no cases should it bring the server down during reload. > > One problem I'm not sure how to address is the fact that we require 2 calls of set_config_option for each option, one toverify the new value and another to actually apply it. Normally, this function returns true for a valid value and falseif it is unable to set the new value for whatever reason (either the value is invalid, or the value cannot be changedin the context of the caller). However, calling it once per value in the 'apply' mode during reload produces falsefor every unchanged value that can only be changed during startup (i.e. shared_buffers, or max_connections). If we ignoreits return value, we'll lose the ability to detect whether invalid values were encountered during the reload and reportthis fact at the end of the function. I think the function should be changed, as described in my previous email (http://archives.postgresql.org/message-id/97A66029-9D3E-43AF-95AA-46FE1B447447@commandprompt.com)and I'd like to hear otheropinions on that. Meanwhile, due t o 2 calls to set_config_option, it currently reports all invalid values twice. If others will be opposed to changing theset_config_option, I'll fix this by removing the first, verification call and final 'errors were detected' warning toavoid 'false positives' on that (i.e. the WARNING you saw with the previous version for the valid .conf). > > I'd appreciate your further comments and suggestions. > > Thank you. > > -- > Alexey Klyukin http://www.commandprompt.com > The PostgreSQL Company – Command Prompt, Inc. > After a quick two minute test, this patch seems to work well. It does just what I think it should. I'll add it to the commitfestpage for ya. -Andy
On ons, 2011-09-07 at 10:00 -0400, Tom Lane wrote: > There has however > been some debate about the exact extent of ignoring bad values during > reload --- currently the theory is "ignore the whole file if anything is > wrong", but there's some support for applying all non-bad values as long > as the overall file syntax is okay. That could be a problem if you have some values that depend on each other, and then you change both of them, but because of an error only one gets applied. I think ACID(-like) changes is a feature, also on this level.
On Sep 12, 2011, at 10:24 PM, Peter Eisentraut wrote: > On ons, 2011-09-07 at 10:00 -0400, Tom Lane wrote: >> There has however >> been some debate about the exact extent of ignoring bad values during >> reload --- currently the theory is "ignore the whole file if anything is >> wrong", but there's some support for applying all non-bad values as long >> as the overall file syntax is okay. > > That could be a problem if you have some values that depend on each > other, and then you change both of them, but because of an error only > one gets applied. I think ACID(-like) changes is a feature, also on > this level. > I think exactly this argument has already been discussed earlier in this thread: http://archives.postgresql.org/message-id/21310D95-EB8D-4B15-A8BC-0F05505C6A34@phlo.org -- Alexey Klyukin http://www.commandprompt.com The PostgreSQL Company – Command Prompt, Inc.
Alexey Klyukin <alexk@commandprompt.com> writes: > Attached is v5. It should fix both problems you've experienced with v4. I've applied this patch after some additional hacking. > One problem I'm not sure how to address is the fact that we require 2 > calls of set_config_option for each option, one to verify the new > value and another to actually apply it. Normally, this function > returns true for a valid value and false if it is unable to set the > new value for whatever reason (either the value is invalid, or the > value cannot be changed in the context of the caller). However, > calling it once per value in the 'apply' mode during reload produces > false for every unchanged value that can only be changed during > startup (i.e. shared_buffers, or max_connections). I thought that the most reasonable solution here was to extend set_config_option to provide a three-valued result: applied, error, or not-applied-for-some-non-error-reason. So I did that, and then ripped out the first set of calls to set_config_option. That should make things a bit faster, as well as eliminating a set of corner cases where the "checking" call succeeds but then the real apply step fails anyway. I also got rid of the changes to ParseConfigFile's API. I thought those were messy and unnecessary, because there's no good reason not to define the parse-error limit as being so many errors per file. It's really only there to prevent the "config file contains War and Peace" syndrome anyway. regards, tom lane