Thread: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
Hi, Related SO question from '13: https://stackoverflow.com/questions/16333319/how-to-syntax-check-postgresql-config-files Peter Eisentraut answered: > There is no way to do this that is similar to apache2ctl. If you reload > the configuration files and there is a syntax error, the PostgreSQL > server will complain in the log and refuse to load the new file. > <...> > (Of course, this won't guard you against writing semantically wrong > things, but apache2ctl won't do that either.) So, at least one person in the history of the universe (besides me) has wanted a tool that does this. In addition to a simple syntax check, there's a bunch of "config wisdom" tidbits I've encountering, which is scattered through talks, commit messages, and mailing list discussion, and documentation notes (chapter 17, paragraph 12). These could be collected into a tool that: - Checks your configuration's syntax - Checks for semantically legal values ('read committed' not 'read_committed' ) - Warns of unrecognized keys ("'hot_standby' is not a valid GUC in v9.1"). - Is version-aware. - Serves as an "executable" form of past experience. - Describes the config problem in a structured way (as an expression, for example) - Includes a friendly, semi-verbose, description of the problem. - Describes ways to fix the problem, *right there*. - Is independent from server (but reuses the same parser), particularly any life-cycle commands such as restart. Users who adopt the tool will also seem more likely to contribute back coverage for any new pitfalls they fall into, which can yield feedback for developers and drive improvements in documentation. Those inclined can use the tool in their ops repo' CI. Two talk videos have been particularly valuable sources for example of configuration that can bite you: "...Lag - What's Wrong with My Slave?" (Samantha Billington, 2015) https://youtu.be/If22EwtCFKc "PostgreSQL Replication Tutorial"(Josh berkus,2015 ) https://youtu.be/GobQw9LMEaw What I've collected so far: - Quoting rules for recovery.conf and postgresql.conf are different - 'primary_conninfo' is visible to unprivileged users, so including a password is a security risk. - streaming replication + read slave should consider turning on "hot_standby_feedback". - replication_timeout < wal_receiver_status_interval is bad. - setting max_wal_senders too low so no streaming replication block backup with pg_basebackup - max_wal_senders without wal_level - recently on "weekly news": Fujii Masao pushed: Document that max_worker_processes must be high enough in standby. The setting values of some parameters including max_worker_processesmust be equal to or higher than the values on the master. However, previously max_worker_processes wasnot listed as such parameter in the document. So this commit adds it to that list. Back-patch to 9.4 where max_worker_processeswas added. http://git.postgresql.org/pg/commitdiff/1ea5ce5c5f204918b8a9fa6eaa8f3f1374aa8aec - turning on replication with default max_wal_senders =0 - FATAL: WAL archival (archive_mode=on) requires wal_level "archive", "hot_standby", or "logical" There must be more, please mention any other checks you feel should be included. Note: The server _will_ explicitly complain about the last two but a bad "wal_level" for example is only checked once the server is already down: "LOG: parameter "wal_level" cannot be changed without restarting the server" Implementation: without getting too far ahead, I feel rules should be stored as independent files in a drop directory, in some lightweight, structured format (JSON, YAML, custom textual format), with some mandatory fields (version, severity, description, solution(s)). This makes it easy for people to add new checks without making any oblique language demands (Perl isn't as popular as it used to be) Comments? Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Tom Lane
Date:
Amir Rohan <amir.rohan@zoho.com> writes: > Comments? ISTM that all of the "functional" parts of this are superseded by pg_file_settings; or at least, if they aren't, you need to provide a rationale that doesn't consist only of pointing to pre-9.5 discussions. The "advice" parts of it maybe are still useful, but a tool that's just meant for that would look quite a bit different. Maybe what you're really after is to update pgtune. Also, as a general comment, the sketch you provided seems to require porting everything the server knows about GUC file syntax, file locations, variable names and values into some other representation that apparently is not C, nor Perl either. I think that's likely to be impossible to keep accurate or up to date. Just as a thought experiment, ask yourself how you'd validate the TimeZone setting in external code, bearing in mind that anytime you don't give exactly the same answer the server would, your tool has failed to be helpful. regards, tom lane
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/08/2015 02:38 PM, Tom Lane wrote: > Amir Rohan <amir.rohan@zoho.com> writes: >> Comments? > > ISTM that all of the "functional" parts of this are superseded by > pg_file_settings; To use the new pg_file_settings view you need: 1( 9.5+ 2( a running server The feature is describes as follows in a97e0c3354ace5d74 > The information returned includes the configuration file name, line > number in that file, sequence number indicating when the parameter is > loaded )useful to see if it is later masked by another definition of > the same parameter(, parameter name, and what it is set to at that > point. This information is updated on reload of the server. None of which has much overlap in purpose with what I was suggesting, so I don't see why you think it actually makes it redundant. > or at least, if they aren't, you need to provide a > rationale that doesn't consist only of pointing to pre-9.5 discussions. The original rationale already does AFAICT. > The "advice" parts of it maybe are still useful, but a tool that's just > meant for that would look quite a bit different. Maybe what you're really > after is to update pgtune. > In the sense that pgtune processes .conf files and helps the user choose better settings, yes there's commonality. But in that it looks at 5 GUCs or so and applies canned formulas to silently write out a new .conf without explanation, it's not the same tool. > Also, as a general comment, the sketch you provided seems to require > porting everything the server knows about > GUC file syntax, yes, a parser, possibly sharing code with the server. > file locations, specified via command line option, yes. > variable names and values misc/guc.c is eminently parseable in that sense. For some types, validation is trivial. For others, we do the best we can and improve incrementally. > into some other representation that apparently > is not C, nor Perl either. Well, it's data, it probably should have that anyway. You raise an interesting issue that having the schema for the configuration described in a more congenial format than C code would have made implementing this easier, and may actually be worthwhile in its own right. Aside on parsing, when starting a brand new configuration file was suggested in CAOG9ApHYCPmTypAAwfD3_V7sVOkbnECFivmRc1AxhB40ZBSwNQ@mail.gmail.com/ only so json could be used for configuring that feature, I wrote a parser that extends postgresql.conf to accept JSON objects as values as a toy exercise. It's not rocket science. So, None of the above seem like a major hurdle to me. > I think that's likely to be impossible to keep accurate or up to date. If significant rot sets in a few releases down the line, there may be more false positives. But then again if users found it useful so developers cared about keeping it up to data, it wouldn't get that way. Again, previous note about DRY and lack of explicit schema for GUCs except in code form. Also, this depends crucially on the GUC churn rate, which admittedly you are far better qualified to pronounce on than me. > Just as a thought experiment, > ask yourself how you'd validate the TimeZone setting in external code, bearing in mind that > anytime you don't give exactly the same answer the server would, your tool > has failed to be helpful. A tool may be extremely useful for a hundred checks, and poor in a few others. That would make it imperfect, not useless. If TimeZone turns out to be an extremely challenging GUC to validate, I'd unceremoniously skip the semantic check for it. Kind Regards, Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Robert Haas
Date:
On Thu, Oct 8, 2015 at 7:07 AM, Amir Rohan <amir.rohan@zoho.com> wrote: > In addition to a simple syntax check, there's a bunch of "config wisdom" > tidbits I've encountering, which is scattered through talks, commit > messages, and mailing list discussion, and documentation notes > (chapter 17, paragraph 12). These could be collected into a tool that: > > - Checks your configuration's syntax > - Checks for semantically legal values ('read committed' not > 'read_committed' ) > - Warns of unrecognized keys ("'hot_standby' is not a valid GUC in v9.1"). > - Is version-aware. > - Serves as an "executable" form of past experience. > - Describes the config problem in a structured way (as an expression, > for example) > - Includes a friendly, semi-verbose, description of the problem. > - Describes ways to fix the problem, *right there*. > - Is independent from server (but reuses the same parser), particularly > any life-cycle commands such as restart. Sounds reasonable. I don't know whether or not we would accept this into core, but I can certainly see it being a worthwhile effort. I'd expect to spend a lot of time figuring out which rules you want to enforce. > - Quoting rules for recovery.conf and postgresql.conf are different I believe this is no longer true. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Robert Haas
Date:
On Thu, Oct 8, 2015 at 9:06 AM, Amir Rohan <amir.rohan@zoho.com> wrote: > On 10/08/2015 02:38 PM, Tom Lane wrote: >> Amir Rohan <amir.rohan@zoho.com> writes: >>> Comments? >> >> ISTM that all of the "functional" parts of this are superseded by >> pg_file_settings; > > To use the new pg_file_settings view you need: > 1( 9.5+ > 2( a running server > > The feature is describes as follows in a97e0c3354ace5d74 > >> The information returned includes the configuration file name, line >> number in that file, sequence number indicating when the parameter is >> loaded )useful to see if it is later masked by another definition of >> the same parameter(, parameter name, and what it is set to at that >> point. This information is updated on reload of the server. > > None of which has much overlap in purpose with what I was suggesting, so > I don't see why you think it actually makes it redundant. I think Tom's point is that if you want to see whether the file reloads without errors, you can use this view for that. That would catch stuff like wal_level=lgocial and default_transaction_isolation='read_committed'. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/09/2015 09:55 PM, Robert Haas wrote: > On Thu, Oct 8, 2015 at 9:06 AM, Amir Rohan <amir.rohan@zoho.com> wrote: >> On 10/08/2015 02:38 PM, Tom Lane wrote: >>> Amir Rohan <amir.rohan@zoho.com> writes: >>>> Comments? >>> >>> ISTM that all of the "functional" parts of this are superseded by >>> pg_file_settings; >> >> To use the new pg_file_settings view you need: >> 1( 9.5+ >> 2( a running server >> >> The feature is describes as follows in a97e0c3354ace5d74 >> >>> The information returned includes the configuration file name, line >>> number in that file, sequence number indicating when the parameter is >>> loaded )useful to see if it is later masked by another definition of >>> the same parameter(, parameter name, and what it is set to at that >>> point. This information is updated on reload of the server. >> >> None of which has much overlap in purpose with what I was suggesting, so >> I don't see why you think it actually makes it redundant. > > I think Tom's point is that if you want to see whether the file > reloads without errors, you can use this view for that. That would > catch stuff like wal_level=lgocial and > default_transaction_isolation='read_committed'. > It does catch bad syntax, but in most cases all you get is "The setting could not be applied". that's not great for enums or a float instead of an int. I guess a future version will fix that (or not). You need a running server to run a check. You need to monkey with said server's configuration in place to run a check. You must be on 9.5+. The checking mechanism isn't extensible. Certainly not as easily as dropping a new rule file somewhere. It doesn't check (AFAICT) for bad combinations of values, for example it will tell you that you can't change `wal_archive` without restart (without showing source location btw, bug?), but not that you better set `wal_level` *before* you restart. It doesn't do any semantic checks. It won't warn you about things that are not actually an error, just a bad idea. Forgive me if any of the above betrays an ignorance of some of pg_file_settings's finer points. I have only read the documentation and tried it in a shell. There was also concern about the prohibitive maintenance burden such a tool would impose. ISTM all you actually need is a JSON file generated from the output of `select * from pg_settings` to make the really tedious bit completely automatic. The semantic checks are by their nature "best effort", and it's my impression (and only that) that they are more stable, in the sense that bad ideas remain so. That's why I disagree with tom's suggestion that pg_file_settings obviates the need for the tool I proposed. Which isn't at all to say it doesn't solve another very well. Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Robert Haas
Date:
On Fri, Oct 9, 2015 at 4:38 PM, Amir Rohan <amir.rohan@zoho.com> wrote: > It does catch bad syntax, but in most cases all you get is > "The setting could not be applied". that's not great for enums > or a float instead of an int. I guess a future version will fix that > (or not). I expect we would consider patches to improve the error messages if you (or someone else) wanted to propose such. But you don't have to want to do that. > You need a running server to run a check. You need to monkey > with said server's configuration in place to run a check. You must be on > 9.5+. The checking mechanism isn't extensible. Certainly not as easily > as dropping a new rule file somewhere. It doesn't check (AFAICT) for bad > combinations of values, for example it will tell you that you can't > change `wal_archive` without restart (without showing source location > btw, bug?), but not that you better set `wal_level` *before* you > restart. It doesn't do any semantic checks. It won't warn you > about things that are not actually an error, just a bad idea. So, I'm not saying that a config checker has no value. In fact, I already said the opposite. You seem to be jumping all over me here when all I was trying to do is explain what I think Tom was getting at. I *do* think that pg_file_settings is a helpful feature that is certainly related to what you are trying to do, but I don't think that it means that a config checker is useless. Fair? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/13/2015 02:02 AM, Robert Haas wrote: > On Fri, Oct 9, 2015 at 4:38 PM, Amir Rohan <amir.rohan@zoho.com> wrote: >> It does catch bad syntax, but in most cases all you get is >> "The setting could not be applied". that's not great for enums >> or a float instead of an int. I guess a future version will fix that >> (or not). > > I expect we would consider patches to improve the error messages if > you (or someone else) wanted to propose such. But you don't have to > want to do that. > >> You need a running server to run a check. You need to monkey >> with said server's configuration in place to run a check. You must be on >> 9.5+. The checking mechanism isn't extensible. Certainly not as easily >> as dropping a new rule file somewhere. It doesn't check (AFAICT) for bad >> combinations of values, for example it will tell you that you can't >> change `wal_archive` without restart (without showing source location >> btw, bug?), but not that you better set `wal_level` *before* you >> restart. It doesn't do any semantic checks. It won't warn you >> about things that are not actually an error, just a bad idea. > > So, I'm not saying that a config checker has no value. In fact, I > already said the opposite. You seem to be jumping all over me here > when all I was trying to do is explain what I think Tom was getting > at. I *do* think that pg_file_settings is a helpful feature that is > certainly related to what you are trying to do, but I don't think that > it means that a config checker is useless. Fair? > That wasn't my intention. Perhaps I'm overreacting to a long-standing "Tom Lane's bucket of cold water" tradition. I'm new here. I understand your point and I was only reiterating what in particular makes the conf checker distinctly useful IMO, and what it could provide that pg_settings doesn't. I've looked at parts of the pg_settings implementation and indeed some of that code (and legwork) could be reused so the mundane parts of writing this will be less hassle. I might have missed that if Tom and you hadn't pointed that out. So, Fair, and thanks. Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Robert Haas
Date:
On Mon, Oct 12, 2015 at 8:01 PM, Amir Rohan <amir.rohan@zoho.com> wrote: > That wasn't my intention. Perhaps I'm overreacting to a long-standing > "Tom Lane's bucket of cold water" tradition. I'm new here. > I understand your point and I was only reiterating what in particular > makes the conf checker distinctly useful IMO, and what it could > provide that pg_settings doesn't. > > I've looked at parts of the pg_settings implementation and indeed some > of that code (and legwork) could be reused so the mundane parts > of writing this will be less hassle. I might have missed that if Tom and > you hadn't pointed that out. > > So, Fair, and thanks. No worries. I'm not upset with you, and I see where you're coming from. But I since I'm trying to be helpful I thought I would check whether it's working. Sounds like yes, which is nice. It would be spiffy if we could use the same config-file parser from outside postgres itself, but it seems hard. I assume the core lexer and parser could be adapted to work from libcommon with some non-enormous amount of effort, but check-functions are can and do assume that they are running in a backend environment; one would lose a lot of sanity-checking if those couldn't be executed, and checking GUCs created by loadable modules seems impossible. Still, even a partial move toward making this code accessible in frontend code would be welcome from where I sit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/13/2015 09:20 PM, Robert Haas wrote: > On Mon, Oct 12, 2015 at 8:01 PM, Amir Rohan <amir.rohan@zoho.com> wrote: >> That wasn't my intention. Perhaps I'm overreacting to a long-standing >> "Tom Lane's bucket of cold water" tradition. I'm new here. >> I understand your point and I was only reiterating what in particular >> makes the conf checker distinctly useful IMO, and what it could >> provide that pg_settings doesn't. >> >> I've looked at parts of the pg_settings implementation and indeed some >> of that code (and legwork) could be reused so the mundane parts >> of writing this will be less hassle. I might have missed that if Tom and >> you hadn't pointed that out. >> >> So, Fair, and thanks. > > No worries. I'm not upset with you, and I see where you're coming > from. But I since I'm trying to be helpful I thought I would check > whether it's working. Sounds like yes, which is nice. > > It would be spiffy if we could use the same config-file parser from > outside postgres itself, but it seems hard. I assume the core lexer > and parser could be adapted to work from libcommon with some > non-enormous amount of effort, but check-functions are can and do > assume that they are running in a backend environment; one would lose > a lot of sanity-checking if those couldn't be executed, I've been considering that. Reusing the parser would ensure no errors are introduces by having a different implementation, but on the other hand involving the pg build in installation what's intended as a lightweight, independent tool would hurt. Because it's dubious whether this will end up in core, I'd like "pip install pg_confcheck" to be all that is required. Also, anything short of building the tool in tree with an unmodified parser amounts to forking the parser code and maintaining it along with upstream, per version, etc'. I'm not eager to do that. > and checking GUCs created by loadable modules seems impossible. Still, even a > partial move toward making this code accessible in frontend code would > be welcome from where I sit. > Dumping the output of the pg_settings view into json has already provided all the type/enum information needed to type-check values and flag unrecognized GUCs. I don't really see that as involving a heroic amount of work, the language seems extremely simple. There aren't that many types, type-checking them isn't that much work, and once that's written the same checks should apply to all GUCs registered with the server, assuming the pg_settings view is exhaustive in that sense. Any modules outside the main server can provide their own data in a similar format, if it comes to that. I doubt whether such a tool has that much appeal, especially if it isn't bundled with the server. So, I'll think about this some more, and write up a description of how I think it's going to look for some feedback. Then I'll code something. Regards, Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Alvaro Herrera
Date:
Amir Rohan wrote: > I've been considering that. Reusing the parser would ensure no errors > are introduces by having a different implementation, but on the other > hand involving the pg build in installation what's intended as a > lightweight, independent tool would hurt. > Because it's dubious whether this will end up in core, I'd like > "pip install pg_confcheck" to be all that is required. Maybe just compile a single file in a separate FRONTEND environment? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 12:14 AM, Alvaro Herrera wrote: > Amir Rohan wrote: > >> I've been considering that. Reusing the parser would ensure no errors >> are introduces by having a different implementation, but on the other >> hand involving the pg build in installation what's intended as a >> lightweight, independent tool would hurt. >> Because it's dubious whether this will end up in core, I'd like >> "pip install pg_confcheck" to be all that is required. > > Maybe just compile a single file in a separate FRONTEND environment? > You mean refactoring the postgres like rhass means? could you elaborate? I believe most people get pg as provided by their distro or PaaS, and not by compiling it. Involving a pg build environment is asking a whole lot of someone who has pg all installed and who just wants to lint their conf. It's also legitimate for someone to be configuring postgres without having a clue how to compile it. If this was in core, this wouldn't be an issue because then packages would also include the tool. Note I'm not actually pushing for that, it's that that has implications on what can be assumed as available.
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Alvaro Herrera
Date:
Amir Rohan wrote: > On 10/14/2015 12:14 AM, Alvaro Herrera wrote: > > Amir Rohan wrote: > > > >> I've been considering that. Reusing the parser would ensure no errors > >> are introduces by having a different implementation, but on the other > >> hand involving the pg build in installation what's intended as a > >> lightweight, independent tool would hurt. > >> Because it's dubious whether this will end up in core, I'd like > >> "pip install pg_confcheck" to be all that is required. > > > > Maybe just compile a single file in a separate FRONTEND environment? > > You mean refactoring the postgres like rhass means? could you elaborate? > > I believe most people get pg as provided by their distro or PaaS, > and not by compiling it. I mean the utility would be built by using a file from the backend source, just like pg_xlogdump does. We have several such cases. I don't think this is impossible to do outside core, either. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Andres Freund
Date:
On October 13, 2015 11:14:19 PM GMT+02:00, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >Amir Rohan wrote: > >> I've been considering that. Reusing the parser would ensure no errors >> are introduces by having a different implementation, but on the other >> hand involving the pg build in installation what's intended as a >> lightweight, independent tool would hurt. >> Because it's dubious whether this will end up in core, I'd like >> "pip install pg_confcheck" to be all that is required. > >Maybe just compile a single file in a separate FRONTEND environment? Maybe I'm missing something here - but doesn't the posted binary do nearly all of this for you already? There's the optionto display the value of a config option, and that checks the validity of the configuration. Might need to add an optionto validate hba.conf et al as well and allow to display all values. That should be pretty simple? --- Please excuse brevity and formatting - I am writing this on my mobile phone.
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 01:12 AM, Alvaro Herrera wrote: > Amir Rohan wrote: >> On 10/14/2015 12:14 AM, Alvaro Herrera wrote: >>> Amir Rohan wrote: >>> >>>> I've been considering that. Reusing the parser would ensure no errors >>>> are introduces by having a different implementation, but on the other >>>> hand involving the pg build in installation what's intended as a >>>> lightweight, independent tool would hurt. >>>> Because it's dubious whether this will end up in core, I'd like >>>> "pip install pg_confcheck" to be all that is required. >>> >>> Maybe just compile a single file in a separate FRONTEND environment? >> >> You mean refactoring the postgres like rhass means? could you elaborate? >> >> I believe most people get pg as provided by their distro or PaaS, >> and not by compiling it. > > I mean the utility would be built by using a file from the backend > source, just like pg_xlogdump does. We have several such cases. > I don't think this is impossible to do outside core, either. > I've considered "vendoring", but it seems like enough code surgery be involved to make this very dubious "reuse". The language is simple enough that writing a parser from scratch isn't a big deal hard, and there doesn't seem much room for divergent parsing either. So, the only question is whether reusing the existing parser will brings along some highly useful functionality beyond an AST and a battle-tested validator for bools, etc'. I'm not ruling anything out yet, though. Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 01:16 AM, Andres Freund wrote: > On October 13, 2015 11:14:19 PM GMT+02:00, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> Amir Rohan wrote: >> >>> I've been considering that. Reusing the parser would ensure no errors >>> are introduces by having a different implementation, but on the other >>> hand involving the pg build in installation what's intended as a >>> lightweight, independent tool would hurt. >>> Because it's dubious whether this will end up in core, I'd like >>> "pip install pg_confcheck" to be all that is required. >> >> Maybe just compile a single file in a separate FRONTEND environment? > > Maybe I'm missing something here - but doesn't the posted binary do nearly all of this for you already? There's the optionto display the value of a config option, and that checks the validity of the configuration. Might need to add an optionto validate hba.conf et al as well and allow to display all values. That should be pretty simple? > Andres, please see upthread for quite a bit on what it doesn't do, and why having it in the server is both an advantages and a shortcoming.
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Andres Freund
Date:
On 2015-10-14 01:54:46 +0300, Amir Rohan wrote: > Andres, please see upthread for quite a bit on what it doesn't do, and > why having it in the server is both an advantages and a shortcoming. As far as I have skimmed the thread it's only talking about shortcoming in case it requires a running server. Which -C doesn't. I don't think there's any fundamental difference between some external binary parsing & validating the config file and the postgres binary doing it. There *is* the question of the utility being able to to process options from multiple major releases, but I don't think that's a particularly worthwhile goal here. Greetings, Andres Freund
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 01:30 PM, Andres Freund wrote: > On 2015-10-14 01:54:46 +0300, Amir Rohan wrote: >> Andres, please see upthread for quite a bit on what it doesn't do, and >> why having it in the server is both an advantages and a shortcoming. > > As far as I have skimmed the thread it's only talking about shortcoming > in case it requires a running server. Which -C doesn't. > That's helpful, no one has suggested -C yet. This was new to me. Here's what the usage says: Usage: postgres [OPTION]... Options: -C NAME print value of run-time parameter, then exit ``` P1. The fact that -C will warn about bad values when you query for an unrelated name (as it requires you to do) is cunningly and successfully elided. I expected to have to check every one individually. How about a no-argument version of "-C"? P2. Next, that it will print the value of a "run-time" parameter, without an actual process is another nice surprise, which I wouldn't have guessed. The error messages are all one could wish for: LOG: invalid value for parameter "wal_level": "archive2" HINT: Available values: minimal, archive, hot_standby, logical. LOG: parameter "fsync" requires a Boolean value *and* it prints everything it finds wrong, not stopping at the first one. very nice. it does fail the "dependent options" test: $ postgres -C "archive_mode" on $ postgres -C wal_level minimal no errors, great, let's try it: $ pg_ctl restart FATAL: WAL archival cannot be enabled when wal_level is "minimal" > I don't think there's any fundamental difference between some external > binary parsing & validating the config file and the postgres binary > doing it. -C does a much better job then pg_file_settings for this task, I agree. Still, it doesn't answer the already mentioned points of: 1) You need a server (binary, not a running server) to check a config (reasonable), with a version matching what you'll run on (again, reasonable) 2) It doesn't catch more complicated problems like dependent options. 3) It doesn't catch bad ideas, only errors. 4) You can't easily extend the checks performed, without forking postgres or going through the (lengthy, rigorous) cf process. 5) Because it checks syntax only, you don't get the benefits of having an official place for the community to easily contribute rules that warn you against config pitfalls, so that all users benefits. See my OP for real-world examples and links about this problem. > There *is* the question of the utility being able to to > process options from multiple major releases, but I don't think that's a > particularly worthwhile goal here. > For one thing, I think it's been made clear that either few people know about -C or few use it as part of their usual workflow. That in itself is an issue. Any bloggers reading this? Next, you may not agree semantic checks/advice is useful. I only agree it's easy to dismiss until it saves your (i.e. a user's) ass that first time. I would *highly* recommend the talk mentioned in the OP: "...Lag - What's Wrong with My Slave?" (Samantha Billington, 2015) https://youtu.be/If22EwtCFKc Not just for the concrete examples (*you* know those by heart), but to really hear the frustration in a real production user's voice when they deploy pg in production, hit major operational difficulties, spend lots of time and effort trying to root-cause and finally realize they simply needed to "turn on FOO". Most of these problem can be caught very easily with a conf linter. Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Andres Freund
Date:
On 2015-10-14 16:17:55 +0300, Amir Rohan wrote: > it does fail the "dependent options" test: > $ postgres -C "archive_mode" > on > $ postgres -C wal_level > minimal Yea, because that's currently evaluated outside the config mechanism. It'd imo would be good to change that independent of this thread. > 5) Because it checks syntax only, you don't get the benefits of having > an official place for the community to easily contribute rules that > warn you against config pitfalls, so that all users benefits. > See my OP for real-world examples and links about this problem. I don't think we as a community want to do that without review mechanisms in place, and I personally don't think we want to add separate processes for this. Greetings, Andres Freund
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 04:24 PM, Andres Freund wrote: > On 2015-10-14 16:17:55 +0300, Amir Rohan wrote: >> it does fail the "dependent options" test: >> $ postgres -C "archive_mode" >> on >> $ postgres -C wal_level >> minimal > > Yea, because that's currently evaluated outside the config > mechanism. It'd imo would be good to change that independent of this > thread. > I agree. >> 5) Because it checks syntax only, you don't get the benefits of having >> an official place for the community to easily contribute rules that >> warn you against config pitfalls, so that all users benefits. >> See my OP for real-world examples and links about this problem. > > I don't think we as a community want to do that without review > mechanisms in place, and I personally don't think we want to add > separate processes for this. > That's what "contribute" means in my book. I'm getting mixed signals about what the "community" wants. I certainly think if adding rules involves modifying the postgres server code, that is far too high a bar and no one will. I'm not sure what you mean by "separate process". My original pitch was to have this live in core or contrib, and no one wanted it. If you don't want it in core, but people thinks its a good idea to have (with review), what would you suggest? Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Andres Freund
Date:
On 2015-10-14 16:50:41 +0300, Amir Rohan wrote: > > I don't think we as a community want to do that without review > > mechanisms in place, and I personally don't think we want to add > > separate processes for this. > > > > That's what "contribute" means in my book. Then your argument about the CF process doesn't seem to make sense.
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 05:35 PM, Andres Freund wrote: > On 2015-10-14 16:50:41 +0300, Amir Rohan wrote: >>> I don't think we as a community want to do that without review >>> mechanisms in place, and I personally don't think we want to add >>> separate processes for this. >>> >> >> That's what "contribute" means in my book. > > Then your argument about the CF process doesn't seem to make sense. > Why? I ask again, what do you mean by "separate process"? either it's in core (and follows its processes) or it isn't. But you can't say you don't want it in core but that you also don't want it to follow a "separate process". I think you're simply saying you're -1 on the whole idea. ok. Regards, Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Andres Freund
Date:
On 2015-10-14 17:46:25 +0300, Amir Rohan wrote: > On 10/14/2015 05:35 PM, Andres Freund wrote: > > Then your argument about the CF process doesn't seem to make sense. > Why? I ask again, what do you mean by "separate process"? Not going through the CF and normal release process. > either it's in core (and follows its processes) or it isn't. But you > can't say you don't want it in core but that you also don't > want it to follow a "separate process". Oh for crying out loud. You write: > 4) You can't easily extend the checks performed, without forking > postgres or going through the (lengthy, rigorous) cf process. and > > I don't think we as a community want to do that without review > > mechanisms in place, and I personally don't think we want to add > > separate processes for this. > That's what "contribute" means in my book. I don't see how those two statements don't conflict.
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 05:55 PM, Andres Freund wrote: > On 2015-10-14 17:46:25 +0300, Amir Rohan wrote: >> On 10/14/2015 05:35 PM, Andres Freund wrote: >>> Then your argument about the CF process doesn't seem to make sense. > >> Why? I ask again, what do you mean by "separate process"? > > Not going through the CF and normal release process. > >> either it's in core (and follows its processes) or it isn't. But you >> can't say you don't want it in core but that you also don't >> want it to follow a "separate process". > > Oh for crying out loud. You write: > Andres, I'm not here looking for ways to quibble with you. So, please "assume good faith". >> 4) You can't easily extend the checks performed, without forking >> postgres or going through the (lengthy, rigorous) cf process. > > and > >>> I don't think we as a community want to do that without review >>> mechanisms in place, and I personally don't think we want to add >>> separate processes for this. > >> That's what "contribute" means in my book. > > I don't see how those two statements don't conflict. > Right. I was saying that "contribute" always implies review before acceptance, responding to the first half of your sentence. The second half assumes it makes sense to discuss "review process" as a separate issue from inclusion in core. It does not make sense, and I said so. If you have a bone to pick with my comment about CF review being lengthy, the points was that an independent tool can move more quickly to accept submissions because: 1. there's less at stake 2. if it's written in a higher level language, enhancements are easier. Those don't hold when adding another check involves changes to the `postgres` binary. Fair? Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
David Fetter
Date:
On Wed, Oct 14, 2015 at 01:52:21AM +0300, Amir Rohan wrote: > On 10/14/2015 01:12 AM, Alvaro Herrera wrote: > > Amir Rohan wrote: > >> On 10/14/2015 12:14 AM, Alvaro Herrera wrote: > >>> Amir Rohan wrote: > >>> > >>>> I've been considering that. Reusing the parser would ensure no errors > >>>> are introduces by having a different implementation, but on the other > >>>> hand involving the pg build in installation what's intended as a > >>>> lightweight, independent tool would hurt. > >>>> Because it's dubious whether this will end up in core, I'd like > >>>> "pip install pg_confcheck" to be all that is required. > >>> > >>> Maybe just compile a single file in a separate FRONTEND environment? > >> > >> You mean refactoring the postgres like rhass means? could you elaborate? > >> > >> I believe most people get pg as provided by their distro or PaaS, > >> and not by compiling it. > > > > I mean the utility would be built by using a file from the backend > > source, just like pg_xlogdump does. We have several such cases. > > I don't think this is impossible to do outside core, either. > > I've considered "vendoring", but it seems like enough code surgery > be involved to make this very dubious "reuse". The language is simple > enough that writing a parser from scratch isn't a big deal hard, and > there doesn't seem much room for divergent parsing either. Such room as there is seems worth eliminating if possible. There's even a formal name for this issue, which attackers can use, although the implications as a source of subtle bugs in the absence of an attacker seem more pertinent right now. https://www.google.com/?q=parse%20tree%20differential%20attack > So, the only question is whether reusing the existing parser will > brings along some highly useful functionality beyond an AST and > a battle-tested validator for bools, etc'. I'm not ruling anything > out yet, though. I suspect that having a single source parser, however painful now, will pay large dividends later. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
On 10/14/2015 07:41 PM, David Fetter wrote: >> On Wed, Oct 14, 2015 at 01:52:21AM +0300, Amir Rohan wrote: >> >> I've considered "vendoring", but it seems like enough code surgery >> be involved to make this very dubious "reuse". The language is simple >> enough that writing a parser from scratch isn't a big deal hard, and >> there doesn't seem much room for divergent parsing either. > > Such room as there is seems worth eliminating if possible. IMO, not If it means the tool will thus dodge minor potential for harmless bugs, but becomes far more difficult to install in the process. > There's even a formal name for this issue, which attackers can use, although > the implications as a source of subtle bugs in the absence of an > attacker seem more pertinent right now. > > https://www.google.com/?q=parse%20tree%20differential%20attack > Interesting, but the security implications of a "parser attack" or indeed different parsing behavior in some corner cases, are roughly nil in this case. >> So, the only question is whether reusing the existing parser will >> brings along some highly useful functionality beyond an AST and >> a battle-tested validator for bools, etc'. I'm not ruling anything >> out yet, though. > > I suspect that having a single source parser, however painful now, > will pay large dividends later. > I don't think anything is more important for a productivity tool then being easily installed and easy to use. It's a fact of life that all software will have bugs. If they matter, you fix them. Still, if there are more concrete benefits to adapting the parser, such as batteries-included features I'm simply unaware of, I'd love to hear about it. Thanks, Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Alvaro Herrera
Date:
Amir Rohan wrote: > it does fail the "dependent options" test: > $ postgres -C "archive_mode" > on > $ postgres -C wal_level > minimal > > no errors, great, let's try it: > $ pg_ctl restart > > FATAL: WAL archival cannot be enabled when wal_level is "minimal" This complaint could be fixed we had a --check-config that runs the check hook for every variable, I think. I think that closes all the syntax checks you want; then your tool only needs to worry about semantic checks, and can obtain the values by repeated application of -C <param> (or, more conveniently, have a new -C mode that takes no args and prints the names and values for all parameters). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Andres Freund
Date:
On October 14, 2015 7:45:53 PM GMT+02:00, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >Amir Rohan wrote: > >> it does fail the "dependent options" test: >> $ postgres -C "archive_mode" >> on >> $ postgres -C wal_level >> minimal >> >> no errors, great, let's try it: >> $ pg_ctl restart >> >> FATAL: WAL archival cannot be enabled when wal_level is "minimal" > >This complaint could be fixed we had a --check-config that runs the >check hook for every variable, I think. The problem is that this, and some others, invariant is checked outside the GUC framework. Which we should probably change,which IIRC will require some new infrastructure. Andres --- Please excuse brevity and formatting - I am writing this on my mobile phone.
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Amir Rohan
Date:
IOn 10/14/2015 08:45 PM, Alvaro Herrera wrote: > Amir Rohan wrote: > >> it does fail the "dependent options" test: >> $ postgres -C "archive_mode" >> on >> $ postgres -C wal_level >> minimal >> >> no errors, great, let's try it: >> $ pg_ctl restart >> >> FATAL: WAL archival cannot be enabled when wal_level is "minimal" > > This complaint could be fixed we had a --check-config that runs the > check hook for every variable, I think. I think that closes all the > syntax checks you want; then your tool only needs to worry about > semantic checks, and can obtain the values by repeated application of -C > <param> (or, more conveniently, have a new -C mode that takes no args > and prints the names and values for all parameters). > That sounds reasonable and convenient. It's important to have one tool that covers everything, so I'd have to call "-C" and parse the errors, but that seems possible. If there was a flag which produced something more like the output of the pg_settings view in a structured format, as well as the errors, that would be ideal. And possibly useful for other tool building. Would such a thing be contrary to the pg spirit? Amir
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Robert Haas
Date:
On Wed, Oct 14, 2015 at 6:30 AM, Andres Freund <andres@anarazel.de> wrote: > On 2015-10-14 01:54:46 +0300, Amir Rohan wrote: >> Andres, please see upthread for quite a bit on what it doesn't do, and >> why having it in the server is both an advantages and a shortcoming. > > As far as I have skimmed the thread it's only talking about shortcoming > in case it requires a running server. Which -C doesn't. > > I don't think there's any fundamental difference between some external > binary parsing & validating the config file and the postgres binary > doing it. There *is* the question of the utility being able to to > process options from multiple major releases, but I don't think that's a > particularly worthwhile goal here. But if you want to write something like pgtune - in addition to that particular thing, EDB has several tools that do this kind of stuff - then it's either got to be part of the server, which is not viable unless you're (ahem) prepared to maintain a fork of the server in perpetuity - or it's got to be somewhere else, in which case you've got to write your own lexer and parser for our config-file format. Now fortunately that format isn't crazy complicated, but this wheel has been reinvented multiple times, and will be again. Making it possible for people to use ours rather than rolling their own would be a good thing, IMHO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
From
Peter Eisentraut
Date:
On 10/14/15 1:50 PM, Andres Freund wrote: > On October 14, 2015 7:45:53 PM GMT+02:00, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> Amir Rohan wrote: >> >>> it does fail the "dependent options" test: >>> $ postgres -C "archive_mode" >>> on >>> $ postgres -C wal_level >>> minimal >>> >>> no errors, great, let's try it: >>> $ pg_ctl restart >>> >>> FATAL: WAL archival cannot be enabled when wal_level is "minimal" >> >> This complaint could be fixed we had a --check-config that runs the >> check hook for every variable, I think. I think that would be widely useful and fairly uncontroversial. > The problem is that this, and some others, invariant is checked outside the GUC framework. Which we should probably change,which IIRC will require some new infrastructure. In the extreme, this problem is not solvable (halting problem). If we had a dry-run checking functionality, there would probably be more incentive to normalize many of the common dependency cases into a declarative system.