Thread: wal_buffers = -1 and SIGHUP
This might be nitpicking (or i'm currently missing something), but i recognized that setting wal_buffers = -1 always triggers the following on reload, even if nothing to wal_buffers had changed: $ pg_ctl reload LOG: received SIGHUP, reloading configuration files LOG: parameter "wal_buffers" cannot be changed without restarting the server This only happens when you have wal_buffers set to -1. -- Thanks Bernd
On Thu, Mar 31, 2011 at 8:38 AM, Bernd Helmle <mailings@oopsware.de> wrote: > This might be nitpicking (or i'm currently missing something), but i > recognized that setting wal_buffers = -1 always triggers the following on > reload, even if nothing to wal_buffers had changed: > > $ pg_ctl reload > LOG: received SIGHUP, reloading configuration files > LOG: parameter "wal_buffers" cannot be changed without restarting the > server > > This only happens when you have wal_buffers set to -1. This is a bug. The root cause is that, on startup, we do this: if (xbuffers != XLOGbuffers) { snprintf(buf, sizeof(buf), "%d", xbuffers); SetConfigOption("wal_buffers",buf, PGC_POSTMASTER, PGC_S_OVERRIDE); } I had intended to commit Greg's patch with a show hook and an assign hook and the calculated value stored separately from the GUC variable, which I believe would avoid this is a problem, but Tom thought this way was better. Unfortunately, my knowledge of the GUC machinery is insufficient to think of a way to avoid it, other than the one Tom already rejected. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Mar 31, 2011 at 8:38 AM, Bernd Helmle <mailings@oopsware.de> wrote: >> This might be nitpicking (or i'm currently missing something), but i >> recognized that setting wal_buffers = -1 always triggers the following on >> reload, even if nothing to wal_buffers had changed: >> >> $ pg_ctl reload >> LOG: �received SIGHUP, reloading configuration files >> LOG: �parameter "wal_buffers" cannot be changed without restarting the >> server >> >> This only happens when you have wal_buffers set to -1. > This is a bug. The root cause is that, on startup, we do this: > if (xbuffers != XLOGbuffers) > { > snprintf(buf, sizeof(buf), "%d", xbuffers); > SetConfigOption("wal_buffers", buf, PGC_POSTMASTER, > PGC_S_OVERRIDE); > } > I had intended to commit Greg's patch with a show hook and an assign > hook and the calculated value stored separately from the GUC variable, > which I believe would avoid this is a problem, but Tom thought this > way was better. Unfortunately, my knowledge of the GUC machinery is > insufficient to think of a way to avoid it, other than the one Tom > already rejected. Mph ... I think this shows the error of my thinking :-(. We at least need an assign hook here. Will fix, since it was my oversight to begin with. regards, tom lane
I wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I had intended to commit Greg's patch with a show hook and an assign >> hook and the calculated value stored separately from the GUC variable, >> which I believe would avoid this is a problem, but Tom thought this >> way was better. Unfortunately, my knowledge of the GUC machinery is >> insufficient to think of a way to avoid it, other than the one Tom >> already rejected. > Mph ... I think this shows the error of my thinking :-(. We at least > need an assign hook here. Will fix, since it was my oversight to begin > with. After thinking about this for awhile, I think the fundamental problem is in the is_newvalue_equal() function, which as its comment states is pretty cheesy: /** Attempt (badly) to detect if a proposed new GUC setting is the same* as the current value.** XXX this does not reallywork because it doesn't account for the* effects of canonicalization of string values by assign_hooks.*/ If you hold your head at a funny angle you can see replacement of "-1" with a suitable default value as a form of canonicalization, so the behavior Bernd complained of is exactly what the comment is talking about. We've not had complaints previously because is_newvalue_equal() is only used for PGC_POSTMASTER variables, and few of those have assign hooks that do any canonicalization. But it is possible to trigger the problem with unix_socket_directory, for example: try setting it to something like '/tmp/bogus/..', and you'll see that pg_ctl reload triggers a log message: LOG: parameter "unix_socket_directory" cannot be changed without restarting the server Robert had suggested fixing this by kluging up wal_buffers' assign and show hooks, but IIRC he never actually got that to work; I doubt it's possible to make it work in the EXEC_BACKEND case without some additional hacks to get the state to be properly replicated into child processes. Even if we could make it work, wal_buffers is not likely to be the last variable that we'll want to allow auto-tuning of. So I think we ought to address the underlying problem instead of trying to work around it in the variable-specific code for wal_buffers. IMO the real problem is essentially that GUC assign hooks have two functions, checking and canonicalization of the value-to-be-stored versus executing secondary actions when an assignment is made; and there's no way to get at just the first one. So we cannot canonicalize the value first and then see if it's equal to the current setting. I think the only real fix is to change the API for assign hooks. This is a bit annoying but it's not like we haven't ever done that before. I'm thinking about splitting assign hooks into two functions, along the lines of bool check_hook (datatype *new_value, GucSource source)bool assign_hook (datatype new_value, GucSource source) check_hook would validate the new value, and possibly change it (hence the pass-by-reference parameter). assign_hook would only be responsible for executing secondary actions needed when an assignment is done. The "doit" flag can go away since we'd not call the assign_hook at all unless we actually want the assignment to occur. I think most of the existing uses might only need one or the other of these hooks, but haven't gone through them yet. It might be appropriate to change the signatures some more while we're at it, in particular pass the desired elog message level explicitly instead of making hooks infer it from the source parameter. It would probably take less than a day to flesh out this idea and make it happen, but it does seem like a rather large change for late alpha. If people don't want to do this now, I suggest that we just live with the problem for 9.1. It's purely cosmetic, and easy enough to work around (just don't uncomment the default value). regards, tom lane
On Sun, Apr 3, 2011 at 1:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > IMO the real problem is essentially that GUC assign hooks have two > functions, checking and canonicalization of the value-to-be-stored > versus executing secondary actions when an assignment is made; and > there's no way to get at just the first one. Yes, I think that's right. A related point is that the API for assign hooks is not consistent across all data types: string assign hooks can return a replacement value, whereas everyone else can only succeed or fail. > It would probably take less than a day to flesh out this idea and make > it happen, but it does seem like a rather large change for late alpha. > If people don't want to do this now, I suggest that we just live with > the problem for 9.1. It's purely cosmetic, and easy enough to work > around (just don't uncomment the default value). I think it's a pretty ugly wart, so I'm inclined to say go ahead and fix it. I'm not sure what alpha is for if it's not cleaning up the detritus of all the stuff we've committed in the last 9 months. AIUI, what we're trying to avoid is committing new stuff that may require additional cleanup, not cleaning up the stuff we already did commit. Once we get to beta I'll be less enthusiastic about making changes like this, but I think most of the testing that will get done is still in front of us. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Apr 3, 2011 at 1:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> IMO the real problem is essentially that GUC assign hooks have two >> functions, checking and canonicalization of the value-to-be-stored >> versus executing secondary actions when an assignment is made; and >> there's no way to get at just the first one. > Yes, I think that's right. A related point is that the API for assign > hooks is not consistent across all data types: string assign hooks can > return a replacement value, whereas everyone else can only succeed or > fail. Right. In the original design we only foresaw the need to canonicalize string values, so that's why it's like that. This change will make it more consistent. regards, tom lane
I wrote: > IMO the real problem is essentially that GUC assign hooks have two > functions, checking and canonicalization of the value-to-be-stored > versus executing secondary actions when an assignment is made; and > there's no way to get at just the first one. So we cannot canonicalize > the value first and then see if it's equal to the current setting. > I think the only real fix is to change the API for assign hooks. This > is a bit annoying but it's not like we haven't ever done that before. > I'm thinking about splitting assign hooks into two functions, along the > lines of > bool check_hook (datatype *new_value, GucSource source) > bool assign_hook (datatype new_value, GucSource source) After perusing the existing assign_hook functions, I have some further thoughts about this proposal. * Many of the existing assign hooks do a nontrivial amount of computation (such as catalog lookups) to derive internal values from the presented string; examples include assign_temp_tablespaces and assign_timezone. A naive implementation of the above design would require the assign_hook to repeat this computation after the check_hook already did it, which of course is undesirable. * Assign hooks that do catalog lookups need special pushups to avoid having to do such lookups while restoring a previous GUC setting during transaction abort (since lookups can't safely be performed in an aborted transaction). Up to now we've used ad-hoc techniques for each such variable, as seen for instance in assign_session_authorization. The usual idea is to modify the original string to store additional data, which then requires a custom show_hook to ensure only the original part of the string is shown to users. The messiest aspect of this is that it must be possible to reliably tell a modified string from original user input. I think that we can avoid the first problem and clean up the second problem if we do this: 1. Code guc.c so that the check_hook is only applied to original user input. When restoring a previous setting during abort (which necessarily will have been passed through the check_hook at an earlier time), only call the assign_hook. 2. Guarantee that the string pointer passed to the assign_hook is exactly what was returned by the check_hook, ie, guc.c is not allowed to duplicate or copy that string. Given these rules, a check_hook and assign_hook could cooperate to store additional data in what guc.c thinks is just a pointer to a string value, ie, there can be more data after the terminating \0. The assign_hook could work off just this additional data without ever doing a catalog lookup. No special show_hook is needed. Of course this only works for string GUC variables, but I'm finding it hard to visualize a case where a bool, int, float, or enum GUC could need a catalog lookup to interpret. We could possibly legislate that all of these are separately malloc'd to allow the same type of trick to be applied across the board, but I think that's overkill. We can just tell people they must use a string GUC if they need hidden data. This technique would need documentation of course, but at least it *would* be documented rather than ad-hoc for each variable. Another variant would be to allow the check_hook to pass back a separate "void *" value that could be passed on to the assign_hook, containing any necessary derived data. This is logically a bit cleaner, and would work for all types of GUC variables; but it would make things messier in guc.c since there would be an additional value to pass around. I'm not convinced it's worth that, but could be talked into it if anyone feels strongly about it. Another thing I was reminded of while perusing the code is the comment for GUC_complaint_elevel: * At some point it'd be nice to replace this with a mechanism that allows* the custom message to become the DETAIL line ofguc.c's generic message. The reason we invented GUC_complaint_elevel in the first place was to avoid a change in the signatures of assign hooks. If we are making such a change, now would be the time to fix it, because we're never gonna fix it otherwise. I see a few ways we could do it: 1. Add a "char **errdetail" parameter to assign_hooks, which guc.c would initialize to NULL before call. If the hook stores a non-null pointer there, guc.c would include that string as errdetail. This is the least effort from guc.c's viewpoint, but will be relatively painful to use from the hook's standpoint, because it'd generally have to palloc some space, or maybe even set up a StringInfo buffer to contain the generated message. 2. Add a "char *errdetail" parameter to assign_hooks, which points at a local-variable buffer in the calling routine, of a well-known size (think GUC_ERRDETAIL_BUFSIZE macro in guc.h). Then hooks do something likesnprintf(errdetail, GUC_ERRDETAIL_BUFSIZE, _("format"), ...); to return an error detail string. 3. Create a function in guc.c for hooks to call, along the lines ofvoid GUC_assign_errdetail(const char *format, ...) The simplest implementation of this would rely on the assumption that assign-hook-calling isn't re-entrant, so it could just store the formatted string in a static variable. That seems a bit ugly, but at least the ugliness would be hidden in a well-defined place, where it could be fixed locally if the assumption ever breaks down. At this point I'm leaning to #3 but wondered if anyone would see it as either overkill or too ugly. With any of these variants, we could forget about my previous suggestion of adding an explicit elevel parameter to assign hook calls, since the hooks would no longer need to know the elog level to use anyway. Comments? regards, tom lane
On Mon, Apr 4, 2011 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> IMO the real problem is essentially that GUC assign hooks have two >> functions, checking and canonicalization of the value-to-be-stored >> versus executing secondary actions when an assignment is made; and >> there's no way to get at just the first one. So we cannot canonicalize >> the value first and then see if it's equal to the current setting. >> I think the only real fix is to change the API for assign hooks. This >> is a bit annoying but it's not like we haven't ever done that before. >> I'm thinking about splitting assign hooks into two functions, along the >> lines of > >> bool check_hook (datatype *new_value, GucSource source) >> bool assign_hook (datatype new_value, GucSource source) > > After perusing the existing assign_hook functions, I have some further > thoughts about this proposal. > > * Many of the existing assign hooks do a nontrivial amount of > computation (such as catalog lookups) to derive internal values from the > presented string; examples include assign_temp_tablespaces and > assign_timezone. A naive implementation of the above design would > require the assign_hook to repeat this computation after the check_hook > already did it, which of course is undesirable. > > * Assign hooks that do catalog lookups need special pushups to avoid > having to do such lookups while restoring a previous GUC setting during > transaction abort (since lookups can't safely be performed in an aborted > transaction). Up to now we've used ad-hoc techniques for each such > variable, as seen for instance in assign_session_authorization. The > usual idea is to modify the original string to store additional data, > which then requires a custom show_hook to ensure only the original part > of the string is shown to users. The messiest aspect of this is that > it must be possible to reliably tell a modified string from original > user input. > > I think that we can avoid the first problem and clean up the second > problem if we do this: > > 1. Code guc.c so that the check_hook is only applied to original user > input. When restoring a previous setting during abort (which > necessarily will have been passed through the check_hook at an earlier > time), only call the assign_hook. > > 2. Guarantee that the string pointer passed to the assign_hook is > exactly what was returned by the check_hook, ie, guc.c is not allowed > to duplicate or copy that string. > > Given these rules, a check_hook and assign_hook could cooperate to store > additional data in what guc.c thinks is just a pointer to a string > value, ie, there can be more data after the terminating \0. The > assign_hook could work off just this additional data without ever doing > a catalog lookup. No special show_hook is needed. The only thing this proposal has to recommend it is that the current coding is even worse. > Of course this only works for string GUC variables, but I'm finding it > hard to visualize a case where a bool, int, float, or enum GUC could > need a catalog lookup to interpret. We could possibly legislate that > all of these are separately malloc'd to allow the same type of trick to > be applied across the board, but I think that's overkill. We can just > tell people they must use a string GUC if they need hidden data. > > This technique would need documentation of course, but at least it > *would* be documented rather than ad-hoc for each variable. > > Another variant would be to allow the check_hook to pass back a separate > "void *" value that could be passed on to the assign_hook, containing > any necessary derived data. This is logically a bit cleaner, and would > work for all types of GUC variables; but it would make things messier in > guc.c since there would be an additional value to pass around. I'm not > convinced it's worth that, but could be talked into it if anyone feels > strongly about it. > > Another thing I was reminded of while perusing the code is the comment > for GUC_complaint_elevel: > > * At some point it'd be nice to replace this with a mechanism that allows > * the custom message to become the DETAIL line of guc.c's generic message. > > The reason we invented GUC_complaint_elevel in the first place was to > avoid a change in the signatures of assign hooks. If we are making such > a change, now would be the time to fix it, because we're never gonna fix > it otherwise. I see a few ways we could do it: > > 1. Add a "char **errdetail" parameter to assign_hooks, which guc.c would > initialize to NULL before call. If the hook stores a non-null pointer > there, guc.c would include that string as errdetail. This is the least > effort from guc.c's viewpoint, but will be relatively painful to use > from the hook's standpoint, because it'd generally have to palloc some > space, or maybe even set up a StringInfo buffer to contain the generated > message. > > 2. Add a "char *errdetail" parameter to assign_hooks, which points at > a local-variable buffer in the calling routine, of a well-known size > (think GUC_ERRDETAIL_BUFSIZE macro in guc.h). Then hooks do something > like > snprintf(errdetail, GUC_ERRDETAIL_BUFSIZE, _("format"), ...); > to return an error detail string. > > 3. Create a function in guc.c for hooks to call, along the lines of > void GUC_assign_errdetail(const char *format, ...) > The simplest implementation of this would rely on the assumption that > assign-hook-calling isn't re-entrant, so it could just store the > formatted string in a static variable. That seems a bit ugly, but at > least the ugliness would be hidden in a well-defined place, where it > could be fixed locally if the assumption ever breaks down. > > At this point I'm leaning to #3 but wondered if anyone would see it > as either overkill or too ugly. With any of these variants, we could > forget about my previous suggestion of adding an explicit elevel > parameter to assign hook calls, since the hooks would no longer need > to know the elog level to use anyway. Definitely +1 for #3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Apr 4, 2011 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Given these rules, a check_hook and assign_hook could cooperate to store >> additional data in what guc.c thinks is just a pointer to a string >> value, ie, there can be more data after the terminating \0. �The >> assign_hook could work off just this additional data without ever doing >> a catalog lookup. �No special show_hook is needed. > The only thing this proposal has to recommend it is that the current > coding is even worse. Well, if you don't like that, do you like this one? >> Another variant would be to allow the check_hook to pass back a separate >> "void *" value that could be passed on to the assign_hook, containing >> any necessary derived data. �This is logically a bit cleaner, and would >> work for all types of GUC variables; but it would make things messier in >> guc.c since there would be an additional value to pass around. �I'm not >> convinced it's worth that, but could be talked into it if anyone feels >> strongly about it. If not, what do you suggest? regards, tom lane
On Mon, Apr 4, 2011 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Apr 4, 2011 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Given these rules, a check_hook and assign_hook could cooperate to store >>> additional data in what guc.c thinks is just a pointer to a string >>> value, ie, there can be more data after the terminating \0. The >>> assign_hook could work off just this additional data without ever doing >>> a catalog lookup. No special show_hook is needed. > >> The only thing this proposal has to recommend it is that the current >> coding is even worse. > > Well, if you don't like that, do you like this one? To be clear, it's certainly an improvement over what we have now. >>> Another variant would be to allow the check_hook to pass back a separate >>> "void *" value that could be passed on to the assign_hook, containing >>> any necessary derived data. This is logically a bit cleaner, and would >>> work for all types of GUC variables; but it would make things messier in >>> guc.c since there would be an additional value to pass around. I'm not >>> convinced it's worth that, but could be talked into it if anyone feels >>> strongly about it. I haven't really got the mental energy to think through all of this right now in detail, but I think that might be better. I think there's more kludgery here than we're going to fix in one pass, so as long as we're making improvements, I'm happy. Is there any case for using a Datum rather than a void * so people can pack a short quantity in directly without allocating memory, or are we expecting this to always be (say) a struct pointer? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Apr 4, 2011 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Another variant would be to allow the check_hook to pass back a separate >> "void *" value that could be passed on to the assign_hook, containing >> any necessary derived data. �This is logically a bit cleaner, and would >> work for all types of GUC variables; but it would make things messier in >> guc.c since there would be an additional value to pass around. �I'm not >> convinced it's worth that, but could be talked into it if anyone feels >> strongly about it. > I haven't really got the mental energy to think through all of this > right now in detail, but I think that might be better. I think > there's more kludgery here than we're going to fix in one pass, so as > long as we're making improvements, I'm happy. Is there any case for > using a Datum rather than a void * so people can pack a short quantity > in directly without allocating memory, or are we expecting this to > always be (say) a struct pointer? Well, I was intending to insist that the void* parameter point to a single malloc'd block, so that guc.c could release it when the value was no longer of interest by doing free(). If we don't say that, then we are going to need a "free" hook for those objects, which is surely way more notational overhead than is likely to be repaid for the occasional cases where a single OID or whatever would be sufficient info. regards, tom lane
On Mon, Apr 4, 2011 at 3:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Apr 4, 2011 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Another variant would be to allow the check_hook to pass back a separate >>> "void *" value that could be passed on to the assign_hook, containing >>> any necessary derived data. This is logically a bit cleaner, and would >>> work for all types of GUC variables; but it would make things messier in >>> guc.c since there would be an additional value to pass around. I'm not >>> convinced it's worth that, but could be talked into it if anyone feels >>> strongly about it. > >> I haven't really got the mental energy to think through all of this >> right now in detail, but I think that might be better. I think >> there's more kludgery here than we're going to fix in one pass, so as >> long as we're making improvements, I'm happy. Is there any case for >> using a Datum rather than a void * so people can pack a short quantity >> in directly without allocating memory, or are we expecting this to >> always be (say) a struct pointer? > > Well, I was intending to insist that the void* parameter point to a > single malloc'd block, so that guc.c could release it when the value was > no longer of interest by doing free(). If we don't say that, then we > are going to need a "free" hook for those objects, which is surely way > more notational overhead than is likely to be repaid for the occasional > cases where a single OID or whatever would be sufficient info. OK. Please comment the crap out of whatever you do, or maybe even add a README. This stuff is just a bit arcane, and guideposts help a lot. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > OK. Please comment the crap out of whatever you do, or maybe even add > a README. This stuff is just a bit arcane, and guideposts help a lot. We already have a README for that ;-). PFA, a patch to src/backend/utils/misc/README describing the proposed revised API. If nobody has any objections, I'll get on with making this happen. regards, tom lane diff --git a/src/backend/utils/misc/README b/src/backend/utils/misc/README index 881862a30b182a94cd71c6ed1a98dd6e6d5e51e0..221f595f8b72eb1cf48d318beb4aa8d3cb2b3411 100644 *** a/src/backend/utils/misc/README --- b/src/backend/utils/misc/README *************** determining which setting is used. *** 12,68 **** Per-Variable Hooks ------------------ ! Each variable known to GUC can optionally have an assign_hook and/or ! a show_hook to provide customized behavior. Assign hooks are used to ! perform validity checking on variable values (above and beyond what ! GUC can do). They are also used to update any derived state that needs ! to change when a GUC variable is set. Show hooks are used to modify ! the default SHOW display for a variable. If an assign_hook is provided, it points to a function of the signature ! bool assign_hook(newvalue, bool doit, GucSource source) ! where the type of "newvalue" matches the kind of variable. This function ! is called immediately before actually setting the variable's value (so it ! can look at the actual variable to determine the old value). If the ! function returns "true" then the assignment is completed; if it returns ! "false" then newvalue is considered invalid and the assignment is not ! performed. If "doit" is false then the function should simply check ! validity of newvalue and not change any derived state. The "source" parameter ! indicates where the new value came from. If it is >= PGC_S_INTERACTIVE, ! then we are performing an interactive assignment (e.g., a SET command), and ! ereport(ERROR) is safe to do. But when source < PGC_S_INTERACTIVE, we are ! reading a non-interactive option source, such as postgresql.conf. In this ! case the assign_hook should *not* ereport but should just return false if it ! doesn't like the newvalue. ! If an assign_hook returns false then guc.c will report a generic "invalid ! value for option FOO" error message. If you feel the need to provide a more ! specific error message, ereport() it using "GUC_complaint_elevel(source)" ! as the error level. Note that this might return either ERROR or a lower level ! such as LOG, so the ereport call might or might not return. If it does ! return, return false out of the assign_hook. ! For string variables, the signature for assign hooks is a bit different: ! const char *assign_hook(const char *newvalue, ! bool doit, ! GucSource source) ! The meanings of the parameters are the same as for the other types of GUC ! variables, but the return value is handled differently: ! NULL --- assignment fails (like returning false for other datatypes) ! newvalue --- assignment succeeds, assign the newvalue as-is ! malloc'd (not palloc'd!!!) string --- assign that value instead ! The third choice is allowed in case the assign_hook wants to return a ! "canonical" version of the new value. For example, the assign_hook for ! datestyle always returns a string that includes both output and input ! datestyle options, although the input might have specified only one. - Note that a string variable's assign_hook will NEVER be called with a NULL - value for newvalue, since there would be no way to distinguish success - and failure returns. If the boot_val or reset_val for a string variable - is NULL, it will just be assigned without calling the assign_hook. - Therefore, a NULL boot_val should never be used in combination with an - assign_hook that has side-effects, as the side-effects wouldn't happen - during a RESET that re-institutes the boot-time setting. If a show_hook is provided, it points to a function of the signature const char *show_hook(void) --- 12,108 ---- Per-Variable Hooks ------------------ ! Each variable known to GUC can optionally have a check_hook, an ! assign_hook, and/or a show_hook to provide customized behavior. ! Check hooks are used to perform validity checking on variable values ! (above and beyond what GUC can do), to compute derived settings when ! nontrivial work is needed to do that, and optionally to "canonicalize" ! user-supplied values. Assign hooks are used to update any derived state ! that needs to change when a GUC variable is set. Show hooks are used to ! modify the default SHOW display for a variable. ! ! ! If a check_hook is provided, it points to a function of the signature ! bool check_hook(datatype *newvalue, void **extra, GucSource source) ! The "newvalue" argument is of type bool *, int *, double *, or char ** ! for bool, int/enum, float, or string variables respectively. The check ! function should validate the proposed new value, and return true if it is ! OK or false if not. The function can optionally do a few other things: ! ! * When rejecting a bad proposed value, it may be useful to append some ! additional information to the generic "invalid value for option FOO" ! complaint that guc.c will emit. To do that, call ! void GUC_check_errdetail(const char *format, ...) ! where the format string and additional arguments follow the rules for ! errdetail() arguments. The resulting string will be emitted as the ! DETAIL line of guc.c's error report, so it should follow the message style ! guidelines for DETAIL messages. In general, check_hooks should avoid ! throwing errors directly if possible, though this may be difficult to ! avoid for some corner cases such as out-of-memory. ! ! * Since the newvalue is pass-by-reference, the function can modify it. ! This might be used for example to canonicalize the spelling of a string ! value, round off a buffer size to the nearest supported value, or replace ! a special value such as "-1" with a computed default value. If the ! function wishes to replace a string value, it must malloc (not palloc) ! the replacement value, and be sure to free() the previous value. ! ! * Derived information, such as the role OID represented by a user name, ! can be stored for use by the assign hook. To do this, malloc (not palloc) ! storage space for the information, and return its address at *extra. ! guc.c will automatically free() this space when the associated GUC setting ! is no longer of interest. *extra is initialized to NULL before call, so ! it can be ignored if not needed. ! ! The "source" argument indicates the source of the proposed new value, ! If it is >= PGC_S_INTERACTIVE, then we are performing an interactive ! assignment (e.g., a SET command). But when source < PGC_S_INTERACTIVE, ! we are reading a non-interactive option source, such as postgresql.conf. ! This is sometimes needed to determine whether a setting should be ! allowed. The check_hook might also look at the current actual value of ! the variable to determine what is allowed. ! ! Note that check hooks are sometimes called just to validate a value, ! without any intention of actually changing the setting. Therefore the ! check hook must *not* take any action based on the assumption that an ! assignment will occur. ! If an assign_hook is provided, it points to a function of the signature ! bool assign_hook(datatype newvalue, void *extra) ! where the type of "newvalue" matches the kind of variable, and "extra" ! is the derived-information pointer returned by the check_hook (always ! NULL if there is no check_hook). This function is called immediately ! before actually setting the variable's value (so it can look at the actual ! variable to determine the old value, for example to avoid doing work when ! the value isn't really changing). If the function returns "true" then ! the assignment is completed; if it returns "false" then the assignment ! fails and the variable value is not changed. ! IMPORTANT: assign_hooks should never fail except under the most dire ! circumstances (such as out-of-memory), since a failure may for example ! result in a GUC setting not being rolled back during transaction abort. ! In general, try to do anything that could conceivably fail in a check_hook ! instead, and pass along the results in an "extra" struct, so that the ! assign hook has little to do beyond copying the data to someplace. This ! applies particularly to catalog lookups: any required lookups should be ! done in the check_hook, since the assign_hook may be executed during ! transaction rollback when lookups will be unsafe. ! Note that check_hooks are sometimes called outside any transaction, too. ! This happens when processing the wired-in "bootstrap" value, values coming ! from the postmaster command line or environment, or values coming from ! postgresql.conf. Therefore, any catalog lookups done in a check_hook ! should be guarded with an IsTransactionState() test, and there must be a ! fallback path to allow derived values to be computed during the first ! subsequent use of the GUC setting within a transaction. Typically, ! catalog values computed by the check_hook and installed by the assign_hook ! are used only for the remainder of the transaction in which the new ! setting is made. Each subsequent transaction looks up the values afresh ! on first use. This arrangement is useful to prevent use of stale catalog ! values, independently of the problem of needing to check GUC values ! outside a transaction. If a show_hook is provided, it points to a function of the signature const char *show_hook(void) *************** currently a different interactive value *** 202,226 **** The assign_hook and show_hook routines work only with the actual variable, and are not directly aware of the additional values maintained by GUC. - This is not a problem for normal usage, since we can assign first to the - actual variable and then (if that succeeds) to the additional values as - needed. However, for SIGHUP rereads we may not want to assign to the - actual variable. Our procedure in that case is to call the assign_hook - with doit = false so that the value is validated, but no derived state is - changed. ! String Memory Handling ! ---------------------- ! String option values are allocated with strdup, not with the ! pstrdup/palloc mechanisms. We would need to keep them in a permanent ! context anyway, and strdup gives us more control over handling out-of-memory failures. We allow a string variable's actual value, reset_val, boot_val, and stacked values to point at the same storage. This makes it slightly harder to free space (we must test whether a value to be freed isn't equal to any of the other pointers in the GUC entry or associated stack items). The main ! advantage is that we never need to strdup during transaction commit/abort, so cannot cause an out-of-memory failure there. --- 242,264 ---- The assign_hook and show_hook routines work only with the actual variable, and are not directly aware of the additional values maintained by GUC. ! GUC Memory Handling ! ------------------- ! String option values are allocated with malloc/strdup, not with the ! palloc/pstrdup mechanisms. We would need to keep them in a permanent ! context anyway, and malloc gives us more control over handling out-of-memory failures. We allow a string variable's actual value, reset_val, boot_val, and stacked values to point at the same storage. This makes it slightly harder to free space (we must test whether a value to be freed isn't equal to any of the other pointers in the GUC entry or associated stack items). The main ! advantage is that we never need to malloc during transaction commit/abort, so cannot cause an out-of-memory failure there. + + "Extra" structs returned by check_hook routines are managed in the same + way as string values. Note that we support "extra" structs for all types + of GUC variables, although they are mainly useful with strings.
On Mon, Apr 4, 2011 at 4:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> OK. Please comment the crap out of whatever you do, or maybe even add >> a README. This stuff is just a bit arcane, and guideposts help a lot. > > We already have a README for that ;-). PFA, a patch to > src/backend/utils/misc/README describing the proposed revised API. > If nobody has any objections, I'll get on with making this happen. Looks reasonable to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
I wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> OK. Please comment the crap out of whatever you do, or maybe even add >> a README. This stuff is just a bit arcane, and guideposts help a lot. > We already have a README for that ;-). PFA, a patch to > src/backend/utils/misc/README describing the proposed revised API. > If nobody has any objections, I'll get on with making this happen. Attached is a WIP patch for this. It turned out to be a lot more work than I thought, because there are a lot more assign hooks than I'd remembered, and they all needed to be looked at. But I'm pretty happy with the results: things seem noticeably cleaner, and there are way fewer strange hacks like having magical things happen when timezone is set to "UNKNOWN". And it does fix Bernd's original complaint :-) The patch is not complete yet; I need to refactor some code in timezone abbreviation support and encoding-conversion support so that the assign hooks don't have to duplicate work (and risk of failure) from the check hooks. Also I've not touched contrib or the optional PLs yet. My original idea of just providing GUC_check_errdetail() to check hooks proved inadequate: there were several places where useful hints were being offered, and some where it seemed appropriate to override GUC's primary message string too. So now there are also GUC_check_errmsg() and GUC_check_errhint(). A few places were reporting SQLSTATE codes different from ERRCODE_INVALID_PARAMETER_VALUE; in particular the routines associated with transaction properties sometimes reported ERRCODE_ACTIVE_SQL_TRANSACTION instead. As the patch stands it doesn't maintain that behavior, but just reports ERRCODE_INVALID_PARAMETER_VALUE in all check-hook failure cases. We could preserve the old errcodes with another static variable and another function GUC_check_errcode(), but I'm not sure if it's worth the trouble. Thoughts? Also, I'm very seriously considering dropping the provision that says assign hooks can return a bool success flag, but instead having them return void and just saying they're not supposed to fail. The only ones that can ever return false at the moment are the ones for timezone abbrev and encoding, and as stated above that's only for lack of refactoring of their infrastructure. I think allowing failure just complicates matters for guc.c without much redeeming social benefit. Any comments? I'm hoping to push this through to commit today or tomorrow, so I can get back to fooling with collations. regards, tom lane