Thread: Redacting information from logs
Logs are important to diagnose problems or monitor operations, but logs can contain sensitive information which is often unnecessary for these purposes. Redacting the sensitive information would enable easier access and simpler integration with analysis tools without compromising the sensitive information. The challenge is that nobody wants to classify all of the log messages; and even if someone did that today, there would be never-ending work in the future to try to maintain that classification. My proposal is: * redact every '%s' in an ereport by having a special mode for snprintf.c (this is possible because we now own snprintf) * generate both redacted and unredacted messages (if redaction is enabled) * choose which destinations (stderr, eventlog, syslog, csvlog) get redacted or plain messages * emit_log_hook always has both redacted and plain messages available * allow specifying a custom redaction function, e.g. a function that hashes the string rather than completely redacting it I think '%s' in a log message is a pretty close match to the kind of information that might be sensitive. All data goes through type output functions (e.g. the conflicting datum for a unique constraint violation message), and most other things that a user might type would go through %s. A lot of other information useful in logs, like LSNs, %m's, PIDs, etc. would be preserved. All object names would be redacted, but that's not as bad as it sounds: (a) You can specify a custom redaction function that hashes rather than completely redacts. That allows you to see if different messages refer to the same object, and also map back to suspected objects if you really need to. (b) The unredacted object names are still a part of ErrorData, so you can do something interesting with emit_log_hook. (c) You still might have the unredacted logs in a more protected place, and can access them when you really need to. A weakness of this proposal is that it could be confusing to use ereport() in combination with snprintf(). If using snprintf to build the format string, nothing would be redacted, so you'd have to be careful not to expand any %s that might be sensitive. If using snprintf to build up an argument, the entire argument would be redacted. The first case should not be common, because good coding generally avoids non-constant format strings. The second case is just over-redaction, which is not necessarily bad. One annoying case would be if some of the arguments to ereport() are used for things like the right number of commas or tabs -- redacting those would just make the message look horrible. I didn't find such cases but I'm pretty sure they exist. Another annoying case is time, which is useful for debugging, but formatted with %s so it gets redacted (I did find plenty of these cases). But I don't see a better solution. Right now, it's a pain to treat log files as sensitive things when there are so many ways they can help with smooth operations and so many tools available to analyze them. This proposal seems like a practical solution to enable better use of log files while protecting potentially-sensitive information. Attached is a WIP patch. Regards, Jeff Davis
Attachment
On Tue, Jul 30, 2019 at 11:54:55AM -0700, Jeff Davis wrote: >Logs are important to diagnose problems or monitor operations, but logs >can contain sensitive information which is often unnecessary for these >purposes. Redacting the sensitive information would enable easier >access and simpler integration with analysis tools without compromising >the sensitive information. > OK, that's a worthwhile goal. I assume by "sensitive data" you mean user data, right? >The challenge is that nobody wants to classify all of the log messages; >and even if someone did that today, there would be never-ending work in >the future to try to maintain that classification. > >My proposal is: > > * redact every '%s' in an ereport by having a special mode for >snprintf.c (this is possible because we now own snprintf) > * generate both redacted and unredacted messages (if redaction is >enabled) > * choose which destinations (stderr, eventlog, syslog, csvlog) get >redacted or plain messages > * emit_log_hook always has both redacted and plain messages available > * allow specifying a custom redaction function, e.g. a function that >hashes the string rather than completely redacting it > >I think '%s' in a log message is a pretty close match to the kind of >information that might be sensitive. All data goes through type output >functions (e.g. the conflicting datum for a unique constraint violation >message), and most other things that a user might type would go through >%s. A lot of other information useful in logs, like LSNs, %m's, PIDs, >etc. would be preserved. > IMHO the crucial part here is 'might be sensitive'. How often is that actually true? My guess is 99% of places using %s are not sensitive at all, and are used for things like filenames, table/attribute names, and so on. And redacting those parts will make the logs essentially useless, because we'll get things like this: ERROR: column "******" does not exist at character 10 ERROR: division by zero CONTEXT: SQL function "******" during inlining I'm not sure those are the logs I'd like to see on a production system while investigating an issue. >All object names would be redacted, but that's not as bad as it sounds: > (a) You can specify a custom redaction function that hashes rather >than completely redacts. That allows you to see if different messages >refer to the same object, and also map back to suspected objects if you >really need to. > (b) The unredacted object names are still a part of ErrorData, so you >can do something interesting with emit_log_hook. Isn't hashing essentially an information leak, i.e. somewhat undesirable for sensitive data? > (c) You still might have the unredacted logs in a more protected >place, and can access them when you really need to. > The question is whether that's actually an acceptable solution for deployments that do handle sensitive data ... >A weakness of this proposal is that it could be confusing to use >ereport() in combination with snprintf(). If using snprintf to build >the format string, nothing would be redacted, so you'd have to be >careful not to expand any %s that might be sensitive. If using snprintf >to build up an argument, the entire argument would be redacted. The >first case should not be common, because good coding generally avoids >non-constant format strings. The second case is just over-redaction, >which is not necessarily bad. > >One annoying case would be if some of the arguments to ereport() are >used for things like the right number of commas or tabs -- redacting >those would just make the message look horrible. I didn't find such >cases but I'm pretty sure they exist. Another annoying case is time, >which is useful for debugging, but formatted with %s so it gets >redacted (I did find plenty of these cases). > >But I don't see a better solution. Right now, it's a pain to treat log >files as sensitive things when there are so many ways they can help >with smooth operations and so many tools available to analyze them. >This proposal seems like a practical solution to enable better use of >log files while protecting potentially-sensitive information. > Hmm. I wonder how difficult would it be to actually go through the ereport calls and classify those that can leak sensitive data, and then do redaction only for those. That's about the only alternative approach I can think of. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-07-30 11:54:55 -0700, Jeff Davis wrote: > My proposal is: > > * redact every '%s' in an ereport by having a special mode for > snprintf.c (this is possible because we now own snprintf) I'm extremely doubtful this is a sane approach. We use snprintf for a heck of a lot of things. The likelihood of this having unintended consequences seems high (consider an error being thrown while trying to report another error message and such). Nor do I think that snprintf.c is a good layer to perform redaction - it's too low level. It's used for both frontend/backend. It's used for both non-error and error purposes. I also don't think you're actually going to get that far with it - there's plenty places where we concatenate error messages without using *printf, but e.g. appendStringInfoString(). > But I don't see a better solution. Right now, it's a pain to treat log > files as sensitive things when there are so many ways they can help > with smooth operations and so many tools available to analyze them. > This proposal seems like a practical solution to enable better use of > log files while protecting potentially-sensitive information. I don't really see a low-effort way either. But I'm fairly certain that this will cause at least many problems as it'll help solve. I think incrementally moving to messages where portions of information are separated out (e.g. the things we'd inline with %s) is, although a lengthy process, the better approach. It'll make richer output formats possible, it'll allow for proper redaction, etc. I.e. something very roughly like ereport(ERROR, errmsg_rich("string with %{named}s references to %{parameter}s"), errparam("named", somevar), errparam("parameter", othervar, .redact = CONTEXT)); Which would allow us to add annotate whether a specific parameter needs to be redacted for certain contexts. I'd probably add a errredact(bool) to annotate whether a message needs to be redacted, mostly so we can easily flag a lot of current messages as OK. When not present, I'd redact the entire message when errmsg() is being used, and redact nothing if errmsg_rich() is used, and none of the parameters flag an error. That'd then also allow us to reference parameters that clients / exception handlers may not see, e.g. the arguments to leakproof functions. Which currently makes a lot of issues harder to debug, because we don't get the values for e.g. overflows, input syntax errors etc. Allowing errparam()s to be specified that are not used in the error messages, we can provide more detail to errors for people using richer log outputs. I'd assume we'd fairly quickly have logfmt/json logging target/format. Greetings, Andres Freund
On 7/30/19 2:54 PM, Jeff Davis wrote: > Logs are important to diagnose problems or monitor operations, but logs > can contain sensitive information which is often unnecessary for these > purposes. Redacting the sensitive information would enable easier > access and simpler integration with analysis tools without compromising > the sensitive information. This seems like a thread that could contain a link to this other thread: https://www.postgresql.org/message-id/1055919c-98ea-56d2-9ea7-f8a7c72e16b4%40anastigmatix.net Implementing ideas are different, but motivating concerns the same. Regards, -Chap
Andres Freund <andres@anarazel.de> writes: > On 2019-07-30 11:54:55 -0700, Jeff Davis wrote: >> My proposal is: >> * redact every '%s' in an ereport by having a special mode for >> snprintf.c (this is possible because we now own snprintf) > I'm extremely doubtful this is a sane approach. Yeah, it's really hard to believe that messing with snprintf isn't going to have a lot of unintended consequences. If we want to do something about this, we're going to have to bite the bullet and go around to annotate all those arguments with redaction-worthiness info. It'd be a lot of work, but as long as we can make sure to do it incrementally, we could get there. (It's not like ereport itself has been there forever ...) > I.e. something very roughly like > ereport(ERROR, > errmsg_rich("string with %{named}s references to %{parameter}s"), > errparam("named", somevar), > errparam("parameter", othervar, .redact = CONTEXT)); I'm not terribly attracted by that specific approach, though. With this, we'd get to maintain *two* variants of snprintf, and I think the one with annotation knowledge would have significant performance problems. (Something I'm sensitive to, on account of certain persons beating me over the head about snprintf.c's performance.) It'd be an amazing pain in the rear for translators, too, on account of their tools not understanding this format-string language. It seems to me that it'd be sufficient to do the annotation by inserting wrapper functions, like the errparam() you suggest above. If we just had errparam() choosing whether to return "..." instead of its argument string, we'd have what we need, without messing with the format language. regards, tom lane
Hi, On 2019-08-03 19:14:13 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I.e. something very roughly like > > > ereport(ERROR, > > errmsg_rich("string with %{named}s references to %{parameter}s"), > > errparam("named", somevar), > > errparam("parameter", othervar, .redact = CONTEXT)); > > I'm not terribly attracted by that specific approach, though. With > this, we'd get to maintain *two* variants of snprintf, and I think > the one with annotation knowledge would have significant performance > problems. (Something I'm sensitive to, on account of certain persons > beating me over the head about snprintf.c's performance.) I don't think that the performance issues of *printf in general - which we use in a lot of hot paths like output functions - are the same as when we log messages/throw errors. My gut feeling that at least >= DEBUG1 the difference would be hard to measure. And given how we e.g. build up the log_line_prefix, we could probably also recoup a lot potentially lost performance, around logging. We would probably just allow both positional references and named references. I.e. for smaller messages we can just use the current format, and just for the very long ones with many parameters we'd normally use a positional reference - but still have the name associated with the parameter. Even if we don't want to allow reference parameters by name in the format string, I think it's worthwhile. See below. It's really kind of a shame that we constantly re-parse the format strings. With something like modern C++ it'd be fairly easy to not do that (by using compile time evaluated constexprs when possible) - but I wonder if we couldn't also manage to do that within plain C, at the cost of a branch and some macro magic. I'm imagining something like #define pg_printf(fmt, ...) \ do { \ if ( __builtin_constant_p(fmt)) \ { \ static processed_fmt processed_fmt_ = {.format = fmt}; \ if (unlikely(!processed_fmt_.parsed)) \ preprocess_format_string(&processed_fmt_) \ pg_printf_processed(&processed_fmt_, __VA_ARGS__); \ } \ else \ pg_printf_unprocessed(fmt, __VA_ARGS__); \ } while (0) \ having to use a static variable for this purpose sucks somewhat, but I can't think of something in C that'd let us avoid that. > It'd be an amazing pain in the rear for translators, too, on account > of their tools not understanding this format-string language. The limitations of that world are starting to be really frustrating. > It seems to me that it'd be sufficient to do the annotation by > inserting wrapper functions, like the errparam() you suggest above. > If we just had errparam() choosing whether to return "..." instead of > its argument string, we'd have what we need, without messing with > the format language. I agree that if redaction were the only reason, that the expanded format strings wouldn't be worth it. And perhaps they still aren't. I think we've quite repeatedly had requests for a log format that can be parsed reasonably, and annotated with additional information that's too verbose for the main message. It's a pain to have to parse quite complex strings in production, just to extract the value of the actual information in the message. As an *easy* example of something I wanted to do in the past: Plot the total wait time of backends that had to wait for locks longer than deadlock_timeout. In that case, because we saw the system failing with too much waiting, but didn't really know when that started become problematic, and what the neormal amount of waiting is So we need to parse ereport(LOG, (errmsg("process %d acquired %s on %s after %ld.%03d ms", MyProcPid, modename, buf.data, msecs, usecs))); into something usable. First problem is of course that the log_line_prefix is variable, and that we'll likely need information from it to hunt down the source of locks. Parsing the above with regexes isn't too hard - but then we also need to deal with the fact that the message can be translated. Oh, and finding the associated statement requires looking forward in the log stream by a hard to predict amount, and requires knowledge about log verbosity was set to. In the past I had also wanted to parse the log to understand how much work manual and auto-vacuum an installation was doing, and whether that's perhaps correlated to problems. But for that one need a parser that can handle this mess: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/vacuumlazy.c;h=a3c4a1df3b4b28953ab1dfcd43f3d566eecac555;hb=HEAD#l396 which is obviously much harder. If we instead had the relevant variables separated out in a named fashion, we could - like many other tools these days - have a richer log target, that'd first have typical log message as a human readable string, but then also all the interpolated values separately. E.g. in theaforementioned simpler case it could be something like ts="2019-08-04 10:19:21.286 PDT" pid=32416 vxid=12/9 sev=LOG \ msg="Prozess 15027 erlangte AccessShareLock-Sperre auf Relation 1259 der Datenbank 13416 nach 29808,865 ms" \ fmt="process %d acquired %s on %s after %ld.%03d ms" \ p:modename="AccessShareLock-Sperre auf Relation 1259" \ p:msec=18343 p:usec=173 \ p:locktag=190345/134343/0/0/relation/default \ c:statement="LOCK TABLE pg_class;" \ l:func=ProcSleep l:file=proc.c l:line=1495 So, the prefix fields are just included as k=v, the evaluated message as one field, the *untranslated* format as another. All the parameters are included prefixed with p:, context information prefixed with c:, and location with l:. Note how e.g. p:modename is a pre-translated string, which is why p:locktag was added as an intentionall untranslated paramter that's not included in the message. This isn't perfect, but still a heck of a lot easier to parse than what we have now. All of the data is one block rather than in consecutive log lines that may or may not belong together, it's key=value in a way that can be parsed, the untranslated format string is available, so tools can actually analyze logs independent of the current locale. This isn't a proposal to adopt precisely the above log format - that's just something I came up with while writing this email - but to have enough information available to actually produce something like it when emitting logs. If we have enough information to pass to the logging hook, we don't even need to define how all of this is going to look like exactly (although I'd probably argue that a logfmt or json target ought to be in core). The cool part - and people from other projects/company projects might be laughing about me describing it as cool in this day and age - is that once you have a format that can readily be parsed it's pretty simple to write tools that process a logfile and extract just the information you want. Greetings, Andres Freund
On Sat, 2019-08-03 at 19:14 -0400, Tom Lane wrote: > It seems to me that it'd be sufficient to do the annotation by > inserting wrapper functions, like the errparam() you suggest above. > If we just had errparam() choosing whether to return "..." instead of > its argument string, we'd have what we need, without messing with > the format language. I'm having trouble getting the ergonomics to work out here so that it can generate both a redacted and an unredacted message. If errparam() is a normal argument to errmsg(), then errparam() will be evaluated first. Will it return the redacted version, the unredacted version, or a special type that holds both? If I try to use macros to force multiple evaluation (to get one redacted and one unredacted string), then it seems like that would happen for all arguments (not just errparam arguments), which would be bad. Suggestions? Regards, Jeff Davis
Hi, On 2019-08-05 13:37:50 -0700, Jeff Davis wrote: > On Sat, 2019-08-03 at 19:14 -0400, Tom Lane wrote: > > It seems to me that it'd be sufficient to do the annotation by > > inserting wrapper functions, like the errparam() you suggest above. > > If we just had errparam() choosing whether to return "..." instead of > > its argument string, we'd have what we need, without messing with > > the format language. > > I'm having trouble getting the ergonomics to work out here so that it > can generate both a redacted and an unredacted message. > > If errparam() is a normal argument to errmsg(), then errparam() will be > evaluated first. Will it return the redacted version, the unredacted > version, or a special type that holds both? I was thinking that it'd just store a struct ErrParam, which'd reference the passed value and metadata like the name (for structured log output) and redaction category. The bigger problem I see is handling the different types of arguments - but perhaps the answer there would be to just make the macro typesafe? Or have one for scalar values and one for pointer types? We could even allocate the necessary information for this on the stack, with some macro trickery. Not sure if it's worth it, but ... Doing something like this does however require not directly using plain sprintf. I'm personally not bothered by that, but I'd not be surprised if e.g. Tom saw that differently. > If I try to use macros to force multiple evaluation (to get one > redacted and one unredacted string), then it seems like that would > happen for all arguments (not just errparam arguments), which would be > bad. Yea, multiple evaluation clearly is a no-go. Greetings, Andres Freund
On Sun, 2019-08-04 at 11:17 -0700, Andres Freund wrote: > I'm imagining something like > > #define pg_printf(fmt, ...) \ > do { \ > if ( __builtin_constant_p(fmt)) \ > { \ > static processed_fmt processed_fmt_ = {.format = fmt}; \ > if (unlikely(!processed_fmt_.parsed)) \ > preprocess_format_string(&processed_fmt_) \ > pg_printf_processed(&processed_fmt_, __VA_ARGS__); \ > } \ > else \ > pg_printf_unprocessed(fmt, __VA_ARGS__); \ > } while (0) \ What would you do in the preprocessing exactly? Create a list of indexes into the string where the format codes are? > I think we've quite repeatedly had requests for a log format that can > be > parsed reasonably, and annotated with additional information that's > too > verbose for the main message. It's a pain to have to parse quite > complex strings in production, just to extract the value of the > actual > information in the message. > > ts="2019-08-04 10:19:21.286 PDT" pid=32416 vxid=12/9 sev=LOG \ > msg="Prozess 15027 erlangte AccessShareLock-Sperre auf Relation > 1259 der Datenbank 13416 nach 29808,865 ms" \ > fmt="process %d acquired %s on %s after %ld.%03d ms" \ > p:modename="AccessShareLock-Sperre auf Relation 1259" \ > p:msec=18343 p:usec=173 \ > p:locktag=190345/134343/0/0/relation/default \ > c:statement="LOCK TABLE pg_class;" \ > l:func=ProcSleep l:file=proc.c l:line=1495 > ... > If we have enough information to pass to the logging hook, we don't > even > need to define how all of this is going to look like exactly > (although > I'd probably argue that a logfmt or json target ought to be in core). I think I see where you are going with this now: it is almost orthogonal to your new-style format strings ( %{foo}s ), but not quite. You're suggesting that we save all of the arguments, along with some annotation, in the ErrorData struct, and then let emit_log_hook sort it out (perhaps by constructing some JSON message, perhaps translating the message_id, etc.). I like the idea, but still playing with the ergonomics a bit, and how it interacts with various message parts (msg, hint, detail, etc.). If we had the name-based format strings, then the message parts could all share a common set of parameters; but with the current positional format strings, I think each message part would need its own set of arguments. Positional: ereport(ERROR, (errcode(ERRCODE_UNIQUE_VIOLATION), errmsg_params("duplicate key value violates unique constraint \"%s\"", errparam("constraintname", MSGDEFAULT, RelationGetRelationName(rel)), errdetail_params("Key %s already exists.", errparam("key", MSGUSERDATA, key_desc))) ); Named: ereport(ERROR, (errcode(ERRCODE_UNIQUE_VIOLATION), errmsg_rich("duplicate key value violates unique constraint \"%{constraintname}s\""), errdetail_rich("Key %{key}s already exists."), errparam("key", MSGUSERDATA, key_desc)) errparam("constraintname", MSGDEFAULT, RelationGetRelationName(rel))) ); I think either one needs some ergonomic improvements, but it seems like we are going in the right direction. Maybe we can make the parameters common to different message parts by using an integer index to reference the parameter, like: ereport(ERROR, (errcode(ERRCODE_UNIQUE_VIOLATION), errmsg_rich("duplicate key value violates unique constraint \"%s\"", 1 /* second errparam */), errdetail_rich("Key %s already exists.", 0 /* first errparam */), errparam("key", MSGUSERDATA, key_desc)) errparam("constraintname", MSGDEFAULT, RelationGetRelationName(rel))) ); Not quite ideal, but might get us closer. Regards, Jeff Davis
On Mon, 2019-08-05 at 14:10 -0700, Andres Freund wrote: > I was thinking that it'd just store a struct ErrParam, which'd > reference > the passed value and metadata like the name (for structured log > output) and redaction category. The bigger problem I see is handling > the different types of arguments - but perhaps the answer there would > be > to just make the macro typesafe? Or have one for scalar values and > one > for pointer types? Losing the compile-time checks for compatibility between format codes and arguments would be a shame. Are you saying there's a potential solution for that? > Doing something like this does however require not directly using > plain > sprintf. I'm personally not bothered by that, but I'd not be > surprised > if e.g. Tom saw that differently. It may be possible to still use sprintf if we translate the ErrParams into plain values first. Regards, Jeff Davis
Hi, On 2019-08-05 14:26:44 -0700, Jeff Davis wrote: > On Sun, 2019-08-04 at 11:17 -0700, Andres Freund wrote: > > I'm imagining something like > > > > #define pg_printf(fmt, ...) \ > > do { \ > > if ( __builtin_constant_p(fmt)) \ > > { \ > > static processed_fmt processed_fmt_ = {.format = fmt}; \ > > if (unlikely(!processed_fmt_.parsed)) \ > > preprocess_format_string(&processed_fmt_) \ > > pg_printf_processed(&processed_fmt_, __VA_ARGS__); \ > > } \ > > else \ > > pg_printf_unprocessed(fmt, __VA_ARGS__); \ > > } while (0) \ > > What would you do in the preprocessing exactly? Create a list of > indexes into the string where the format codes are? Yea, basically. If you look at snprint.c's dopr(), there's a fair bit of parsing going on that is going to stay the same from call to call in nearly all cases (and the cases where not we possibly ought to fix). And things like the $ processing for argument order, or having named arguments as I suggest, make that more pronounced. > > If we have enough information to pass to the logging hook, we don't > > even > > need to define how all of this is going to look like exactly > > (although > > I'd probably argue that a logfmt or json target ought to be in core). > > I think I see where you are going with this now: it is almost > orthogonal to your new-style format strings ( %{foo}s ), but not quite. > > You're suggesting that we save all of the arguments, along with some > annotation, in the ErrorData struct, and then let emit_log_hook sort it > out (perhaps by constructing some JSON message, perhaps translating the > message_id, etc.). Right. > I like the idea, but still playing with the ergonomics a bit, and how > it interacts with various message parts (msg, hint, detail, etc.). If > we had the name-based format strings, then the message parts could all > share a common set of parameters; but with the current positional > format strings, I think each message part would need its own set of > arguments. Right, I think that's a good part of where I was coming from. > Maybe we can make the parameters common to different message parts by > using an integer index to reference the parameter, like: > > ereport(ERROR, > (errcode(ERRCODE_UNIQUE_VIOLATION), > errmsg_rich("duplicate key value violates unique constraint > \"%s\"", 1 /* second errparam */), > errdetail_rich("Key %s already exists.", 0 /* first > errparam */), > errparam("key", MSGUSERDATA, key_desc)) > errparam("constraintname", MSGDEFAULT, > RelationGetRelationName(rel))) > ); > > Not quite ideal, but might get us closer. If we insist that errmsg_rich/errdetail_rich may not have parameters, then they can just use the same set of arguments, without any of this, at the cost of sometimes more complicated % syntax (i.e. %1$d to refer to the first argument). I think the probable loss of gcc format warnings would be the biggest issue with this whole proposal, and potential translator trouble the biggest impediment for named parameters. Greetings, Andres Freund
Hi, On 2019-08-05 14:32:36 -0700, Jeff Davis wrote: > On Mon, 2019-08-05 at 14:10 -0700, Andres Freund wrote: > > I was thinking that it'd just store a struct ErrParam, which'd > > reference > > the passed value and metadata like the name (for structured log > > output) and redaction category. The bigger problem I see is handling > > the different types of arguments - but perhaps the answer there would > > be > > to just make the macro typesafe? Or have one for scalar values and > > one > > for pointer types? > > Losing the compile-time checks for compatibility between format codes > and arguments would be a shame. Are you saying there's a potential > solution for that? Yea, I'd just written that in another reply to yours. I did actually think about this some recently: https://www.postgresql.org/message-id/20190730181845.jyyk4selyohagwnf%40alap3.anarazel.de Not sure if any of that is really applicable. Once more I really am regretting that PG is C rather than C++. > > Doing something like this does however require not directly using > > plain > > sprintf. I'm personally not bothered by that, but I'd not be > > surprised > > if e.g. Tom saw that differently. > > It may be possible to still use sprintf if we translate the ErrParams > into plain values first. It's pretty hard to call those functions with a runtime variable number of arguments. va_list as an interface is not great... Greetings, Andres Freund
On Mon, 2019-08-05 at 14:44 -0700, Andres Freund wrote: > at the cost of sometimes more complicated % syntax (i.e. %1$d to > refer > to the first argument). > > I think the probable loss of gcc format warnings would be the biggest > issue with this whole proposal, and potential translator trouble the > biggest impediment for named parameters. I'd be OK with '%1$d' syntax. That leaves type safety as the main problem. Your solution involving _Generic is interesting -- I didn't even know that existed. I don't think it needs to be supported on all compilers, as long as we are getting errors from somewhere. They would be runtime errors instead of compile-time errors, though. Regards, Jeff Davis