Thread: TODO list
2 things. I submitted a patch for this 5 months ago, which is still waiting to be merged (hope it hasn't bitrotted in the meantime): . Allow log lines to include session-level information, like database and user If nobody is working on this I am prepared to look at it: . Allow logging of only data definition(DDL), or DDL and modification statements cheers andrew
I wrote: > > > If nobody is working on this I am prepared to look at it: > > . Allow logging of only data definition(DDL), or DDL and modification > statements > Here are some options: 1. change the type of "log_statement" option from boolean to string, with allowed values of "all, mod, ddl, none" with default "none". 2. same as 1. but make boolean true values synonyms for "all" and boolean false values synonyms for "none". 3. keep "log_statement" option as now and add a new option "log_statement_level" with the same options as 1. but default to "all", which will have no effect unless "log_statement" is true. Also, I assume "modification statements" means insert/update/delete, or are we talking about DDL mods (like "alter table")? Finally, what about functions that have side effects? It would be nice to be able to detect the statements to be logged at the syntactic level, but it strikes me that that might not be possible. cheers andrew
Andrew Dunstan wrote: > > 2 things. > > I submitted a patch for this 5 months ago, which is still waiting to be > merged (hope it hasn't bitrotted in the meantime): > > . Allow log lines to include session-level information, like database > and user > > If nobody is working on this I am prepared to look at it: > > . Allow logging of only data definition(DDL), or DDL and modification > statements Yes, sorry I haven't gotten back to that, and yes the DDL idea is a good one. For the log idea, I think we need to get a way to merge all the per-line info into one setup, so pid, timestamp, user, etc would all be configurable using your setup. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian said: > Andrew Dunstan wrote: >> >> 2 things. >> >> I submitted a patch for this 5 months ago, which is still waiting to be merged (hope it hasn't bitrotted in the meantime): >> >> . Allow log lines to include session-level information, like database and user >> >> If nobody is working on this I am prepared to look at it: >> >> . Allow logging of only data definition(DDL), or DDL and modification statements > > Yes, sorry I haven't gotten back to that, and yes the DDL idea is a good one. > > For the log idea, I think we need to get a way to merge all the > per-line info into one setup, so pid, timestamp, user, etc would all be configurable using your setup. > I thought we had thrashed this out back in August. Certainly the only thing I recall seeing after I submitted the patch was some stylistic criticism from Neil, which I addressed in a revised patch. Anyway, it is in principle doable. That's partly why I adopted a printf style format string. There are some wrinkles, though: . interaction with syslog pid/timestamp logging . making sure the info is available when you need to log it - I had to rearrange a few thing to avoid getting SEGVs, IIRC. Also, the session duration logging part of the patch is orthogonal to the issue. cheers andrew
Andrew Dunstan wrote: > Bruce Momjian said: > > Andrew Dunstan wrote: > >> > >> 2 things. > >> > >> I submitted a patch for this 5 months ago, which is still waiting to > be merged (hope it hasn't bitrotted in the meantime): > >> > >> . Allow log lines to include session-level information, like database > and user > >> > >> If nobody is working on this I am prepared to look at it: > >> > >> . Allow logging of only data definition(DDL), or DDL and modification > statements > > > > Yes, sorry I haven't gotten back to that, and yes the DDL idea is a good > one. > > > > For the log idea, I think we need to get a way to merge all the > > per-line info into one setup, so pid, timestamp, user, etc would all be > configurable using your setup. > > > > I thought we had thrashed this out back in August. Certainly the only > thing I recall seeing after I submitted the patch was some stylistic > criticism from Neil, which I addressed in a revised patch. > > Anyway, it is in principle doable. That's partly why I adopted a printf > style format string. There are some wrinkles, though: > . interaction with syslog pid/timestamp logging Yes. If you use syslog, just don't ask for pid/timestamp and let syslog do it. Of course, right now we are able to send non-pid/timestamp to syslog _and_ send pid/timestamp to a log file, but that seems like a rare operation that doesn't justify keeping the various log parameters separate. Also, I would like to see some kind of session identifier that is more unique than pid, which wraps around. Ideally we could have 10{pid}, then then the pid wraps around, 20{pid), or something like that. > . making sure the info is available when you need to log it - I had to > rearrange a few thing to avoid getting SEGVs, IIRC. Of course some messages, like postmaster status messages, don't have some of these fields, like username or host. Is that going to cause problems for tools that read our log files? > Also, the session duration logging part of the patch is orthogonal to the > issue. You mean query duration? Yes, it is orthoginal. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >Andrew Dunstan wrote: > > >>I thought we had thrashed this out back in August. Certainly the only >>thing I recall seeing after I submitted the patch was some stylistic >>criticism from Neil, which I addressed in a revised patch. >> >>Anyway, it is in principle doable. That's partly why I adopted a printf >>style format string. There are some wrinkles, though: >>. interaction with syslog pid/timestamp logging >> >> > >Yes. If you use syslog, just don't ask for pid/timestamp and let syslog >do it. Of course, right now we are able to send non-pid/timestamp to >syslog _and_ send pid/timestamp to a log file, but that seems like a >rare operation that doesn't justify keeping the various log parameters >separate. > I'm OK with that as long as it is the consensus view. It does seem a little odd to remove functionality (however rare) for the sake of configuration neatness, though. > >Also, I would like to see some kind of session identifier that is more >unique than pid, which wraps around. Ideally we could have 10{pid}, >then then the pid wraps around, 20{pid), or something like that. > This requires some thought. ISTM it wouldn't buy you much unless you made it persistent across server restarts, and possibly not even then. I don't see it as a reason to hold up these features, though. If someone wants to tackle this it could be plugged in to the loginfo feature very easily as an extra escape sequence. > > > >>. making sure the info is available when you need to log it - I had to >>rearrange a few thing to avoid getting SEGVs, IIRC. >> >> > >Of course some messages, like postmaster status messages, don't have >some of these fields, like username or host. Is that going to cause >problems for tools that read our log files? > If users want a non-empty info string they will have to teach the tools to handle it anyway. The point was, however, that rolling up PID and timestamp into the printf-style format will require some significant work, because we wouldn't want to lose that info if the user/db weren't known, whereas the patch currently suppresses all log-info output if these are not present (i.e. when MyProcPort == NULL). > > > >>Also, the session duration logging part of the patch is orthogonal to the >>issue. >> >> > >You mean query duration? Yes, it is orthoginal. > > No, I meant the logging of the end of a session, including its duration, which was also in the patch. cheers andrew
On Tue, 6 Jan 2004, Andrew Dunstan wrote: > >Also, I would like to see some kind of session identifier that is more > >unique than pid, which wraps around. Ideally we could have 10{pid}, > >then then the pid wraps around, 20{pid), or something like that. > > This requires some thought. ISTM it wouldn't buy you much unless you > made it persistent across server restarts, and possibly not even then. And on OpenBSD (though no other platforms that I know of) the PID is a random number, so there is no "wrapping" to begin with. Jon
Am Tuesday 06 January 2004 21:30 schrieb Jon Jensen: > On Tue, 6 Jan 2004, Andrew Dunstan wrote: > > >Also, I would like to see some kind of session identifier that is more > > >unique than pid, which wraps around. Ideally we could have 10{pid}, > > >then then the pid wraps around, 20{pid), or something like that. > > > > This requires some thought. ISTM it wouldn't buy you much unless you > > made it persistent across server restarts, and possibly not even then. > > And on OpenBSD (though no other platforms that I know of) the PID is a > random number, so there is no "wrapping" to begin with. Linux >= 2.6 has random pids too.
Andrew Dunstan wrote: > > > Bruce Momjian wrote: > > >Andrew Dunstan wrote: > > > > > >>I thought we had thrashed this out back in August. Certainly the only > >>thing I recall seeing after I submitted the patch was some stylistic > >>criticism from Neil, which I addressed in a revised patch. > >> > >>Anyway, it is in principle doable. That's partly why I adopted a printf > >>style format string. There are some wrinkles, though: > >>. interaction with syslog pid/timestamp logging > >> > >> > > > >Yes. If you use syslog, just don't ask for pid/timestamp and let syslog > >do it. Of course, right now we are able to send non-pid/timestamp to > >syslog _and_ send pid/timestamp to a log file, but that seems like a > >rare operation that doesn't justify keeping the various log parameters > >separate. > > > > > > I'm OK with that as long as it is the consensus view. It does seem a > little odd to remove functionality (however rare) for the sake of > configuration neatness, though. Yes, agreed. Let's explore it. The rare functionality we would be removing is for:o User uses syslogo User wants to log postmaster output to a file tooo User wants timestamp info in thepostmaster output file And the downside is that they get duplicate timestamps in syslog. Now, if we don't merge timestamp into your new per-line log string, we end up with a rather illogical and inflexible output format, only to allow the rare case listed above. Clearly this isn't a 100% clear decision, but it seems to lean in the direction of removing the functionality with the goal of providing a more logical logging API to the users. > >Also, I would like to see some kind of session identifier that is more > >unique than pid, which wraps around. Ideally we could have 10{pid}, > >then then the pid wraps around, 20{pid), or something like that. > > > > > This requires some thought. ISTM it wouldn't buy you much unless you > made it persistent across server restarts, and possibly not even then. I > don't see it as a reason to hold up these features, though. If someone > wants to tackle this it could be plugged in to the loginfo feature very > easily as an extra escape sequence. Yes, that was my idea --- let's get this in and we can then add a session id, and your point about restarts is a good one. > >>. making sure the info is available when you need to log it - I had to > >>rearrange a few thing to avoid getting SEGVs, IIRC. > >> > >> > > > >Of course some messages, like postmaster status messages, don't have > >some of these fields, like username or host. Is that going to cause > >problems for tools that read our log files? > > > > > If users want a non-empty info string they will have to teach the tools > to handle it anyway. The point was, however, that rolling up PID and > timestamp into the printf-style format will require some significant > work, because we wouldn't want to lose that info if the user/db weren't > known, whereas the patch currently suppresses all log-info output if > these are not present (i.e. when MyProcPort == NULL). Oh, good point. That would suggest that maybe we are better off leaving pid and timestamp as separate options and _not_ merge them into your new string. I am starting to think having your string print only session-specific information is the best way. I wonder if we should rename your GUC log_session_line or something like that. > >>Also, the session duration logging part of the patch is orthogonal to the > >>issue. > > > >You mean query duration? Yes, it is orthoginal. > > No, I meant the logging of the end of a session, including its duration, > which was also in the patch. Oh, I missed that one. It seems like a nice addition. I see you added it when they ask for log_connections. Good idea. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Jon Jensen wrote: >On Tue, 6 Jan 2004, Andrew Dunstan wrote: > > > >>>Also, I would like to see some kind of session identifier that is more >>>unique than pid, which wraps around. Ideally we could have 10{pid}, >>>then then the pid wraps around, 20{pid), or something like that. >>> >>> >>This requires some thought. ISTM it wouldn't buy you much unless you >>made it persistent across server restarts, and possibly not even then. >> >> > >And on OpenBSD (though no other platforms that I know of) the PID is a >random number, so there is no "wrapping" to begin with. > > > OK, so a sessionid based on prefix+pid won't work portably. If we *really* want to do it, a cluster-wide sequence generator would probably be the way to go, but I suspect that with the ability to log session termination explicitly (which I have already provided) much of the supposed extra utility would disappear anyway. cheers andrew
Bruce Momjian wrote: >Andrew Dunstan wrote: > > >>Bruce Momjian wrote: >> >> >> >>>Andrew Dunstan wrote: >>> >>> >>> >>> >>>>I thought we had thrashed this out back in August. Certainly the only >>>>thing I recall seeing after I submitted the patch was some stylistic >>>>criticism from Neil, which I addressed in a revised patch. >>>> >>>>Anyway, it is in principle doable. That's partly why I adopted a printf >>>>style format string. There are some wrinkles, though: >>>>. interaction with syslog pid/timestamp logging >>>> >>>> >>>> >>>> >>>Yes. If you use syslog, just don't ask for pid/timestamp and let syslog >>>do it. Of course, right now we are able to send non-pid/timestamp to >>>syslog _and_ send pid/timestamp to a log file, but that seems like a >>>rare operation that doesn't justify keeping the various log parameters >>>separate. >>> >>> >>> >> >>I'm OK with that as long as it is the consensus view. It does seem a >>little odd to remove functionality (however rare) for the sake of >>configuration neatness, though. >> >> > >Yes, agreed. Let's explore it. The rare functionality we would be >removing is for: > > o User uses syslog > o User wants to log postmaster output to a file too > o User wants timestamp info in the postmaster output file > >And the downside is that they get duplicate timestamps in syslog. > >Now, if we don't merge timestamp into your new per-line log string, we >end up with a rather illogical and inflexible output format, only to >allow the rare case listed above. > >Clearly this isn't a 100% clear decision, but it seems to lean in the >direction of removing the functionality with the goal of providing a >more logical logging API to the users. > > > >>>Also, I would like to see some kind of session identifier that is more >>>unique than pid, which wraps around. Ideally we could have 10{pid}, >>>then then the pid wraps around, 20{pid), or something like that. >>> >>> >>> >>This requires some thought. ISTM it wouldn't buy you much unless you >>made it persistent across server restarts, and possibly not even then. I >>don't see it as a reason to hold up these features, though. If someone >>wants to tackle this it could be plugged in to the loginfo feature very >>easily as an extra escape sequence. >> >> > >Yes, that was my idea --- let's get this in and we can then add a >session id, and your point about restarts is a good one. > > > >>>>. making sure the info is available when you need to log it - I had to >>>>rearrange a few thing to avoid getting SEGVs, IIRC. >>>> >>>> >>>> >>>> >>>Of course some messages, like postmaster status messages, don't have >>>some of these fields, like username or host. Is that going to cause >>>problems for tools that read our log files? >>> >>> >>> >>If users want a non-empty info string they will have to teach the tools >>to handle it anyway. The point was, however, that rolling up PID and >>timestamp into the printf-style format will require some significant >>work, because we wouldn't want to lose that info if the user/db weren't >>known, whereas the patch currently suppresses all log-info output if >>these are not present (i.e. when MyProcPort == NULL). >> >> > >Oh, good point. That would suggest that maybe we are better off leaving >pid and timestamp as separate options and _not_ merge them into your new >string. I am starting to think having your string print only >session-specific information is the best way. > >I wonder if we should rename your GUC log_session_line or something like >that. > > > >>>>Also, the session duration logging part of the patch is orthogonal to the >>>>issue. >>>> >>>> >>>You mean query duration? Yes, it is orthoginal. >>> >>> >>No, I meant the logging of the end of a session, including its duration, >>which was also in the patch. >> >> > >Oh, I missed that one. It seems like a nice addition. I see you added >it when they ask for log_connections. Good idea. > I think you are looking at the original patch, not the revised patch, which is here: http://candle.pha.pa.us/mhonarc/patches2/msg00091.html and provides a separate GUC var called "log_session_end" - Neil wanted these not to be combined, IIRC. I am agnostic as to the names of the variables. cheers andrew
Andrew Dunstan wrote: > I wrote: > > > > > > > If nobody is working on this I am prepared to look at it: > > > > . Allow logging of only data definition(DDL), or DDL and modification > > statements > > > > > > Here are some options: > > 1. change the type of "log_statement" option from boolean to string, > with allowed values of "all, mod, ddl, none" with default "none". > 2. same as 1. but make boolean true values synonyms for "all" and > boolean false values synonyms for "none". > 3. keep "log_statement" option as now and add a new option > "log_statement_level" with the same options as 1. but default to "all", > which will have no effect unless "log_statement" is true. I like 1. > Also, I assume "modification statements" means insert/update/delete, or Yes. > are we talking about DDL mods (like "alter table")? Alter is DDL. > Finally, what about functions that have side effects? It would be nice > to be able to detect the statements to be logged at the syntactic level, > but it strikes me that that might not be possible. Not possible. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >Andrew Dunstan wrote: > > wow. that was nearly 3 months ago ... >>I wrote: >> >> >> >>>If nobody is working on this I am prepared to look at it: >>> >>>. Allow logging of only data definition(DDL), or DDL and modification >>>statements >>> >>> >>> >> >>Here are some options: >> >>1. change the type of "log_statement" option from boolean to string, >>with allowed values of "all, mod, ddl, none" with default "none". >>2. same as 1. but make boolean true values synonyms for "all" and >>boolean false values synonyms for "none". >>3. keep "log_statement" option as now and add a new option >>"log_statement_level" with the same options as 1. but default to "all", >>which will have no effect unless "log_statement" is true. >> >> > >I like 1. > > > >>Also, I assume "modification statements" means insert/update/delete, or >> >> > >Yes. > > > >>are we talking about DDL mods (like "alter table")? >> >> > >Alter is DDL. > > > >>Finally, what about functions that have side effects? It would be nice >>to be able to detect the statements to be logged at the syntactic level, >>but it strikes me that that might not be possible. >> >> > >Not possible. > > > Subsequent discussion suggested we should add "syntax-errors" to the allowed values (and I would favor making it the default). The problem is this - in order to make the decision about whether or not to log, we need to have parsed the statement (unless the level is set to "all"). My simple approach, which would mean that the entire patch would amount to around 100 lines, maybe, plus docco, would have the (I think) trivial side effect of reversing the order in which a logged statement and the corresponding parse error log entry appeared. You objected to that effect, so I stopped work on it. Now I can think of a couple of different approaches which would not have this effect: a. embed a time machine in postgres so we can make a decision based on information we do not yet have, or b. find some spot where we can trap the parse error log message before it is emitted and then first log the statement. That spot is probably somewhere in src/backend/utils/error/elog.c, but I am not quite sure where. I have rejected as ugly and unmaintainable monkeying with the parser itself to produce the desired effect, and I regret that idea a is beyond my humble abilities :-) cheers andrew
Andrew Dunstan wrote: > Bruce Momjian wrote: > > >Andrew Dunstan wrote: > > > > > > > wow. that was nearly 3 months ago ... Oh, I remember why I kept this email now. I am going to try to code this. > Subsequent discussion suggested we should add "syntax-errors" to the > allowed values (and I would favor making it the default). We already have log_min_error_statement. Seems that is what should be used if someone wants only syntax errors. > The problem is this - in order to make the decision about whether or not > to log, we need to have parsed the statement (unless the level is set > to "all"). My simple approach, which would mean that the entire patch > would amount to around 100 lines, maybe, plus docco, would have the (I > think) trivial side effect of reversing the order in which a logged > statement and the corresponding parse error log entry appeared. You > objected to that effect, so I stopped work on it. > > Now I can think of a couple of different approaches which would not have > this effect: > a. embed a time machine in postgres so we can make a decision based on > information we do not yet have, or > b. find some spot where we can trap the parse error log message before > it is emitted and then first log the statement. That spot is probably > somewhere in src/backend/utils/error/elog.c, but I am not quite sure where. > > I have rejected as ugly and unmaintainable monkeying with the parser > itself to produce the desired effect, and I regret that idea a is beyond > my humble abilities :-) I will start on this now. Thanks. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073