Thread: syslog_line_prefix
Here is a patch to add a GUC parameter "syslog_line_prefix". It adds prefixes to syslog and eventlog. We still have "log_line_prefix", that will be used only for stderr logs. We have a tip that log_line_prefix is not required for syslog in the documentation, but we'd better to have independent setttings if we set log_destination to 'stderr, syslog'. http://developer.postgresql.org/pgdocs/postgres/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT | Tip: Syslog produces its own time stamp and process ID | information, so you probably do not want to use those escapes | if you are logging to syslog. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
On Mon, Sep 14, 2009 at 02:43, Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > Here is a patch to add a GUC parameter "syslog_line_prefix". > It adds prefixes to syslog and eventlog. We still have > "log_line_prefix", that will be used only for stderr logs. > > We have a tip that log_line_prefix is not required for syslog > in the documentation, but we'd better to have independent setttings > if we set log_destination to 'stderr, syslog'. > > http://developer.postgresql.org/pgdocs/postgres/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT > | Tip: Syslog produces its own time stamp and process ID > | information, so you probably do not want to use those escapes > | if you are logging to syslog. I'm not sure I like this as a GUC. We're going to end up with a lot of different GUCs, and everytime we add a new log destination (admittedly not often, of course), that increases even further. And GUCs really don't provide the level of flexibility you'd really like to have. I've been thinking (long-term) in the direction of a separate config file, since that could contain an arbitrary number of lines, with "rules" on them (somewhat like pg_hba.conf maybe). You'd do the matching on things like error level and destination, and then specify a bunch of flags. Or potentially do it on error level and contents, and filtering which destinations get it. Forcing it into the guc framework seems like a limiting long-term strategy. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander wrote: > I'm not sure I like this as a GUC. We're going to end up with a lot of > different GUCs, and everytime we add a new log destination (admittedly > not often, of course), that increases even further. And GUCs really > don't provide the level of flexibility you'd really like to have. I've > been thinking (long-term) in the direction of a separate config file, > since that could contain an arbitrary number of lines, with "rules" on > them (somewhat like pg_hba.conf maybe). I tend to agree with this idea, but I'm not sure about rejecting the current patch because of it. FWIW one of the things that this "rules of logging config" system should support is configuring each type of server process differently, for example set min_log_level to debug2 for autovacuum only, etc. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Sep 15, 2009 at 2:18 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Magnus Hagander wrote: > >> I'm not sure I like this as a GUC. We're going to end up with a lot of >> different GUCs, and everytime we add a new log destination (admittedly >> not often, of course), that increases even further. And GUCs really >> don't provide the level of flexibility you'd really like to have. I've >> been thinking (long-term) in the direction of a separate config file, >> since that could contain an arbitrary number of lines, with "rules" on >> them (somewhat like pg_hba.conf maybe). > > I tend to agree with this idea, but I'm not sure about rejecting the > current patch because of it. I'm picking up this patch to review for this CommitFest. I agree that the idea of this patch is good. It's pretty silly for us to give people advice that they should not log time stamps and pids to syslog, but then provide them no way of actually implementing that behavior when logging to multiple destinations. On the other hand, I don't think this is the right way to do it. The patch proposes the following mapping of logging destinations to GUCs: stderr -> log_line_prefix (same as now) csvlog -> not applicable (same as now) syslog -> syslog_line_prefix eventlog -> syslog_line_prefix That's not exactly mnemonic; I think we'd want {stderr,syslog,eventlog}_log_line_prefix if anything. But that seems like too many GUCs already - for anyone logging to a single destination (which I would think by far the most common case), it's just extra work. So I'm inclined to say that we should reject this patch for now and see what other ideas come down the pipe. ...Robert
Robert Haas escribió: > On the other hand, I don't think this is the right way to do it. The > patch proposes the following mapping of logging destinations to GUCs: > > stderr -> log_line_prefix (same as now) > csvlog -> not applicable (same as now) > syslog -> syslog_line_prefix > eventlog -> syslog_line_prefix > > That's not exactly mnemonic; I think we'd want > {stderr,syslog,eventlog}_log_line_prefix if anything. So let's have a (ugh) fourth GUC that keeps the current name log_line_prefix and is the default value for all the other vars. So today's config would continue to work identically, and people wanting more configurable behavior could get it by simply setting one or more of the new vars. The only problem is what would we do when we implement Magnus' idea. Are we close to that? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Fri, Sep 25, 2009 at 12:13 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Robert Haas escribió: > >> On the other hand, I don't think this is the right way to do it. The >> patch proposes the following mapping of logging destinations to GUCs: >> >> stderr -> log_line_prefix (same as now) >> csvlog -> not applicable (same as now) >> syslog -> syslog_line_prefix >> eventlog -> syslog_line_prefix >> >> That's not exactly mnemonic; I think we'd want >> {stderr,syslog,eventlog}_log_line_prefix if anything. > > So let's have a (ugh) fourth GUC that keeps the current name > log_line_prefix and is the default value for all the other vars. > So today's config would continue to work identically, and people wanting > more configurable behavior could get it by simply setting one or more of > the new vars. That might be workable, if there's a reasonable way to make the default for one GUC depend on the value of another GUC. But it doesn't make a good idea. > The only problem is what would we do when we implement Magnus' idea. > Are we close to that? Unless there's some code out there that hasn't been posted, I don't think so. I don't think we even have a complete design, which would be a good thing to have in trying to compare this proposal vs. that one. ...Robert
On Mon, 2009-09-14 at 09:43 +0900, Itagaki Takahiro wrote: > Here is a patch to add a GUC parameter "syslog_line_prefix". > It adds prefixes to syslog and eventlog. We still have > "log_line_prefix", that will be used only for stderr logs. > > We have a tip that log_line_prefix is not required for syslog > in the documentation, but we'd better to have independent setttings > if we set log_destination to 'stderr, syslog'. IMO we should just make log_line_prefix work with syslog/eventlog too. > > http://developer.postgresql.org/pgdocs/postgres/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT > | Tip: Syslog produces its own time stamp and process ID > | information, so you probably do not want to use those escapes > | if you are logging to syslog. > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander
"Joshua D. Drake" <jd@commandprompt.com> writes: > On Mon, 2009-09-14 at 09:43 +0900, Itagaki Takahiro wrote: >> We have a tip that log_line_prefix is not required for syslog >> in the documentation, but we'd better to have independent setttings >> if we set log_destination to 'stderr, syslog'. > IMO we should just make log_line_prefix work with syslog/eventlog too. It *does* work with syslog. You missed the point, which is that because syslog sticks on timestamp and PID information of its own accord, you'd typically want a different prefix setting for syslog than for stderr. However, I don't think I actually believe the premise of this patch, which is that sending log information to both stderr and syslog is a useful thing to do --- so useful that it's worth greatly complicating the elog stuff to support it a trifle better. Given the amount of whining we hear about the overhead of logging, who is going to want duplicate output? And especially, who is going to want elog.c to do twice as much work to format the log output differently for the two destinations? regards, tom lane
On Fri, Sep 25, 2009 at 21:19, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> On Mon, 2009-09-14 at 09:43 +0900, Itagaki Takahiro wrote: >>> We have a tip that log_line_prefix is not required for syslog >>> in the documentation, but we'd better to have independent setttings >>> if we set log_destination to 'stderr, syslog'. > >> IMO we should just make log_line_prefix work with syslog/eventlog too. > > It *does* work with syslog. You missed the point, which is that because > syslog sticks on timestamp and PID information of its own accord, you'd > typically want a different prefix setting for syslog than for stderr. > > However, I don't think I actually believe the premise of this patch, > which is that sending log information to both stderr and syslog is > a useful thing to do --- so useful that it's worth greatly complicating > the elog stuff to support it a trifle better. Given the amount of > whining we hear about the overhead of logging, who is going to want > duplicate output? And especially, who is going to want elog.c to do > twice as much work to format the log output differently for the two > destinations? I am :-) I definitely want both text and CSV output - which I can't have today. I would even more like to have some things send to CSV and some things sent to text. Other than if you're logging all your queries (or over <n> time, where <n> is very small), I've never seen a system with performance issues from logging. I'm sure others may have, but not me. Is there really any log output other than the query-logging-for-performance-analysis that is likely to cause any real load on the system? If not, perhaps we need to break out that part to a separate codepath instead, and optimize that one for speed, while optimizing the other paths for flexibility? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander wrote: > I definitely want both text and CSV output - which I can't have today. > > Sure you can. What makes you think you can't? cheers andrew
On Fri, Sep 25, 2009 at 4:06 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Sep 25, 2009 at 21:19, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> "Joshua D. Drake" <jd@commandprompt.com> writes: >>> On Mon, 2009-09-14 at 09:43 +0900, Itagaki Takahiro wrote: >>>> We have a tip that log_line_prefix is not required for syslog >>>> in the documentation, but we'd better to have independent setttings >>>> if we set log_destination to 'stderr, syslog'. >> >>> IMO we should just make log_line_prefix work with syslog/eventlog too. >> >> It *does* work with syslog. You missed the point, which is that because >> syslog sticks on timestamp and PID information of its own accord, you'd >> typically want a different prefix setting for syslog than for stderr. >> >> However, I don't think I actually believe the premise of this patch, >> which is that sending log information to both stderr and syslog is >> a useful thing to do --- so useful that it's worth greatly complicating >> the elog stuff to support it a trifle better. Given the amount of >> whining we hear about the overhead of logging, who is going to want >> duplicate output? And especially, who is going to want elog.c to do >> twice as much work to format the log output differently for the two >> destinations? > > I am :-) > > I definitely want both text and CSV output - which I can't have today. Huh? I thought I had that exact configuration working yesterday. > I would even more like to have some things send to CSV and some things > sent to text. This patch won't help, then. > Other than if you're logging all your queries (or over <n> time, where > <n> is very small), I've never seen a system with performance issues > from logging. I'm sure others may have, but not me. > > Is there really any log output other than the > query-logging-for-performance-analysis that is likely to cause any > real load on the system? If not, perhaps we need to break out that > part to a separate codepath instead, and optimize that one for speed, > while optimizing the other paths for flexibility? Not sure, but I doubt it's that easy. ...Robert
On Fri, Sep 25, 2009 at 22:17, Andrew Dunstan <andrew@dunslane.net> wrote: > > > Magnus Hagander wrote: >> >> I definitely want both text and CSV output - which I can't have today. >> >> > > Sure you can. What makes you think you can't? How do i do that? When I enable csv logging, it changes the log format to csv, and my plaintext logs don't end up in the logs anymore. Note that I'm not talking about syslog, I'm talking about the logging that goes through the logging collector, and is dealt with by PostgreSQL. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Fri, Sep 25, 2009 at 22:18, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Sep 25, 2009 at 4:06 PM, Magnus Hagander <magnus@hagander.net> wrote: >> Other than if you're logging all your queries (or over <n> time, where >> <n> is very small), I've never seen a system with performance issues >> from logging. I'm sure others may have, but not me. >> >> Is there really any log output other than the >> query-logging-for-performance-analysis that is likely to cause any >> real load on the system? If not, perhaps we need to break out that >> part to a separate codepath instead, and optimize that one for speed, >> while optimizing the other paths for flexibility? > > Not sure, but I doubt it's that easy. If we are talking about the "log query duration" or "log queries longer than <n>" that's a single location in the code. It can't be that hard... -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander wrote: > On Fri, Sep 25, 2009 at 22:17, Andrew Dunstan <andrew@dunslane.net> wrote: > >> Magnus Hagander wrote: >> >>> I definitely want both text and CSV output - which I can't have today. >>> >>> >>> >> Sure you can. What makes you think you can't? >> > > How do i do that? When I enable csv logging, it changes the log format > to csv, and my plaintext logs don't end up in the logs anymore. > > Note that I'm not talking about syslog, I'm talking about the logging > that goes through the logging collector, and is dealt with by > PostgreSQL. > > log_destination = 'stderr, csvlog' IIRC cheers andrew
On Fri, Sep 25, 2009 at 03:19:36PM -0400, Tom Lane wrote: > However, I don't think I actually believe the premise of this patch, > which is that sending log information to both stderr and syslog is > a useful thing to do Actually the thing I want is to be able to send some stuff to syslog, and some to a file, and other stuff to another file. This patch doesn't do all that, but lays the necessary groundwork. For instance, the various applications that parse query output all require specific log line formats and various other configurations to be able to understand our logs, and even then still have problems dealing with multi-line entries. This is a pain. Such applications should be able to have their own machine-readable logs, like csvlog. Unfortunately csvlogs are almost entirely unreadable by humans, so I'd also like to have a human readable log somewhere. These two logs need not necessarily contain the same information. Loads of people seem to want to be able to have separate per-database log files, which something like this could also allow. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
On Fri, Sep 25, 2009 at 4:33 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Sep 25, 2009 at 22:18, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Sep 25, 2009 at 4:06 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> Other than if you're logging all your queries (or over <n> time, where >>> <n> is very small), I've never seen a system with performance issues >>> from logging. I'm sure others may have, but not me. >>> >>> Is there really any log output other than the >>> query-logging-for-performance-analysis that is likely to cause any >>> real load on the system? If not, perhaps we need to break out that >>> part to a separate codepath instead, and optimize that one for speed, >>> while optimizing the other paths for flexibility? >> >> Not sure, but I doubt it's that easy. > > If we are talking about the "log query duration" or "log queries > longer than <n>" that's a single location in the code. It can't be > that hard... I'm not sure that's really the only thing that can cause a logging bottleneck. I would be kinda surprised. ...Robert
On Fri, Sep 25, 2009 at 4:58 PM, Joshua Tolley <eggyknap@gmail.com> wrote: > On Fri, Sep 25, 2009 at 03:19:36PM -0400, Tom Lane wrote: >> However, I don't think I actually believe the premise of this patch, >> which is that sending log information to both stderr and syslog is >> a useful thing to do > > Actually the thing I want is to be able to send some stuff to syslog, and some > to a file, and other stuff to another file. This patch doesn't do all that, > but lays the necessary groundwork. I don't think it does anything of the sort. Getting to that point by adding GUCs is quickly going to produce obviously unacceptable numbers of GUCs. Or if it isn't, then I'd like to hear the whole designed laid out. I think Magnus's idea of a separate config file is much more likely to be succesful than what we have here, but that too will require some design that hasn't been done yet. ...Robert
On Fri, Sep 25, 2009 at 22:57, Andrew Dunstan <andrew@dunslane.net> wrote: > > Magnus Hagander wrote: >> >> On Fri, Sep 25, 2009 at 22:17, Andrew Dunstan <andrew@dunslane.net> wrote: >> >>> >>> Magnus Hagander wrote: >>> >>>> >>>> I definitely want both text and CSV output - which I can't have today. >>>> >>> >>> Sure you can. What makes you think you can't? >>> >> >> How do i do that? When I enable csv logging, it changes the log format >> to csv, and my plaintext logs don't end up in the logs anymore. >> >> Note that I'm not talking about syslog, I'm talking about the logging >> that goes through the logging collector, and is dealt with by >> PostgreSQL. >> >> > > log_destination = 'stderr, csvlog' Clearly that works. I wonder why that didn't work when I last tried it :S /me wipes the egg off. (it's still weird that it's called stderr when it's a logfile, but that's a different story) Without looking deeply at the code, does it also properly route these famous "messages from third party libraries" to both files? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Fri, Sep 25, 2009 at 04:18:08PM -0400, Robert Haas wrote: > > I would even more like to have some things send to CSV and some things > > sent to text. > > This patch won't help, then. No, it won't, but as said before, it lays the groundwork, namely letting the syslogger know things about the log messages it gets (rather than just having an opaque string), and route messages various places, accordingly. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
On Fri, Sep 25, 2009 at 5:01 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Sep 25, 2009 at 22:57, Andrew Dunstan <andrew@dunslane.net> wrote: >> >> Magnus Hagander wrote: >>> >>> On Fri, Sep 25, 2009 at 22:17, Andrew Dunstan <andrew@dunslane.net> wrote: >>> >>>> >>>> Magnus Hagander wrote: >>>> >>>>> >>>>> I definitely want both text and CSV output - which I can't have today. >>>>> >>>> >>>> Sure you can. What makes you think you can't? >>>> >>> >>> How do i do that? When I enable csv logging, it changes the log format >>> to csv, and my plaintext logs don't end up in the logs anymore. >>> >>> Note that I'm not talking about syslog, I'm talking about the logging >>> that goes through the logging collector, and is dealt with by >>> PostgreSQL. >>> >>> >> >> log_destination = 'stderr, csvlog' > > Clearly that works. I wonder why that didn't work when I last tried it :S > > /me wipes the egg off. > > (it's still weird that it's called stderr when it's a logfile, but > that's a different story) > > > Without looking deeply at the code, does it also properly route these > famous "messages from third party libraries" to both files? Rather than looking at the code, I'd suggest testing it. But I bet it does. The whole point of redirecting stderr specifically (rather than logging to some other random fd we keep around) is to catch those random messages, and it would be pretty silly to catch them and then not bother processing them properly. For whatever it's worth, I think your (as perceived by me) scorn regarding this issue is off base. I have this problem all the time, and not just in C, but also in Perl, shell scripts, etc. I can't tell you how many times I've tried to write code to ensure that ALL error messages get logged to some database, file, sent as an email message, etc. and inevitably something happens that my clever plan fails to catch, and the darn thing fails without alerting me. You have to have a collector process to make failure detection robust, and it has to capture stderr. Period, full stop. ...Robert
On Fri, Sep 25, 2009 at 05:04:45PM -0400, Robert Haas wrote: > On Fri, Sep 25, 2009 at 4:58 PM, Joshua Tolley <eggyknap@gmail.com> wrote: > > On Fri, Sep 25, 2009 at 03:19:36PM -0400, Tom Lane wrote: > >> However, I don't think I actually believe the premise of this patch, > >> which is that sending log information to both stderr and syslog is > >> a useful thing to do > > > > Actually the thing I want is to be able to send some stuff to syslog, and some > > to a file, and other stuff to another file. This patch doesn't do all that, > > but lays the necessary groundwork. > > I don't think it does anything of the sort. Getting to that point by > adding GUCs is quickly going to produce obviously unacceptable numbers > of GUCs. Or if it isn't, then I'd like to hear the whole designed > laid out. I think Magnus's idea of a separate config file is much > more likely to be succesful than what we have here, but that too will > require some design that hasn't been done yet. This doesn't approach the issue of how precisely you'd configure a more complex logging scheme, because clearly that will be complex. The whole purpose here is to let the syslogger know stuff about the log messages it gets, so it can act on them intelligently. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
On Fri, Sep 25, 2009 at 5:27 PM, Joshua Tolley <eggyknap@gmail.com> wrote: > On Fri, Sep 25, 2009 at 04:18:08PM -0400, Robert Haas wrote: >> > I would even more like to have some things send to CSV and some things >> > sent to text. >> >> This patch won't help, then. > > No, it won't, but as said before, it lays the groundwork, namely letting the > syslogger know things about the log messages it gets (rather than just having > an opaque string), and route messages various places, accordingly. Unless I'm missing something, all this patch does is add a GUC called syslog_line_prefix. Maybe you mean the logging patch? ...Robert
On Fri, Sep 25, 2009 at 05:04:45PM -0400, Robert Haas wrote: > On Fri, Sep 25, 2009 at 4:58 PM, Joshua Tolley <eggyknap@gmail.com> wrote: > > On Fri, Sep 25, 2009 at 03:19:36PM -0400, Tom Lane wrote: > >> However, I don't think I actually believe the premise of this patch, > >> which is that sending log information to both stderr and syslog is > >> a useful thing to do > > > > Actually the thing I want is to be able to send some stuff to syslog, and some > > to a file, and other stuff to another file. This patch doesn't do all that, > > but lays the necessary groundwork. > > I don't think it does anything of the sort. Getting to that point by > adding GUCs is quickly going to produce obviously unacceptable numbers > of GUCs. Or if it isn't, then I'd like to hear the whole designed > laid out. I think Magnus's idea of a separate config file is much > more likely to be succesful than what we have here, but that too will > require some design that hasn't been done yet. > > ...Robert Having just sent two messages to the discussion about the wrong patch, I'll apologize, and shut up now :) -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
Joshua Tolley <eggyknap@gmail.com> writes: > Having just sent two messages to the discussion about the wrong patch, I'll > apologize, and shut up now :) No need to apologize --- this really is, and should be, all one conversation. I think the main problem I've got with applying either patch is that I don't believe we have consensus on the direction the logging code should go. Without that, it's a bad idea to accept incremental patches, even if they're arguably harmless by themselves. regards, tom lane
Tom Lane wrote: > Joshua Tolley <eggyknap@gmail.com> writes: > >> Having just sent two messages to the discussion about the wrong patch, I'll >> apologize, and shut up now :) >> > > No need to apologize --- this really is, and should be, all one > conversation. I think the main problem I've got with applying either > patch is that I don't believe we have consensus on the direction the > logging code should go. Without that, it's a bad idea to accept > incremental patches, even if they're arguably harmless by themselves. > Agreed. The discussion does have en element of /déjà vu,/ too. The the whole idea behind log_line_prefix was to allow people to make easier and better log splitting decisions after the fact. Like you I'm wary of adding too much extra processing into the elog code. cheers andrew
On Mon, 2009-09-14 at 09:43 +0900, Itagaki Takahiro wrote: > Here is a patch to add a GUC parameter "syslog_line_prefix". > It adds prefixes to syslog and eventlog. We still have > "log_line_prefix", that will be used only for stderr logs. > > We have a tip that log_line_prefix is not required for syslog > in the documentation, but we'd better to have independent setttings > if we set log_destination to 'stderr, syslog'. IMO we should just make log_line_prefix work with syslog/eventlog too. > > http://developer.postgresql.org/pgdocs/postgres/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT > | Tip: Syslog produces its own time stamp and process ID > | information, so you probably do not want to use those escapes > | if you are logging to syslog. > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander
On Fri, Sep 25, 2009 at 6:12 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > Tom Lane wrote: >> Joshua Tolley <eggyknap@gmail.com> writes: >>> Having just sent two messages to the discussion about the wrong patch, >>> I'll >>> apologize, and shut up now :) >> >> No need to apologize --- this really is, and should be, all one >> conversation. I think the main problem I've got with applying either >> patch is that I don't believe we have consensus on the direction the >> logging code should go. Without that, it's a bad idea to accept >> incremental patches, even if they're arguably harmless by themselves. > > Agreed. The discussion does have en element of /déją vu,/ too. The the whole > idea behind log_line_prefix was to allow people to make easier and better > log splitting decisions after the fact. > > Like you I'm wary of adding too much extra processing into the elog code. I think we have consensus that this patch isn't clearly moving us in the right direction, and might be moving us in the wrong direction, so I am going to mark it as Rejected. I also agree with Tom's comments that we don't have consensus on where this should go. I think it would help a lot if someone put together a design document (perhaps on the wiki) and tried to enumerate at a high level the logging requirements that aren't being satisfied by the current system. Then we could have a conversation about the right way to address them. By writing the code first, I think we're putting the cart before the horse. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > I also agree with Tom's comments that we don't have consensus on where > this should go. I think it would help a lot if someone put together a > design document (perhaps on the wiki) and tried to enumerate at a high > level the logging requirements that aren't being satisfied by the > current system. Then we could have a conversation about the right way > to address them. By writing the code first, I think we're putting the > cart before the horse. +1 ... that seems like a much more sensible way to proceed than submitting patches without prior discussion. regards, tom lane
On Fri, 2009-09-25 at 14:58 -0600, Joshua Tolley wrote: > Actually the thing I want is to be able to send some stuff to syslog, > and some > to a file, and other stuff to another file. This patch doesn't do all > that, > but lays the necessary groundwork. Then why not send everything to syslog and have syslog filter it to the places you want to? That is what syslog is for, after all.
Peter Eisentraut <peter_e@gmx.net> writes: > On Fri, 2009-09-25 at 14:58 -0600, Joshua Tolley wrote: >> Actually the thing I want is to be able to send some stuff to syslog, >> and some to a file, and other stuff to another file. This patch >> doesn't do all that, but lays the necessary groundwork. > Then why not send everything to syslog and have syslog filter it to the > places you want to? That is what syslog is for, after all. We send all syslog output with the same identifier/priority/facility, so there's not a lot of hope of getting syslog to do any useful filtering (at least not with the versions of syslog I'm familiar with). Possibly it'd be worth trying to make that more configurable, but that would require a lot of the same infrastructure and complexity we're arguing about now. And it'd still be no help to Windows users. regards, tom lane
On Sun, 2009-09-27 at 16:15 -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > Then why not send everything to syslog and have syslog filter it to the > > places you want to? That is what syslog is for, after all. > > We send all syslog output with the same identifier/priority/facility, > so there's not a lot of hope of getting syslog to do any useful > filtering (at least not with the versions of syslog I'm familiar with). Time to upgrade then. ;-) For example, the default syslog in Fedora has been rsyslog since Fedora 8, and that one can do a lot more than just filter by identifier/priority/facility. syslog-ng is another popular example for a more featureful syslog.
On Sun, Sep 27, 2009 at 4:54 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On Sun, 2009-09-27 at 16:15 -0400, Tom Lane wrote: >> Peter Eisentraut <peter_e@gmx.net> writes: >> > Then why not send everything to syslog and have syslog filter it to the >> > places you want to? That is what syslog is for, after all. >> >> We send all syslog output with the same identifier/priority/facility, >> so there's not a lot of hope of getting syslog to do any useful >> filtering (at least not with the versions of syslog I'm familiar with). > > Time to upgrade then. ;-) For example, the default syslog in Fedora has > been rsyslog since Fedora 8, and that one can do a lot more than just > filter by identifier/priority/facility. syslog-ng is another popular > example for a more featureful syslog. Presumably csvlog would be good for these sorts of things too, no? The whole point is it's machine-readable. ...Robert
On Sun, Sep 27, 2009 at 23:03, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Sep 27, 2009 at 4:54 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> On Sun, 2009-09-27 at 16:15 -0400, Tom Lane wrote: >>> Peter Eisentraut <peter_e@gmx.net> writes: >>> > Then why not send everything to syslog and have syslog filter it to the >>> > places you want to? That is what syslog is for, after all. >>> >>> We send all syslog output with the same identifier/priority/facility, >>> so there's not a lot of hope of getting syslog to do any useful >>> filtering (at least not with the versions of syslog I'm familiar with). >> >> Time to upgrade then. ;-) For example, the default syslog in Fedora has >> been rsyslog since Fedora 8, and that one can do a lot more than just >> filter by identifier/priority/facility. syslog-ng is another popular >> example for a more featureful syslog. > > Presumably csvlog would be good for these sorts of things too, no? > The whole point is it's machine-readable. If there was a way to pipe the csv log through an external problem, that would take care of much of the problem. And I guess if you make that program responsible for *everything* it would work - you just set your logging level to log very much data, and let the external process deal with it. If we implemented the ability to have a different logging level for different destinations you could keep text logging for other things, or you could just delegate all that to the external process as well. That would basically turn the syslogger into a process that reads from the input and sends the data out to an external process. But it could then implement things like automatic restart of the external process in case of crash etc, in perhaps a much easier way than the postmaster can do for the syslogger itself. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Mon, Sep 28, 2009 at 5:22 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Sun, Sep 27, 2009 at 23:03, Robert Haas <robertmhaas@gmail.com> wrote: >> On Sun, Sep 27, 2009 at 4:54 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> On Sun, 2009-09-27 at 16:15 -0400, Tom Lane wrote: >>>> Peter Eisentraut <peter_e@gmx.net> writes: >>>> > Then why not send everything to syslog and have syslog filter it to the >>>> > places you want to? That is what syslog is for, after all. >>>> >>>> We send all syslog output with the same identifier/priority/facility, >>>> so there's not a lot of hope of getting syslog to do any useful >>>> filtering (at least not with the versions of syslog I'm familiar with). >>> >>> Time to upgrade then. ;-) For example, the default syslog in Fedora has >>> been rsyslog since Fedora 8, and that one can do a lot more than just >>> filter by identifier/priority/facility. syslog-ng is another popular >>> example for a more featureful syslog. >> >> Presumably csvlog would be good for these sorts of things too, no? >> The whole point is it's machine-readable. > > If there was a way to pipe the csv log through an external problem, > that would take care of much of the problem. tail -f is probably a bit too fragile for this purpose, but I think it would be possible to design a utility that would do this. The idea would be to maintain a state file that would list the most recent CSV log file read and the byte offset of the first byte following the last line processed. On every iteration, we just open up the last file read and read beginning at the designated offset through end of file. Then we check if a newer file is available and, if so, we begin reading that file. When we're done reading, we update the state file. There is the problem of what happens if we read a partial last line of a file being written, but that seems surmountable: if the last line read does not end in a newline, and no newer file is present, then don't include that partial line in the output, and record the offset of the beginning of that line in the state file. I'm not sure if this will work on Windows, but it should be OK on anything UNIX-ish. > And I guess if you make that program responsible for *everything* it > would work - you just set your logging level to log very much data, > and let the external process deal with it. If we implemented the > ability to have a different logging level for different destinations > you could keep text logging for other things, or you could just > delegate all that to the external process as well. That would > basically turn the syslogger into a process that reads from the input > and sends the data out to an external process. But it could then > implement things like automatic restart of the external process in > case of crash etc, in perhaps a much easier way than the postmaster > can do for the syslogger itself. The problem with having the syslogger send the data directly to an external process is that the external process might be unable to process the data as fast as syslogger is sending it. I'm not sure exactly what will happen in that case, but it will definitely be bad. I think what will likely happen is that the entire database cluster will end up waiting on write(2) calls to various places and processing will grind to a halt. I think it's better to spool the log messages to files, and then let the external utility read the files. The external utility can still fall behind, but even if it does the cluster will continue running. ...Robert
2009/9/28 Robert Haas <robertmhaas@gmail.com>: > On Mon, Sep 28, 2009 at 5:22 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Sun, Sep 27, 2009 at 23:03, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Sun, Sep 27, 2009 at 4:54 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>>> On Sun, 2009-09-27 at 16:15 -0400, Tom Lane wrote: >>>>> Peter Eisentraut <peter_e@gmx.net> writes: >>>>> > Then why not send everything to syslog and have syslog filter it to the >>>>> > places you want to? That is what syslog is for, after all. >>>>> >>>>> We send all syslog output with the same identifier/priority/facility, >>>>> so there's not a lot of hope of getting syslog to do any useful >>>>> filtering (at least not with the versions of syslog I'm familiar with). >>>> >>>> Time to upgrade then. ;-) For example, the default syslog in Fedora has >>>> been rsyslog since Fedora 8, and that one can do a lot more than just >>>> filter by identifier/priority/facility. syslog-ng is another popular >>>> example for a more featureful syslog. >>> >>> Presumably csvlog would be good for these sorts of things too, no? >>> The whole point is it's machine-readable. >> >> If there was a way to pipe the csv log through an external problem, >> that would take care of much of the problem. > > tail -f is probably a bit too fragile for this purpose, but I think it > would be possible to design a utility that would do this. The idea > would be to maintain a state file that would list the most recent CSV > log file read and the byte offset of the first byte following the last > line processed. On every iteration, we just open up the last file > read and read beginning at the designated offset through end of file. > Then we check if a newer file is available and, if so, we begin > reading that file. When we're done reading, we update the state file. That would mean we have to write everything to the file, though, so it would be rather bad for the case where you want to log "just a little" but are "delegating" the decision to the external process. And it would create double the I/O on disk for the logfile (once to the csv log, once processed by the external process). > There is the problem of what happens if we read a partial last line of > a file being written, but that seems surmountable: if the last line > read does not end in a newline, and no newer file is present, then > don't include that partial line in the output, and record the offset > of the beginning of that line in the state file. > > I'm not sure if this will work on Windows, but it should be OK on > anything UNIX-ish. Well, there'll be dealing with the sharing violations and stuff, but we just need to make sure that the syslogger would open the file with the proper sharing flags. Which I think it does already, actaully. >> And I guess if you make that program responsible for *everything* it >> would work - you just set your logging level to log very much data, >> and let the external process deal with it. If we implemented the >> ability to have a different logging level for different destinations >> you could keep text logging for other things, or you could just >> delegate all that to the external process as well. That would >> basically turn the syslogger into a process that reads from the input >> and sends the data out to an external process. But it could then >> implement things like automatic restart of the external process in >> case of crash etc, in perhaps a much easier way than the postmaster >> can do for the syslogger itself. > > The problem with having the syslogger send the data directly to an > external process is that the external process might be unable to > process the data as fast as syslogger is sending it. I'm not sure > exactly what will happen in that case, but it will definitely be bad. > I think what will likely happen is that the entire database cluster > will end up waiting on write(2) calls to various places and processing > will grind to a halt. We'll have the same issue if we have the syslogger write it to disk, don't we? In fact, it might even be faster depending on how much processing is done and what can be thrown away at that step, since it could decrease the disk I/O needed in favor of CPU work. > I think it's better to spool the log messages to files, and then let > the external utility read the files. The external utility can still > fall behind, but even if it does the cluster will continue running. The difficulty there is to make it "live enough". But I guess if it implements the same method as tail -f, it would do that - the only issue then would be the fact that this would require much more I/O on disk. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Mon, Sep 28, 2009 at 6:51 AM, Magnus Hagander <magnus@hagander.net> wrote: >> I think it's better to spool the log messages to files, and then let >> the external utility read the files. The external utility can still >> fall behind, but even if it does the cluster will continue running. > > The difficulty there is to make it "live enough". But I guess if it > implements the same method as tail -f, it would do that - the only > issue then would be the fact that this would require much more I/O on > disk. The I/O issue is a tricky one. If that's an issue, then maybe a pipe or socket is a better fit. But if the pipe fills up, then the logging collector needs to begin spooling the messages to disk so that the whole system doesn't pile up on the external log analyzer. Figuring out the right design here is a bit tricky. ...Robert
[ please trim the quoted material a bit, folks ] Magnus Hagander <magnus@hagander.net> writes: > 2009/9/28 Robert Haas <robertmhaas@gmail.com>: >> The problem with having the syslogger send the data directly to an >> external process is that the external process might be unable to >> process the data as fast as syslogger is sending it. �I'm not sure >> exactly what will happen in that case, but it will definitely be bad. This is the same issue already raised with respect to syslog versus syslogger, ie, some people would rather lose log data than have the backends block waiting for it to be written. > That would mean we have to write everything to the file, though, so it > would be rather bad for the case where you want to log "just a little" > but are "delegating" the decision to the external process. And it > would create double the I/O on disk for the logfile (once to the csv > log, once processed by the external process). Robert's design could be made to work without that, if you dump the original log output into a ramdisk and let the external process write whatever it chooses to real disk. If you have a system crash you lose any as-yet-unprocessed log output, but hopefully there usually wouldn't be much. regards, tom lane
Tom Lane escribió: > [ please trim the quoted material a bit, folks ] > > Magnus Hagander <magnus@hagander.net> writes: > > 2009/9/28 Robert Haas <robertmhaas@gmail.com>: > >> The problem with having the syslogger send the data directly to an > >> external process is that the external process might be unable to > >> process the data as fast as syslogger is sending it. �I'm not sure > >> exactly what will happen in that case, but it will definitely be bad. > > This is the same issue already raised with respect to syslog versus > syslogger, ie, some people would rather lose log data than have the > backends block waiting for it to be written. That could be made configurable; i.e. let the user choose whether to lose messages or to make everybody wait. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Sep 28, 2009 at 1:07 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Tom Lane escribió: >> [ please trim the quoted material a bit, folks ] >> >> Magnus Hagander <magnus@hagander.net> writes: >> > 2009/9/28 Robert Haas <robertmhaas@gmail.com>: >> >> The problem with having the syslogger send the data directly to an >> >> external process is that the external process might be unable to >> >> process the data as fast as syslogger is sending it. I'm not sure >> >> exactly what will happen in that case, but it will definitely be bad. >> >> This is the same issue already raised with respect to syslog versus >> syslogger, ie, some people would rather lose log data than have the >> backends block waiting for it to be written. > > That could be made configurable; i.e. let the user choose whether to > lose messages or to make everybody wait. I think the behavior I was proposing was neither "drop" nor "wait", but "buffer". Not sure how people feel about that. ...Robert
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribió: >> This is the same issue already raised with respect to syslog versus >> syslogger, ie, some people would rather lose log data than have the >> backends block waiting for it to be written. > That could be made configurable; i.e. let the user choose whether to > lose messages or to make everybody wait. Hmm, I guess I missed where you proposed an implementation that would support that? regards, tom lane
Robert Haas escribió: > On Mon, Sep 28, 2009 at 1:07 PM, Alvaro Herrera > <alvherre@commandprompt.com> wrote: > > Tom Lane escribió: > >> [ please trim the quoted material a bit, folks ] > >> > >> Magnus Hagander <magnus@hagander.net> writes: > >> > 2009/9/28 Robert Haas <robertmhaas@gmail.com>: > >> >> The problem with having the syslogger send the data directly to an > >> >> external process is that the external process might be unable to > >> >> process the data as fast as syslogger is sending it. I'm not sure > >> >> exactly what will happen in that case, but it will definitely be bad. > >> > >> This is the same issue already raised with respect to syslog versus > >> syslogger, ie, some people would rather lose log data than have the > >> backends block waiting for it to be written. > > > > That could be made configurable; i.e. let the user choose whether to > > lose messages or to make everybody wait. > > I think the behavior I was proposing was neither "drop" nor "wait", > but "buffer". Not sure how people feel about that. Given an arbitrary increase in log rate during an arbitrary length of time, any buffer you keep will be filled. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Sep 28, 2009 at 2:13 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Robert Haas escribió: >> On Mon, Sep 28, 2009 at 1:07 PM, Alvaro Herrera >> <alvherre@commandprompt.com> wrote: >> > Tom Lane escribió: >> >> [ please trim the quoted material a bit, folks ] >> >> >> >> Magnus Hagander <magnus@hagander.net> writes: >> >> > 2009/9/28 Robert Haas <robertmhaas@gmail.com>: >> >> >> The problem with having the syslogger send the data directly to an >> >> >> external process is that the external process might be unable to >> >> >> process the data as fast as syslogger is sending it. I'm not sure >> >> >> exactly what will happen in that case, but it will definitely be bad. >> >> >> >> This is the same issue already raised with respect to syslog versus >> >> syslogger, ie, some people would rather lose log data than have the >> >> backends block waiting for it to be written. >> > >> > That could be made configurable; i.e. let the user choose whether to >> > lose messages or to make everybody wait. >> >> I think the behavior I was proposing was neither "drop" nor "wait", >> but "buffer". Not sure how people feel about that. > > Given an arbitrary increase in log rate during an arbitrary length of > time, any buffer you keep will be filled. True. But the activity might be bursty. ...Robert
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane escribió: > >> This is the same issue already raised with respect to syslog versus > >> syslogger, ie, some people would rather lose log data than have the > >> backends block waiting for it to be written. > > > That could be made configurable; i.e. let the user choose whether to > > lose messages or to make everybody wait. > > Hmm, I guess I missed where you proposed an implementation that would > support that? syslog uses a nonblocking file descriptor without a retry loop to implement their logic. I see no reason we couldn't do that ourselves. Mixing it with regular blocking code could turn out to be nontrivial, but I don't think it's impossible. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Tom Lane escribió: > >> Alvaro Herrera <alvherre@commandprompt.com> writes: >> >>> Tom Lane escribió: >>> >>>> This is the same issue already raised with respect to syslog versus >>>> syslogger, ie, some people would rather lose log data than have the >>>> backends block waiting for it to be written. >>>> >>> That could be made configurable; i.e. let the user choose whether to >>> lose messages or to make everybody wait. >>> >> Hmm, I guess I missed where you proposed an implementation that would >> support that? >> > > syslog uses a nonblocking file descriptor without a retry loop to > implement their logic. I see no reason we couldn't do that ourselves. > Mixing it with regular blocking code could turn out to be nontrivial, > but I don't think it's impossible. > > Well, for CSV logs it's a complete non-starter. We go to quite a deal of trouble to ensure we don't miss messages, because if we do the CSVs will be hopelessly corrupted. Frankly, if you're generating so much log output that blocking is going to become an issue you should probably just be using syslog on Unix anyway. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Alvaro Herrera wrote: >> syslog uses a nonblocking file descriptor without a retry loop to >> implement their logic. I see no reason we couldn't do that ourselves. >> Mixing it with regular blocking code could turn out to be nontrivial, >> but I don't think it's impossible. > Well, for CSV logs it's a complete non-starter. We go to quite a deal of > trouble to ensure we don't miss messages, because if we do the CSVs will > be hopelessly corrupted. You could make it work if write() had all-or-nothing semantics, so that you could write or discard a whole logical message at once. But I don't believe that's guaranteed for any interesting cases. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Alvaro Herrera wrote: >> >>> syslog uses a nonblocking file descriptor without a retry loop to >>> implement their logic. I see no reason we couldn't do that ourselves. >>> Mixing it with regular blocking code could turn out to be nontrivial, >>> but I don't think it's impossible. >>> > > >> Well, for CSV logs it's a complete non-starter. We go to quite a deal of >> trouble to ensure we don't miss messages, because if we do the CSVs will >> be hopelessly corrupted. >> > > You could make it work if write() had all-or-nothing semantics, so that > you could write or discard a whole logical message at once. But I don't > believe that's guaranteed for any interesting cases. > > > Right. That's part of why we had to invent the chunking protocol between the backends and the syslogger, IIRC. cheers andrew