Thread: WIP - syslogger infrastructure changes
I've been working on a restructure of some of the syslogger infrastructure with Josh Tolley. Attached is the WIP patch that we have so far. First, the patch removes the logging_collector parameter and basically assumes that logging_collector is always on. With that, it's no longer necessary to restart your server just to reconfigure the logging, and it also takes away a confusing parameter (really "log_destination=stderr, logging_collector=on" is *not* a logical way to say "log to file"). Instead, it adds a log_destination of "file" that is the standard log to file. Given that the syslogger is now always started, those that actually *want* logging to stderr (which I claim is a low number of people, but that's a different story) will have it go through the syslogger and then to the stderr of syslogger. If syslogger is not used at all (say with syslog log target that goes directly from backend), it'll just sit around doing nothing, and the cost should be very close to zero. Second, data transfer from the backends is now in the form of structured data, and the actual logging message is put together in the syslogger (today,it's put together in the backend and just sent as a string to the syslogger). Yes, this means that we will likely send more data than what's eventually going to be logged, since all fields don't go out (except with CVS logging, I guess). But you usually don't send *that* much data in the log. It *does* allow for us to properly specify multiple log destinations, which we only "almost" can now. For example, it allows a regular logfile and csvlog at the same time. And since stderr actually means stderr, you can combine stderr and logfile. By putting more available data in the syslogger, this is also intended to make it easier to make the syslogger functionality more advanced in the future. This includes the kind of work that Itagaki Takahiro submitted a patch around recently, and things like being able to write different things to different logfiles. As the name says, it's still a WIP, and the code contains both several FIXME places, and some other non-finished parts so far, but we're posting it to get comments on the general idea and method for doing it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Attachment
Hi Magnus, On Mon, Sep 14, 2009 at 9:41 PM, Magnus Hagander <magnus@hagander.net> wrote: > First, the patch removes the logging_collector parameter and basically > assumes that logging_collector is always on. Alvaro commited this a few days ago: http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/config.sgml?r1=1.225&r2=1.226 Any consequence? > Second, data transfer from the backends is now in the form of > structured data, and the actual logging message is put together in the > syslogger (today,it's put together in the backend and just sent as a > string to the syslogger). Yes, this means that we will likely send > more data than what's eventually going to be logged, since all fields > don't go out (except with CVS logging, I guess). But you usually don't > send *that* much data in the log. I don't know if it will make a real difference but some of us log quite a lot of queries. -- Guillaume
On Mon, Sep 14, 2009 at 21:56, Guillaume Smet <guillaume.smet@gmail.com> wrote: > Hi Magnus, > > On Mon, Sep 14, 2009 at 9:41 PM, Magnus Hagander <magnus@hagander.net> wrote: >> First, the patch removes the logging_collector parameter and basically >> assumes that logging_collector is always on. > > Alvaro commited this a few days ago: > http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/config.sgml?r1=1.225&r2=1.226 > > Any consequence? At this point, not really. If you log to syslog, it still goes directly to syslog, just like before, without passing the logging collector. That is something worth considering inthe future, though. >> Second, data transfer from the backends is now in the form of >> structured data, and the actual logging message is put together in the >> syslogger (today,it's put together in the backend and just sent as a >> string to the syslogger). Yes, this means that we will likely send >> more data than what's eventually going to be logged, since all fields >> don't go out (except with CVS logging, I guess). But you usually don't >> send *that* much data in the log. > > I don't know if it will make a real difference but some of us log > quite a lot of queries. Yeah, one of the main reasons is to be able to do that using CSV (not CVS, sorry about that typo) to stick it into logging parsers, and then have the regular logfile available for DBA reading as well. Anyway, we'll have to do some performance testing as well, obviously - but I haven't done any of that yet. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > First, the patch removes the logging_collector parameter and basically > assumes that logging_collector is always on. I don't find that to be a good idea, and you certainly have not made a case why we should change it. I can't see any reason why pushing functionality out of backends and downstream to the syslogger process is an improvement. What it's more likely to do is create a processing bottleneck and a single point of failure. > ... Given that the syslogger is now > always started, those that actually *want* logging to stderr (which I > claim is a low number of people, but that's a different story) will > have it go through the syslogger and then to the stderr of syslogger. That design doesn't work because there is then *no* way to recover from a syslogger crash. You no longer have access to the original stderr file once the postmaster has redirected stderr to syslogger. We can live with that so long as syslogger's stderr output isn't very interesting, but not if it's going to be the main log channel. regards, tom lane
Magnus Hagander <magnus@hagander.net> wrote: > Second, data transfer from the backends is now in the form of > structured data, and the actual logging message is put together in the > syslogger I'd like to have an opposite approach -- per-backend log files. Syslogger and appending logs at only one file will be a performance bottleneck when we want write a lot of logs. Per-backend log files like "postgres.<backend-id>.log" would be a help then. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On 15 sep 2009, at 07.21, Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp > wrote: > > Magnus Hagander <magnus@hagander.net> wrote: > >> Second, data transfer from the backends is now in the form of >> structured data, and the actual logging message is put together in >> the >> syslogger > > I'd like to have an opposite approach -- per-backend log files. > Syslogger and appending logs at only one file will be a performance > bottleneck when we want write a lot of logs. Per-backend log files > like "postgres.<backend-id>.log" would be a help then. > That would make the log more or less completely useless for any of the situations I've been in. I can see each backend writing it, certainly, but keeping it in separate files makes it useless without post processing, which in most vases means useless for day-to-day work. /Magnus
Magnus Hagander <magnus@hagander.net> wrote: > On 15 sep 2009, at 07.21, Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp > > I'd like to have an opposite approach -- per-backend log files. > > I can see each backend writing it, certainly, but keeping it in > separate files makes it useless without post processing, which in most > vases means useless for day-to-day work. Sure. There should be a trade-off between performance and usability. And that's is the reason I submitted per-destination or per-category log filter. Log messages for "day-to-day work" should be written in a single file (either text or syslog), but sql queries used only for database auditing are acceptable even if written in separate files. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On Mon, Sep 14, 2009 at 4:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> First, the patch removes the logging_collector parameter and basically >> assumes that logging_collector is always on. > > I don't find that to be a good idea, and you certainly have not made > a case why we should change it. I can't see any reason why pushing > functionality out of backends and downstream to the syslogger process > is an improvement. What it's more likely to do is create a processing > bottleneck and a single point of failure. Hmm. I think the justification was supposed to be this part here: $ With that, it's no longer necessary to restart your server just to $ reconfigure the logging, and it also takes away a confusing parameter $ (really "log_destination=stderr, logging_collector=on" is *not* a logical $ way to say "log to file"). Instead, it adds a log_destination of "file" that $ is the standard log to file." Do we have any positive or negative experience with logging_collector as a performance bottleneck? Are there people running with logging_collector=off to avert disaster? >> ... Given that the syslogger is now >> always started, those that actually *want* logging to stderr (which I >> claim is a low number of people, but that's a different story) will >> have it go through the syslogger and then to the stderr of syslogger. > > That design doesn't work because there is then *no* way to recover from > a syslogger crash. You no longer have access to the original stderr > file once the postmaster has redirected stderr to syslogger. We can > live with that so long as syslogger's stderr output isn't very > interesting, but not if it's going to be the main log channel. I haven't read the patch yet, but this seems like it might be surmountable by having the postmaster keep both file descriptors around, and letting the children make appropriate calls to dup2() and close() just after forking. ...Robert
On Fri, Sep 25, 2009 at 05:18, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Sep 14, 2009 at 4:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>> First, the patch removes the logging_collector parameter and basically >>> assumes that logging_collector is always on. >> >> I don't find that to be a good idea, and you certainly have not made >> a case why we should change it. I can't see any reason why pushing >> functionality out of backends and downstream to the syslogger process >> is an improvement. What it's more likely to do is create a processing >> bottleneck and a single point of failure. > > Hmm. I think the justification was supposed to be this part here: > > $ With that, it's no longer necessary to restart your server just to > $ reconfigure the logging, and it also takes away a confusing parameter > $ (really "log_destination=stderr, logging_collector=on" is *not* a logical > $ way to say "log to file"). Instead, it adds a log_destination of "file" that > $ is the standard log to file." > > Do we have any positive or negative experience with logging_collector > as a performance bottleneck? Are there people running with > logging_collector=off to avert disaster? I've never heard of that, so I'd be very interested in hearing if somebody did. Actually, it's not starting the logging collector that's the issue. It's the redirection. So we could always start the logging collector, but not necessarily redirect stderr. For those who then set logging to "file", it gets passed to the logging collector. Those who set it to "stderr" just have to deal with stderr somehow (pg_ctl -l for example). It's still going to require a restart to change stderr, but that will only hit those people who are actually having performance issues from it (and want to switch to using stderr instead of logging collector). Then we just send the data to the syslogger over a separate pipe. If we keep the "endpoint" of this pipe in the postmaster, a new syslogger could just "pick that up" after a crash, no? Data that goes on stderr will still go to stderr and not be captured in this case, but normally that would be no data at all, and if it did happen it'd be caught by wherever you point pg_ctl -l at (which is the same place the early startup messages goes, before the logger is running). -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Fri, Sep 25, 2009 at 4:13 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Sep 25, 2009 at 05:18, Robert Haas <robertmhaas@gmail.com> wrote: >> On Mon, Sep 14, 2009 at 4:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Magnus Hagander <magnus@hagander.net> writes: >>>> First, the patch removes the logging_collector parameter and basically >>>> assumes that logging_collector is always on. >>> >>> I don't find that to be a good idea, and you certainly have not made >>> a case why we should change it. I can't see any reason why pushing >>> functionality out of backends and downstream to the syslogger process >>> is an improvement. What it's more likely to do is create a processing >>> bottleneck and a single point of failure. >> >> Hmm. I think the justification was supposed to be this part here: >> >> $ With that, it's no longer necessary to restart your server just to >> $ reconfigure the logging, and it also takes away a confusing parameter >> $ (really "log_destination=stderr, logging_collector=on" is *not* a logical >> $ way to say "log to file"). Instead, it adds a log_destination of "file" that >> $ is the standard log to file." >> >> Do we have any positive or negative experience with logging_collector >> as a performance bottleneck? Are there people running with >> logging_collector=off to avert disaster? > > I've never heard of that, so I'd be very interested in hearing if somebody did. > > Actually, it's not starting the logging collector that's the issue. > It's the redirection. > > So we could always start the logging collector, but not necessarily > redirect stderr. For those who then set logging to "file", it gets > passed to the logging collector. Those who set it to "stderr" just > have to deal with stderr somehow (pg_ctl -l for example). It's still > going to require a restart to change stderr, but that will only hit > those people who are actually having performance issues from it (and > want to switch to using stderr instead of logging collector). > > Then we just send the data to the syslogger over a separate pipe. If > we keep the "endpoint" of this pipe in the postmaster, a new syslogger > could just "pick that up" after a crash, no? Data that goes on stderr > will still go to stderr and not be captured in this case, but normally > that would be no data at all, and if it did happen it'd be caught by > wherever you point pg_ctl -l at (which is the same place the early > startup messages goes, before the logger is running). That doesn't sound horribly robust. I thought part of the point of the design of the logging collector was that if random libraries linked into the backend emit messages on stderr, we'll still capture them and send them on to the appropriate place, even though we don't really know what they are. Having such messages go off to some probably-much-less-carefully-monitored logfile somewhere doesn't seem good. Could we handle this by allow the setting to be changed while the system is up, but just have it take effect for any NEW backends? As long as we keep the original stderr around in the postmaster, we ought to be able to change how new backends are set up after the fact. Anyone who really needs to make sure the setting gets changed, but without shutting down the system, will just have to make sure to kill off all their old backends after making the change. ...Robert
On Fri, Sep 25, 2009 at 22:26, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Sep 25, 2009 at 4:13 PM, Magnus Hagander <magnus@hagander.net> wrote: >> On Fri, Sep 25, 2009 at 05:18, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Mon, Sep 14, 2009 at 4:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Magnus Hagander <magnus@hagander.net> writes: >>>>> First, the patch removes the logging_collector parameter and basically >>>>> assumes that logging_collector is always on. >>>> >>>> I don't find that to be a good idea, and you certainly have not made >>>> a case why we should change it. I can't see any reason why pushing >>>> functionality out of backends and downstream to the syslogger process >>>> is an improvement. What it's more likely to do is create a processing >>>> bottleneck and a single point of failure. >>> >>> Hmm. I think the justification was supposed to be this part here: >>> >>> $ With that, it's no longer necessary to restart your server just to >>> $ reconfigure the logging, and it also takes away a confusing parameter >>> $ (really "log_destination=stderr, logging_collector=on" is *not* a logical >>> $ way to say "log to file"). Instead, it adds a log_destination of "file" that >>> $ is the standard log to file." >>> >>> Do we have any positive or negative experience with logging_collector >>> as a performance bottleneck? Are there people running with >>> logging_collector=off to avert disaster? >> >> I've never heard of that, so I'd be very interested in hearing if somebody did. >> >> Actually, it's not starting the logging collector that's the issue. >> It's the redirection. >> >> So we could always start the logging collector, but not necessarily >> redirect stderr. For those who then set logging to "file", it gets >> passed to the logging collector. Those who set it to "stderr" just >> have to deal with stderr somehow (pg_ctl -l for example). It's still >> going to require a restart to change stderr, but that will only hit >> those people who are actually having performance issues from it (and >> want to switch to using stderr instead of logging collector). >> >> Then we just send the data to the syslogger over a separate pipe. If >> we keep the "endpoint" of this pipe in the postmaster, a new syslogger >> could just "pick that up" after a crash, no? Data that goes on stderr >> will still go to stderr and not be captured in this case, but normally >> that would be no data at all, and if it did happen it'd be caught by >> wherever you point pg_ctl -l at (which is the same place the early >> startup messages goes, before the logger is running). > > That doesn't sound horribly robust. I thought part of the point of > the design of the logging collector was that if random libraries > linked into the backend emit messages on stderr, we'll still capture > them and send them on to the appropriate place, even though we don't > really know what they are. Having such messages go off to some > probably-much-less-carefully-monitored logfile somewhere doesn't seem > good. Well, from how I read the first complaint here, running with logging_collector on is simply broken from the crash perspective, no? I mean, don't we have the same problem *today*, it's just that logging_collector isn't on by default (in the source distribution that is - on our most installed platform it is..) Also, how often does it actually happen that "random libraries linked into the backend emit messages on stderr"? Is this really something that we need to pay so much attention to, rather than just let them go on stderr and have them be redirected? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Robert Haas escribió: > $ With that, it's no longer necessary to restart your server just to > $ reconfigure the logging, and it also takes away a confusing parameter > $ (really "log_destination=stderr, logging_collector=on" is *not* a logical > $ way to say "log to file"). Instead, it adds a log_destination of "file" that > $ is the standard log to file." > > Do we have any positive or negative experience with logging_collector > as a performance bottleneck? Are there people running with > logging_collector=off to avert disaster? The collector can be a bottleneck, yes, but for this to happen it needs to be in active use. So if the collector is always running but the messages are being sent to syslog, no harm is done. (A customer of ours found the hard way that the collector can block backends when things are tight. They have since switched to syslog.) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Magnus Hagander <magnus@hagander.net> writes: > Well, from how I read the first complaint here, running with > logging_collector on is simply broken from the crash perspective, no? > I mean, don't we have the same problem *today*, No, we don't. But we would if we accept the proposed patch. regards, tom lane
[ argh, hit send a bit too fast ] Magnus Hagander <magnus@hagander.net> writes: > Also, how often does it actually happen that "random libraries linked > into the backend emit messages on stderr"? Is this really something > that we need to pay so much attention to, rather than just let them go > on stderr and have them be redirected? Yes, this is an ABSOLUTE deal-breaker. To name two reasons: * glibc tells about various malloc-arena-corruption problems on stderr. * On a lot of systems you will not find out anything useful about dynamic-linker problems if you don't capture stderr. regards, tom lane
On Fri, Sep 25, 2009 at 4:29 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Sep 25, 2009 at 22:26, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Sep 25, 2009 at 4:13 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> On Fri, Sep 25, 2009 at 05:18, Robert Haas <robertmhaas@gmail.com> wrote: >>>> On Mon, Sep 14, 2009 at 4:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>>> Magnus Hagander <magnus@hagander.net> writes: >>>>>> First, the patch removes the logging_collector parameter and basically >>>>>> assumes that logging_collector is always on. >>>>> >>>>> I don't find that to be a good idea, and you certainly have not made >>>>> a case why we should change it. I can't see any reason why pushing >>>>> functionality out of backends and downstream to the syslogger process >>>>> is an improvement. What it's more likely to do is create a processing >>>>> bottleneck and a single point of failure. >>>> >>>> Hmm. I think the justification was supposed to be this part here: >>>> >>>> $ With that, it's no longer necessary to restart your server just to >>>> $ reconfigure the logging, and it also takes away a confusing parameter >>>> $ (really "log_destination=stderr, logging_collector=on" is *not* a logical >>>> $ way to say "log to file"). Instead, it adds a log_destination of "file" that >>>> $ is the standard log to file." >>>> >>>> Do we have any positive or negative experience with logging_collector >>>> as a performance bottleneck? Are there people running with >>>> logging_collector=off to avert disaster? >>> >>> I've never heard of that, so I'd be very interested in hearing if somebody did. >>> >>> Actually, it's not starting the logging collector that's the issue. >>> It's the redirection. >>> >>> So we could always start the logging collector, but not necessarily >>> redirect stderr. For those who then set logging to "file", it gets >>> passed to the logging collector. Those who set it to "stderr" just >>> have to deal with stderr somehow (pg_ctl -l for example). It's still >>> going to require a restart to change stderr, but that will only hit >>> those people who are actually having performance issues from it (and >>> want to switch to using stderr instead of logging collector). >>> >>> Then we just send the data to the syslogger over a separate pipe. If >>> we keep the "endpoint" of this pipe in the postmaster, a new syslogger >>> could just "pick that up" after a crash, no? Data that goes on stderr >>> will still go to stderr and not be captured in this case, but normally >>> that would be no data at all, and if it did happen it'd be caught by >>> wherever you point pg_ctl -l at (which is the same place the early >>> startup messages goes, before the logger is running). >> >> That doesn't sound horribly robust. I thought part of the point of >> the design of the logging collector was that if random libraries >> linked into the backend emit messages on stderr, we'll still capture >> them and send them on to the appropriate place, even though we don't >> really know what they are. Having such messages go off to some >> probably-much-less-carefully-monitored logfile somewhere doesn't seem >> good. > > Well, from how I read the first complaint here, running with > logging_collector on is simply broken from the crash perspective, no? > I mean, don't we have the same problem *today*, it's just that > logging_collector isn't on by default (in the source distribution that > is - on our most installed platform it is..) AIUI the problem is that when logging_collector is on, we throw away the original stderr. That's OK as long as you never try to switch back to it. > Also, how often does it actually happen that "random libraries linked > into the backend emit messages on stderr"? Is this really something > that we need to pay so much attention to, rather than just let them go > on stderr and have them be redirected? I think so. But I'm willing to be told I'm full of it. ...Robert
On Fri, Sep 25, 2009 at 5:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: > AIUI the problem is that when logging_collector is on, we throw away > the original stderr. That's OK as long as you never try to switch > back to it. BTW, this seems like it could be fixed with some appropriate file descriptor management in postmaster. No? Anyway, marking this Rejected for now, as that seems to be the consensus. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Sep 25, 2009 at 5:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> AIUI the problem is that when logging_collector is on, we throw away >> the original stderr. �That's OK as long as you never try to switch >> back to it. > BTW, this seems like it could be fixed with some appropriate file > descriptor management in postmaster. No? Yeah, probably. In the current system design it didn't seem to be necessary because collecting the collector process's own stderr output is usually not all that critical. If we had the postmaster dup its original stderr and the collector dup it back, that would be a more complete but more complex solution. (dup2 works on Windows, no?) regards, tom lane
On Sat, Sep 26, 2009 at 17:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Sep 25, 2009 at 5:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> AIUI the problem is that when logging_collector is on, we throw away >>> the original stderr. That's OK as long as you never try to switch >>> back to it. > >> BTW, this seems like it could be fixed with some appropriate file >> descriptor management in postmaster. No? > > Yeah, probably. In the current system design it didn't seem to be > necessary because collecting the collector process's own stderr output > is usually not all that critical. If we had the postmaster dup its > original stderr and the collector dup it back, that would be a more > complete but more complex solution. (dup2 works on Windows, no?) Yeah, they should. Not sure how that works through process boundaries though, we do some special fiddling for any handles that need to be inherited. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Sat, Sep 26, 2009 at 11:43:46AM -0400, Tom Lane wrote: > complete but more complex solution. (dup2 works on Windows, no?) Unless I'm misreading syslogger.c, dup2() gets called on every platform. I've started the wiki page in question: http://wiki.postgresql.org/wiki/Logging_Brainstorm It doesn't have anything linking to it right now, which might be a bad thing. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
Joshua Tolley wrote: > On Sat, Sep 26, 2009 at 11:43:46AM -0400, Tom Lane wrote: > >> complete but more complex solution. (dup2 works on Windows, no?) >> > > Unless I'm misreading syslogger.c, dup2() gets called on every platform. > Yes. Windows supports dup2(). See for example <http://msdn.microsoft.com/en-us/library/8syseb29(VS.80).aspx> cheers andrew
On Sat, 2009-09-26 at 15:35 -0600, Joshua Tolley wrote: > On Sat, Sep 26, 2009 at 11:43:46AM -0400, Tom Lane wrote: > > complete but more complex solution. (dup2 works on Windows, no?) > > Unless I'm misreading syslogger.c, dup2() gets called on every platform. > > I've started the wiki page in question: > http://wiki.postgresql.org/wiki/Logging_Brainstorm Most of the items listed there you can already do with a sufficiently non-ancient syslog implementation. Has anyone ever actually tested rsyslog and syslog-ng for performance and robustness with PostgreSQL? Half the ideas about on-disk queuing and checkpointing and so on that have been mentioned recently seem to come straight from their documentations.