Thread: Configuration of statistical views
Hi, OK, all the high-frequently called functions of the pgstat stuff are macros (will commit that later today). Now about the per database configuration. The thing is that I don't know if it is worth doing it too detailed. #ifdef'ing out the functionality I have the following wallclock runtimes for the regression test on a 500MHz P-III: Backend does nothing: 1:03 Backend sends per table scan and block IO: 1:05 Backend sends per table info plus querystring: 1:10 If somebody wants to see an applications querystring (at least the first 512 bytes) just in case something goes wrong and the client hangs, he'd have to run querystring reporting all the time either way. So I can see value in a per database default in pg_database plus the ability to switch it on/off via statement toanalyze single commands. What do others think? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Jan Wieck <JanWieck@Yahoo.com> writes: > So I can see value in a per database default in pg_database > plus the ability to switch it on/off via statement to analyze > single commands. Do you even need a per-database default? Why not an installation-wide default in postgresql.conf plus on/off commands? The great advantage of doing it that way is that it's simply a GUC variable or three, and you don't need to expend any work on developing infrastructure. So I'd recommend doing it that way to get started, even if you later decide that something more complex is warranted. regards, tom lane
> If somebody wants to see an applications querystring (at > least the first 512 bytes) just in case something goes wrong > and the client hangs, he'd have to run querystring reporting > all the time either way. Agreed. That should be on all the time. > So I can see value in a per database default in pg_database > plus the ability to switch it on/off via statement to analyze > single commands. Sounds fine. You may be able to just to a GUC/SET option and not do a per-database field. GUC doesn't do per-database and having a database flag and GUC would be confusing. Let's roll with just GUC/SET and see how it goes. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > If somebody wants to see an applications querystring (at > > least the first 512 bytes) just in case something goes wrong > > and the client hangs, he'd have to run querystring reporting > > all the time either way. > > Agreed. That should be on all the time. > > > So I can see value in a per database default in pg_database > > plus the ability to switch it on/off via statement to analyze > > single commands. > > Sounds fine. You may be able to just to a GUC/SET option and not do a > per-database field. GUC doesn't do per-database and having a database > flag and GUC would be confusing. Let's roll with just GUC/SET and see > how it goes. No per backend on/off statement - is that what you mean? That'd be easiest to get started. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
> Bruce Momjian wrote: > > > If somebody wants to see an applications querystring (at > > > least the first 512 bytes) just in case something goes wrong > > > and the client hangs, he'd have to run querystring reporting > > > all the time either way. > > > > Agreed. That should be on all the time. > > > > > So I can see value in a per database default in pg_database > > > plus the ability to switch it on/off via statement to analyze > > > single commands. > > > > Sounds fine. You may be able to just to a GUC/SET option and not do a > > per-database field. GUC doesn't do per-database and having a database > > flag and GUC would be confusing. Let's roll with just GUC/SET and see > > how it goes. > > No per backend on/off statement - is that what you mean? > That'd be easiest to get started. GUC as the default, and SET for per-backend. I am liking GUC more and more. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: > Jan Wieck <JanWieck@Yahoo.com> writes: > > So I can see value in a per database default in pg_database > > plus the ability to switch it on/off via statement to analyze > > single commands. > > Do you even need a per-database default? Why not an installation-wide > default in postgresql.conf plus on/off commands? The great advantage > of doing it that way is that it's simply a GUC variable or three, and > you don't need to expend any work on developing infrastructure. So > I'd recommend doing it that way to get started, even if you later decide > that something more complex is warranted. Personally, I can live with no options at all, because I think that amount of performance loss is worth it beeingable to look at a query in case. You know, if it's a config option it tends to allways being off when the errorshappen. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> If somebody wants to see an applications querystring (at >> least the first 512 bytes) just in case something goes wrong >> and the client hangs, he'd have to run querystring reporting >> all the time either way. > Agreed. That should be on all the time. "On by default", sure. "On all the time", I'm not sold on. But anyway, we seem to be converging on the conclusion that setting up a GUC variable will do fine, at least until there is definite evidence that it won't. Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER variable that controls whether the stats collector is even started, and (b) PGC_USERSET variable(s) that enable a particular backend to send particular kinds of data to the collector. Note that, for example, backend start/stop events probably need to be reported whenever the postmaster variable is set, even if all the USERSET variables are off. regards, tom lane
Bruce Momjian wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > >> If somebody wants to see an applications querystring (at > > >> least the first 512 bytes) just in case something goes wrong > > >> and the client hangs, he'd have to run querystring reporting > > >> all the time either way. > > > > > Agreed. That should be on all the time. > > > > "On by default", sure. "On all the time", I'm not sold on. > > So we will havre GUC for stats and query string. Fine. Set query > string on by default and stats off by default. Good. > > > > > But anyway, we seem to be converging on the conclusion that setting > > up a GUC variable will do fine, at least until there is definite > > evidence that it won't. > > > > Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER > > variable that controls whether the stats collector is even started, > > and (b) PGC_USERSET variable(s) that enable a particular backend to > > send particular kinds of data to the collector. Note that, for example, > > backend start/stop events probably need to be reported whenever the > > postmaster variable is set, even if all the USERSET variables are off. > > And another one to control whether the daemon is even running. OK. Forcing the other two to stay off if no daemon present. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Jan Wieck <JanWieck@yahoo.com> writes: >> backend start/stop events probably need to be reported whenever the >> postmaster variable is set, even if all the USERSET variables are off. > I don't consider backend start/stop messages to be critical, > although we get some complaints already about connection > slowness - well, this is somewhere in the microseconds. And > it'd be a little messy because the start message is sent by > the backend while the stop message is sent by the postmaster. > So where exactly to put it? This is exactly why I think they should be sent unconditionally. It doesn't matter if a particular backend turns its reporting on and off while it runs (I hope), but I'd think the stats collector would get confused if it saw, say, a start and no stop message for a particular backend. OTOH, given that we need to treat the transmission channel as unreliable, it would be a bad idea anyway if the stats collector got seriously confused by not seeing the start or the stop message. regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> If somebody wants to see an applications querystring (at > >> least the first 512 bytes) just in case something goes wrong > >> and the client hangs, he'd have to run querystring reporting > >> all the time either way. > > > Agreed. That should be on all the time. > > "On by default", sure. "On all the time", I'm not sold on. So we will havre GUC for stats and query string. Fine. Set query string on by default and stats off by default. Good. > > But anyway, we seem to be converging on the conclusion that setting > up a GUC variable will do fine, at least until there is definite > evidence that it won't. > > Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER > variable that controls whether the stats collector is even started, > and (b) PGC_USERSET variable(s) that enable a particular backend to > send particular kinds of data to the collector. Note that, for example, > backend start/stop events probably need to be reported whenever the > postmaster variable is set, even if all the USERSET variables are off. And another one to control whether the daemon is even running. OK. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> If somebody wants to see an applications querystring (at > >> least the first 512 bytes) just in case something goes wrong > >> and the client hangs, he'd have to run querystring reporting > >> all the time either way. > > > Agreed. That should be on all the time. > > "On by default", sure. "On all the time", I'm not sold on. > > But anyway, we seem to be converging on the conclusion that setting > up a GUC variable will do fine, at least until there is definite > evidence that it won't. Up to now, only three fulltime PG-developers spoke up. Maybe someone else likes to comment on it too and hasn't had the time yet. Let's be a little patient. > Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER > variable that controls whether the stats collector is even started, > and (b) PGC_USERSET variable(s) that enable a particular backend to > send particular kinds of data to the collector. Note that, for example, > backend start/stop events probably need to be reported whenever the > postmaster variable is set, even if all the USERSET variables are off. I don't consider backend start/stop messages to be critical, although we get some complaints already about connection slowness - well, this is somewhere in the microseconds. And it'd be a little messy because the start messageis sent by the backend while the stop message is sent by the postmaster. So where exactly to put it? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Tom Lane wrote: > Jan Wieck <JanWieck@yahoo.com> writes: > >> backend start/stop events probably need to be reported whenever the > >> postmaster variable is set, even if all the USERSET variables are off. > > > I don't consider backend start/stop messages to be critical, > > although we get some complaints already about connection > > slowness - well, this is somewhere in the microseconds. And > > it'd be a little messy because the start message is sent by > > the backend while the stop message is sent by the postmaster. > > So where exactly to put it? > > This is exactly why I think they should be sent unconditionally. > It doesn't matter if a particular backend turns its reporting on and > off while it runs (I hope), but I'd think the stats collector would > get confused if it saw, say, a start and no stop message for a > particular backend. > > OTOH, given that we need to treat the transmission channel as > unreliable, it would be a bad idea anyway if the stats collector got > seriously confused by not seeing the start or the stop message. Hmmm - that's a good point. Right now, the collector is totally lax on all of that. Missing start packet - no problem, we create the backend slot on the fly. Missing stats packet - well, the counters aren't 100% correct, so be it. But OTOH it causes him to remember the dead backend for postmaster lifetime in case of a missingstop. Except a PID wraparound causes a fix someday. Maybe it should periodically (every 10 minutesor even longer) check with a zero-kill if all the backends it knows about are really alive. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Tom Lane writes: > Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER > variable that controls whether the stats collector is even started, > and (b) PGC_USERSET variable(s) that enable a particular backend to > send particular kinds of data to the collector. Note that, for example, > backend start/stop events probably need to be reported whenever the > postmaster variable is set, even if all the USERSET variables are off. I'm not familiar with the kinds of statistics that are supposed to be gathered here, but I suppose their usefulness would be greatly increased if they were gathered across all data/actions, not only the ones that the users turned them on for. So I think ordinary users have no business controlling these settings. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Peter Eisentraut <peter_e@gmx.net> writes: > I'm not familiar with the kinds of statistics that are supposed to be > gathered here, but I suppose their usefulness would be greatly increased > if they were gathered across all data/actions, not only the ones that the > users turned them on for. So I think ordinary users have no business > controlling these settings. Okay, the per-backend GUC variables should be SUSET instead of USERSET. I don't have a problem with that ... regards, tom lane