Thread: procpid?
Can someone explain why pg_stat_activity has a column named procpid and not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant (proc-process-id). A mistake? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > Can someone explain why pg_stat_activity has a column named procpid and > not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant > (proc-process-id). A mistake? Mistake or not, it's about half a dozen releases too late to change it. regards, tom lane
On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian <bruce@momjian.us> wrote: > Can someone explain why pg_stat_activity has a column named procpid and > not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant > (proc-process-id). A mistake? Well, we refer to the slots that backends use as "procs" (really PGPROC), so I'm guessing that this was intended to mean "the pid associated with the proc". It might not be the greatest name but I can't see changing it now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian <bruce@momjian.us> wrote: > > Can someone explain why pg_stat_activity has a column named procpid and > > not simply pid? ?'pid' is that pg_locks uses, and 'procpid' is redundant > > (proc-process-id). ?A mistake? > > Well, we refer to the slots that backends use as "procs" (really > PGPROC), so I'm guessing that this was intended to mean "the pid > associated with the proc". It might not be the greatest name but I > can't see changing it now. Agreed. Just pointing out this mistake slipped through. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Jun 9, 2011, at 11:29 AM, Robert Haas wrote: > On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian <bruce@momjian.us> wrote: >> Can someone explain why pg_stat_activity has a column named procpid and >> not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant >> (proc-process-id). A mistake? > > Well, we refer to the slots that backends use as "procs" (really > PGPROC), so I'm guessing that this was intended to mean "the pid > associated with the proc". It might not be the greatest name but I > can't see changing it now. It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we candeprecate procpid and eventually remove it... -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasby <jim@nasby.net> wrote: > > It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we candeprecate > procpid and eventually remove it... well, if we will start changing bad picked names we will have a *lot* of work to do... starting by the project's name ;) -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación
On 6/11/2011 1:02 AM, Jaime Casanova wrote: > On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasby<jim@nasby.net> wrote: >> It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so wecan deprecate >> procpid and eventually remove it... > well, if we will start changing bad picked names we will have a *lot* > of work to do... starting by the project's name ;) There is a difference between a project name and something that directly affects usability. +1 on fixing this. IMO, we don't create a new pid column, we just fix the problem. If we do it for 9.2, we have 18 months to communicate the change. Joshua D. Drake
Joshua D. Drake wrote: > On 6/11/2011 1:02 AM, Jaime Casanova wrote: > > On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasby<jim@nasby.net> wrote: > >> It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so wecan deprecate > >> procpid and eventually remove it... > > well, if we will start changing bad picked names we will have a *lot* > > of work to do... starting by the project's name ;) > > There is a difference between a project name and something that directly > affects usability. +1 on fixing this. IMO, we don't create a new pid > column, we just fix the problem. If we do it for 9.2, we have 18 months > to communicate the change. Uh, I am the first one I remember complaining about this so I don't see why we should break compatibility for such a low-level problem. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 6/11/2011 1:23 PM, Bruce Momjian wrote: > >> There is a difference between a project name and something that directly >> affects usability. +1 on fixing this. IMO, we don't create a new pid >> column, we just fix the problem. If we do it for 9.2, we have 18 months >> to communicate the change. > Uh, I am the first one I remember complaining about this so I don't see > why we should break compatibility for such a low-level problem. > Because it is a very real problem with an easy fix. We have 18 months to publicize that fix. I mean really? This is a no-brainer. JD
On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > On 6/11/2011 1:23 PM, Bruce Momjian wrote: >> >>> There is a difference between a project name and something that directly >>> affects usability. +1 on fixing this. IMO, we don't create a new pid >>> column, we just fix the problem. If we do it for 9.2, we have 18 months >>> to communicate the change. >> >> Uh, I am the first one I remember complaining about this so I don't see >> why we should break compatibility for such a low-level problem. > > Because it is a very real problem with an easy fix. We have 18 months to > publicize that fix. I mean really? This is a no-brainer. I really don't see what the big deal with calling it the process PID rather than just the PID is. Changing something like this forces pgAdmin and every other application out there that is built to work with PG to make a code change to keep working with PG. That seems like pushing a lot of unnecessary work on other people for what is basically a minor cosmetic issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/6/12 Robert Haas <robertmhaas@gmail.com>: > On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote: >> On 6/11/2011 1:23 PM, Bruce Momjian wrote: >>> >>>> There is a difference between a project name and something that directly >>>> affects usability. +1 on fixing this. IMO, we don't create a new pid >>>> column, we just fix the problem. If we do it for 9.2, we have 18 months >>>> to communicate the change. >>> >>> Uh, I am the first one I remember complaining about this so I don't see >>> why we should break compatibility for such a low-level problem. >> >> Because it is a very real problem with an easy fix. We have 18 months to >> publicize that fix. I mean really? This is a no-brainer. > > I really don't see what the big deal with calling it the process PID > rather than just the PID is. Changing something like this forces > pgAdmin and every other application out there that is built to work > with PG to make a code change to keep working with PG. That seems > like pushing a lot of unnecessary work on other people for what is > basically a minor cosmetic issue. I agree. This is at least a use-case for something^Wfeature like 'create synonym', allowing smooth end-user's application upgrade on schema update. I am not claiming that we need that, it just seems a good usecase for column alias/synonym. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
On Sat, Jun 11, 2011 at 9:56 PM, Cédric Villemain <cedric.villemain.debian@gmail.com> wrote: > 2011/6/12 Robert Haas <robertmhaas@gmail.com>: >> On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote: >>> On 6/11/2011 1:23 PM, Bruce Momjian wrote: >>>> >>>>> There is a difference between a project name and something that directly >>>>> affects usability. +1 on fixing this. IMO, we don't create a new pid >>>>> column, we just fix the problem. If we do it for 9.2, we have 18 months >>>>> to communicate the change. >>>> >>>> Uh, I am the first one I remember complaining about this so I don't see >>>> why we should break compatibility for such a low-level problem. >>> >>> Because it is a very real problem with an easy fix. We have 18 months to >>> publicize that fix. I mean really? This is a no-brainer. >> >> I really don't see what the big deal with calling it the process PID >> rather than just the PID is. Changing something like this forces >> pgAdmin and every other application out there that is built to work >> with PG to make a code change to keep working with PG. That seems >> like pushing a lot of unnecessary work on other people for what is >> basically a minor cosmetic issue. > > I agree. > This is at least a use-case for something^Wfeature like 'create > synonym', allowing smooth end-user's application upgrade on schema > update. I am not claiming that we need that, it just seems a good > usecase for column alias/synonym. I had the same thought. I'm not sure that this particular example would be worthwhile even if we had a column synonym facility. But at least if we were bent on changing it we could do it without breaking things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On lör, 2011-06-11 at 16:23 -0400, Bruce Momjian wrote: > Uh, I am the first one I remember complaining about this so I don't > see why we should break compatibility for such a low-level problem. I complain about it every day to the wall. :)
Peter Eisentraut <peter_e@gmx.net> writes: > On lör, 2011-06-11 at 16:23 -0400, Bruce Momjian wrote: >> Uh, I am the first one I remember complaining about this so I don't >> see why we should break compatibility for such a low-level problem. > > I complain about it every day to the wall. :) +1 ! -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Jun 11, 2011, at 9:36 PM, Robert Haas wrote: >> This is at least a use-case for something^Wfeature like 'create >> synonym', allowing smooth end-user's application upgrade on schema >> update. I am not claiming that we need that, it just seems a good >> usecase for column alias/synonym. > > I had the same thought. I'm not sure that this particular example > would be worthwhile even if we had a column synonym facility. But at > least if we were bent on changing it we could do it without breaking > things. A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been thatway for years and the only reason it's never been cleaned up is because it would require simultaneously changing configsettings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the changevery difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to specialcase it), it would be even more painful to fix the problem. If we had database name synonyms we could create a synonymand migrate everything over time... and in the meantime, code could stop special-casing it. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Mon, Jun 13, 2011 at 11:20 AM, Jim Nasby <jim@nasby.net> wrote: > On Jun 11, 2011, at 9:36 PM, Robert Haas wrote: >>> This is at least a use-case for something^Wfeature like 'create >>> synonym', allowing smooth end-user's application upgrade on schema >>> update. I am not claiming that we need that, it just seems a good >>> usecase for column alias/synonym. >> >> I had the same thought. I'm not sure that this particular example >> would be worthwhile even if we had a column synonym facility. But at >> least if we were bent on changing it we could do it without breaking >> things. > > A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been thatway for years and the only reason it's never been cleaned up is because it would require simultaneously changing configsettings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the changevery difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to specialcase it), it would be even more painful to fix the problem. If we had database name synonyms we could create a synonymand migrate everything over time... and in the meantime, code could stop special-casing it. That's probably the best explanation of why synonyms would be useful I believe I've yet heard. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Jun 13, 2011, at 10:22 AM, Robert Haas wrote: >> A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been thatway for years and the only reason it's never been cleaned up is because it would require simultaneously changing configsettings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the changevery difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to specialcase it), it would be even more painful to fix the problem. If we had database name synonyms we could create a synonymand migrate everything over time... and in the meantime, code could stop special-casing it. > > That's probably the best explanation of why synonyms would be useful I > believe I've yet heard. FWIW, I've asked Command Prompt to look into creating database name synonyms for us, but perhaps there are other synonymsthat would make sense? I can't really think of any other cases where you care about name and don't have a way towork around it (ie: column and tables can be done with views; you can grant a role to another role; you can create a wrapperfunction). -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Sun, Jun 12, 2011 at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote: >> On 6/11/2011 1:23 PM, Bruce Momjian wrote: >>> >>>> There is a difference between a project name and something that directly >>>> affects usability. +1 on fixing this. IMO, we don't create a new pid >>>> column, we just fix the problem. If we do it for 9.2, we have 18 months >>>> to communicate the change. >>> >>> Uh, I am the first one I remember complaining about this so I don't see >>> why we should break compatibility for such a low-level problem. >> >> Because it is a very real problem with an easy fix. We have 18 months to >> publicize that fix. I mean really? This is a no-brainer. > > I really don't see what the big deal with calling it the process PID > rather than just the PID is. Changing something like this forces > pgAdmin and every other application out there that is built to work > with PG to make a code change to keep working with PG. That seems > like pushing a lot of unnecessary work on other people for what is > basically a minor cosmetic issue. +1 If we were going to make changes like this, I'd suggest we save them up in a big bag for when we change major version number. Everybody in the world thinks that PostgreSQL v8 is compatible across all versions (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we would still have forward progress, but in more sensible sized steps. Otherwise we just break the code annually for all the people that support us. If we had a more stable environment for tools vendors, maybe people wouldn't need to be manually typing procpid anyway... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jun 13, 2011 at 11:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > +1 > > If we were going to make changes like this, I'd suggest we save them > up in a big bag for when we change major version number. Everybody in > the world thinks that PostgreSQL v8 is compatible across all versions > (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we > would still have forward progress, but in more sensible sized steps. > Otherwise we just break the code annually for all the people that > support us. If we had a more stable environment for tools vendors, > maybe people wouldn't need to be manually typing procpid anyway... Amen. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Simon Riggs wrote: > On Sun, Jun 12, 2011 at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > >> On 6/11/2011 1:23 PM, Bruce Momjian wrote: > >>> > >>>> There is a difference between a project name and something that directly > >>>> affects usability. +1 on fixing this. IMO, we don't create a new pid > >>>> column, we just fix the problem. If we do it for 9.2, we have 18 months > >>>> to communicate the change. > >>> > >>> Uh, I am the first one I remember complaining about this so I don't see > >>> why we should break compatibility for such a low-level problem. > >> > >> Because it is a very real problem with an easy fix. We have 18 months to > >> publicize that fix. I mean really? This is a no-brainer. > > > > I really don't see what the big deal with calling it the process PID > > rather than just the PID is. ?Changing something like this forces > > pgAdmin and every other application out there that is built to work > > with PG to make a code change to keep working with PG. ?That seems > > like pushing a lot of unnecessary work on other people for what is > > basically a minor cosmetic issue. > > +1 > > If we were going to make changes like this, I'd suggest we save them > up in a big bag for when we change major version number. Everybody in > the world thinks that PostgreSQL v8 is compatible across all versions > (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we > would still have forward progress, but in more sensible sized steps. > Otherwise we just break the code annually for all the people that > support us. If we had a more stable environment for tools vendors, > maybe people wouldn't need to be manually typing procpid anyway... Agreed. I did add a C comment that this was misnamed so when we are in that code we will see it. I did reorder the pg_stat_activity columns in 9.0 for sanity, and no one complained, but renaming is more disruptive than reordering. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Jun 13, 2011, at 10:56 AM, Simon Riggs wrote: > If we were going to make changes like this, I'd suggest we save them > up in a big bag for when we change major version number. Everybody in > the world thinks that PostgreSQL v8 is compatible across all versions > (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we > would still have forward progress, but in more sensible sized steps. > Otherwise we just break the code annually for all the people that > support us. If we had a more stable environment for tools vendors, > maybe people wouldn't need to be manually typing procpid anyway... Wouldn't it be better still to have both the new and old columns available for a while? That would produce the minimum amountof disruption to tools, etc. The only downside is some potential confusion, but that would just serve to drive peopleto the documentation to see why there were two fields, where they would find out one was deprecated. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Jim Nasby wrote: > On Jun 13, 2011, at 10:56 AM, Simon Riggs wrote: > > If we were going to make changes like this, I'd suggest we save them > > up in a big bag for when we change major version number. Everybody in > > the world thinks that PostgreSQL v8 is compatible across all versions > > (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we > > would still have forward progress, but in more sensible sized steps. > > Otherwise we just break the code annually for all the people that > > support us. If we had a more stable environment for tools vendors, > > maybe people wouldn't need to be manually typing procpid anyway... > > Wouldn't it be better still to have both the new and old columns > available for a while? That would produce the minimum amount of > disruption to tools, etc. The only downside is some potential confusion, > but that would just serve to drive people to the documentation to see > why there were two fields, where they would find out one was deprecated. Well, someone doing SELECT *, which is probably 90% of the users, are going to be pretty confused by duplicate columns, asking, "What is the difference"? For those people this would make things worse than they are now. I would say 90% of users are doing SELECT *, and 10% are joining to other tables or displaying specific columns. We want to help that 10% without making that 90% confused. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 06/14/2011 11:44 AM, Jim Nasby wrote: > Wouldn't it be better still to have both the new and old columns > available for a while? That would produce the minimum amount of > disruption to tools, etc. Doing this presumes the existence of a large number of tools where the author is unlikely to be keeping up with PostgreSQL development. I don't believe that theorized set of users actually exists. There are people who use pg_stat_activity simply, and there are tool authors who are heavily involved enough that they will see a change here coming far enough in advance to adopt it without disruption. If there's a large base of "casual" tool authors, who wrote something using pg_stat_activity once and will never update it again, I don't know where they are. Anyway, I want a larger change to pg_stat_activity than this one, and I would just roll fixing this column name into that more disruptive and positive change. Right now the biggest problem with this view is that you have to parse the text of the query to figure out what state the connection is in. This is silly; there should be boolean values exposed for "idle" and "in transaction". I want to be able to write things like this: SELECT idle,in_trans,count(*) FROM pg_stat_activity GROUP BY idle,in_trans; SELECT min(backend_start) FROM pg_stat_activity WHERE idle; Right now the standard approach to this is to turn current_query into a derived state value using CASE statements. It's quite unfriendly, and a bigger problem than this procpid mismatch. Fix that whole mess at once, and now you've got something useful enough to justify breaking tools. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
On Tue, Jun 14, 2011 at 12:25 PM, Greg Smith <greg@2ndquadrant.com> wrote: > > Anyway, I want a larger change to pg_stat_activity than this one Well, Simon recomended to have a big bag of changes that justify break tools... and you have presented a good one item for that bag... Maybe we should start a wiki page for this and put there all the changes we want to see before break anything? for example, a change i want to see is in csvlog: i want a duration field there because tools like pgfouine, pgsi and others parse the message field for a "duration" string which is only usefull if the message is in english which non-english dba's won't have -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación
On Tue, Jun 14, 2011 at 1:43 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Tue, Jun 14, 2011 at 12:25 PM, Greg Smith <greg@2ndquadrant.com> wrote: >> >> Anyway, I want a larger change to pg_stat_activity than this one > > Well, Simon recomended to have a big bag of changes that justify break > tools... and you have presented a good one item for that bag... > Maybe we should start a wiki page for this and put there all the > changes we want to see before break anything? > > for example, a change i want to see is in csvlog: i want a duration > field there because tools like pgfouine, pgsi and others parse the > message field for a "duration" string which is only usefull if the > message is in english which non-english dba's won't have There are real problems with the idea of having one release where we break everything that we want to break - mostly from a process standpoint. We aren't always good at being organized and disciplined, and coming up with a multi-year plan to break everything all at once in 2014 for release in 2015 may be difficult, because it requires a consensus on release management to hold together for years, and sometimes we can't even manage "days". But I don't think it's a bad idea to try. So +1 for creating a list of things that we think we might like to break at some point. It might be worth trying to do this in the context of the Todo list - come up with some special badge or flag that we can put on items that require a compatibility break, so that we can scan for them there easily. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Greg Smith <greg@2ndQuadrant.com> wrote: > Doing this presumes the existence of a large number of tools where > the author is unlikely to be keeping up with PostgreSQL > development. I don't believe that theorized set of users actually > exists. There could be a number of queries used for monitoring or administration which will be affected. Just on our Wiki pages we have some queries available for copy/paste which would need multiple versions while both column names were in supported versions of the software: http://wiki.postgresql.org/wiki/Lock_dependency_information http://wiki.postgresql.org/wiki/Lock_Monitoring http://wiki.postgresql.org/wiki/Backend_killer_function I agree that these are manageable, but not necessarily trivial. (You should see how long it can take to get them to install new monitoring software to our centralized system here.) I think that's consistent with the "save up our breaking changes to do them all at once" approach. -Kevin
On 06/14/2011 02:20 PM, Kevin Grittner wrote: > Just on our Wiki pages we have some queries available for copy/paste > which would need multiple > versions while both column names were in supported versions of the > software: > > http://wiki.postgresql.org/wiki/Lock_dependency_information > http://wiki.postgresql.org/wiki/Lock_Monitoring > http://wiki.postgresql.org/wiki/Backend_killer_function > ...and most of these would actually be simplified if they could just JOIN on pid instead of needing this common idiom: join pg_catalog.pg_stat_activity ka on kl.pid = ka.procpid Yes, there are a lot of these floating around. I'd bet that in an hour of research I could find 95% of them though, and make sure they were all updated in advance of the release. (I already did most of this search as part of stealing every good idea I could find in this area for my book) > I think that's consistent with the "save up our breaking changes to do them all at > once" approach. > I don't actually buy into this whole idea at all. We already have this big wall at 8.3 because changes made in that release are too big for people on the earlier side to upgrade past. I'd rather see a series of smaller changes in each release, even if they are disruptive, so that no one version turns into a frustrating hurdle seen as impossible to clear. This adjustment is a perfect candidate for putting into 9.2 to me, because I'd rather reduce max(breakage) across releases than intentionally aim at increasing it but bundling them into larger clumps. For me, the litmus test is whether the change provides enough improvement that it outweighs the disruption when the user runs into it. This is why I suggested a specific, useful, and commonly requested (to me at least) change to pg_stat_activity go along with this. If people discover their existing pg_stat_activity tools break, presumably they're going to look at the view again to see what changed. When they do that, I don't want the reaction to be "why was this random change made?" I want it to be "look, there are useful new fields in here; let me see if I can use them too here". That's how you make people tolerate disruption in upgrades. If they see a clear improvement in the same spot when forced to fix around it, the experience is much more pleasant if they get something new out of it too. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
Excerpts from Bruce Momjian's message of mar jun 14 12:59:15 -0400 2011: > Well, someone doing SELECT *, which is probably 90% of the users, are > going to be pretty confused by duplicate columns, asking, "What is the > difference"? For those people this would make things worse than they > are now. > > I would say 90% of users are doing SELECT *, and 10% are joining to > other tables or displaying specific columns. We want to help that 10% > without making that 90% confused. I think if you had column synonyms, you would get only a single one when doing "select *". The other name would still be accepted in a query that explicitely asked for it. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote: > There are real problems with the idea of having one release where we > break everything that we want to break - mostly from a process > standpoint. We aren't always good at being organized and disciplined, > and coming up with a multi-year plan to break everything all at once > in 2014 for release in 2015 may be difficult, because it requires a > consensus on release management to hold together for years, and > sometimes we can't even manage "days". I have had this fantasy of a break-everything release for a long time as well, but frankly, experience from other projects such as Python 3, Perl 6, KDE 4, Samba 4, add-yours-here, indicates that such things might not work out so well. OK, some of those were rewrites as well as interface changes, but the effect visible to the end user is mostly the same.
Peter Eisentraut wrote: > On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote: > > There are real problems with the idea of having one release where we > > break everything that we want to break - mostly from a process > > standpoint. We aren't always good at being organized and disciplined, > > and coming up with a multi-year plan to break everything all at once > > in 2014 for release in 2015 may be difficult, because it requires a > > consensus on release management to hold together for years, and > > sometimes we can't even manage "days". > > I have had this fantasy of a break-everything release for a long time as > well, but frankly, experience from other projects such as Python 3, Perl > 6, KDE 4, Samba 4, add-yours-here, indicates that such things might not > work out so well. > > OK, some of those were rewrites as well as interface changes, but the > effect visible to the end user is mostly the same. Funny you mentioned Perl 6 because I just blogged about that: http://momjian.us/main/blogs/pgblog/2011.html#June_14_2011 -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Peter Eisentraut <peter_e@gmx.net> writes: > On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote: >> There are real problems with the idea of having one release where we >> break everything that we want to break - mostly from a process >> standpoint. We aren't always good at being organized and disciplined, >> and coming up with a multi-year plan to break everything all at once >> in 2014 for release in 2015 may be difficult, because it requires a >> consensus on release management to hold together for years, and >> sometimes we can't even manage "days". > I have had this fantasy of a break-everything release for a long time as > well, but frankly, experience from other projects such as Python 3, Perl > 6, KDE 4, Samba 4, add-yours-here, indicates that such things might not > work out so well. > OK, some of those were rewrites as well as interface changes, but the > effect visible to the end user is mostly the same. Good point. I think the case that has actually been discussed is the idea of saving up binary-compatibility breaks (on-disk format changes). That seems sensible. It doesn't create a bigger problem for users, since a dump/reload is a dump/reload no matter how many individual format changes happened underneath. But we should be wary of applying that approach to application-visible incompatibilities. As far as Greg's proposal is concerned, I don't see how a proposed addition of two columns would justify renaming an existing column. Additions should not break any sanely-implemented application, but renamings certainly will. regards, tom lane
On 06/14/2011 06:00 PM, Tom Lane wrote: > As far as Greg's proposal is concerned, I don't see how a proposed > addition of two columns would justify renaming an existing column. > Additions should not break any sanely-implemented application, but > renamings certainly will. > It's not so much justification as something that makes the inevitable complaints easier to stomach, in terms of not leaving a really bad taste in the user's mouth. My thinking is that if we're going to mess with pg_stat_activity in a way that breaks something, I'd like to see it completely refactored for better usability in the process. If code breaks and the resulting investigation by the admin highlights something new, that offsets some of the bad user experience resulting from the breakage. Also, I haven't fully worked whether it makes sense to really change what current_query means if the idle/transaction component of it gets moved to another column. Would it be better to set current_query to null if you are idle, rather than the way it's currently overloaded with text in that case? I don't like the way this view works at all, but I'm not sure the best way to change it. Just changing procpid wouldn't be the only thing on the list though. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
Greg Smith wrote: > On 06/14/2011 06:00 PM, Tom Lane wrote: > > As far as Greg's proposal is concerned, I don't see how a proposed > > addition of two columns would justify renaming an existing column. > > Additions should not break any sanely-implemented application, but > > renamings certainly will. > > > > It's not so much justification as something that makes the inevitable > complaints easier to stomach, in terms of not leaving a really bad taste > in the user's mouth. My thinking is that if we're going to mess with > pg_stat_activity in a way that breaks something, I'd like to see it > completely refactored for better usability in the process. If code > breaks and the resulting investigation by the admin highlights something > new, that offsets some of the bad user experience resulting from the > breakage. > > Also, I haven't fully worked whether it makes sense to really change > what current_query means if the idle/transaction component of it gets > moved to another column. Would it be better to set current_query to > null if you are idle, rather than the way it's currently overloaded with > text in that case? I don't like the way this view works at all, but I'm > not sure the best way to change it. Just changing procpid wouldn't be > the only thing on the list though. Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate fields. If I had thought of it I would have done it that way years ago. (At least I think it was me.) Using angle brackets to put magic values in that field was clearly wrong. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Wed, Jun 15, 2011 at 2:50 AM, Bruce Momjian <bruce@momjian.us> wrote: > Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate > fields. If I had thought of it I would have done it that way years ago. > (At least I think it was me.) Using angle brackets to put magic values > in that field was clearly wrong. I think of these as just placeholders in the SQL text field for cases where there's no SQL text available. But they do clearly indicate a need for columns with this information. For what it's worth Oracle provides a whole list of states the transaction can be in, it can be waiting for client traffic, waiting on i/o, waiting on a lock, etc. Separately whether the session is in a transaction might need to become slightly richer than a boolean now that we have snapshot management. You can be in a transaction but not have any snapshots or be in the traditional state where you have at least one snapshot. And If we do autonomous transactions the field might have be much much richer again. -- greg
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > For me, the litmus test is whether the change provides enough > improvement that it outweighs the disruption when the user runs into > it. For the procpid that started all of this, the clear answer is no. I'm surprised people seriously considered making this change. It's a historical accident: document and move on. And if we are going to talk about changing misnamed things, I've got a whole bunch of others I could throw at you (such as abbreviation rules: blks_read on the one extreme, and autovacuum_analyze_scale_factor on the other) :) > This is why I suggested a specific, useful, and commonly requested > (to me at least) change to pg_stat_activity go along with this. +1. The procpid change is silly, but fixing the current_query field would be very useful. You don't know how many times my fingers have typed "WHERE current_query <> '<IDLE>'" - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201106142300 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAk34IRoACgkQvJuQZxSWSsi0dgCgi37mrLYbD6G3dS99GPbSFhHW EjYAniZNpRUXxYmhBHfb1k1LsMSoOHE7 =61nA -----END PGP SIGNATURE-----
On Tue, Jun 14, 2011 at 11:04 PM, Greg Sabino Mullane <greg@turnstep.com> wrote: >> For me, the litmus test is whether the change provides enough >> improvement that it outweighs the disruption when the user runs into >> it. > > For the procpid that started all of this, the clear answer is no. I'm > surprised people seriously considered making this change. It's a > historical accident: document and move on. I agree with you on this one... >> This is why I suggested a specific, useful, and commonly requested >> (to me at least) change to pg_stat_activity go along with this. > > +1. The procpid change is silly, but fixing the current_query field > would be very useful. You don't know how many times my fingers > have typed "WHERE current_query <> '<IDLE>'" ...but I'm not even excited about this. *Maybe* it's worth adding another column, but the problem with the existing system is *entirely* cosmetic. The string chosen here is unconfusable with an actual query, so we are talking here, as with the procpid -> pid proposal, ONLY about saving a few keystrokes when writing queries. That is a pretty thin justification for a compatibility break IMV. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Here's the sort of thing every person who writes a monitoring tool involving pg_stat_activity goes through: 1) Hurray! I know how to see what the database is doing now! Let me try counting all the connections so I can finally figure out what to set [max_connections | work_mem | other] to. 2) Wait, some of these can be "<IDLE>". That's not documented. I'll have to special case them because they don't really matter for my computation. 3) Seriously, there's another state for idle in a transaction? Just how many of these special values are there? [There's actually one more surprise after this] The whole thing is enormously frustrating, and it's an advocacy problem--it contributes to people just starting to become serious about using PostgreSQL lowering their opinion of its suitability for their business. If this is what's included for activity monitoring, and it's this terrible, it suggest people must not have very high requirements for that. And what you end up with to make it better is not just another few keystrokes. Here, as a common example I re-use a lot, is a decoder inspired by Munin's connection count monitoring graph: SELECT waiting, CASE WHEN current_query='<IDLE>' THEN true ELSE false END AS idle, CASE WHEN current_query='<IDLE>in transaction' THEN true ELSE false END AS idletransaction, CASE WHEN current_query='<insufficient privilege>' THEN false ELSE true END as visible, CASE WHEN NOT waiting AND current_query NOT IN ('<IDLE>', '<IDLE> in transaction', '<insufficient privilege>') THEN true ELSE false END AS active, procpid,current_query FROM pg_stat_activity WHERE procpid != pg_backend_pid(); What percentage of people do you think get this right? Now, what does that number go to if these states were all obviously exposed booleans? As I'm concerned, this design is fundamentally flawed as currently delivered, so the concept of "breaking" it doesn't really make sense. The fact that you can only figure all this decoding magic out through extensive trial and error, or reading the source code to [the database | another monitoring tool], is crazy. It's a much bigger problem than the fact that the pid column is misnamed, and way up on my list of things I'm just really tired of doing. Yes, we could just document all these mystery states to help, but they'd still be terrible. This is a database; let's expose the data in a way that it's easy to slice yourself using a database query. And if we're going to fix that--which unfortunately will be breaking it relative to those already using the current format--I figure why not bundle the procpid fix into that while we're at it. It's even possible to argue that breaking that small thing will draw useful attention to the improvements in other parts of the view. Having your monitoring query break after a version upgrade is no fun. But if investigating why reveals new stuff you didn't notice in the release notes, the changes become more discoverable, albeit in a somewhat perverse way. Putting on my stability hat instead of my "make it right" one, maybe this really makes sense to expose as a view with a whole new name. Make this new one pg_activity (there's no stats here anyway), keep the old one around as pg_stat_activity for a few releases until everyone has converted to the new one. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
Following this whole conversation rises the impression the topic is going to get lost in nirvana of personal preferences. Most suggestions on change for itself are likely to not cross the border of "not justifying a compatibility break". I wonder, whether the actual point really is towards compatibility. On closer look this is more about a change in paradigm of "system tables". Seems like those previously had been crafted with having in mind more a human "reader" than a programmatic user. What seems to be requested sounds more like splitting access to system information into a level that is more appropriate for programmatic use (with all those basic properties being explicit) and some level more apt for being read. E.g. I much prefer reading an "<IDLE> in transaction" on a quick glance over having to search a column and recognize a "t" from an "f" to find out whether there is a transaction pending or not. So may be we need a (new) set of "tables/views" that provide detailed information that is designed for programmatic use as a basic interface layer. And reconstruct the existing "tables /views" based on those. That would allow all required changes to coexist without braking compatibility. And it also provides an easier ground for later "extensions" to such information. Anybody sticking with the existing interface will not suffer incompatibility. While anybody in need of more details and "better" information may switch over to the new basic layer. (And I doubt adding that "extra" level will cause problems performance wise...) Rainer Am 15.06.2011 06:19, schrieb Robert Haas: > On Tue, Jun 14, 2011 at 11:04 PM, Greg Sabino Mullane <greg@turnstep.com> wrote: >>> For me, the litmus test is whether the change provides enough >>> improvement that it outweighs the disruption when the user runs into >>> it. >> For the procpid that started all of this, the clear answer is no. I'm >> surprised people seriously considered making this change. It's a >> historical accident: document and move on. > I agree with you on this one... > >>> This is why I suggested a specific, useful, and commonly requested >>> (to me at least) change to pg_stat_activity go along with this. >> +1. The procpid change is silly, but fixing the current_query field >> would be very useful. You don't know how many times my fingers >> have typed "WHERE current_query <> '<IDLE>'" > ...but I'm not even excited about this. *Maybe* it's worth adding > another column, but the problem with the existing system is *entirely* > cosmetic. The string chosen here is unconfusable with an actual > query, so we are talking here, as with the procpid -> pid proposal, > ONLY about saving a few keystrokes when writing queries. That is a > pretty thin justification for a compatibility break IMV. >
On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith <greg@2ndquadrant.com> wrote: > The whole thing is enormously frustrating, and it's an advocacy problem--it > contributes to people just starting to become serious about using PostgreSQL > lowering their opinion of its suitability for their business. If this is > what's included for activity monitoring, and it's this terrible, it suggest > people must not have very high requirements for that. Well, if we're going to start complaining about the lack of proper activity monitoring, the problems that you're talking about are just the tip of the iceberg. Don't even get me started. > Putting on my stability hat instead of my "make it right" one, maybe this > really makes sense to expose as a view with a whole new name. Make this new > one pg_activity (there's no stats here anyway), keep the old one around as > pg_stat_activity for a few releases until everyone has converted to the new > one. Now, that's a suggestion I could very possibly get behind. Though the fact that it would leave us with pg_activity / pg_stat_replication seems less than ideal. Maybe pg_activity isn't the best name either... bikeshedding time! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Jun 14, 2011 at 9:50 PM, Bruce Momjian <bruce@momjian.us> wrote:
FWIW, I wrote a monitoring query around it like this (the requirement was to not expose the current_query contents).
SELECT datname, procpid, usename, backend_start, xact_start, query_start,
waiting AS is_waiting, current_query = $$<IDLE>$$ AS is_idle,
current_query = $$<IDLE> in transaction$$ AS is_idle_in_transaction,
current_query ilike $$VACUUM%$$ as is_vacuum,
client_port IS NULL AND (current_query like $$autovacuum:%$$ OR current_query like $$VACUUM%$$) as is_autovacuum,
now() AS capture_time
FROM pg_catalog.pg_stat_activity
The tricky part was to determine how long a connection has been in the state that it currently is in. Since the various *_start columns are changed only as needed, I had to use the following expression to calculate that.
(capture_time - COALESCE(query_start, xact_start, backend_start))::interval
query_start is changed every time current_query value is changed; but it is NULL if the backend has just started. Similarly, xact_start changes whenever backend goes into/comes out of a transaction; but it is NULL when the backend has just started. backend_start is never NULL, so we can fall back on that when nothing else is available (i.e when the backend has just started).
If we separated is_idle and is_idle_in_transaction into separate fields, then we also need to somehow expose when did the backend get into that state, unless we promise to hold the assumptions true that were made when writing the above query (which is not as straightforward as one would expect).
Greg Smith wrote:Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate
> On 06/14/2011 06:00 PM, Tom Lane wrote:
> > As far as Greg's proposal is concerned, I don't see how a proposed
> > addition of two columns would justify renaming an existing column.
> > Additions should not break any sanely-implemented application, but
> > renamings certainly will.
> >
>
> It's not so much justification as something that makes the inevitable
> complaints easier to stomach, in terms of not leaving a really bad taste
> in the user's mouth. My thinking is that if we're going to mess with
> pg_stat_activity in a way that breaks something, I'd like to see it
> completely refactored for better usability in the process. If code
> breaks and the resulting investigation by the admin highlights something
> new, that offsets some of the bad user experience resulting from the
> breakage.
>
> Also, I haven't fully worked whether it makes sense to really change
> what current_query means if the idle/transaction component of it gets
> moved to another column. Would it be better to set current_query to
> null if you are idle, rather than the way it's currently overloaded with
> text in that case? I don't like the way this view works at all, but I'm
> not sure the best way to change it. Just changing procpid wouldn't be
> the only thing on the list though.
fields. If I had thought of it I would have done it that way years ago.
(At least I think it was me.) Using angle brackets to put magic values
in that field was clearly wrong.
FWIW, I wrote a monitoring query around it like this (the requirement was to not expose the current_query contents).
SELECT datname, procpid, usename, backend_start, xact_start, query_start,
waiting AS is_waiting, current_query = $$<IDLE>$$ AS is_idle,
current_query = $$<IDLE> in transaction$$ AS is_idle_in_transaction,
current_query ilike $$VACUUM%$$ as is_vacuum,
client_port IS NULL AND (current_query like $$autovacuum:%$$ OR current_query like $$VACUUM%$$) as is_autovacuum,
now() AS capture_time
FROM pg_catalog.pg_stat_activity
The tricky part was to determine how long a connection has been in the state that it currently is in. Since the various *_start columns are changed only as needed, I had to use the following expression to calculate that.
(capture_time - COALESCE(query_start, xact_start, backend_start))::interval
query_start is changed every time current_query value is changed; but it is NULL if the backend has just started. Similarly, xact_start changes whenever backend goes into/comes out of a transaction; but it is NULL when the backend has just started. backend_start is never NULL, so we can fall back on that when nothing else is available (i.e when the backend has just started).
If we separated is_idle and is_idle_in_transaction into separate fields, then we also need to somehow expose when did the backend get into that state, unless we promise to hold the assumptions true that were made when writing the above query (which is not as straightforward as one would expect).
--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
On Wed, Jun 15, 2011 at 8:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Why not expose this new information as functions instead of a new view, like we do for pg_is_in_replication(). People can use whatever alias they want in the queries they write.
SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid), transaction_start_time(pid), .... FROM (select procpid as pid FROM pg_stat_activity);
Then pg_activity (or whatever we name it later) would also be a view on top of these functions.
-- On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith <greg@2ndquadrant.com> wrote:Well, if we're going to start complaining about the lack of proper
> The whole thing is enormously frustrating, and it's an advocacy problem--it
> contributes to people just starting to become serious about using PostgreSQL
> lowering their opinion of its suitability for their business. If this is
> what's included for activity monitoring, and it's this terrible, it suggest
> people must not have very high requirements for that.
activity monitoring, the problems that you're talking about are just
the tip of the iceberg. Don't even get me started.Now, that's a suggestion I could very possibly get behind. Though the
> Putting on my stability hat instead of my "make it right" one, maybe this
> really makes sense to expose as a view with a whole new name. Make this new
> one pg_activity (there's no stats here anyway), keep the old one around as
> pg_stat_activity for a few releases until everyone has converted to the new
> one.
fact that it would leave us with pg_activity / pg_stat_replication
seems less than ideal. Maybe pg_activity isn't the best name
either... bikeshedding time!
Why not expose this new information as functions instead of a new view, like we do for pg_is_in_replication(). People can use whatever alias they want in the queries they write.
SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid), transaction_start_time(pid), .... FROM (select procpid as pid FROM pg_stat_activity);
Then pg_activity (or whatever we name it later) would also be a view on top of these functions.
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
On 06/14/2011 08:04 PM, Greg Sabino Mullane wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: RIPEMD160 > > >> For me, the litmus test is whether the change provides enough >> improvement that it outweighs the disruption when the user runs into >> it. > > For the procpid that started all of this, the clear answer is no. I'm > surprised people seriously considered making this change. It's a > historical accident: document and move on. It is a bug in consistency, the table pg_locks uses "pid" where pg_stat_activity uses "procpid". That is a bug and all bugs are accidents. We take a lot of care in fixing bugs. This isn't just about a few characters in a query, it is about consistency and providing an overall more sane user experience. Frankly I don't care if we use procpid or pid but it should be one or the other not both. Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On Wed, Jun 15, 2011 at 9:44 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote: > Why not expose this new information as functions instead of a new view, like > we do for pg_is_in_replication(). People can use whatever alias they want in > the queries they write. > > SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid), > transaction_start_time(pid), .... FROM (select procpid as pid FROM > pg_stat_activity); > > Then pg_activity (or whatever we name it later) would also be a view on top > of these functions. Well, that would probably be a lot slower, and wouldn't necessarily deliver as consistent a snapshot of system activity. It's better to have one set-returning function that dumps out all the data in a single pass. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 15, 2011 at 10:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I wanted to address consistency issue in the previous mail, but then wanted that to be left for later.
We can provide consistency the same way pg_locks provides; take a snapshot on first request within a transaction, and reuse that snapshot for subsequent calls. In this case we might want to go a bit finer grained by providing a snapshot for every query.
On Wed, Jun 15, 2011 at 9:44 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:Well, that would probably be a lot slower, and wouldn't necessarily
> Why not expose this new information as functions instead of a new view, like
> we do for pg_is_in_replication(). People can use whatever alias they want in
> the queries they write.
>
> SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid),
> transaction_start_time(pid), .... FROM (select procpid as pid FROM
> pg_stat_activity);
>
> Then pg_activity (or whatever we name it later) would also be a view on top
> of these functions.
deliver as consistent a snapshot of system activity. It's better to
have one set-returning function that dumps out all the data in a
single pass.
I wanted to address consistency issue in the previous mail, but then wanted that to be left for later.
We can provide consistency the same way pg_locks provides; take a snapshot on first request within a transaction, and reuse that snapshot for subsequent calls. In this case we might want to go a bit finer grained by providing a snapshot for every query.
--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Gurjeet Singh <singh.gurjeet@gmail.com> writes: > On Wed, Jun 15, 2011 at 10:31 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Well, that would probably be a lot slower, and wouldn't necessarily >> deliver as consistent a snapshot of system activity. It's better to >> have one set-returning function that dumps out all the data in a >> single pass. > I wanted to address consistency issue in the previous mail, but then wanted > that to be left for later. > We can provide consistency the same way pg_locks provides; take a snapshot > on first request within a transaction, and reuse that snapshot for > subsequent calls. In this case we might want to go a bit finer grained by > providing a snapshot for every query. Quite honestly, the implementation mechanism used by the other statistics views is enormous overkill. I agree with Robert that I'm not eager to duplicate that for the activity view, when a simple SRF can get the job done. regards, tom lane
Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011: > On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith <greg@2ndquadrant.com> wrote: > > Putting on my stability hat instead of my "make it right" one, maybe this > > really makes sense to expose as a view with a whole new name. Make this new > > one pg_activity (there's no stats here anyway), keep the old one around as > > pg_stat_activity for a few releases until everyone has converted to the new > > one. > > Now, that's a suggestion I could very possibly get behind. Though the > fact that it would leave us with pg_activity / pg_stat_replication > seems less than ideal. Maybe pg_activity isn't the best name > either... bikeshedding time! pg_sessions? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011: >> Now, that's a suggestion I could very possibly get behind. Though the >> fact that it would leave us with pg_activity / pg_stat_replication >> seems less than ideal. Maybe pg_activity isn't the best name >> either... bikeshedding time! > pg_sessions? Yeah. Or pg_stat_sessions if you want to keep it looking like it's part of the pg_stat_ family. (I'm not sure if we do, since it's really a completely independent facility. OTOH, if we don't name it that way, we're kind of bound to move the documentation into the System Views chapter, whereas it'd be better to keep it where it is.) regards, tom lane
On Wed, Jun 15, 2011 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011: >>> Now, that's a suggestion I could very possibly get behind. Though the >>> fact that it would leave us with pg_activity / pg_stat_replication >>> seems less than ideal. Maybe pg_activity isn't the best name >>> either... bikeshedding time! > >> pg_sessions? > > Yeah. Or pg_stat_sessions if you want to keep it looking like it's part > of the pg_stat_ family. (I'm not sure if we do, since it's really a > completely independent facility. OTOH, if we don't name it that way, > we're kind of bound to move the documentation into the System Views > chapter, whereas it'd be better to keep it where it is.) I've always found the fact that the system views are documented in two different places to be somewhat confusing. It doesn't help that the documentation for the statistics views is quite a bit less detailed. At any rate, I like "sessions". That's what it is, after all. But I will note that we had better be darn sure to make all the changes we want to make in one go, because I dowanna have to create pg_sessions2 (or pg_tessions?) in a year or three. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > At any rate, I like "sessions". That's what it is, after all. But I > will note that we had better be darn sure to make all the changes we > want to make in one go, because I dowanna have to create pg_sessions2 > (or pg_tessions?) in a year or three. Or perhaps pg_connections. Yes, +1 to making things fully backwards compatible by keeping pg_stat_activity around but making a better designed and better named table (view/SRF/whatever). Sounds like perhaps a wiki page to start documenting some of our monitoring shortcomings? Might as well fix as much as we can in one swoop. - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201106151246 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAk344ioACgkQvJuQZxSWSshy9wCgnrj4lQkaomsgS55yq9KI0HBl P2UAoI62Tkt9/U62l0Bxv/KfQUUlL/NF =aaTL -----END PGP SIGNATURE-----
--On 15. Juni 2011 16:47:55 +0000 Greg Sabino Mullane <greg@turnstep.com> wrote: > Or perhaps pg_connections. Yes, +1 to making things fully backwards > compatible by keeping pg_stat_activity around but making a better > designed and better named table (view/SRF/whatever). I thought about that too when reading the thread the first time, but "pg_stat_sessions" sounds better. Our documentation also primarily refers to a database connection as a "session", i think. -- Thanks Bernd
On 06/15/2011 04:13 AM, Rainer Pruy wrote: > I much prefer reading an "<IDLE> in transaction" on a quick glance > over having to search a column and recognize a "t" from an "f" > to find out whether there is a transaction pending or not. > This is a fair observation. If we provide a second view here that reorganizes the data toward something more appropriate for monitoring systems to process it, you may be right that the result will be a step backwards for making it human-readable. They may end up being similar, co-existing views aimed at different uses, rather than one clearly replacing the other one day. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
Since the CF is upon us and discussion is settling, let's see if I can wrap this bikeshedding up into a more concrete proposal that someone can return to later. The ideas floating around have gelled into: -Add a new pg_stat_sessions function that is implemented similarly to pg_stat_activity. For efficiency and simplicity sake, internally this will use the same sort of SRF UI that pg_stat_get_activity does inside src/backend/utils/adt/pgstatfuncs.c There will need to be some refactoring here to reduce code duplication between that and the new function (which will presumably named pg_stat_get_sessions) -The process ID field here will be named "pid" to match other system views, rather than the current "procpid" -State information such as whether the session is idle, idle in a transaction, or has a query visible to this backend will be presented as booleans similar to the current waiting field. A possible additional state to expose is the concept of "active", which ends up being derived using logic like "visible && !idle && !idle_transaction && !waiting" in some monitoring systems. -A case could be made for making some of these state fields null, instead true or false, in situations where the session is not visible. If you don't have rights to see the connection activity, setting idle, idle_transaction, and active all to null may be the right thing to do. More future bikeshedding is likely on this part, once an initial patch is ready for testing. I'd want to get some specific tests against the common monitoring goals of tools like check_postgres and the Munin plug-in to see which implementation makes more sense for them as input on that. -It is still useful to set current_query to descriptive text in the cases where the transaction is <IDLE> etc. That text is not ambiguous with a real query, it is useful for a human-readable view, and it improves the potential for pg_stat_sessions to fully replace a deprecated pg_stat_activity (instead of just co-existing with it). That the query text is overloaded with this information seems agreed to be a good thing; it's just that filtering on the state information there should not require parsing it. The additional booleans will handle that. If idle sessions can be filtered using "WHERE NOT idle", whether the current_query for them reads "<IDLE>" or is null won't matter to typical monitoring use. Given no strong preference there, using "<IDLE>" is both familiar and more human readable. I'll go add this as a TODO now. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
On 06/15/2011 12:41 PM, Robert Haas wrote: > But I will note that we had better be darn sure to make all the changes we > want to make in one go, because I dowanna have to create pg_sessions2 > (or pg_tessions?) in a year or three. > I just added a new section to the TODO to start collecting up some of these related ideas into one place: http://wiki.postgresql.org/wiki/Todo#Monitoring so we might try to get as many as possible all in one go. The other item on there related to pg_stat_activity that might impact this design was adding a column for tracking progress of commands like CREATE INDEX and VACUUM (I updated to note CLUSTER falls into that category too). While query progress will always be a hard problem, adding a field to store some sort of progress indicator might be useful even if it only worked on these two initially. Anyway, topic for another time. The only other item related to this view on the TODO was "Have pg_stat_activity display query strings in the correct client encoding". That might be worthwhile to bundle into this rework, but it doesn't seem something that impacts the UI such that it must be considered early. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 >> Or perhaps pg_connections. Yes, +1 to making things fully backwards >> compatible by keeping pg_stat_activity around but making a better >> designed and better named table (view/SRF/whatever). > I thought about that too when reading the thread the first time, but > "pg_stat_sessions" sounds better. Our documentation also primarily refers to a > database connection as a "session", i think. No, this is clearly connections, not sessions. At least based on the items in the postgresql.conf file, especially max_connections (probably one of the items most closely associated with pg_stat_activity) - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201106161132 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAk36IjYACgkQvJuQZxSWSsg8MgCgkMNw1o37cgmtJdYBAsGl7kz6 Q8sAoISFra0LyQjyKw3zcapWBdCLh2RV =EYAc -----END PGP SIGNATURE-----
Greg Smith <greg@2ndQuadrant.com> writes: > The only other item related to this view on the TODO was "Have > pg_stat_activity display query strings in the correct client encoding". > That might be worthwhile to bundle into this rework, but it doesn't seem > something that impacts the UI such that it must be considered early. That entry is garbled to the point of uselessness anyway, as client encoding has got exactly zip to do with it. The point is that another backend's entry could be in a different *server* encoding, and what do you do if there's no equivalent character in your encoding? regards, tom lane
--On 16. Juni 2011 15:33:35 +0000 Greg Sabino Mullane <greg@turnstep.com> wrote: > No, this is clearly connections, not sessions. At least based on the items > in the postgresql.conf file, especially max_connections (probably one of the > items most closely associated with pg_stat_activity) Well, but it doesn't show database connection(s) only, it also shows what actions are currently performed through the various connections on the databases and state information about them. I'm not a native english speaker, but i have the feeling that "sessions" is better suited for this kind of interactive monitoring. I believe Oracle also has a v$session view to query various information about what's going on. -- Thanks Bernd
Tom Lane <tgl@sss.pgh.pa.us> wrote: > The point is that another backend's entry could be in a different > *server* encoding, and what do you do if there's no equivalent > character in your encoding? My first thought was that it was just a matter of picking a character to represent the "unprintable" characters. My second thought was that if you don't understand the encoding scheme, you're not even going to know where the character boundaries are. :-( -Kevin
Excerpts from Greg Sabino Mullane's message of jue jun 16 15:33:35 UTC 2011: > > Hash: RIPEMD160 > > >> Or perhaps pg_connections. Yes, +1 to making things fully backwards > >> compatible by keeping pg_stat_activity around but making a better > >> designed and better named table (view/SRF/whatever). > > > I thought about that too when reading the thread the first time, but > > "pg_stat_sessions" sounds better. Our documentation also primarily refers to a > > database connection as a "session", i think. > > No, this is clearly connections, not sessions. At least based on the items > in the postgresql.conf file, especially max_connections (probably one of the > items most closely associated with pg_stat_activity) That doesn't include autovacuum, though, whereas the new view would. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Greg Smith wrote: > -It is still useful to set current_query to descriptive text in the > cases where the transaction is <IDLE> etc. That text is not ambiguous > with a real query, it is useful for a human-readable view, and it > improves the potential for pg_stat_sessions to fully replace a > deprecated pg_stat_activity (instead of just co-existing with it). That > the query text is overloaded with this information seems agreed to be a > good thing; it's just that filtering on the state information there > should not require parsing it. The additional booleans will handle > that. If idle sessions can be filtered using "WHERE NOT idle", whether > the current_query for them reads "<IDLE>" or is null won't matter to > typical monitoring use. Given no strong preference there, using > "<IDLE>" is both familiar and more human readable. Uh, if we are going to do that, why not just add the boolean columns to the existing view? Clearly renaming procpid isn't worth creating another view. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 06/16/2011 05:27 PM, Bruce Momjian wrote: > Greg Smith wrote: > >> -It is still useful to set current_query to descriptive text in the >> cases where the transaction is<IDLE> etc. >> > Uh, if we are going to do that, why not just add the boolean columns to > the existing view? Clearly renaming procpid isn't worth creating > another view. > I'm not completely set on this either way; that's why I suggested a study that digs into typical monitoring system queries would be useful. Even the current view is pushing the limits for how much you can put into something that intends to be human-readable though. Adding a new pile of columns to it has some downsides there. I hadn't ever tried to write down everything I'd like to see changed here until this week, so there may be further column churn that justifies a new view too. I think the whole idea needs to get chewed on a bit more. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
On Fri, Jun 17, 2011 at 06:39, Greg Smith <greg@2ndquadrant.com> wrote: > On 06/16/2011 05:27 PM, Bruce Momjian wrote: >> >> Greg Smith wrote: >> >>> >>> -It is still useful to set current_query to descriptive text in the >>> cases where the transaction is<IDLE> etc. >>> >> >> Uh, if we are going to do that, why not just add the boolean columns to >> the existing view? Clearly renaming procpid isn't worth creating >> another view. >> > > I'm not completely set on this either way; that's why I suggested a study > that digs into typical monitoring system queries would be useful. Even the > current view is pushing the limits for how much you can put into something > that intends to be human-readable though. Adding a new pile of columns to > it has some downsides there. Is it intended for human-readable? And for human readable without specifying which part you want? It's already way too wide to fit in most terminals - and has been for years. You need to use \x unless you specify the fields. And if you want a "simpler version", why not just add all the columns to the existing one we need, and then create a regular VIEW over it that shows just the most common columns? But I still think you're going to find a hard time making even that narrow enough to be easily consumable - but you could certainly remove things like usesysid and datid which are mainly useful only for JOINing to other stuff. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Jun 16, 2011, at 9:31 AM, Greg Smith wrote: > -A case could be made for making some of these state fields null, instead true or false, in situations where the sessionis not visible. If you don't have rights to see the connection activity, setting idle, idle_transaction, and activeall to null may be the right thing to do. More future bikeshedding is likely on this part, once an initial patch isready for testing. I'd want to get some specific tests against the common monitoring goals of tools like check_postgresand the Munin plug-in to see which implementation makes more sense for them as input on that. ISTM this should be driven by what data we actually expose. If we're willing to expose actual information for idle, idle_transactionand waiting for backends that you don't have permission to see the query for, then we should expose the actualinformation (I personally think this would be useful). OTOH, if we are not willing to expose that information, then we should certainly set those fields to null instead of somedefault value. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net