Thread: procpid?

procpid?

From
Bruce Momjian
Date:
Can someone explain why pg_stat_activity has a column named procpid and
not simply pid?  'pid' is that pg_locks uses, and 'procpid' is redundant
(proc-process-id).  A mistake?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Can someone explain why pg_stat_activity has a column named procpid and
> not simply pid?  'pid' is that pg_locks uses, and 'procpid' is redundant
> (proc-process-id).  A mistake?

Mistake or not, it's about half a dozen releases too late to change it.
        regards, tom lane


Re: procpid?

From
Robert Haas
Date:
On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Can someone explain why pg_stat_activity has a column named procpid and
> not simply pid?  'pid' is that pg_locks uses, and 'procpid' is redundant
> (proc-process-id).  A mistake?

Well, we refer to the slots that backends use as "procs" (really
PGPROC), so I'm guessing that this was intended to mean "the pid
associated with the proc".  It might not be the greatest name but I
can't see changing it now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Bruce Momjian
Date:
Robert Haas wrote:
> On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian <bruce@momjian.us> wrote:
> > Can someone explain why pg_stat_activity has a column named procpid and
> > not simply pid? ?'pid' is that pg_locks uses, and 'procpid' is redundant
> > (proc-process-id). ?A mistake?
> 
> Well, we refer to the slots that backends use as "procs" (really
> PGPROC), so I'm guessing that this was intended to mean "the pid
> associated with the proc".  It might not be the greatest name but I
> can't see changing it now.

Agreed.  Just pointing out this mistake slipped through.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
Jim Nasby
Date:
On Jun 9, 2011, at 11:29 AM, Robert Haas wrote:
> On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> Can someone explain why pg_stat_activity has a column named procpid and
>> not simply pid?  'pid' is that pg_locks uses, and 'procpid' is redundant
>> (proc-process-id).  A mistake?
>
> Well, we refer to the slots that backends use as "procs" (really
> PGPROC), so I'm guessing that this was intended to mean "the pid
> associated with the proc".  It might not be the greatest name but I
> can't see changing it now.

It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we
candeprecate procpid and eventually remove it... 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




Re: procpid?

From
Jaime Casanova
Date:
On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasby <jim@nasby.net> wrote:
>
> It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we
candeprecate 
> procpid and eventually remove it...

well, if we will start changing bad picked names we will have a *lot*
of work to do... starting by the project's name ;)

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


Re: procpid?

From
"Joshua D. Drake"
Date:
On 6/11/2011 1:02 AM, Jaime Casanova wrote:
> On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasby<jim@nasby.net>  wrote:
>> It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so
wecan deprecate
 
>> procpid and eventually remove it...
> well, if we will start changing bad picked names we will have a *lot*
> of work to do... starting by the project's name ;)

There is a difference between a project name and something that directly 
affects usability. +1 on fixing this. IMO, we don't create a new pid 
column, we just fix the problem. If we do it for 9.2, we have 18 months 
to communicate the change.

Joshua D. Drake




Re: procpid?

From
Bruce Momjian
Date:
Joshua D. Drake wrote:
> On 6/11/2011 1:02 AM, Jaime Casanova wrote:
> > On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasby<jim@nasby.net>  wrote:
> >> It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so
wecan deprecate
 
> >> procpid and eventually remove it...
> > well, if we will start changing bad picked names we will have a *lot*
> > of work to do... starting by the project's name ;)
> 
> There is a difference between a project name and something that directly 
> affects usability. +1 on fixing this. IMO, we don't create a new pid 
> column, we just fix the problem. If we do it for 9.2, we have 18 months 
> to communicate the change.

Uh, I am the first one I remember complaining about this so I don't see
why we should break compatibility for such a low-level problem.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
"Joshua D. Drake"
Date:
On 6/11/2011 1:23 PM, Bruce Momjian wrote:
>
>> There is a difference between a project name and something that directly
>> affects usability. +1 on fixing this. IMO, we don't create a new pid
>> column, we just fix the problem. If we do it for 9.2, we have 18 months
>> to communicate the change.
> Uh, I am the first one I remember complaining about this so I don't see
> why we should break compatibility for such a low-level problem.
>

Because it is a very real problem with an easy fix. We have 18 months to 
publicize that fix. I mean really? This is a no-brainer.

JD



Re: procpid?

From
Robert Haas
Date:
On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> On 6/11/2011 1:23 PM, Bruce Momjian wrote:
>>
>>> There is a difference between a project name and something that directly
>>> affects usability. +1 on fixing this. IMO, we don't create a new pid
>>> column, we just fix the problem. If we do it for 9.2, we have 18 months
>>> to communicate the change.
>>
>> Uh, I am the first one I remember complaining about this so I don't see
>> why we should break compatibility for such a low-level problem.
>
> Because it is a very real problem with an easy fix. We have 18 months to
> publicize that fix. I mean really? This is a no-brainer.

I really don't see what the big deal with calling it the process PID
rather than just the PID is.  Changing something like this forces
pgAdmin and every other application out there that is built to work
with PG to make a code change to keep working with PG.  That seems
like pushing a lot of unnecessary work on other people for what is
basically a minor cosmetic issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Cédric Villemain
Date:
2011/6/12 Robert Haas <robertmhaas@gmail.com>:
> On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>> On 6/11/2011 1:23 PM, Bruce Momjian wrote:
>>>
>>>> There is a difference between a project name and something that directly
>>>> affects usability. +1 on fixing this. IMO, we don't create a new pid
>>>> column, we just fix the problem. If we do it for 9.2, we have 18 months
>>>> to communicate the change.
>>>
>>> Uh, I am the first one I remember complaining about this so I don't see
>>> why we should break compatibility for such a low-level problem.
>>
>> Because it is a very real problem with an easy fix. We have 18 months to
>> publicize that fix. I mean really? This is a no-brainer.
>
> I really don't see what the big deal with calling it the process PID
> rather than just the PID is.  Changing something like this forces
> pgAdmin and every other application out there that is built to work
> with PG to make a code change to keep working with PG.  That seems
> like pushing a lot of unnecessary work on other people for what is
> basically a minor cosmetic issue.

I agree.
This is at least a use-case for something^Wfeature like 'create
synonym', allowing smooth end-user's application upgrade on schema
update. I am not claiming that we need that, it just seems a good
usecase for column alias/synonym.


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


Re: procpid?

From
Robert Haas
Date:
On Sat, Jun 11, 2011 at 9:56 PM, Cédric Villemain
<cedric.villemain.debian@gmail.com> wrote:
> 2011/6/12 Robert Haas <robertmhaas@gmail.com>:
>> On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>>> On 6/11/2011 1:23 PM, Bruce Momjian wrote:
>>>>
>>>>> There is a difference between a project name and something that directly
>>>>> affects usability. +1 on fixing this. IMO, we don't create a new pid
>>>>> column, we just fix the problem. If we do it for 9.2, we have 18 months
>>>>> to communicate the change.
>>>>
>>>> Uh, I am the first one I remember complaining about this so I don't see
>>>> why we should break compatibility for such a low-level problem.
>>>
>>> Because it is a very real problem with an easy fix. We have 18 months to
>>> publicize that fix. I mean really? This is a no-brainer.
>>
>> I really don't see what the big deal with calling it the process PID
>> rather than just the PID is.  Changing something like this forces
>> pgAdmin and every other application out there that is built to work
>> with PG to make a code change to keep working with PG.  That seems
>> like pushing a lot of unnecessary work on other people for what is
>> basically a minor cosmetic issue.
>
> I agree.
> This is at least a use-case for something^Wfeature like 'create
> synonym', allowing smooth end-user's application upgrade on schema
> update. I am not claiming that we need that, it just seems a good
> usecase for column alias/synonym.

I had the same thought.  I'm not sure that this particular example
would be worthwhile even if we had a column synonym facility.  But at
least if we were bent on changing it we could do it without breaking
things.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Peter Eisentraut
Date:
On lör, 2011-06-11 at 16:23 -0400, Bruce Momjian wrote:
> Uh, I am the first one I remember complaining about this so I don't
> see why we should break compatibility for such a low-level problem. 

I complain about it every day to the wall. :)



Re: procpid?

From
Dimitri Fontaine
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On lör, 2011-06-11 at 16:23 -0400, Bruce Momjian wrote:
>> Uh, I am the first one I remember complaining about this so I don't
>> see why we should break compatibility for such a low-level problem.
>
> I complain about it every day to the wall. :)

+1 !
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: procpid?

From
Jim Nasby
Date:
On Jun 11, 2011, at 9:36 PM, Robert Haas wrote:
>> This is at least a use-case for something^Wfeature like 'create
>> synonym', allowing smooth end-user's application upgrade on schema
>> update. I am not claiming that we need that, it just seems a good
>> usecase for column alias/synonym.
>
> I had the same thought.  I'm not sure that this particular example
> would be worthwhile even if we had a column synonym facility.  But at
> least if we were bent on changing it we could do it without breaking
> things.

A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been
thatway for years and the only reason it's never been cleaned up is because it would require simultaneously changing
configsettings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the
changevery difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to
specialcase it), it would be even more painful to fix the problem. If we had database name synonyms we could create a
synonymand migrate everything over time... and in the meantime, code could stop special-casing it. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




Re: procpid?

From
Robert Haas
Date:
On Mon, Jun 13, 2011 at 11:20 AM, Jim Nasby <jim@nasby.net> wrote:
> On Jun 11, 2011, at 9:36 PM, Robert Haas wrote:
>>> This is at least a use-case for something^Wfeature like 'create
>>> synonym', allowing smooth end-user's application upgrade on schema
>>> update. I am not claiming that we need that, it just seems a good
>>> usecase for column alias/synonym.
>>
>> I had the same thought.  I'm not sure that this particular example
>> would be worthwhile even if we had a column synonym facility.  But at
>> least if we were bent on changing it we could do it without breaking
>> things.
>
> A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been
thatway for years and the only reason it's never been cleaned up is because it would require simultaneously changing
configsettings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the
changevery difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to
specialcase it), it would be even more painful to fix the problem. If we had database name synonyms we could create a
synonymand migrate everything over time... and in the meantime, code could stop special-casing it. 

That's probably the best explanation of why synonyms would be useful I
believe I've yet heard.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Jim Nasby
Date:
On Jun 13, 2011, at 10:22 AM, Robert Haas wrote:
>> A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been
thatway for years and the only reason it's never been cleaned up is because it would require simultaneously changing
configsettings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the
changevery difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to
specialcase it), it would be even more painful to fix the problem. If we had database name synonyms we could create a
synonymand migrate everything over time... and in the meantime, code could stop special-casing it. 
>
> That's probably the best explanation of why synonyms would be useful I
> believe I've yet heard.

FWIW, I've asked Command Prompt to look into creating database name synonyms for us, but perhaps there are other
synonymsthat would make sense? I can't really think of any other cases where you care about name and don't have a way
towork around it (ie: column and tables can be done with views; you can grant a role to another role; you can create a
wrapperfunction). 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




Re: procpid?

From
Simon Riggs
Date:
On Sun, Jun 12, 2011 at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>> On 6/11/2011 1:23 PM, Bruce Momjian wrote:
>>>
>>>> There is a difference between a project name and something that directly
>>>> affects usability. +1 on fixing this. IMO, we don't create a new pid
>>>> column, we just fix the problem. If we do it for 9.2, we have 18 months
>>>> to communicate the change.
>>>
>>> Uh, I am the first one I remember complaining about this so I don't see
>>> why we should break compatibility for such a low-level problem.
>>
>> Because it is a very real problem with an easy fix. We have 18 months to
>> publicize that fix. I mean really? This is a no-brainer.
>
> I really don't see what the big deal with calling it the process PID
> rather than just the PID is.  Changing something like this forces
> pgAdmin and every other application out there that is built to work
> with PG to make a code change to keep working with PG.  That seems
> like pushing a lot of unnecessary work on other people for what is
> basically a minor cosmetic issue.

+1

If we were going to make changes like this, I'd suggest we save them
up in a big bag for when we change major version number. Everybody in
the world thinks that PostgreSQL v8 is compatible across all versions
(8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
would still have forward progress, but in more sensible sized steps.
Otherwise we just break the code annually for all the people that
support us. If we had a more stable environment for tools vendors,
maybe people wouldn't need to be manually typing procpid anyway...

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: procpid?

From
Robert Haas
Date:
On Mon, Jun 13, 2011 at 11:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> +1
>
> If we were going to make changes like this, I'd suggest we save them
> up in a big bag for when we change major version number. Everybody in
> the world thinks that PostgreSQL v8 is compatible across all versions
> (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
> would still have forward progress, but in more sensible sized steps.
> Otherwise we just break the code annually for all the people that
> support us. If we had a more stable environment for tools vendors,
> maybe people wouldn't need to be manually typing procpid anyway...

Amen.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Bruce Momjian
Date:
Simon Riggs wrote:
> On Sun, Jun 12, 2011 at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> >> On 6/11/2011 1:23 PM, Bruce Momjian wrote:
> >>>
> >>>> There is a difference between a project name and something that directly
> >>>> affects usability. +1 on fixing this. IMO, we don't create a new pid
> >>>> column, we just fix the problem. If we do it for 9.2, we have 18 months
> >>>> to communicate the change.
> >>>
> >>> Uh, I am the first one I remember complaining about this so I don't see
> >>> why we should break compatibility for such a low-level problem.
> >>
> >> Because it is a very real problem with an easy fix. We have 18 months to
> >> publicize that fix. I mean really? This is a no-brainer.
> >
> > I really don't see what the big deal with calling it the process PID
> > rather than just the PID is. ?Changing something like this forces
> > pgAdmin and every other application out there that is built to work
> > with PG to make a code change to keep working with PG. ?That seems
> > like pushing a lot of unnecessary work on other people for what is
> > basically a minor cosmetic issue.
> 
> +1
> 
> If we were going to make changes like this, I'd suggest we save them
> up in a big bag for when we change major version number. Everybody in
> the world thinks that PostgreSQL v8 is compatible across all versions
> (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
> would still have forward progress, but in more sensible sized steps.
> Otherwise we just break the code annually for all the people that
> support us. If we had a more stable environment for tools vendors,
> maybe people wouldn't need to be manually typing procpid anyway...

Agreed.  I did add a C comment that this was misnamed so when we are in
that code we will see it.  I did reorder the pg_stat_activity columns in
9.0 for sanity, and no one complained, but renaming is more disruptive
than reordering.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
Jim Nasby
Date:
On Jun 13, 2011, at 10:56 AM, Simon Riggs wrote:
> If we were going to make changes like this, I'd suggest we save them
> up in a big bag for when we change major version number. Everybody in
> the world thinks that PostgreSQL v8 is compatible across all versions
> (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
> would still have forward progress, but in more sensible sized steps.
> Otherwise we just break the code annually for all the people that
> support us. If we had a more stable environment for tools vendors,
> maybe people wouldn't need to be manually typing procpid anyway...

Wouldn't it be better still to have both the new and old columns available for a while? That would produce the minimum
amountof disruption to tools, etc. The only downside is some potential confusion, but that would just serve to drive
peopleto the documentation to see why there were two fields, where they would find out one was deprecated. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




Re: procpid?

From
Bruce Momjian
Date:
Jim Nasby wrote:
> On Jun 13, 2011, at 10:56 AM, Simon Riggs wrote:
> > If we were going to make changes like this, I'd suggest we save them
> > up in a big bag for when we change major version number. Everybody in
> > the world thinks that PostgreSQL v8 is compatible across all versions
> > (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
> > would still have forward progress, but in more sensible sized steps.
> > Otherwise we just break the code annually for all the people that
> > support us. If we had a more stable environment for tools vendors,
> > maybe people wouldn't need to be manually typing procpid anyway...
> 
> Wouldn't it be better still to have both the new and old columns
> available for a while? That would produce the minimum amount of
> disruption to tools, etc. The only downside is some potential confusion,
> but that would just serve to drive people to the documentation to see
> why there were two fields, where they would find out one was deprecated.

Well, someone doing SELECT *, which is probably 90% of the users, are
going to be pretty confused by duplicate columns, asking, "What is the
difference"?  For those people this would make things worse than they
are now.

I would say 90% of users are doing SELECT *, and 10% are joining to
other tables or displaying specific columns.  We want to help that 10%
without making that 90% confused.

-- Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
Greg Smith
Date:
On 06/14/2011 11:44 AM, Jim Nasby wrote:
> Wouldn't it be better still to have both the new and old columns 
> available for a while? That would produce the minimum amount of 
> disruption to tools, etc.

Doing this presumes the existence of a large number of tools where the 
author is unlikely to be keeping up with PostgreSQL development.  I 
don't believe that theorized set of users actually exists.  There are 
people who use pg_stat_activity simply, and there are tool authors who 
are heavily involved enough that they will see a change here coming far 
enough in advance to adopt it without disruption.  If there's a large 
base of "casual" tool authors, who wrote something using 
pg_stat_activity once and will never update it again, I don't know where 
they are.

Anyway, I want a larger change to pg_stat_activity than this one, and I 
would just roll fixing this column name into that more disruptive and 
positive change.  Right now the biggest problem with this view is that 
you have to parse the text of the query to figure out what state the 
connection is in.  This is silly; there should be boolean values exposed 
for "idle" and "in transaction".  I want to be able to write things like 
this:

SELECT idle,in_trans,count(*) FROM pg_stat_activity GROUP BY idle,in_trans;
SELECT min(backend_start) FROM pg_stat_activity WHERE idle;

Right now the standard approach to this is to turn current_query into a 
derived state value using CASE statements.  It's quite unfriendly, and a 
bigger problem than this procpid mismatch.  Fix that whole mess at once, 
and now you've got something useful enough to justify breaking tools.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
Jaime Casanova
Date:
On Tue, Jun 14, 2011 at 12:25 PM, Greg Smith <greg@2ndquadrant.com> wrote:
>
> Anyway, I want a larger change to pg_stat_activity than this one

Well, Simon recomended to have a big bag of changes that justify break
tools... and you have presented a good one item for that bag...
Maybe we should start a wiki page for this and put there all the
changes we want to see before break anything?

for example, a change i want to see is in csvlog: i want a duration
field there because tools like pgfouine, pgsi and others parse the
message field for a "duration" string which is only usefull if the
message is in english which non-english dba's won't have

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


Re: procpid?

From
Robert Haas
Date:
On Tue, Jun 14, 2011 at 1:43 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
> On Tue, Jun 14, 2011 at 12:25 PM, Greg Smith <greg@2ndquadrant.com> wrote:
>>
>> Anyway, I want a larger change to pg_stat_activity than this one
>
> Well, Simon recomended to have a big bag of changes that justify break
> tools... and you have presented a good one item for that bag...
> Maybe we should start a wiki page for this and put there all the
> changes we want to see before break anything?
>
> for example, a change i want to see is in csvlog: i want a duration
> field there because tools like pgfouine, pgsi and others parse the
> message field for a "duration" string which is only usefull if the
> message is in english which non-english dba's won't have

There are real problems with the idea of having one release where we
break everything that we want to break - mostly from a process
standpoint.  We aren't always good at being organized and disciplined,
and coming up with a multi-year plan to break everything all at once
in 2014 for release in 2015 may be difficult, because it requires a
consensus on release management to hold together for years, and
sometimes we can't even manage "days".

But I don't think it's a bad idea to try.  So +1 for creating a list
of things that we think we might like to break at some point.  It
might be worth trying to do this in the context of the Todo list -
come up with some special badge or flag that we can put on items that
require a compatibility break, so that we can scan for them there
easily.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
"Kevin Grittner"
Date:
Greg Smith <greg@2ndQuadrant.com> wrote:
> Doing this presumes the existence of a large number of tools where
> the author is unlikely to be keeping up with PostgreSQL
> development.  I don't believe that theorized set of users actually
> exists.
There could be a number of queries used for monitoring or
administration which will be affected.  Just on our Wiki pages we
have some queries available for copy/paste which would need multiple
versions while both column names were in supported versions of the
software:
http://wiki.postgresql.org/wiki/Lock_dependency_information
http://wiki.postgresql.org/wiki/Lock_Monitoring
http://wiki.postgresql.org/wiki/Backend_killer_function
I agree that these are manageable, but not necessarily trivial. 
(You should see how long it can take to get them to install new
monitoring software to our centralized system here.)  I think that's
consistent with the "save up our breaking changes to do them all at
once" approach.
-Kevin


Re: procpid?

From
Greg Smith
Date:
On 06/14/2011 02:20 PM, Kevin Grittner wrote:
> Just on our Wiki pages we have some queries available for copy/paste 
> which would need multiple
> versions while both column names were in supported versions of the
> software:
>
> http://wiki.postgresql.org/wiki/Lock_dependency_information
> http://wiki.postgresql.org/wiki/Lock_Monitoring
> http://wiki.postgresql.org/wiki/Backend_killer_function
>    

...and most of these would actually be simplified if they could just 
JOIN on pid instead of needing this common idiom:
           join pg_catalog.pg_stat_activity ka           on kl.pid = ka.procpid

Yes, there are a lot of these floating around.  I'd bet that in an hour 
of research I could find 95% of them though, and make sure they were all 
updated in advance of the release.  (I already did most of this search 
as part of stealing every good idea I could find in this area for my book)

> I think that's consistent with the "save up our breaking changes to do them all at
> once" approach.
>    

I don't actually buy into this whole idea at all.  We already have this 
big wall at 8.3 because changes made in that release are too big for 
people on the earlier side to upgrade past.  I'd rather see a series of 
smaller changes in each release, even if they are disruptive, so that no 
one version turns into a frustrating hurdle seen as impossible to 
clear.  This adjustment is a perfect candidate for putting into 9.2 to 
me, because I'd rather reduce max(breakage) across releases than 
intentionally aim at increasing it but bundling them into larger clumps.

For me, the litmus test is whether the change provides enough 
improvement that it outweighs the disruption when the user runs into 
it.  This is why I suggested a specific, useful, and commonly requested 
(to me at least) change to pg_stat_activity go along with this.  If 
people discover their existing pg_stat_activity tools break, presumably 
they're going to look at the view again to see what changed.  When they 
do that, I don't want the reaction to be "why was this random change 
made?"  I want it to be "look, there are useful new fields in here; let 
me see if I can use them too here".  That's how you make people tolerate 
disruption in upgrades.  If they see a clear improvement in the same 
spot when forced to fix around it, the experience is much more pleasant 
if they get something new out of it too.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of mar jun 14 12:59:15 -0400 2011:

> Well, someone doing SELECT *, which is probably 90% of the users, are
> going to be pretty confused by duplicate columns, asking, "What is the
> difference"?  For those people this would make things worse than they
> are now.
> 
> I would say 90% of users are doing SELECT *, and 10% are joining to
> other tables or displaying specific columns.  We want to help that 10%
> without making that 90% confused.

I think if you had column synonyms, you would get only a single one when
doing "select *".  The other name would still be accepted in a query
that explicitely asked for it.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: procpid?

From
Peter Eisentraut
Date:
On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote:
> There are real problems with the idea of having one release where we
> break everything that we want to break - mostly from a process
> standpoint.  We aren't always good at being organized and disciplined,
> and coming up with a multi-year plan to break everything all at once
> in 2014 for release in 2015 may be difficult, because it requires a
> consensus on release management to hold together for years, and
> sometimes we can't even manage "days".

I have had this fantasy of a break-everything release for a long time as
well, but frankly, experience from other projects such as Python 3, Perl
6, KDE 4, Samba 4, add-yours-here, indicates that such things might not
work out so well.

OK, some of those were rewrites as well as interface changes, but the
effect visible to the end user is mostly the same.




Re: procpid?

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote:
> > There are real problems with the idea of having one release where we
> > break everything that we want to break - mostly from a process
> > standpoint.  We aren't always good at being organized and disciplined,
> > and coming up with a multi-year plan to break everything all at once
> > in 2014 for release in 2015 may be difficult, because it requires a
> > consensus on release management to hold together for years, and
> > sometimes we can't even manage "days".
> 
> I have had this fantasy of a break-everything release for a long time as
> well, but frankly, experience from other projects such as Python 3, Perl
> 6, KDE 4, Samba 4, add-yours-here, indicates that such things might not
> work out so well.
> 
> OK, some of those were rewrites as well as interface changes, but the
> effect visible to the end user is mostly the same.

Funny you mentioned Perl 6 because I just blogged about that:
http://momjian.us/main/blogs/pgblog/2011.html#June_14_2011

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote:
>> There are real problems with the idea of having one release where we
>> break everything that we want to break - mostly from a process
>> standpoint.  We aren't always good at being organized and disciplined,
>> and coming up with a multi-year plan to break everything all at once
>> in 2014 for release in 2015 may be difficult, because it requires a
>> consensus on release management to hold together for years, and
>> sometimes we can't even manage "days".

> I have had this fantasy of a break-everything release for a long time as
> well, but frankly, experience from other projects such as Python 3, Perl
> 6, KDE 4, Samba 4, add-yours-here, indicates that such things might not
> work out so well.

> OK, some of those were rewrites as well as interface changes, but the
> effect visible to the end user is mostly the same.

Good point.  I think the case that has actually been discussed is the
idea of saving up binary-compatibility breaks (on-disk format changes).
That seems sensible.  It doesn't create a bigger problem for users,
since a dump/reload is a dump/reload no matter how many individual
format changes happened underneath.  But we should be wary of applying
that approach to application-visible incompatibilities.

As far as Greg's proposal is concerned, I don't see how a proposed
addition of two columns would justify renaming an existing column.
Additions should not break any sanely-implemented application, but
renamings certainly will.
        regards, tom lane


Re: procpid?

From
Greg Smith
Date:
On 06/14/2011 06:00 PM, Tom Lane wrote:
> As far as Greg's proposal is concerned, I don't see how a proposed
> addition of two columns would justify renaming an existing column.
> Additions should not break any sanely-implemented application, but
> renamings certainly will.
>    

It's not so much justification as something that makes the inevitable 
complaints easier to stomach, in terms of not leaving a really bad taste 
in the user's mouth.  My thinking is that if we're going to mess with 
pg_stat_activity in a way that breaks something, I'd like to see it 
completely refactored for better usability in the process.  If code 
breaks and the resulting investigation by the admin highlights something 
new, that offsets some of the bad user experience resulting from the 
breakage.

Also, I haven't fully worked whether it makes sense to really change 
what current_query means if the idle/transaction component of it gets 
moved to another column.  Would it be better to set current_query to 
null if you are idle, rather than the way it's currently overloaded with 
text in that case?  I don't like the way this view works at all, but I'm 
not sure the best way to change it.  Just changing procpid wouldn't be 
the only thing on the list though.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
Bruce Momjian
Date:
Greg Smith wrote:
> On 06/14/2011 06:00 PM, Tom Lane wrote:
> > As far as Greg's proposal is concerned, I don't see how a proposed
> > addition of two columns would justify renaming an existing column.
> > Additions should not break any sanely-implemented application, but
> > renamings certainly will.
> >    
> 
> It's not so much justification as something that makes the inevitable 
> complaints easier to stomach, in terms of not leaving a really bad taste 
> in the user's mouth.  My thinking is that if we're going to mess with 
> pg_stat_activity in a way that breaks something, I'd like to see it 
> completely refactored for better usability in the process.  If code 
> breaks and the resulting investigation by the admin highlights something 
> new, that offsets some of the bad user experience resulting from the 
> breakage.
> 
> Also, I haven't fully worked whether it makes sense to really change 
> what current_query means if the idle/transaction component of it gets 
> moved to another column.  Would it be better to set current_query to 
> null if you are idle, rather than the way it's currently overloaded with 
> text in that case?  I don't like the way this view works at all, but I'm 
> not sure the best way to change it.  Just changing procpid wouldn't be 
> the only thing on the list though.

Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate
fields.  If I had thought of it I would have done it that way years ago.
(At least I think it was me.)  Using angle brackets to put magic values
in that field was clearly wrong.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
Greg Stark
Date:
On Wed, Jun 15, 2011 at 2:50 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate
> fields.  If I had thought of it I would have done it that way years ago.
> (At least I think it was me.)  Using angle brackets to put magic values
> in that field was clearly wrong.

I think of these as just placeholders in the SQL text field for cases
where there's no SQL text available.

But they do clearly indicate a need for columns with this information.
For what it's worth Oracle provides a whole list of states the
transaction can be in, it can be waiting for client traffic, waiting
on i/o, waiting on a lock, etc.

Separately whether the session is in a transaction might need to
become slightly richer than a boolean now that we have snapshot
management. You can be in a transaction but not have any snapshots or
be in the traditional state where you have at least one snapshot. And
If we do autonomous transactions the field might have be much much
richer again.

--
greg


Re: procpid?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> For me, the litmus test is whether the change provides enough 
> improvement that it outweighs the disruption when the user runs into 
> it.

For the procpid that started all of this, the clear answer is no. I'm 
surprised people seriously considered making this change. It's a 
historical accident: document and move on. And if we are going to 
talk about changing misnamed things, I've got a whole bunch of others 
I could throw at you (such as abbreviation rules: blks_read on the 
one extreme, and autovacuum_analyze_scale_factor on the other) :)

> This is why I suggested a specific, useful, and commonly requested 
> (to me at least) change to pg_stat_activity go along with this.

+1. The procpid change is silly, but fixing the current_query field 
would be very useful. You don't know how many times my fingers 
have typed "WHERE current_query <> '<IDLE>'"

- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201106142300
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk34IRoACgkQvJuQZxSWSsi0dgCgi37mrLYbD6G3dS99GPbSFhHW
EjYAniZNpRUXxYmhBHfb1k1LsMSoOHE7
=61nA
-----END PGP SIGNATURE-----




Re: procpid?

From
Robert Haas
Date:
On Tue, Jun 14, 2011 at 11:04 PM, Greg Sabino Mullane <greg@turnstep.com> wrote:
>> For me, the litmus test is whether the change provides enough
>> improvement that it outweighs the disruption when the user runs into
>> it.
>
> For the procpid that started all of this, the clear answer is no. I'm
> surprised people seriously considered making this change. It's a
> historical accident: document and move on.

I agree with you on this one...

>> This is why I suggested a specific, useful, and commonly requested
>> (to me at least) change to pg_stat_activity go along with this.
>
> +1. The procpid change is silly, but fixing the current_query field
> would be very useful. You don't know how many times my fingers
> have typed "WHERE current_query <> '<IDLE>'"

...but I'm not even excited about this.  *Maybe* it's worth adding
another column, but the problem with the existing system is *entirely*
cosmetic.  The string chosen here is unconfusable with an actual
query, so we are talking here, as with the procpid -> pid proposal,
ONLY about saving a few keystrokes when writing queries.  That is a
pretty thin justification for a compatibility break IMV.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Greg Smith
Date:
Here's the sort of thing every person who writes a monitoring tool 
involving pg_stat_activity goes through:

1) Hurray!  I know how to see what the database is doing now!  Let me 
try counting all the connections so I can finally figure out what to set 
[max_connections | work_mem | other] to.
2) Wait, some of these can be "<IDLE>".  That's not documented.  I'll 
have to special case them because they don't really matter for my 
computation.
3) Seriously, there's another state for idle in a transaction?  Just how 
many of these special values are there?  [There's actually one more 
surprise after this]

The whole thing is enormously frustrating, and it's an advocacy 
problem--it contributes to people just starting to become serious about 
using PostgreSQL lowering their opinion of its suitability for their 
business.  If this is what's included for activity monitoring, and it's 
this terrible, it suggest people must not have very high requirements 
for that.

And what you end up with to make it better is not just another few 
keystrokes.  Here, as a common example I re-use a lot, is a decoder 
inspired by Munin's connection count monitoring graph:

SELECT    waiting,    CASE WHEN current_query='<IDLE>' THEN true ELSE false END AS idle,    CASE WHEN
current_query='<IDLE>in transaction' THEN true ELSE 
 
false END AS idletransaction,    CASE WHEN current_query='<insufficient privilege>' THEN false ELSE 
true END as visible,    CASE WHEN NOT waiting AND current_query NOT IN ('<IDLE>', '<IDLE> 
in transaction', '<insufficient privilege>') THEN true ELSE false END AS 
active,    procpid,current_query
FROM pg_stat_activity WHERE procpid != pg_backend_pid();

What percentage of people do you think get this right?  Now, what does 
that number go to if these states were all obviously exposed booleans?  
As I'm concerned, this design is fundamentally flawed as currently 
delivered, so the concept of "breaking" it doesn't really make sense.

The fact that you can only figure all this decoding magic out through 
extensive trial and error, or reading the source code to [the database | 
another monitoring tool], is crazy.  It's a much bigger problem than the 
fact that the pid column is misnamed, and way up on my list of things 
I'm just really tired of doing.  Yes, we could just document all these 
mystery states to help, but they'd still be terrible.

This is a database; let's expose the data in a way that it's easy to 
slice yourself using a database query.  And if we're going to fix 
that--which unfortunately will be breaking it relative to those already 
using the current format--I figure why not bundle the procpid fix into 
that while we're at it.  It's even possible to argue that breaking that 
small thing will draw useful attention to the improvements in other 
parts of the view.  Having your monitoring query break after a version 
upgrade is no fun.  But if investigating why reveals new stuff you 
didn't notice in the release notes, the changes become more 
discoverable, albeit in a somewhat perverse way.

Putting on my stability hat instead of my "make it right" one, maybe 
this really makes sense to expose as a view with a whole new name.  Make 
this new one pg_activity (there's no stats here anyway), keep the old 
one around as pg_stat_activity for a few releases until everyone has 
converted to the new one.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
Rainer Pruy
Date:
Following this whole conversation rises the impression the topic is
going to get lost in
nirvana of personal preferences.

Most suggestions on change for itself are likely to not cross the border of
"not justifying a compatibility break".
I wonder, whether the actual point really is towards compatibility.
On closer look this is more about a change in paradigm of "system tables".

Seems like those previously had been crafted with having in mind more a
human "reader"
than a programmatic user. What seems  to be requested sounds more like
splitting
access to system information into a level that is more appropriate for
programmatic use
(with all those basic properties being explicit) and
some level more apt for being read.
E.g. I much prefer reading an "<IDLE> in transaction" on a quick glance
over having to search a column and recognize a "t" from an "f"
to find out whether there is a transaction pending or not.

So may be we need a (new) set of "tables/views" that provide detailed
information that is designed for programmatic use
as a basic interface layer.
And reconstruct the existing "tables /views" based on those.

That would allow all required changes to coexist without braking
compatibility.
And it also provides an easier ground for later "extensions" to such
information.

Anybody sticking with the existing interface will not suffer
incompatibility.
While anybody in need of more details and "better" information may
switch over to
the new basic layer.

(And I doubt adding that "extra" level will cause problems performance
wise...)

Rainer

Am 15.06.2011 06:19, schrieb Robert Haas:
> On Tue, Jun 14, 2011 at 11:04 PM, Greg Sabino Mullane <greg@turnstep.com> wrote:
>>> For me, the litmus test is whether the change provides enough
>>> improvement that it outweighs the disruption when the user runs into
>>> it.
>> For the procpid that started all of this, the clear answer is no. I'm
>> surprised people seriously considered making this change. It's a
>> historical accident: document and move on.
> I agree with you on this one...
>
>>> This is why I suggested a specific, useful, and commonly requested
>>> (to me at least) change to pg_stat_activity go along with this.
>> +1. The procpid change is silly, but fixing the current_query field
>> would be very useful. You don't know how many times my fingers
>> have typed "WHERE current_query <> '<IDLE>'"
> ...but I'm not even excited about this.  *Maybe* it's worth adding
> another column, but the problem with the existing system is *entirely*
> cosmetic.  The string chosen here is unconfusable with an actual
> query, so we are talking here, as with the procpid -> pid proposal,
> ONLY about saving a few keystrokes when writing queries.  That is a
> pretty thin justification for a compatibility break IMV.
>


Re: procpid?

From
Robert Haas
Date:
On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> The whole thing is enormously frustrating, and it's an advocacy problem--it
> contributes to people just starting to become serious about using PostgreSQL
> lowering their opinion of its suitability for their business.  If this is
> what's included for activity monitoring, and it's this terrible, it suggest
> people must not have very high requirements for that.

Well, if we're going to start complaining about the lack of proper
activity monitoring, the problems that you're talking about are just
the tip of the iceberg.  Don't even get me started.

> Putting on my stability hat instead of my "make it right" one, maybe this
> really makes sense to expose as a view with a whole new name.  Make this new
> one pg_activity (there's no stats here anyway), keep the old one around as
> pg_stat_activity for a few releases until everyone has converted to the new
> one.

Now, that's a suggestion I could very possibly get behind.  Though the
fact that it would leave us with pg_activity / pg_stat_replication
seems less than ideal.  Maybe pg_activity isn't the best name
either... bikeshedding time!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Gurjeet Singh
Date:
On Tue, Jun 14, 2011 at 9:50 PM, Bruce Momjian <bruce@momjian.us> wrote:
Greg Smith wrote:
> On 06/14/2011 06:00 PM, Tom Lane wrote:
> > As far as Greg's proposal is concerned, I don't see how a proposed
> > addition of two columns would justify renaming an existing column.
> > Additions should not break any sanely-implemented application, but
> > renamings certainly will.
> >
>
> It's not so much justification as something that makes the inevitable
> complaints easier to stomach, in terms of not leaving a really bad taste
> in the user's mouth.  My thinking is that if we're going to mess with
> pg_stat_activity in a way that breaks something, I'd like to see it
> completely refactored for better usability in the process.  If code
> breaks and the resulting investigation by the admin highlights something
> new, that offsets some of the bad user experience resulting from the
> breakage.
>
> Also, I haven't fully worked whether it makes sense to really change
> what current_query means if the idle/transaction component of it gets
> moved to another column.  Would it be better to set current_query to
> null if you are idle, rather than the way it's currently overloaded with
> text in that case?  I don't like the way this view works at all, but I'm
> not sure the best way to change it.  Just changing procpid wouldn't be
> the only thing on the list though.

Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate
fields.  If I had thought of it I would have done it that way years ago.
(At least I think it was me.)  Using angle brackets to put magic values
in that field was clearly wrong.

FWIW, I wrote a monitoring query around it like this (the requirement was to not expose the current_query contents).

SELECT datname, procpid, usename, backend_start, xact_start, query_start,
    waiting AS is_waiting, current_query = $$<IDLE>$$ AS is_idle,
    current_query = $$<IDLE> in transaction$$ AS is_idle_in_transaction,
    current_query ilike $$VACUUM%$$ as is_vacuum,
    client_port IS NULL AND (current_query like $$autovacuum:%$$ OR current_query like $$VACUUM%$$) as is_autovacuum,
    now() AS capture_time
FROM pg_catalog.pg_stat_activity

The tricky part was to determine how long a connection has been in the state that it currently is in. Since the various *_start columns are changed only as needed, I had to use the following expression to calculate that.

(capture_time - COALESCE(query_start, xact_start, backend_start))::interval

query_start is changed every time current_query value is changed; but it is NULL if the backend has just started. Similarly, xact_start changes whenever backend goes into/comes out of a transaction; but it is NULL when the backend has just started. backend_start is never NULL, so we can fall back on that when nothing else is available (i.e when the backend has just started).

If we separated is_idle and is_idle_in_transaction into separate fields, then we also need to somehow expose when did the backend get into that state, unless we promise to hold the assumptions true that were made when writing the above query (which is not as straightforward as one would expect).

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: procpid?

From
Gurjeet Singh
Date:
On Wed, Jun 15, 2011 at 8:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> The whole thing is enormously frustrating, and it's an advocacy problem--it
> contributes to people just starting to become serious about using PostgreSQL
> lowering their opinion of its suitability for their business.  If this is
> what's included for activity monitoring, and it's this terrible, it suggest
> people must not have very high requirements for that.

Well, if we're going to start complaining about the lack of proper
activity monitoring, the problems that you're talking about are just
the tip of the iceberg.  Don't even get me started.

> Putting on my stability hat instead of my "make it right" one, maybe this
> really makes sense to expose as a view with a whole new name.  Make this new
> one pg_activity (there's no stats here anyway), keep the old one around as
> pg_stat_activity for a few releases until everyone has converted to the new
> one.

Now, that's a suggestion I could very possibly get behind.  Though the
fact that it would leave us with pg_activity / pg_stat_replication
seems less than ideal.  Maybe pg_activity isn't the best name
either... bikeshedding time!

Why not expose this new information as functions instead of a new view, like we do for pg_is_in_replication(). People can use whatever alias they want in the queries they write.

SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid), transaction_start_time(pid), .... FROM (select procpid as pid FROM pg_stat_activity);

Then pg_activity (or whatever we name it later) would also be a view on top of these functions.

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: procpid?

From
"Joshua D. Drake"
Date:
On 06/14/2011 08:04 PM, Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
>
>
>> For me, the litmus test is whether the change provides enough
>> improvement that it outweighs the disruption when the user runs into
>> it.
>
> For the procpid that started all of this, the clear answer is no. I'm
> surprised people seriously considered making this change. It's a
> historical accident: document and move on.

It is a bug in consistency, the table pg_locks uses "pid" where 
pg_stat_activity uses "procpid". That is a bug and all bugs are 
accidents. We take a lot of care in fixing bugs.

This isn't just about a few characters in a query, it is about 
consistency and providing an overall more sane user experience. Frankly 
I don't care if we use procpid or pid but it should be one or the other 
not both.

Joshua D. Drake

-- 
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579


Re: procpid?

From
Robert Haas
Date:
On Wed, Jun 15, 2011 at 9:44 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> Why not expose this new information as functions instead of a new view, like
> we do for pg_is_in_replication(). People can use whatever alias they want in
> the queries they write.
>
> SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid),
> transaction_start_time(pid), .... FROM (select procpid as pid FROM
> pg_stat_activity);
>
> Then pg_activity (or whatever we name it later) would also be a view on top
> of these functions.

Well, that would probably be a lot slower, and wouldn't necessarily
deliver as consistent a snapshot of system activity.  It's better to
have one set-returning function that dumps out all the data in a
single pass.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
Gurjeet Singh
Date:
On Wed, Jun 15, 2011 at 10:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jun 15, 2011 at 9:44 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> Why not expose this new information as functions instead of a new view, like
> we do for pg_is_in_replication(). People can use whatever alias they want in
> the queries they write.
>
> SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid),
> transaction_start_time(pid), .... FROM (select procpid as pid FROM
> pg_stat_activity);
>
> Then pg_activity (or whatever we name it later) would also be a view on top
> of these functions.

Well, that would probably be a lot slower, and wouldn't necessarily
deliver as consistent a snapshot of system activity.  It's better to
have one set-returning function that dumps out all the data in a
single pass.

I wanted to address consistency issue in the previous mail, but then wanted that to be left for later.

We can provide consistency the same way pg_locks provides; take a snapshot on first request within a transaction, and reuse that snapshot for subsequent calls. In this case we might want to go a bit finer grained by providing a snapshot for every query.

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: procpid?

From
Tom Lane
Date:
Gurjeet Singh <singh.gurjeet@gmail.com> writes:
> On Wed, Jun 15, 2011 at 10:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Well, that would probably be a lot slower, and wouldn't necessarily
>> deliver as consistent a snapshot of system activity.  It's better to
>> have one set-returning function that dumps out all the data in a
>> single pass.

> I wanted to address consistency issue in the previous mail, but then wanted
> that to be left for later.

> We can provide consistency the same way pg_locks provides; take a snapshot
> on first request within a transaction, and reuse that snapshot for
> subsequent calls. In this case we might want to go a bit finer grained by
> providing a snapshot for every query.

Quite honestly, the implementation mechanism used by the other
statistics views is enormous overkill.  I agree with Robert that I'm not
eager to duplicate that for the activity view, when a simple SRF can get
the job done.
        regards, tom lane


Re: procpid?

From
Alvaro Herrera
Date:
Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011:
> On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith <greg@2ndquadrant.com> wrote:

> > Putting on my stability hat instead of my "make it right" one, maybe this
> > really makes sense to expose as a view with a whole new name.  Make this new
> > one pg_activity (there's no stats here anyway), keep the old one around as
> > pg_stat_activity for a few releases until everyone has converted to the new
> > one.
> 
> Now, that's a suggestion I could very possibly get behind.  Though the
> fact that it would leave us with pg_activity / pg_stat_replication
> seems less than ideal.  Maybe pg_activity isn't the best name
> either... bikeshedding time!

pg_sessions?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: procpid?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011:
>> Now, that's a suggestion I could very possibly get behind.  Though the
>> fact that it would leave us with pg_activity / pg_stat_replication
>> seems less than ideal.  Maybe pg_activity isn't the best name
>> either... bikeshedding time!

> pg_sessions?

Yeah.  Or pg_stat_sessions if you want to keep it looking like it's part
of the pg_stat_ family.  (I'm not sure if we do, since it's really a
completely independent facility.  OTOH, if we don't name it that way,
we're kind of bound to move the documentation into the System Views
chapter, whereas it'd be better to keep it where it is.)
        regards, tom lane


Re: procpid?

From
Robert Haas
Date:
On Wed, Jun 15, 2011 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011:
>>> Now, that's a suggestion I could very possibly get behind.  Though the
>>> fact that it would leave us with pg_activity / pg_stat_replication
>>> seems less than ideal.  Maybe pg_activity isn't the best name
>>> either... bikeshedding time!
>
>> pg_sessions?
>
> Yeah.  Or pg_stat_sessions if you want to keep it looking like it's part
> of the pg_stat_ family.  (I'm not sure if we do, since it's really a
> completely independent facility.  OTOH, if we don't name it that way,
> we're kind of bound to move the documentation into the System Views
> chapter, whereas it'd be better to keep it where it is.)

I've always found the fact that the system views are documented in two
different places to be somewhat confusing.  It doesn't help that the
documentation for the statistics views is quite a bit less detailed.

At any rate, I like "sessions".  That's what it is, after all.  But I
will note that we had better be darn sure to make all the changes we
want to make in one go, because I dowanna have to create pg_sessions2
(or pg_tessions?) in a year or three.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: procpid?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> At any rate, I like "sessions".  That's what it is, after all.  But I
> will note that we had better be darn sure to make all the changes we
> want to make in one go, because I dowanna have to create pg_sessions2
> (or pg_tessions?) in a year or three.

Or perhaps pg_connections. Yes, +1 to making things fully backwards 
compatible by keeping pg_stat_activity around but making a better 
designed and better named table (view/SRF/whatever).

Sounds like perhaps a wiki page to start documenting some of our 
monitoring shortcomings? Might as well fix as much as we can in one 
swoop.


- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201106151246
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk344ioACgkQvJuQZxSWSshy9wCgnrj4lQkaomsgS55yq9KI0HBl
P2UAoI62Tkt9/U62l0Bxv/KfQUUlL/NF
=aaTL
-----END PGP SIGNATURE-----




Re: procpid?

From
Bernd Helmle
Date:

--On 15. Juni 2011 16:47:55 +0000 Greg Sabino Mullane <greg@turnstep.com> wrote:

> Or perhaps pg_connections. Yes, +1 to making things fully backwards
> compatible by keeping pg_stat_activity around but making a better
> designed and better named table (view/SRF/whatever).

I thought about that too when reading the thread the first time, but 
"pg_stat_sessions" sounds better. Our documentation also primarily refers to a 
database connection as a "session", i think.

-- 
Thanks
Bernd


Re: procpid?

From
Greg Smith
Date:
On 06/15/2011 04:13 AM, Rainer Pruy wrote:
> I much prefer reading an "<IDLE>  in transaction" on a quick glance
> over having to search a column and recognize a "t" from an "f"
> to find out whether there is a transaction pending or not.
>    

This is a fair observation.  If we provide a second view here that 
reorganizes the data toward something more appropriate for monitoring 
systems to process it, you may be right that the result will be a step 
backwards for making it human-readable.  They may end up being similar, 
co-existing views aimed at different uses, rather than one clearly 
replacing the other one day.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
Greg Smith
Date:
Since the CF is upon us and discussion is settling, let's see if I can 
wrap this bikeshedding up into a more concrete proposal that someone can 
return to later.  The ideas floating around have gelled into:

-Add a new pg_stat_sessions function that is implemented similarly to 
pg_stat_activity.  For efficiency and simplicity sake, internally this 
will use the same sort of SRF UI that pg_stat_get_activity does inside 
src/backend/utils/adt/pgstatfuncs.c  There will need to be some 
refactoring here to reduce code duplication between that and the new 
function (which will presumably named pg_stat_get_sessions)

-The process ID field here will be named "pid" to match other system 
views, rather than the current "procpid"

-State information such as whether the session is idle, idle in a 
transaction, or has a query visible to this backend will be presented as 
booleans similar to the current waiting field.  A possible additional 
state to expose is the concept of "active", which ends up being derived 
using logic like "visible && !idle && !idle_transaction && !waiting" in 
some monitoring systems.

-A case could be made for making some of these state fields null, 
instead true or false, in situations where the session is not visible.  
If you don't have rights to see the connection activity, setting idle, 
idle_transaction, and active all to null may be the right thing to do.  
More future bikeshedding is likely on this part, once an initial patch 
is ready for testing.  I'd want to get some specific tests against the 
common monitoring goals of tools like check_postgres and the Munin 
plug-in to see which implementation makes more sense for them as input 
on that.

-It is still useful to set current_query to descriptive text in the 
cases where the transaction is <IDLE> etc.  That text is not ambiguous 
with a real query, it is useful for a human-readable view, and it 
improves the potential for pg_stat_sessions to fully replace a 
deprecated pg_stat_activity (instead of just co-existing with it).  That 
the query text is overloaded with this information seems agreed to be a 
good thing; it's just that filtering on the state information there 
should not require parsing it.  The additional booleans will handle 
that.  If idle sessions can be filtered using "WHERE NOT idle", whether 
the current_query for them reads "<IDLE>" or is null won't matter to 
typical monitoring use.  Given no strong preference there, using 
"<IDLE>" is both familiar and more human readable.

I'll go add this as a TODO now.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
Greg Smith
Date:
On 06/15/2011 12:41 PM, Robert Haas wrote:
> But I will note that we had better be darn sure to make all the changes we
> want to make in one go, because I dowanna have to create pg_sessions2
> (or pg_tessions?) in a year or three.
>    

I just added a new section to the TODO to start collecting up some of 
these related ideas into one place:  
http://wiki.postgresql.org/wiki/Todo#Monitoring so we might try to get 
as many as possible all in one go.

The other item on there related to pg_stat_activity that might impact 
this design was adding a column for tracking progress of commands like 
CREATE INDEX and VACUUM (I updated to note CLUSTER falls into that 
category too).  While query progress will always be a hard problem, 
adding a field to store some sort of progress indicator might be useful 
even if it only worked on these two initially.  Anyway, topic for 
another time.

The only other item related to this view on the TODO was "Have 
pg_stat_activity display query strings in the correct client encoding".  
That might be worthwhile to bundle into this rework, but it doesn't seem 
something that impacts the UI such that it must be considered early.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


>> Or perhaps pg_connections. Yes, +1 to making things fully backwards
>> compatible by keeping pg_stat_activity around but making a better
>> designed and better named table (view/SRF/whatever).

> I thought about that too when reading the thread the first time, but 
> "pg_stat_sessions" sounds better. Our documentation also primarily refers to a 
> database connection as a "session", i think.

No, this is clearly connections, not sessions. At least based on the items 
in the postgresql.conf file, especially max_connections (probably one of the 
items most closely associated with pg_stat_activity)

- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201106161132
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk36IjYACgkQvJuQZxSWSsg8MgCgkMNw1o37cgmtJdYBAsGl7kz6
Q8sAoISFra0LyQjyKw3zcapWBdCLh2RV
=EYAc
-----END PGP SIGNATURE-----




Re: procpid?

From
Tom Lane
Date:
Greg Smith <greg@2ndQuadrant.com> writes:
> The only other item related to this view on the TODO was "Have 
> pg_stat_activity display query strings in the correct client encoding".  
> That might be worthwhile to bundle into this rework, but it doesn't seem 
> something that impacts the UI such that it must be considered early.

That entry is garbled to the point of uselessness anyway, as client
encoding has got exactly zip to do with it.  The point is that another
backend's entry could be in a different *server* encoding, and what do
you do if there's no equivalent character in your encoding?
        regards, tom lane


Re: procpid?

From
Bernd Helmle
Date:

--On 16. Juni 2011 15:33:35 +0000 Greg Sabino Mullane <greg@turnstep.com> wrote:

> No, this is clearly connections, not sessions. At least based on the items
> in the postgresql.conf file, especially max_connections (probably one of the
> items most closely associated with pg_stat_activity)

Well, but it doesn't show database connection(s) only, it also shows what 
actions are currently performed through the various connections on the 
databases and state information about them. I'm not a native english speaker, 
but i have the feeling that "sessions" is better suited for this kind of 
interactive monitoring. I believe Oracle also has a v$session view to query 
various information about what's going on.

-- 
Thanks
Bernd


Re: procpid?

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The point is that another backend's entry could be in a different
> *server* encoding, and what do you do if there's no equivalent
> character in your encoding?
My first thought was that it was just a matter of picking a
character to represent the "unprintable" characters.  My second
thought was that if you don't understand the encoding scheme, you're
not even going to know where the character boundaries are.  :-(
-Kevin


Re: procpid?

From
Alvaro Herrera
Date:
Excerpts from Greg Sabino Mullane's message of jue jun 16 15:33:35 UTC 2011:
> 
> Hash: RIPEMD160
> 
> >> Or perhaps pg_connections. Yes, +1 to making things fully backwards
> >> compatible by keeping pg_stat_activity around but making a better
> >> designed and better named table (view/SRF/whatever).
> 
> > I thought about that too when reading the thread the first time, but 
> > "pg_stat_sessions" sounds better. Our documentation also primarily refers to a 
> > database connection as a "session", i think.
> 
> No, this is clearly connections, not sessions. At least based on the items 
> in the postgresql.conf file, especially max_connections (probably one of the 
> items most closely associated with pg_stat_activity)

That doesn't include autovacuum, though, whereas the new view would.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: procpid?

From
Bruce Momjian
Date:
Greg Smith wrote:
> -It is still useful to set current_query to descriptive text in the 
> cases where the transaction is <IDLE> etc.  That text is not ambiguous 
> with a real query, it is useful for a human-readable view, and it 
> improves the potential for pg_stat_sessions to fully replace a 
> deprecated pg_stat_activity (instead of just co-existing with it).  That 
> the query text is overloaded with this information seems agreed to be a 
> good thing; it's just that filtering on the state information there 
> should not require parsing it.  The additional booleans will handle 
> that.  If idle sessions can be filtered using "WHERE NOT idle", whether 
> the current_query for them reads "<IDLE>" or is null won't matter to 
> typical monitoring use.  Given no strong preference there, using 
> "<IDLE>" is both familiar and more human readable.

Uh, if we are going to do that, why not just add the boolean columns to
the existing view?  Clearly renaming procpid isn't worth creating
another view.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: procpid?

From
Greg Smith
Date:
On 06/16/2011 05:27 PM, Bruce Momjian wrote:
> Greg Smith wrote:
>    
>> -It is still useful to set current_query to descriptive text in the
>> cases where the transaction is<IDLE>  etc.
>>      
> Uh, if we are going to do that, why not just add the boolean columns to
> the existing view?  Clearly renaming procpid isn't worth creating
> another view.
>    

I'm not completely set on this either way; that's why I suggested a 
study that digs into typical monitoring system queries would be useful.  
Even the current view is pushing the limits for how much you can put 
into something that intends to be human-readable though.  Adding a new 
pile of columns to it has some downsides there.

I hadn't ever tried to write down everything I'd like to see changed 
here until this week, so there may be further column churn that 
justifies a new view too.  I think the whole idea needs to get chewed on 
a bit more.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: procpid?

From
Magnus Hagander
Date:
On Fri, Jun 17, 2011 at 06:39, Greg Smith <greg@2ndquadrant.com> wrote:
> On 06/16/2011 05:27 PM, Bruce Momjian wrote:
>>
>> Greg Smith wrote:
>>
>>>
>>> -It is still useful to set current_query to descriptive text in the
>>> cases where the transaction is<IDLE>  etc.
>>>
>>
>> Uh, if we are going to do that, why not just add the boolean columns to
>> the existing view?  Clearly renaming procpid isn't worth creating
>> another view.
>>
>
> I'm not completely set on this either way; that's why I suggested a study
> that digs into typical monitoring system queries would be useful.  Even the
> current view is pushing the limits for how much you can put into something
> that intends to be human-readable though.  Adding a new pile of columns to
> it has some downsides there.

Is it intended for human-readable? And for human readable without
specifying which part you want? It's already way too wide to fit in
most terminals - and has been for years. You need to use \x unless you
specify the fields.

And if you want a "simpler version", why not just add all the columns
to the existing one we need, and then create a regular VIEW over it
that shows just the most common columns? But I still think you're
going to find a hard time making even that narrow enough to  be easily
consumable - but you could certainly remove things like usesysid and
datid which are mainly useful only for JOINing to other stuff.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: procpid?

From
Jim Nasby
Date:
On Jun 16, 2011, at 9:31 AM, Greg Smith wrote:
> -A case could be made for making some of these state fields null, instead true or false, in situations where the
sessionis not visible.  If you don't have rights to see the connection activity, setting idle, idle_transaction, and
activeall to null may be the right thing to do.  More future bikeshedding is likely on this part, once an initial patch
isready for testing.  I'd want to get some specific tests against the common monitoring goals of tools like
check_postgresand the Munin plug-in to see which implementation makes more sense for them as input on that. 

ISTM this should be driven by what data we actually expose. If we're willing to expose actual information for idle,
idle_transactionand waiting for backends that you don't have permission to see the query for, then we should expose the
actualinformation (I personally think this would be useful). 

OTOH, if we are not willing to expose that information, then we should certainly set those fields to null instead of
somedefault value. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net