On Mon, 27 Mar 2006, Magnus Hagander wrote:
>>> It's automatically reproduced every 10 minutes. There are two
>>> possibilities:
>>
>> 'k, I'm seeing processes running since 2pm Sunday ... so
>> "every 10 minutes", what exactly is happening?
>>
>> # ps aux | grep btlaunchmany | grep Sun02 | wc -l
>> 36
>
> Yeah. That's a reasonably normal amount of timing.
> What happens is that the script detects a lot of btlaunchmany that are
> suddenly gone. Either they are still there, and not showing up, or they
> are dead. It could well be that they've crashed.
>
> BTW, I've disabled the cronjob now since it sends an email to the slaves
> list every 10 minutes :-) So right now, nothing happens.
>
>
>>> 2) They are no longer reported in a way that Proc::ProcessTable can
>>> read. See the code at
>>>
>> http://gborg.postgresql.org/cgi-bin/cvsweb.cgi/portal/tools/bt/updatet
>>> or rents.pl?rev=1.8;cvsroot=pgweb for what we're trying to do. It is
>>> *possible* this is done because the processes are swapped
>> out, though
>>> I
>>> *think* I tested the code and it handled that.
>>
>> 'k, first ... "processes are swapped out" ... unless
>> something really odd is happening, I've never heard of a
>> process swapped out that is removed from the process table
>> ... at least not under FreeBSD ...
>
> Nope, that would be really weird. But it could well be a bug in the perl
> library Proc::ProcessTable as well.
Or, it could be not listing 'non-reaped (ie. Zombie)' processes ... but,
the ps listing doesn't seem to be marking them as Zombie ...
>> If you can be online tomorrow afternoon ... ? that way, when
>> I bring it back up, you are able to look over it for any issues ...
>
> Afternoon your time or mine ;-)
I *think* we are 4 hours off ... always makes it fun to scheduale, I don't
want to make it too late for you, but not too early for me :)
How about 6pm-ish your time, which I htink is around 2pm mine?
> Oh. Could *that* be where the problem comes from? Ithought the ()
> indicated they were swapped out, I thin kit does on linux :-)
It could be ... only place I've ever noticed this is on a system where the
child processes weren't being properly killed off though ...
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664