Thread: Server instrumentation for 8.1

Server instrumentation for 8.1

From
Andreas Pflug
Date:
There's still a lengthy discussion going on whether it's a good idea to 
add a forth way to read pgsql's schema (pg_* tables, pg_* views, 
information_schema, did I miss one?), but I'd like to see helper 
functions for issues *not* covered in the core package.

- dbsize has been in contrib for a long time, though it appears to me as 
quite a basic functionality to find out about storage needs.

- The superuser only generic file functions in the admin package have 
been posted for 8.0, but where (more or less ) silently dropped. These 
functions allow pgadmin to display the server logs, as well as editing 
pg_hba.conf and postgresql.conf without console access to 
whatever-pgsql-is-running-on.  I'd like to see this at least as contrib 
module (the functions are probably safer than pl_sh).

Both these modules are bundled with the pgsql win32 installer, and are 
installed by default. Both are supported by (at least) pgAdmin.


- There was a pg_kill_backend function in pre-8.0, but it was dropped 
because "it's too dangerous". Incidentially, I've been in the situation 
more than once where I needed to kill a backend process that was running 
wild; alternatively, I'd have to shutdown the whole server. I had to do 
this on the linux console with kill -9 (fortunately I did have access), 
or using the win32 task manager (same). This appears even more error 
prone to me than to point to the malicious process and kill it (through 
pgadmin/pg_kill_backend)

- We don't have a profiling facility to tap an individual backend for a 
limited period to find out what the client is doing there, so we need to 
use log_statement for this (I'd like to work on profiling, but I didn't 
find the time so far). Consequently, we have to deal with long logfiles, 
containing much stuff we don't need. In the past, I found it to be very 
helpful if a fresh logfile could be used (on a private installation, 
stop/start server), that's why my logfile process implementation did 
include a logfile rotation trigger functionality. Tom didn't need it, so 
he dropped it. I'd opt for re-adding it again.

Yes yes I know, all of these can be done by a local administrator with 
console access and an editor and cmd line tools, but there are indeed 
people that do *not* have console access, or like to use decent tools....

Regards,
Andreas


Re: Server instrumentation for 8.1

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Andreas Pflug
> Sent: 11 May 2005 17:44
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Server instrumentation for 8.1
>
> There's still a lengthy discussion going on whether it's a
> good idea to
> add a forth way to read pgsql's schema (pg_* tables, pg_* views,
> information_schema, did I miss one?), but I'd like to see helper
> functions for issues *not* covered in the core package.

I was going to write pretty much the same message - thanks for saving me
the time!

> - dbsize has been in contrib for a long time, though it
> appears to me as
> quite a basic functionality to find out about storage needs.

Agreed.

> - The superuser only generic file functions in the admin package have
> been posted for 8.0, but where (more or less ) silently
> dropped. These
> functions allow pgadmin to display the server logs, as well
> as editing
> pg_hba.conf and postgresql.conf without console access to
> whatever-pgsql-is-running-on.  I'd like to see this at least
> as contrib
> module (the functions are probably safer than pl_sh).
>
> Both these modules are bundled with the pgsql win32
> installer, and are
> installed by default. Both are supported by (at least) pgAdmin.

I would like to see these as permanent additions to the server. They are
useful functions that allow functionality to be included in interfaces
like pgAdmin that any user coming from MS SQL or other DBMSs would
probably expect to find. For anyone wanting to take a look, the module
can be found in our shiny new Subversion repo at
http://svn.pgadmin.org/cgi-bin/viewcvs.cgi/trunk/pgadmin3/xtra/admin/

>
> - There was a pg_kill_backend function in pre-8.0, but it was dropped
> because "it's too dangerous". Incidentially, I've been in the
> situation
> more than once where I needed to kill a backend process that
> was running
> wild; alternatively, I'd have to shutdown the whole server. I
> had to do
> this on the linux console with kill -9 (fortunately I did
> have access),
> or using the win32 task manager (same). This appears even more error
> prone to me than to point to the malicious process and kill
> it (through
> pgadmin/pg_kill_backend)

This is also essential functionality, though only if it can be made safe
imo.

> - We don't have a profiling facility to tap an individual
> backend for a
> limited period to find out what the client is doing there, so
> we need to
> use log_statement for this (I'd like to work on profiling,
> but I didn't
> find the time so far). Consequently, we have to deal with
> long logfiles,
> containing much stuff we don't need. In the past, I found it
> to be very
> helpful if a fresh logfile could be used (on a private installation,
> stop/start server), that's why my logfile process implementation did
> include a logfile rotation trigger functionality. Tom didn't
> need it, so
> he dropped it. I'd opt for re-adding it again.

Yes, I ran into exactly this problem this morning as it happens when
tracking down an obscure bug in some code that couldn't easily be
debugged.

Now I know you're all thinking 'oh yeah, obviously the pgAdmin team are
putting on a united front', but honestly, I knew nothing about Andreas'
email until I saw it, and he knew nothing of my intention to write one!
:-)

Regards, Dave.


Re: Server instrumentation for 8.1

From
Andrew Sullivan
Date:
On Wed, May 11, 2005 at 04:44:21PM +0000, Andreas Pflug wrote:
> 
> Yes yes I know, all of these can be done by a local administrator with 
> console access and an editor and cmd line tools, but there are indeed 
> people that do *not* have console access, or like to use decent tools....

Is there a reason they couldn't be bundled into a separate package,
and either put in contrib/ or (my preference) put on gforge or
whatever?  The less-is-more approach in the default source seems to
me to be a good thing.  I'm not convinced that packaged systems
should ship that way -- maybe these should be included in desktop
systems -- but enabled-by-default for many of these things seems to
me to be too dangerous.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what 
you told them to.  That actually seems sort of quaint now.    --J.D. Baldwin


Re: Server instrumentation for 8.1

From
"Jim C. Nasby"
Date:
On Wed, May 11, 2005 at 04:44:21PM +0000, Andreas Pflug wrote:
> There's still a lengthy discussion going on whether it's a good idea to 
> add a forth way to read pgsql's schema (pg_* tables, pg_* views, 
> information_schema, did I miss one?), but I'd like to see helper 
> functions for issues *not* covered in the core package.
> 
> - dbsize has been in contrib for a long time, though it appears to me as 
> quite a basic functionality to find out about storage needs.

FWIW, I believe the new system views cover all the dbsize cases.
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: Server instrumentation for 8.1

From
Josh Berkus
Date:
Andreas,

I think you bring up some good points, but I also think that each package you 
propose needs to be dealt with individually.

> - dbsize has been in contrib for a long time, though it appears to me as
> quite a basic functionality to find out about storage needs.

Although not needed so much if the new system views are approved; we have a 
view that calculates database size.  Not that multiple options isn't cool.

> - The superuser only generic file functions in the admin package have
> been posted for 8.0, but where (more or less ) silently dropped. These
> functions allow pgadmin to display the server logs, as well as editing
> pg_hba.conf and postgresql.conf without console access to
> whatever-pgsql-is-running-on.  I'd like to see this at least as contrib
> module (the functions are probably safer than pl_sh).

Heck, I didn't even see these.  I was going to write some in pgperl for my own 
use.  These seem potentially very dangerous though, so we wouldn't want them 
installed by default.

> - There was a pg_kill_backend function in pre-8.0, but it was dropped
> because "it's too dangerous". Incidentially, I've been in the situation
> more than once where I needed to kill a backend process that was running
> wild; alternatively, I'd have to shutdown the whole server. I had to do
> this on the linux console with kill -9 (fortunately I did have access),
> or using the win32 task manager (same). This appears even more error
> prone to me than to point to the malicious process and kill it (through
> pgadmin/pg_kill_backend)

Certainly.  But this was dropped because Tom couldn't get the bugs out (as I 
recall) and make it "safe" to use, even for the superuser.  You could take a 
stab at fixing it.   Also, if this were an "enable-at-build-time" option, it 
would also help defuse the security argument, since it wouldn't necessarily 
be installed.

> - We don't have a profiling facility to tap an individual backend for a
> limited period to find out what the client is doing there, so we need to
> use log_statement for this (I'd like to work on profiling, but I didn't
> find the time so far). Consequently, we have to deal with long logfiles,
> containing much stuff we don't need. In the past, I found it to be very
> helpful if a fresh logfile could be used (on a private installation,
> stop/start server), that's why my logfile process implementation did
> include a logfile rotation trigger functionality. Tom didn't need it, so
> he dropped it. I'd opt for re-adding it again.

+1

> Yes yes I know, all of these can be done by a local administrator with
> console access and an editor and cmd line tools, but there are indeed
> people that do *not* have console access, or like to use decent tools....

To support Andrew's assertion, automated server room administration tools 
(like Hyperic and Embarcadero) could really use the above tools.  

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


Re: Server instrumentation for 8.1

From
Andrew - Supernews
Date:
On 2005-05-11, "Jim C. Nasby" <decibel@decibel.org> wrote:
> On Wed, May 11, 2005 at 04:44:21PM +0000, Andreas Pflug wrote:
>> There's still a lengthy discussion going on whether it's a good idea to 
>> add a forth way to read pgsql's schema (pg_* tables, pg_* views, 
>> information_schema, did I miss one?), but I'd like to see helper 
>> functions for issues *not* covered in the core package.
>> 
>> - dbsize has been in contrib for a long time, though it appears to me as 
>> quite a basic functionality to find out about storage needs.
>
> FWIW, I believe the new system views cover all the dbsize cases.

dbsize looks at the actual size of files on disk; newsysviews does not,
it shows estimated sizes as taken from relpages. newsysviews doesn't
allow you to see size info for databases you're not connected to.

However, newsysviews knows about toast tables, and gives a breakdown of
table size by main storage, toast and index.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


Re: Server instrumentation for 8.1

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
>> - The superuser only generic file functions in the admin package have
>> been posted for 8.0, but where (more or less ) silently dropped. These
>> functions allow pgadmin to display the server logs, as well as editing
>> pg_hba.conf and postgresql.conf without console access to
>> whatever-pgsql-is-running-on.  I'd like to see this at least as contrib
>> module (the functions are probably safer than pl_sh).

> Heck, I didn't even see these.  I was going to write some in pgperl for my own 
> use.  These seem potentially very dangerous though, so we wouldn't want them 
> installed by default.

My recollection is that they weren't "silently dropped", they were
explicitly rejected after much discussion because of security worries
(and possibly other concerns, I don't recall for sure anymore).
        regards, tom lane


Re: Server instrumentation for 8.1

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of
> Andrew Sullivan
> Sent: 11 May 2005 21:04
> To: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Server instrumentation for 8.1
>
> On Wed, May 11, 2005 at 04:44:21PM +0000, Andreas Pflug wrote:
> >
> > Yes yes I know, all of these can be done by a local
> administrator with
> > console access and an editor and cmd line tools, but there
> are indeed
> > people that do *not* have console access, or like to use
> decent tools....
>
> Is there a reason they couldn't be bundled into a separate package,
> and either put in contrib/ or (my preference) put on gforge or
> whatever?  The less-is-more approach in the default source seems to
> me to be a good thing.  I'm not convinced that packaged systems
> should ship that way -- maybe these should be included in desktop
> systems -- but enabled-by-default for many of these things seems to
> me to be too dangerous.

These functions are all superuser only where required, and are used to
provide support for features expected as standard by many people.
Keeping them in separate packages is a) inconvenient and b) could have a
negative effect on our reputation, because people will say things like
"yes, PostgreSQL can do XYZ as SQL or Oracle does, but you need to
install this contrib module, this set of functions from www.pgadmin.org
just to view your logfiles and see how big your database is...".

IMO, the functions we are talking about are in a different league from
things like PLs - it makes sense to download them as required as they
are entirely application specific. It doesn't make sense to require
additional components to edit the server config files and view logs etc
if your preferred interface just happens to be a GUI.

Regards, Dave


Re: Server instrumentation for 8.1

From
Andreas Pflug
Date:
Josh Berkus wrote:

>>- dbsize has been in contrib for a long time, though it appears to me as
>>quite a basic functionality to find out about storage needs.
> 
> 
> Although not needed so much if the new system views are approved; we have a 
> view that calculates database size. 

First, as some other msg states the views will estimate the sizes, 
dbsize uses actual file sizes. Second, in contrast to CKL, I would *not* 
use these fancy new system views, because they mean yet another 
dependency for pgAdmin.


> Not that multiple options isn't cool.

What do you mean? Those views aren't an option for me.


> 
>>- The superuser only generic file functions in the admin package have
>>been posted for 8.0, but where (more or less ) silently dropped. These
>>functions allow pgadmin to display the server logs, as well as editing
>>pg_hba.conf and postgresql.conf without console access to
>>whatever-pgsql-is-running-on.  I'd like to see this at least as contrib
>>module (the functions are probably safer than pl_sh).
> 
> 
> Heck, I didn't even see these.  I was going to write some in pgperl for my own 
> use.  These seem potentially very dangerous though, so we wouldn't want them 
> installed by default.

Not more dangerous than "drop table pg_class". The file functions are 
restricted to the cluster directory.


>>- There was a pg_kill_backend function in pre-8.0, but it was dropped
>>because "it's too dangerous". Incidentially, I've been in the situation
>>more than once where I needed to kill a backend process that was running
>>wild; alternatively, I'd have to shutdown the whole server. I had to do
>>this on the linux console with kill -9 (fortunately I did have access),
>>or using the win32 task manager (same). This appears even more error
>>prone to me than to point to the malicious process and kill it (through
>>pgadmin/pg_kill_backend)
> 
> 
> Certainly.  But this was dropped because Tom couldn't get the bugs out (as I 
> recall) and make it "safe" to use, even for the superuser. 

Hm, don't remember that, AFAIR it was removed after a discussion between 
Tom and Dave that kill -9 would stress a code path that still isn't 
elaborated too much. But in the situations mentioned above, I took the 
risk and would like a function for that.


Regards,
Andreas


Re: Server instrumentation for 8.1

From
Andreas Pflug
Date:
Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> 
>>>- The superuser only generic file functions in the admin package have
>>>been posted for 8.0, but where (more or less ) silently dropped. These
>>>functions allow pgadmin to display the server logs, as well as editing
>>>pg_hba.conf and postgresql.conf without console access to
>>>whatever-pgsql-is-running-on.  I'd like to see this at least as contrib
>>>module (the functions are probably safer than pl_sh).
> 
> 
>>Heck, I didn't even see these.  I was going to write some in pgperl for my own 
>>use.  These seem potentially very dangerous though, so we wouldn't want them 
>>installed by default.
> 
> 
> My recollection is that they weren't "silently dropped", they were
> explicitly rejected after much discussion because of security worries

These functions were thoroughly discussed between Bruce, you and me, and 
I fixed all issues that came up until a point where Bruce would agree 
the functions to be reasonably safe. But it condensed down to something 
like "Tom doesn't like it" at the end. The functions are not only 
superuser only, but also restricted to the db cluster directory, making 
them no more dangerous than "drop table pg_class".

Regards,
Andreas


Re: Server instrumentation for 8.1

From
Andreas Pflug
Date:
Andrew - Supernews wrote:

> 
> dbsize looks at the actual size of files on disk; newsysviews does not,
> it shows estimated sizes as taken from relpages.

Which shows *net* size only, not actual size because non-vacuumed rows 
are not covered. It is correct after a vacuum full only.

> newsysviews doesn't
> allow you to see size info for databases you're not connected to.
> 
> However, newsysviews knows about toast tables, and gives a breakdown of
> table size by main storage, toast and index.

pg_relation_size allows checking of file sizes for these as well, and 
pgAdmin will show toast and index sizes.

Regards,
Andreas




Re: Server instrumentation for 8.1

From
"Magnus Hagander"
Date:
> >>- There was a pg_kill_backend function in pre-8.0, but it
> was dropped
> >>because "it's too dangerous". Incidentially, I've been in
> the situation
> >>more than once where I needed to kill a backend process
> that was running
> >>wild; alternatively, I'd have to shutdown the whole server.
> I had to do
> >>this on the linux console with kill -9 (fortunately I did
> have access),
> >>or using the win32 task manager (same). This appears even more error
> >>prone to me than to point to the malicious process and kill
> it (through
> >>pgadmin/pg_kill_backend)
> >
> >
> > Certainly.  But this was dropped because Tom couldn't get
> the bugs out (as I
> > recall) and make it "safe" to use, even for the superuser.
>
> Hm, don't remember that, AFAIR it was removed after a
> discussion between
> Tom and Dave that kill -9 would stress a code path that still isn't
> elaborated too much. But in the situations mentioned above, I
> took the
> risk and would like a function for that.

Not kill -9. Kill -9 is "safe" because it causes a complete restart of
the postmaster (it's the same as a backend crash, really). Kill -INT is
also safe, because it does a simlpe query cancel.

kill -TERM is the unsafe one. It is also the one most people use.

//Magnus


Re: Server instrumentation for 8.1

From
Andreas Pflug
Date:
Magnus Hagander wrote:


> 
> 
> Not kill -9. Kill -9 is "safe" because it causes a complete restart of
> the postmaster (it's the same as a backend crash, really). Kill -INT is
> also safe, because it does a simlpe query cancel. 

I don't recall exactly; AFAIR this was discussed between Dave and Tom. 
Actually, *both* should be accessible (maybe with some restriction 
making it difficult to call it from an sql cmd line)

Regards,
Andreas


Re: Server instrumentation for 8.1

From
Andrew - Supernews
Date:
On 2005-05-12, Andreas Pflug <pgadmin@pse-consulting.de> wrote:
>> These seem potentially very dangerous though, so we wouldn't want them 
>> installed by default.
>
> Not more dangerous than "drop table pg_class".

Have you ever tried that?

test=# drop table pg_class;
ERROR:  permission denied: "pg_class" is a system catalog

That said, there are several functions built-in already that allow the
superuser to write files (COPY TO and lo_export being the main ones).

>>>- There was a pg_kill_backend function in pre-8.0, but it was dropped
>>>because "it's too dangerous". Incidentially, I've been in the situation
>>>more than once where I needed to kill a backend process that was running
>>>wild; alternatively, I'd have to shutdown the whole server. I had to do
>>>this on the linux console with kill -9 (fortunately I did have access),
>>>or using the win32 task manager (same). This appears even more error
>>>prone to me than to point to the malicious process and kill it (through
>>>pgadmin/pg_kill_backend)
>> 
>> Certainly.  But this was dropped because Tom couldn't get the bugs out
>> (as I recall) and make it "safe" to use, even for the superuser. 
>
> Hm, don't remember that, AFAIR it was removed after a discussion between 
> Tom and Dave that kill -9 would stress a code path that still isn't 
> elaborated too much. But in the situations mentioned above, I took the 
> risk and would like a function for that.

There seems to be some confusion here - kill -9 on a single backend is
unlikely to ever be safe, since the backend could be in the middle of
updates to shared memory at the time, and while it is certainly possible
to code it so that cleanup can still be done in this case, the current
code certainly does not try. (I've done it for a database I wrote myself
in the past: what's needed to make it feasible is that all shared memory
updates, without exception, must either be atomic replacements of values
of type volatile sig_atomic_t or must be updates to data structures that
are already known to be in-use by the specific backend and which can be
cleaned up even if they are in an internally corrupt state. This is
unlikely to be either possible or desirable for postgres.)

What currently happens is that backends respond to kill -15 (_NOT_ -9)
by cleaning up and exiting. This code path is used for implementing the
stop -mfast option, which means that as it currently exists, the cleanup
only has to be good enough to let other backends get out of critical
sections and complete their own rollback-and-exit safely.

(I've never tried it, but I'd expect the postmaster to respond to a
kill -9 of an individual backend by doing a panic shutdown and restart
anyway, so there's no advantage to doing that rather than doing, say,
restart -mfast or -mimmediate. I _have_ on very rare occasions done
kill -15 on individual backends and got away with it, usually when those
backends were idle anyway, but I'd never recommend it as a routine
technique without a lot more confidence in the correctness of the code
in question.)

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


Re: Server instrumentation for 8.1

From
"Magnus Hagander"
Date:
> > Not kill -9. Kill -9 is "safe" because it causes a complete
> restart of
> > the postmaster (it's the same as a backend crash, really).
> Kill -INT
> > is also safe, because it does a simlpe query cancel.
>
> I don't recall exactly; AFAIR this was discussed between Dave
> and Tom.
> Actually, *both* should be accessible (maybe with some
> restriction making it difficult to call it from an sql cmd line)

Actually, I take the blame for starging that discussion - it started
when I craeted the pg_cancel_backend / pg_kill_backend questions.

Now, I *want* these functions. But I want them to be safe :-) (I want
kill -TERM on the commandline to be safe as well!)

//Magnus


Re: Server instrumentation for 8.1

From
Andrew - Supernews
Date:
On 2005-05-12, Andreas Pflug <pgadmin@pse-consulting.de> wrote:
> Andrew - Supernews wrote:
>> dbsize looks at the actual size of files on disk; newsysviews does not,
>> it shows estimated sizes as taken from relpages.
>
> Which shows *net* size only, not actual size because non-vacuumed rows 
> are not covered. It is correct after a vacuum full only.

nonsense

relpages is updated from the value of RelationGetNumberOfBlocks(rel) which
is definitive (it gets the value from smgr which gets it from the physical
file sizes); the only inaccuracy is that it is correct only as of the time
that the pg_class row was last updated (as done by any VACUUM, any ANALYZE,
any CLUSTER or any CREATE INDEX on the table, at minimum).

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


Re: Server instrumentation for 8.1

From
Alvaro Herrera
Date:
On Thu, May 12, 2005 at 10:55:22AM +0200, Magnus Hagander wrote:

> Not kill -9. Kill -9 is "safe" because it causes a complete restart of
> the postmaster (it's the same as a backend crash, really). Kill -INT is
> also safe, because it does a simlpe query cancel. 

Hmm, would it be possible to use another signal for "cancel the current
query and enter a state where no queries are accepted"?  That would mean
driving the backend to a known idle state, allowing further action to
end the connection.  From the client perspective, the connection is no
longer useful anyway, so he can only disconnect.

-- 
Alvaro Herrera (<alvherre[a]surnet.cl>)
"In a specialized industrial society, it would be a disaster
to have kids running around loose." (Paul Graham)


Re: Server instrumentation for 8.1

From
"Magnus Hagander"
Date:
> > Not kill -9. Kill -9 is "safe" because it causes a complete
> restart of
> > the postmaster (it's the same as a backend crash, really).
> Kill -INT
> > is also safe, because it does a simlpe query cancel.
>
> Hmm, would it be possible to use another signal for "cancel
> the current query and enter a state where no queries are
> accepted"?  That would mean driving the backend to a known
> idle state, allowing further action to end the connection.
> From the client perspective, the connection is no longer
> useful anyway, so he can only disconnect.

AFAIK, there are no signals left to use...

Another thought I had along that line was use a different signal to
simply do a "query cancel" and set a global flag that is more or less
"get out when you're done with query cancel". Then if that flag is set,
just close the connection and proceed as if the client dropped the
connection - that has to be a well tested codepath. But it comes down to
the fact that we're out of signals.

//Magnus


Re: Server instrumentation for 8.1

From
Tom Lane
Date:
Andrew - Supernews <andrew+nonews@supernews.com> writes:
> What currently happens is that backends respond to kill -15 (_NOT_ -9)
> by cleaning up and exiting. This code path is used for implementing the
> stop -mfast option, which means that as it currently exists, the cleanup
> only has to be good enough to let other backends get out of critical
> sections and complete their own rollback-and-exit safely.

Exactly.  In theory it probably works fine to allow one backend to exit
via kill -TERM, but it cannot be claimed that that behavior has been
tested to any significant extent --- "fast" shutdown is not stressing it
in the same way.

I think this is largely a question of someone doing a significant amount
of stress testing: gun live server processes with "kill -TERM" in an
active system, and keep an eye out for resource leaks, held locks, and
so on.  It would be more convincing if the processes getting zapped are
executing a wide variety of SQL, too --- I'd not feel very confident
given only tests of killing, say, pgbench threads.
        regards, tom lane


Re: Server instrumentation for 8.1

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> Another thought I had along that line was use a different signal to
> simply do a "query cancel" and set a global flag that is more or less
> "get out when you're done with query cancel". Then if that flag is set,
> just close the connection and proceed as if the client dropped the
> connection - that has to be a well tested codepath.

This is pretty much exactly what kill -TERM does today, and the point is
that the code path has only been extensively tested in the context of
database-wide shutdown.  No one can honestly say that they are sure
there are no resource leaks, locks left unreleased, stuff like that.
That kind of problem wouldn't be visible after a shutdown, but it will
become visible if backends are killed individually with -TERM.

Now in theory there are no bugs and this'll work fine.  What disturbs me
is the lack of testing by anyone who knows what to look for ...
        regards, tom lane


Re: Server instrumentation for 8.1

From
Josh Berkus
Date:
Andreas,

> First, as some other msg states the views will estimate the sizes,
> dbsize uses actual file sizes. Second, in contrast to CKL, I would *not*
> use these fancy new system views, because they mean yet another
> dependency for pgAdmin.

<grin> I like that.  You're in favor of including the server tools because 
you, personally, use them.  You're against including the new system views 
because you, personally, won't use them.   Do I misunderstand you, or are you 
failing to put things in a broader context?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


Re: Server instrumentation for 8.1

From
Robert Treat
Date:
On Thursday 12 May 2005 10:24, Tom Lane wrote:
> Andrew - Supernews <andrew+nonews@supernews.com> writes:
> > What currently happens is that backends respond to kill -15 (_NOT_ -9)
> > by cleaning up and exiting. This code path is used for implementing the
> > stop -mfast option, which means that as it currently exists, the cleanup
> > only has to be good enough to let other backends get out of critical
> > sections and complete their own rollback-and-exit safely.
>
> Exactly.  In theory it probably works fine to allow one backend to exit
> via kill -TERM, but it cannot be claimed that that behavior has been
> tested to any significant extent --- "fast" shutdown is not stressing it
> in the same way.
>
> I think this is largely a question of someone doing a significant amount
> of stress testing: gun live server processes with "kill -TERM" in an
> active system, and keep an eye out for resource leaks, held locks, and
> so on.  It would be more convincing if the processes getting zapped are
> executing a wide variety of SQL, too --- I'd not feel very confident
> given only tests of killing, say, pgbench threads.
>

Cause I know you wont be satisfied with anecdotal evidence, I thought I would 
just say that I have done kill's on specific backends in a high load OLTP 
process, with 1000+ active connections, for years and not had a problem with 
it yet.   

Not that I wouldn't like to see some specific, thorough testing on the matter, 
but I'm perfectly comfortable with the previously provided function.

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: Server instrumentation for 8.1

From
Robert Treat
Date:
On Thursday 12 May 2005 13:04, Josh Berkus wrote:
> Andreas,
>
> > First, as some other msg states the views will estimate the sizes,
> > dbsize uses actual file sizes. Second, in contrast to CKL, I would *not*
> > use these fancy new system views, because they mean yet another
> > dependency for pgAdmin.
>
> <grin> I like that.  You're in favor of including the server tools because
> you, personally, use them.  You're against including the new system views
> because you, personally, won't use them.   Do I misunderstand you, or are
> you failing to put things in a broader context?

I think your mischarectorizing his position.  It's more like because he can 
see a valid use case with the server tools, but doesn't see a valid use case 
for the newsysviews.  Inciedentally I don't even know that he is really 
against the newsysviews, but given that one of the use cases was admin tools, 
and he has some experience in that area, he felt the need to debunk that 
reasoning.  Personally I agree with that portion... if were putting them in 
for admin tools, we don't need them. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: Server instrumentation for 8.1

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Josh Berkus
> Sent: 12 May 2005 18:04
> To: Andreas Pflug
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Server instrumentation for 8.1
>
> Andreas,
>
> > First, as some other msg states the views will estimate the sizes,
> > dbsize uses actual file sizes. Second, in contrast to CKL,
> I would *not*
> > use these fancy new system views, because they mean yet another
> > dependency for pgAdmin.
>
> <grin> I like that.  You're in favor of including the server
> tools because
> you, personally, use them.  You're against including the new
> system views
> because you, personally, won't use them.   Do I misunderstand
> you, or are you
> failing to put things in a broader context?

Saying he won't use them is not quite the same as arguing against their
inclusion for others who might.

Regards, Dave.


Re: Server instrumentation for 8.1

From
Andreas Pflug
Date:
Andrew - Supernews wrote:
> On 2005-05-12, Andreas Pflug <pgadmin@pse-consulting.de> wrote:

> 
> relpages is updated from the value of RelationGetNumberOfBlocks(rel) which
> is definitive (it gets the value from smgr which gets it from the physical
> file sizes); the only inaccuracy is that it is correct only as of the time
> that the pg_class row was last updated (as done by any VACUUM, any ANALYZE,
> any CLUSTER or any CREATE INDEX on the table, at minimum).

"accurate unless vacuum not running".
... which renders it quite useless to find out about *real* disk usage. 
Interesting for the query planner, not for the admin to know when a 
vacuum full might be necessary.

Regards,
Andreas


Re: Server instrumentation for 8.1

From
"Jim C. Nasby"
Date:
On Thu, May 12, 2005 at 10:39:14AM -0400, Tom Lane wrote:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> > Another thought I had along that line was use a different signal to
> > simply do a "query cancel" and set a global flag that is more or less
> > "get out when you're done with query cancel". Then if that flag is set,
> > just close the connection and proceed as if the client dropped the
> > connection - that has to be a well tested codepath.
> 
> This is pretty much exactly what kill -TERM does today, and the point is
> that the code path has only been extensively tested in the context of
> database-wide shutdown.  No one can honestly say that they are sure
> there are no resource leaks, locks left unreleased, stuff like that.
> That kind of problem wouldn't be visible after a shutdown, but it will
> become visible if backends are killed individually with -TERM.
> 
> Now in theory there are no bugs and this'll work fine.  What disturbs me
> is the lack of testing by anyone who knows what to look for ...

Would a script/program that starts connections, runs a query, and then
kills the backend repeatedly suffice?
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: Server instrumentation for 8.1

From
Andrew Dunstan
Date:

Jim C. Nasby wrote:

>On Thu, May 12, 2005 at 10:39:14AM -0400, Tom Lane wrote:
>  
>
>>"Magnus Hagander" <mha@sollentuna.net> writes:
>>    
>>
>>>Another thought I had along that line was use a different signal to
>>>simply do a "query cancel" and set a global flag that is more or less
>>>"get out when you're done with query cancel". Then if that flag is set,
>>>just close the connection and proceed as if the client dropped the
>>>connection - that has to be a well tested codepath.
>>>      
>>>
>>This is pretty much exactly what kill -TERM does today, and the point is
>>that the code path has only been extensively tested in the context of
>>database-wide shutdown.  No one can honestly say that they are sure
>>there are no resource leaks, locks left unreleased, stuff like that.
>>That kind of problem wouldn't be visible after a shutdown, but it will
>>become visible if backends are killed individually with -TERM.
>>
>>Now in theory there are no bugs and this'll work fine.  What disturbs me
>>is the lack of testing by anyone who knows what to look for ...
>>    
>>
>
>Would a script/program that starts connections, runs a query, and then
>kills the backend repeatedly suffice?
>  
>

Incidentally, if there are serious worries about it, testing would be a 
*really* good thing ... it's more or less officially sanctioned, since 
TERM is on the list of signals supported by pg_ctl's kill mode.

cheers

andrew


Re: Server instrumentation for 8.1

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> > Another thought I had along that line was use a different signal to
> > simply do a "query cancel" and set a global flag that is more or less
> > "get out when you're done with query cancel". Then if that flag is set,
> > just close the connection and proceed as if the client dropped the
> > connection - that has to be a well tested codepath.
> 
> This is pretty much exactly what kill -TERM does today, and the point is
> that the code path has only been extensively tested in the context of
> database-wide shutdown.  No one can honestly say that they are sure
> there are no resource leaks, locks left unreleased, stuff like that.
> That kind of problem wouldn't be visible after a shutdown, but it will
> become visible if backends are killed individually with -TERM.
> 
> Now in theory there are no bugs and this'll work fine.  What disturbs me
> is the lack of testing by anyone who knows what to look for ...

Right now the way we do cancel is to catch a signal from the postmaster,
set a flag, then check it later at a safe point to decide if we should
cancel the query.  It seems any code that would allow backends to exit
is going to have to use the same logic for safety.  I don't see how
stress testing is going to ever be sure to catch all problems.  Can't we
have a signal that does a query cancel, does the normal cancel cleanup,
then exits rather than asking for another query?  Is that what is
already being talked about?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Server instrumentation for 8.1

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Can't we
> have a signal that does a query cancel, does the normal cancel cleanup,
> then exits rather than asking for another query?

That *is* what we have.

I give up trying to explain myself, since it's obvious that I'm not
getting through to anyone.  Commit the darn thing.  I take no
responsibility for it and will not investigate any problems.
        regards, tom lane


Re: Server instrumentation for 8.1

From
Rod Taylor
Date:
> > Exactly.  In theory it probably works fine to allow one backend to exit
> > via kill -TERM, but it cannot be claimed that that behavior has been
> > tested to any significant extent --- "fast" shutdown is not stressing it
> > in the same way.
> >
> > I think this is largely a question of someone doing a significant amount
> > of stress testing: gun live server processes with "kill -TERM" in an
> > active system, and keep an eye out for resource leaks, held locks, and
> > so on.  It would be more convincing if the processes getting zapped are
> > executing a wide variety of SQL, too --- I'd not feel very confident
> > given only tests of killing, say, pgbench threads.
> >
> 
> Cause I know you wont be satisfied with anecdotal evidence, I thought I would 
> just say that I have done kill's on specific backends in a high load OLTP 
> process, with 1000+ active connections, for years and not had a problem with 
> it yet.   
> 
> Not that I wouldn't like to see some specific, thorough testing on the matter, 
> but I'm perfectly comfortable with the previously provided function.

I've also used it regularly for a few years with 100 active connections
in order to get rid of processes which were doing things they shouldn't
be, and have run into problems.

It seems about one out of every 20 kills of something holding a heavy
lock (VACUUM, ALTER TABLE, etc.) will result in a lock table corruption
being reported within the next few hours, although the pg_locks view
doesn't show anything interesting, nor do the locks appear to persist as
other processes can use the structures.

-- 



Re: Server instrumentation for 8.1

From
Bruce Momjian
Date:
Well, that's clear evidence that the only way we are going to be able to
SIGTERM a backend is it does a query cancel first, then terminates.  I
don't think anything else is going to work cleanly.

---------------------------------------------------------------------------

Rod Taylor wrote:
> > > Exactly.  In theory it probably works fine to allow one backend to exit
> > > via kill -TERM, but it cannot be claimed that that behavior has been
> > > tested to any significant extent --- "fast" shutdown is not stressing it
> > > in the same way.
> > >
> > > I think this is largely a question of someone doing a significant amount
> > > of stress testing: gun live server processes with "kill -TERM" in an
> > > active system, and keep an eye out for resource leaks, held locks, and
> > > so on.  It would be more convincing if the processes getting zapped are
> > > executing a wide variety of SQL, too --- I'd not feel very confident
> > > given only tests of killing, say, pgbench threads.
> > >
> > 
> > Cause I know you wont be satisfied with anecdotal evidence, I thought I would 
> > just say that I have done kill's on specific backends in a high load OLTP 
> > process, with 1000+ active connections, for years and not had a problem with 
> > it yet.   
> > 
> > Not that I wouldn't like to see some specific, thorough testing on the matter, 
> > but I'm perfectly comfortable with the previously provided function.
> 
> I've also used it regularly for a few years with 100 active connections
> in order to get rid of processes which were doing things they shouldn't
> be, and have run into problems.
> 
> It seems about one out of every 20 kills of something holding a heavy
> lock (VACUUM, ALTER TABLE, etc.) will result in a lock table corruption
> being reported within the next few hours, although the pg_locks view
> doesn't show anything interesting, nor do the locks appear to persist as
> other processes can use the structures.
> 
> -- 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073