Thread: fsutil ideas

fsutil ideas

From
"Kevin Grittner"
Date:
As part of integrating PostgreSQL into our production environment, we're
working on monitoring software, to provide the same kinds of status
reporting and alerts we have implemented for our outgoing commercial
database product.  One of the things we show on our "big board" is
impending failure conditions, so that they can be resolved before actual
failure occurs.  One key component of this is free space monitoring.
Since PostgreSQL uses available free disk space from the operating
system, we needed methods to query this through the database.  (We
looked at getting it directly from the OS, but there are many reasons it
makes more sense to go through functions in the database to get it.)

Peter Brant, a consultant working with us, has written code which is
working for this under both Linux and Windows.  I've been trying to
package it up in PostgreSQL standard format for submission to the
community, but I'm not sure how to handle a few things.  For Linux, he
used statvfs.  There doesn't seem to be anything in mingw to support
these functions, so he used methods declared in windows.h, which compile
fine under mingw gcc and run fine on our Windows systems.  My concern is
that he has about 100 lines of code which differ between the Linux and
Windows versions, which exceeds anything I've seen in current PostgreSQL
#if statements in my code browsing.  (Not that I've looked at every
line, of course.)

So, my questions:

(1)  Did I miss something regarding mingw support for statvfs?

(2)  If not, is it acceptable for a source file to contain that much
#if code for Windows?

I should probably also ask a tertiary question.  His implementation
reports space in 1K increments though int values.  This effectively
limits the maximum space which will be handled to 2 TB, which I'm told
would be enough to cover our organization for the next ten years, but I
suspect that the PostgreSQL community would prefer to see this handled a
different way.

(3)  What data type should be used to report total space and free space
for each volume?

Thanks in advance for your responses.

-Kevin

P.S.  To make it a bit more concrete, an example of the usage of these
methods, against a local instance on my workstation, is attached.




Attachment

Re: fsutil ideas

From
Andrew Dunstan
Date:

Kevin Grittner wrote:

>(2)  If not, is it acceptable for a source file to contain that much
>#if code for Windows?
>
>  
>

Large chunks of platform-specific code usually go in src/port

100 lines for a WIN32 piece would not be out of order there.

If that proves difficult, let us see the mods and we can advise better.

cheers

andrew


Re: fsutil ideas

From
Mark Kirkwood
Date:
Kevin Grittner wrote:

> So, my questions:
> 
> (1)  Did I miss something regarding mingw support for statvfs?
> 
> (2)  If not, is it acceptable for a source file to contain that much
> #if code for Windows?
> 
> I should probably also ask a tertiary question.  His implementation
> reports space in 1K increments though int values.  This effectively
> limits the maximum space which will be handled to 2 TB, which I'm told
> would be enough to cover our organization for the next ten years, but I
> suspect that the PostgreSQL community would prefer to see this handled a
> different way.
> 
> (3)  What data type should be used to report total space and free space
> for each volume?
> 
> Thanks in advance for your responses.
> 

No really answering your questions, but (if you haven't already) you 
might want to consider creating a view that extracts all the interesting 
stuff from the file_system() function - so that potential users can just 
'\d' it to determine what it provides.

Also, not sure if its really important, but the man entry from statvfs 
on FreeBSD is a little disconcerting:

<quote>
SYNOPSIS     #include <sys/statvfs.h>
     int     statvfs(const char * restrict path, struct statvfs * restrict buf);
     int     fstatvfs(int fd, struct statvfs *buf);

DESCRIPTION     The statvfs() and fstatvfs() functions fill the structure pointed 
to by     buf with garbage.  This garbage will occasionally bear resemblance to     file system statistics, but
portableapplications must not depend on     this.  Applications must pass a pathname or file descriptor which 
 
refers     to a file on the file system in which they are interested.
     The statvfs structure contains the following members:

</quote>

I haven't tried out any code that uses it, so not sure if the 'garbage' 
comment is valid.

regards

Mark


Re: fsutil ideas

From
Neil Conway
Date:
Kevin Grittner wrote:
> Peter Brant, a consultant working with us, has written code which is
> working for this under both Linux and Windows.  [...] For Linux, he
> used statvfs.

statvfs(2) is standardized, but doesn't seem portable: it isn't 
available on OSX 10.3, NetBSD 2.0 or OpenBSD, for example. Perhaps you 
can fallback to statfs(2) when it is available?

This seems an area where providing consistent cross-platform behavior 
might be difficult. Do we actually need this functionality inside the 
DBMS in the first place?

> (2)  If not, is it acceptable for a source file to contain that much
> #if code for Windows?

Let's see the code first...

> (3)  What data type should be used to report total space and free space
> for each volume?

int64?

-Neil



Re: fsutil ideas

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Do we actually need this functionality inside the 
> DBMS in the first place?

I think that is the $64 question.  My immediate instinct is "no".
See the knock-down-drag-out fights we had last summer about whether
to expose any filesystem access in built-in standard functions at all.
There will be what the Supreme Court would call "strict scrutiny"
concerning the need for this, possible security risks, etc.
        regards, tom lane


Re: fsutil ideas

From
"Jim C. Nasby"
Date:
On Thu, Feb 23, 2006 at 11:32:05PM -0500, Tom Lane wrote:
> Neil Conway <neilc@samurai.com> writes:
> > Do we actually need this functionality inside the 
> > DBMS in the first place?
> 
> I think that is the $64 question.  My immediate instinct is "no".
> See the knock-down-drag-out fights we had last summer about whether
> to expose any filesystem access in built-in standard functions at all.
> There will be what the Supreme Court would call "strict scrutiny"
> concerning the need for this, possible security risks, etc.

Isn't this something that could be accomplished entirely within a
function? I suppose it might have to be an untrusted language, but that
still seems cleaner than putting it in the backend. Plus, ISTM that
something like perl is more likely to have a cross-platform means of
accomplishing this.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: fsutil ideas

From
"Peter Brant"
Date:
The code in question is written as C extension functions.  I think we
were thinking it might be something for contrib (although perhaps that
would be too much of an official blessing too?)

Pete

>>> "Jim C. Nasby" <jnasby@pervasive.com> 02/24/06 8:04 am >>>
Isn't this something that could be accomplished entirely within a
function? I suppose it might have to be an untrusted language, but
that
still seems cleaner than putting it in the backend. Plus, ISTM that
something like perl is more likely to have a cross-platform means of
accomplishing this.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com 
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: fsutil ideas

From
"Kevin Grittner"
Date:
>>> On Thu, Feb 23, 2006 at  8:43 pm, in message
<43FE72DB.8070701@samurai.com>,
Neil Conway <neilc@samurai.com> wrote: 
> Kevin Grittner wrote:
>> Peter Brant, a consultant working with us, has written code which
is
>> working for this under both Linux and Windows.  [...] For Linux, he
>> used statvfs.
> 
> statvfs(2) is standardized, but doesn't seem portable: it isn't 
> available on OSX 10.3, NetBSD 2.0 or OpenBSD, for example. Perhaps
you 
> can fallback to statfs(2) when it is available?
> 
> This seems an area where providing consistent cross- platform
behavior 
> might be difficult. Do we actually need this functionality inside the

> DBMS in the first place?

It sounds like we should probably just shelve the idea of sharing this
code.  It is very useful to us, since it allows a single database
connection to monitor the health of the database and detect impending
failure.  Having our monitoring software need to know the operating
system of each database, with security to log into the OS of each
machine, would be far worse from a security standpoint, more fragile,
and more work than adding these functions to the database.  But no
response so far has indicated any interest in it from any other quarter,
and it sounds like the portability issues are more than we want to deal
with here.

In response to a few other posts -- it is implemented with three
functions, which I was bundling to target the contrib level of
submission.  One function returns a list of volumes (i.e., drive letters
in Windows, mount points in Linux), one returns available space for a
given volume, and one returns total space for a given volume.

Since we have what we need to get our work done, and the community at
large doesn't seem interested, I'll shelve the idea of submitting
anything.

Thanks for the responses.

-Kevin



Re: fsutil ideas

From
Rod Taylor
Date:
> > This seems an area where providing consistent cross- platform
> behavior 
> > might be difficult. Do we actually need this functionality inside the
> 
> > DBMS in the first place?
> 
> It sounds like we should probably just shelve the idea of sharing this
> code.  It is very useful to us, since it allows a single database
> connection to monitor the health of the database and detect impending
> failure.  Having our monitoring software need to know the operating
> system of each database, with security to log into the OS of each
> machine, would be far worse from a security standpoint, more fragile,
> and more work than adding these functions to the database.  But no

Why don't you have a process on the machine update a set of values in
the database that you then read from remote?

You don't need to know the free diskspace in real time. A 2 minute old
value is probably just as good.


-- 



Re: fsutil ideas

From
Andrew Dunstan
Date:
Kevin Grittner wrote:

>Since we have what we need to get our work done, and the community at
>large doesn't seem interested, I'll shelve the idea of submitting
>anything.
>
>  
>

I think you have misinterpreted. By all means share the code. Put it on 
your website or start a pgfoundry project. Even if it doesn't go into 
the core code some people might very well be interested to use it.

cheers

andrew


Re: fsutil ideas

From
"Kevin Grittner"
Date:
>>> On Fri, Feb 24, 2006 at  9:34 am, in message
<1140795253.5092.122.camel@home>,
Rod Taylor <pg@rbt.ca> wrote: 
> 
> You don't need to know the free diskspace in real time. A 2 minute
old
> value is probably just as good.

Not really, this sort of monitoring has kept us from crashing under our
old database product when a poorly written query starts filling
available space by populating a temporary table.  A green light turns
orange (indicating impending failure) on a big board which monitors the
health of about 200 servers and about 3000 workstations.  A support
person drills down through the "LED" on a version of the "big board" in
their browser, which shows the supporting detail.  A runaway query can
be identified and killed, protecting the rest of the users.  A two
minute delay could cause an embarrassing interruption in service for
hundreds of our users.

Your suggestion would work for the gradual "outgrowing" of space, but
we need to cover both.

-Kevin






Re: fsutil ideas

From
Rod Taylor
Date:
On Fri, 2006-02-24 at 09:48 -0600, Kevin Grittner wrote:
> >>> On Fri, Feb 24, 2006 at  9:34 am, in message
> <1140795253.5092.122.camel@home>,
> Rod Taylor <pg@rbt.ca> wrote: 
> > 
> > You don't need to know the free diskspace in real time. A 2 minute
> old
> > value is probably just as good.
> 
> Not really, this sort of monitoring has kept us from crashing under our
> old database product when a poorly written query starts filling
> available space by populating a temporary table.  A green light turns

I see. It is annoying that you cannot easily (takes a patch to PG
sources) segregate users temporary workspaces into per-user tablespaces
with filesystem quotas.

PostgreSQL seems to deal with out of diskspace situations pretty well
when it impacts a tablespace (global stuff like WAL or subtransactions
have issues -- but they grow slowly) as far as only interrupting service
for the individual actions that ran out.

You may wish to look at funding toggles that can configure the maximum
memory usage and maximum temporary diskspace (different tablespaces with
filesystem quotas) on a per user basis similar to the statement_timeout
limitations in place today.


I'm curious as to how you monitor for total transaction time length to
ensure that vacuum is able to do its thing, particularly when the
transaction is active (not IDLE).

-- 



Re: fsutil ideas

From
"Kevin Grittner"
Date:
>>> On Fri, Feb 24, 2006 at 10:57 am, in message
<1140800266.5092.144.camel@home>,
Rod Taylor <pg@rbt.ca> wrote: 
> 
> PostgreSQL seems to deal with out of diskspace situations pretty
well
> when it impacts a tablespace (global stuff like WAL or
subtransactions
> have issues --  but they grow slowly) as far as only interrupting
service
> for the individual actions that ran out.

We haven't used tablespace features yet, as 3 of the 4 databases
running PostgreSQL so far are on Windows.  We have run out of space a
couple times, and it seems like it handles it well in terms of not
corrupting the database, and resuming OK once some space is freed.  The
messages are not that clear -- some sort of generic I/O write error, as
I recall, instead of "out of disk space" being clearly stated.

> You may wish to look at funding toggles that can configure the
maximum
> memory usage and maximum temporary diskspace (different tablespaces
with
> filesystem quotas) on a per user basis similar to the
statement_timeout
> limitations in place today.

That wouldn't help because the vast majority of the work is done
through a middle tier which uses a connection pool shared by all users. 
It does take some human review and judgment to ensure that a query which
is running long and/or using a lot of temp table space is really a
problem as opposed to one of our larger legitimate processes.

> I'm curious as to how you monitor for total transaction time length
to
> ensure that vacuum is able to do its thing, particularly when the
> transaction is active (not IDLE).

We run a database vacuum nightly and review it the next day. 
(Something will need to be done to automate this with summaries and
exception lists when we get more than a few databases on PostgreSQL.  We
can't have a person reviewing 100 of these every day.)  We've not had
any nightly vacuum fail to finish.  They did start running a tad long
until we did some aggressive maintenance at one point.

Our autovacuum is configured with fairly aggressive parameters,
compared to the default; but, even so, only a few small tables with high
update rates normally reach the thresholds.  I haven't noticed the
autovacuum getting held up on these.

-Kevin




Re: fsutil ideas

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> We haven't used tablespace features yet, as 3 of the 4 databases
> running PostgreSQL so far are on Windows.  We have run out of space a
> couple times, and it seems like it handles it well in terms of not
> corrupting the database, and resuming OK once some space is freed.  The
> messages are not that clear -- some sort of generic I/O write error, as
> I recall, instead of "out of disk space" being clearly stated.

Hm, this may be an issue of poor translation of Windows error codes;
AFAIR PG is pretty good these days about reporting the correct error
message on Unix.  Can you dig up the exact message you got, or
deliberately reproduce the problem to get a fresh copy?
        regards, tom lane


Re: fsutil ideas

From
Rod Taylor
Date:
> > I'm curious as to how you monitor for total transaction time length
> to
> > ensure that vacuum is able to do its thing, particularly when the
> > transaction is active (not IDLE).
> 
> We run a database vacuum nightly and review it the next day. 

Ahh.. different issues again I guess.

I have a few structures that need to be vacuumed every 10 minutes or
less. If a transaction runs longer than 20 minutes (thus effectively
disabling vacuum for 20 minutes) I run into pretty big problems.

I watch for table bloat but I haven't figured out a nice way of tracking
down the postgresql process with the oldest transaction running short of
patching PostgreSQL to report the XID for a connection in
pg_stat_activity.

-- 



Re: fsutil ideas

From
Tom Lane
Date:
Rod Taylor <pg@rbt.ca> writes:
> I watch for table bloat but I haven't figured out a nice way of tracking
> down the postgresql process with the oldest transaction running short of
> patching PostgreSQL to report the XID for a connection in
> pg_stat_activity.

I don't think you need a patch --- try joining with pg_locks.
        regards, tom lane


Re: fsutil ideas

From
Rod Taylor
Date:
On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote:
> Rod Taylor <pg@rbt.ca> writes:
> > I watch for table bloat but I haven't figured out a nice way of tracking
> > down the postgresql process with the oldest transaction running short of
> > patching PostgreSQL to report the XID for a connection in
> > pg_stat_activity.
> 
> I don't think you need a patch --- try joining with pg_locks.

Ugh.. pg_locks is the first think I look at to see if something is
blocked that shouldn't be. Don't know why I didn't think of using it for
the XID value.

Thanks.




Re: fsutil ideas

From
Csaba Nagy
Date:
On Fri, 2006-02-24 at 19:12, Rod Taylor wrote:
> On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote:
> > Rod Taylor <pg@rbt.ca> writes:
> > > I watch for table bloat but I haven't figured out a nice way of tracking
> > > down the postgresql process with the oldest transaction running short of
> > > patching PostgreSQL to report the XID for a connection in
> > > pg_stat_activity.

But I'm afraid that a long running transaction with many short queries
will not even show up in pg_stat_activity. So that's not a completely
reliable way of catching long running transactions... but it's true that
most of the time a long running query is the problem, and that is
catchable.

Cheers,
Csaba.




Re: fsutil ideas

From
Rod Taylor
Date:
On Fri, 2006-02-24 at 19:20 +0100, Csaba Nagy wrote:
> On Fri, 2006-02-24 at 19:12, Rod Taylor wrote:
> > On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote:
> > > Rod Taylor <pg@rbt.ca> writes:
> > > > I watch for table bloat but I haven't figured out a nice way of tracking
> > > > down the postgresql process with the oldest transaction running short of
> > > > patching PostgreSQL to report the XID for a connection in
> > > > pg_stat_activity.
> 
> But I'm afraid that a long running transaction with many short queries
> will not even show up in pg_stat_activity. So that's not a completely
> reliable way of catching long running transactions... but it's true that
> most of the time a long running query is the problem, and that is
> catchable.

The specific query may not show up but the process should appear in one
state or another.

That said, pg_locks would still show low XID (compared to the rest)
exists and that would probably be the culprit.
-- 



Re: fsutil ideas

From
Hannu Krosing
Date:
Ühel kenal päeval, R, 2006-02-24 kell 19:20, kirjutas Csaba Nagy:
> On Fri, 2006-02-24 at 19:12, Rod Taylor wrote:
> > On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote:
> > > Rod Taylor <pg@rbt.ca> writes:
> > > > I watch for table bloat but I haven't figured out a nice way of tracking
> > > > down the postgresql process with the oldest transaction running short of
> > > > patching PostgreSQL to report the XID for a connection in
> > > > pg_stat_activity.
> 
> But I'm afraid that a long running transaction with many short queries
> will not even show up in pg_stat_activity. 

It will show as "<IDLE> in transaction"

The harder part would be knowing how long the queri has been running in
wallclock time, not in transactions.

---------------
Hannu




Re: fsutil ideas

From
"Jim C. Nasby"
Date:
Oh, so does it actually involve any server modifications? Or can it just
go into pgfoundry?
On Fri, Feb 24, 2006 at 08:25:03AM -0600, Peter Brant wrote:
> The code in question is written as C extension functions.  I think we
> were thinking it might be something for contrib (although perhaps that
> would be too much of an official blessing too?)
> 
> Pete
> 
> >>> "Jim C. Nasby" <jnasby@pervasive.com> 02/24/06 8:04 am >>>
> Isn't this something that could be accomplished entirely within a
> function? I suppose it might have to be an untrusted language, but
> that
> still seems cleaner than putting it in the backend. Plus, ISTM that
> something like perl is more likely to have a cross-platform means of
> accomplishing this.
> -- 
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com 
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: fsutil ideas

From
"Kevin Grittner"
Date:
>>> On Fri, Feb 24, 2006 at  5:00 pm, in message
<20060224230030.GM2068@pervasive.com>, "Jim C. Nasby"
<jnasby@pervasive.com>
wrote: 
> Oh, so does it actually involve any server modifications? Or can it
just
> go into pgfoundry?

No server modifications.  I've got it bundled up as though it were
going to be under contrib, and I've applied for a pgfoundry project for
it.  Waiting for approval.

I've not dealt with pgfoundry before, so I don't know how much
modification, if any, is needed to set that up.  It's working for us now
by building in standard contrib fashion (from contrib/fsutil).  The
files:

-rw-r--r--  1 kgrittn users  318 Feb 24 11:39 Makefile
-rw-r--r--  1 kgrittn users  751 Feb 24 11:44 README.fsutil
-rw-r--r--  1 kgrittn users 7857 Feb 24 14:25 fsutil.c
-rw-r--r--  1 kgrittn users 2149 Feb 24 14:25 fsutil.h
-rw-r--r--  1 kgrittn users  440 Feb 24 12:54 fsutil.sql.in

Any pointers on setting up the project, or a URL to a page on that
topic, would be welcome.  I haven't stumbled across it yet; although,
once I know about it I'm sure it will seem obvious.

-Kevin




Re: fsutil ideas

From
Alvaro Herrera
Date:
Kevin Grittner wrote:

> I've not dealt with pgfoundry before, so I don't know how much
> modification, if any, is needed to set that up.  It's working for us now
> by building in standard contrib fashion (from contrib/fsutil).  The
> files:

Having it build with PGXS would be a definite plus for ease of
installation.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: fsutil ideas

From
"Kevin Grittner"
Date:
>>> On Fri, Feb 24, 2006 at  5:25 pm, in message
<20060224232500.GG9060@surnet.cl>,
Alvaro Herrera <alvherre@commandprompt.com> wrote: 
> 
> Having it build with PGXS would be a definite plus for ease of
> installation.

It does.