Thread: fsutil ideas
As part of integrating PostgreSQL into our production environment, we're working on monitoring software, to provide the same kinds of status reporting and alerts we have implemented for our outgoing commercial database product. One of the things we show on our "big board" is impending failure conditions, so that they can be resolved before actual failure occurs. One key component of this is free space monitoring. Since PostgreSQL uses available free disk space from the operating system, we needed methods to query this through the database. (We looked at getting it directly from the OS, but there are many reasons it makes more sense to go through functions in the database to get it.) Peter Brant, a consultant working with us, has written code which is working for this under both Linux and Windows. I've been trying to package it up in PostgreSQL standard format for submission to the community, but I'm not sure how to handle a few things. For Linux, he used statvfs. There doesn't seem to be anything in mingw to support these functions, so he used methods declared in windows.h, which compile fine under mingw gcc and run fine on our Windows systems. My concern is that he has about 100 lines of code which differ between the Linux and Windows versions, which exceeds anything I've seen in current PostgreSQL #if statements in my code browsing. (Not that I've looked at every line, of course.) So, my questions: (1) Did I miss something regarding mingw support for statvfs? (2) If not, is it acceptable for a source file to contain that much #if code for Windows? I should probably also ask a tertiary question. His implementation reports space in 1K increments though int values. This effectively limits the maximum space which will be handled to 2 TB, which I'm told would be enough to cover our organization for the next ten years, but I suspect that the PostgreSQL community would prefer to see this handled a different way. (3) What data type should be used to report total space and free space for each volume? Thanks in advance for your responses. -Kevin P.S. To make it a bit more concrete, an example of the usage of these methods, against a local instance on my workstation, is attached.
Attachment
Kevin Grittner wrote: >(2) If not, is it acceptable for a source file to contain that much >#if code for Windows? > > > Large chunks of platform-specific code usually go in src/port 100 lines for a WIN32 piece would not be out of order there. If that proves difficult, let us see the mods and we can advise better. cheers andrew
Kevin Grittner wrote: > So, my questions: > > (1) Did I miss something regarding mingw support for statvfs? > > (2) If not, is it acceptable for a source file to contain that much > #if code for Windows? > > I should probably also ask a tertiary question. His implementation > reports space in 1K increments though int values. This effectively > limits the maximum space which will be handled to 2 TB, which I'm told > would be enough to cover our organization for the next ten years, but I > suspect that the PostgreSQL community would prefer to see this handled a > different way. > > (3) What data type should be used to report total space and free space > for each volume? > > Thanks in advance for your responses. > No really answering your questions, but (if you haven't already) you might want to consider creating a view that extracts all the interesting stuff from the file_system() function - so that potential users can just '\d' it to determine what it provides. Also, not sure if its really important, but the man entry from statvfs on FreeBSD is a little disconcerting: <quote> SYNOPSIS #include <sys/statvfs.h> int statvfs(const char * restrict path, struct statvfs * restrict buf); int fstatvfs(int fd, struct statvfs *buf); DESCRIPTION The statvfs() and fstatvfs() functions fill the structure pointed to by buf with garbage. This garbage will occasionally bear resemblance to file system statistics, but portableapplications must not depend on this. Applications must pass a pathname or file descriptor which refers to a file on the file system in which they are interested. The statvfs structure contains the following members: </quote> I haven't tried out any code that uses it, so not sure if the 'garbage' comment is valid. regards Mark
Kevin Grittner wrote: > Peter Brant, a consultant working with us, has written code which is > working for this under both Linux and Windows. [...] For Linux, he > used statvfs. statvfs(2) is standardized, but doesn't seem portable: it isn't available on OSX 10.3, NetBSD 2.0 or OpenBSD, for example. Perhaps you can fallback to statfs(2) when it is available? This seems an area where providing consistent cross-platform behavior might be difficult. Do we actually need this functionality inside the DBMS in the first place? > (2) If not, is it acceptable for a source file to contain that much > #if code for Windows? Let's see the code first... > (3) What data type should be used to report total space and free space > for each volume? int64? -Neil
Neil Conway <neilc@samurai.com> writes: > Do we actually need this functionality inside the > DBMS in the first place? I think that is the $64 question. My immediate instinct is "no". See the knock-down-drag-out fights we had last summer about whether to expose any filesystem access in built-in standard functions at all. There will be what the Supreme Court would call "strict scrutiny" concerning the need for this, possible security risks, etc. regards, tom lane
On Thu, Feb 23, 2006 at 11:32:05PM -0500, Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > Do we actually need this functionality inside the > > DBMS in the first place? > > I think that is the $64 question. My immediate instinct is "no". > See the knock-down-drag-out fights we had last summer about whether > to expose any filesystem access in built-in standard functions at all. > There will be what the Supreme Court would call "strict scrutiny" > concerning the need for this, possible security risks, etc. Isn't this something that could be accomplished entirely within a function? I suppose it might have to be an untrusted language, but that still seems cleaner than putting it in the backend. Plus, ISTM that something like perl is more likely to have a cross-platform means of accomplishing this. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
The code in question is written as C extension functions. I think we were thinking it might be something for contrib (although perhaps that would be too much of an official blessing too?) Pete >>> "Jim C. Nasby" <jnasby@pervasive.com> 02/24/06 8:04 am >>> Isn't this something that could be accomplished entirely within a function? I suppose it might have to be an untrusted language, but that still seems cleaner than putting it in the backend. Plus, ISTM that something like perl is more likely to have a cross-platform means of accomplishing this. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
>>> On Thu, Feb 23, 2006 at 8:43 pm, in message <43FE72DB.8070701@samurai.com>, Neil Conway <neilc@samurai.com> wrote: > Kevin Grittner wrote: >> Peter Brant, a consultant working with us, has written code which is >> working for this under both Linux and Windows. [...] For Linux, he >> used statvfs. > > statvfs(2) is standardized, but doesn't seem portable: it isn't > available on OSX 10.3, NetBSD 2.0 or OpenBSD, for example. Perhaps you > can fallback to statfs(2) when it is available? > > This seems an area where providing consistent cross- platform behavior > might be difficult. Do we actually need this functionality inside the > DBMS in the first place? It sounds like we should probably just shelve the idea of sharing this code. It is very useful to us, since it allows a single database connection to monitor the health of the database and detect impending failure. Having our monitoring software need to know the operating system of each database, with security to log into the OS of each machine, would be far worse from a security standpoint, more fragile, and more work than adding these functions to the database. But no response so far has indicated any interest in it from any other quarter, and it sounds like the portability issues are more than we want to deal with here. In response to a few other posts -- it is implemented with three functions, which I was bundling to target the contrib level of submission. One function returns a list of volumes (i.e., drive letters in Windows, mount points in Linux), one returns available space for a given volume, and one returns total space for a given volume. Since we have what we need to get our work done, and the community at large doesn't seem interested, I'll shelve the idea of submitting anything. Thanks for the responses. -Kevin
> > This seems an area where providing consistent cross- platform > behavior > > might be difficult. Do we actually need this functionality inside the > > > DBMS in the first place? > > It sounds like we should probably just shelve the idea of sharing this > code. It is very useful to us, since it allows a single database > connection to monitor the health of the database and detect impending > failure. Having our monitoring software need to know the operating > system of each database, with security to log into the OS of each > machine, would be far worse from a security standpoint, more fragile, > and more work than adding these functions to the database. But no Why don't you have a process on the machine update a set of values in the database that you then read from remote? You don't need to know the free diskspace in real time. A 2 minute old value is probably just as good. --
Kevin Grittner wrote: >Since we have what we need to get our work done, and the community at >large doesn't seem interested, I'll shelve the idea of submitting >anything. > > > I think you have misinterpreted. By all means share the code. Put it on your website or start a pgfoundry project. Even if it doesn't go into the core code some people might very well be interested to use it. cheers andrew
>>> On Fri, Feb 24, 2006 at 9:34 am, in message <1140795253.5092.122.camel@home>, Rod Taylor <pg@rbt.ca> wrote: > > You don't need to know the free diskspace in real time. A 2 minute old > value is probably just as good. Not really, this sort of monitoring has kept us from crashing under our old database product when a poorly written query starts filling available space by populating a temporary table. A green light turns orange (indicating impending failure) on a big board which monitors the health of about 200 servers and about 3000 workstations. A support person drills down through the "LED" on a version of the "big board" in their browser, which shows the supporting detail. A runaway query can be identified and killed, protecting the rest of the users. A two minute delay could cause an embarrassing interruption in service for hundreds of our users. Your suggestion would work for the gradual "outgrowing" of space, but we need to cover both. -Kevin
On Fri, 2006-02-24 at 09:48 -0600, Kevin Grittner wrote: > >>> On Fri, Feb 24, 2006 at 9:34 am, in message > <1140795253.5092.122.camel@home>, > Rod Taylor <pg@rbt.ca> wrote: > > > > You don't need to know the free diskspace in real time. A 2 minute > old > > value is probably just as good. > > Not really, this sort of monitoring has kept us from crashing under our > old database product when a poorly written query starts filling > available space by populating a temporary table. A green light turns I see. It is annoying that you cannot easily (takes a patch to PG sources) segregate users temporary workspaces into per-user tablespaces with filesystem quotas. PostgreSQL seems to deal with out of diskspace situations pretty well when it impacts a tablespace (global stuff like WAL or subtransactions have issues -- but they grow slowly) as far as only interrupting service for the individual actions that ran out. You may wish to look at funding toggles that can configure the maximum memory usage and maximum temporary diskspace (different tablespaces with filesystem quotas) on a per user basis similar to the statement_timeout limitations in place today. I'm curious as to how you monitor for total transaction time length to ensure that vacuum is able to do its thing, particularly when the transaction is active (not IDLE). --
>>> On Fri, Feb 24, 2006 at 10:57 am, in message <1140800266.5092.144.camel@home>, Rod Taylor <pg@rbt.ca> wrote: > > PostgreSQL seems to deal with out of diskspace situations pretty well > when it impacts a tablespace (global stuff like WAL or subtransactions > have issues -- but they grow slowly) as far as only interrupting service > for the individual actions that ran out. We haven't used tablespace features yet, as 3 of the 4 databases running PostgreSQL so far are on Windows. We have run out of space a couple times, and it seems like it handles it well in terms of not corrupting the database, and resuming OK once some space is freed. The messages are not that clear -- some sort of generic I/O write error, as I recall, instead of "out of disk space" being clearly stated. > You may wish to look at funding toggles that can configure the maximum > memory usage and maximum temporary diskspace (different tablespaces with > filesystem quotas) on a per user basis similar to the statement_timeout > limitations in place today. That wouldn't help because the vast majority of the work is done through a middle tier which uses a connection pool shared by all users. It does take some human review and judgment to ensure that a query which is running long and/or using a lot of temp table space is really a problem as opposed to one of our larger legitimate processes. > I'm curious as to how you monitor for total transaction time length to > ensure that vacuum is able to do its thing, particularly when the > transaction is active (not IDLE). We run a database vacuum nightly and review it the next day. (Something will need to be done to automate this with summaries and exception lists when we get more than a few databases on PostgreSQL. We can't have a person reviewing 100 of these every day.) We've not had any nightly vacuum fail to finish. They did start running a tad long until we did some aggressive maintenance at one point. Our autovacuum is configured with fairly aggressive parameters, compared to the default; but, even so, only a few small tables with high update rates normally reach the thresholds. I haven't noticed the autovacuum getting held up on these. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > We haven't used tablespace features yet, as 3 of the 4 databases > running PostgreSQL so far are on Windows. We have run out of space a > couple times, and it seems like it handles it well in terms of not > corrupting the database, and resuming OK once some space is freed. The > messages are not that clear -- some sort of generic I/O write error, as > I recall, instead of "out of disk space" being clearly stated. Hm, this may be an issue of poor translation of Windows error codes; AFAIR PG is pretty good these days about reporting the correct error message on Unix. Can you dig up the exact message you got, or deliberately reproduce the problem to get a fresh copy? regards, tom lane
> > I'm curious as to how you monitor for total transaction time length > to > > ensure that vacuum is able to do its thing, particularly when the > > transaction is active (not IDLE). > > We run a database vacuum nightly and review it the next day. Ahh.. different issues again I guess. I have a few structures that need to be vacuumed every 10 minutes or less. If a transaction runs longer than 20 minutes (thus effectively disabling vacuum for 20 minutes) I run into pretty big problems. I watch for table bloat but I haven't figured out a nice way of tracking down the postgresql process with the oldest transaction running short of patching PostgreSQL to report the XID for a connection in pg_stat_activity. --
Rod Taylor <pg@rbt.ca> writes: > I watch for table bloat but I haven't figured out a nice way of tracking > down the postgresql process with the oldest transaction running short of > patching PostgreSQL to report the XID for a connection in > pg_stat_activity. I don't think you need a patch --- try joining with pg_locks. regards, tom lane
On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote: > Rod Taylor <pg@rbt.ca> writes: > > I watch for table bloat but I haven't figured out a nice way of tracking > > down the postgresql process with the oldest transaction running short of > > patching PostgreSQL to report the XID for a connection in > > pg_stat_activity. > > I don't think you need a patch --- try joining with pg_locks. Ugh.. pg_locks is the first think I look at to see if something is blocked that shouldn't be. Don't know why I didn't think of using it for the XID value. Thanks.
On Fri, 2006-02-24 at 19:12, Rod Taylor wrote: > On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote: > > Rod Taylor <pg@rbt.ca> writes: > > > I watch for table bloat but I haven't figured out a nice way of tracking > > > down the postgresql process with the oldest transaction running short of > > > patching PostgreSQL to report the XID for a connection in > > > pg_stat_activity. But I'm afraid that a long running transaction with many short queries will not even show up in pg_stat_activity. So that's not a completely reliable way of catching long running transactions... but it's true that most of the time a long running query is the problem, and that is catchable. Cheers, Csaba.
On Fri, 2006-02-24 at 19:20 +0100, Csaba Nagy wrote: > On Fri, 2006-02-24 at 19:12, Rod Taylor wrote: > > On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote: > > > Rod Taylor <pg@rbt.ca> writes: > > > > I watch for table bloat but I haven't figured out a nice way of tracking > > > > down the postgresql process with the oldest transaction running short of > > > > patching PostgreSQL to report the XID for a connection in > > > > pg_stat_activity. > > But I'm afraid that a long running transaction with many short queries > will not even show up in pg_stat_activity. So that's not a completely > reliable way of catching long running transactions... but it's true that > most of the time a long running query is the problem, and that is > catchable. The specific query may not show up but the process should appear in one state or another. That said, pg_locks would still show low XID (compared to the rest) exists and that would probably be the culprit. --
Ühel kenal päeval, R, 2006-02-24 kell 19:20, kirjutas Csaba Nagy: > On Fri, 2006-02-24 at 19:12, Rod Taylor wrote: > > On Fri, 2006-02-24 at 12:48 -0500, Tom Lane wrote: > > > Rod Taylor <pg@rbt.ca> writes: > > > > I watch for table bloat but I haven't figured out a nice way of tracking > > > > down the postgresql process with the oldest transaction running short of > > > > patching PostgreSQL to report the XID for a connection in > > > > pg_stat_activity. > > But I'm afraid that a long running transaction with many short queries > will not even show up in pg_stat_activity. It will show as "<IDLE> in transaction" The harder part would be knowing how long the queri has been running in wallclock time, not in transactions. --------------- Hannu
Oh, so does it actually involve any server modifications? Or can it just go into pgfoundry? On Fri, Feb 24, 2006 at 08:25:03AM -0600, Peter Brant wrote: > The code in question is written as C extension functions. I think we > were thinking it might be something for contrib (although perhaps that > would be too much of an official blessing too?) > > Pete > > >>> "Jim C. Nasby" <jnasby@pervasive.com> 02/24/06 8:04 am >>> > Isn't this something that could be accomplished entirely within a > function? I suppose it might have to be an untrusted language, but > that > still seems cleaner than putting it in the backend. Plus, ISTM that > something like perl is more likely to have a cross-platform means of > accomplishing this. > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
>>> On Fri, Feb 24, 2006 at 5:00 pm, in message <20060224230030.GM2068@pervasive.com>, "Jim C. Nasby" <jnasby@pervasive.com> wrote: > Oh, so does it actually involve any server modifications? Or can it just > go into pgfoundry? No server modifications. I've got it bundled up as though it were going to be under contrib, and I've applied for a pgfoundry project for it. Waiting for approval. I've not dealt with pgfoundry before, so I don't know how much modification, if any, is needed to set that up. It's working for us now by building in standard contrib fashion (from contrib/fsutil). The files: -rw-r--r-- 1 kgrittn users 318 Feb 24 11:39 Makefile -rw-r--r-- 1 kgrittn users 751 Feb 24 11:44 README.fsutil -rw-r--r-- 1 kgrittn users 7857 Feb 24 14:25 fsutil.c -rw-r--r-- 1 kgrittn users 2149 Feb 24 14:25 fsutil.h -rw-r--r-- 1 kgrittn users 440 Feb 24 12:54 fsutil.sql.in Any pointers on setting up the project, or a URL to a page on that topic, would be welcome. I haven't stumbled across it yet; although, once I know about it I'm sure it will seem obvious. -Kevin
Kevin Grittner wrote: > I've not dealt with pgfoundry before, so I don't know how much > modification, if any, is needed to set that up. It's working for us now > by building in standard contrib fashion (from contrib/fsutil). The > files: Having it build with PGXS would be a definite plus for ease of installation. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>>> On Fri, Feb 24, 2006 at 5:25 pm, in message <20060224232500.GG9060@surnet.cl>, Alvaro Herrera <alvherre@commandprompt.com> wrote: > > Having it build with PGXS would be a definite plus for ease of > installation. It does.