Thread: Answering my own question
I found, on http://www.argudo.org/postgresql/soft-tuning-10.html my answer. (Excellent site btw) What I need to do is prefix my SELECT statements for this table with the statement : SET enable_seqscan='false'; as in : SET enable_seqscan='false';SELECT * FROM pol_xref_d WHERE policy_no=1200079 ORDER BY sequence; This change makes the response sub-second , a great improvement over the 11+ seconds we're currently experiencing. I'ld love any explanation of why the Query plan was going sequential versus indexed. Gerry
Now THAT's REALLY fascinating. We upgraded from 6.5 to 7.1 about 4-5 months ago , and only got this problem after that. I leave "ANALYZE" out of VACUUM due to the large volume of (I though needless) output that gets piped to my cron_log. Does anyone have a feal for how "necessary" VACUUM ANALYZE is over "VACUUM" ?? Gerry ghaverla@freenet.edm onton.ab.ca To: gerry.smit@lombard.ca cc: pgsql-novice@postgresql.org 15/05/2002 01:50 PM Fax to: Subject: Re: [NOVICE] Answering my own question On Wed, 15 May 2002 gerry.smit@lombard.ca wrote: > I found, on http://www.argudo.org/postgresql/soft-tuning-10.html my > answer. (Excellent site btw) What I need to do is prefix my SELECT > statements for this table with the statement : > > SET enable_seqscan='false'; > > as in : > > SET enable_seqscan='false';SELECT * FROM pol_xref_d WHERE policy_no=1200079 > ORDER BY sequence; Some else mentioned ANALYZE (or is it ANALYSE?). Anyway, I seem to remember reading someplace that with newer versions of PostgreSQL, when you used to VACUUM on a regular basis, you were supposed to change over to VACUUM ANALYSE (ANALYZE?) some (all?) of the time. My memory is fuzzy about this, so perhaps the docs will have something on this. Gord Matter Realisations http://www.materialisations.com/ Gordon Haverland, B.Sc. M.Eng. President 101 9504 182 St. NW Edmonton, AB, CA T5T 3A7 780/481-8019 ghaverla @ freenet.edmonton.ab.ca 780/993-1274 (cell)
On Wed, 15 May 2002 gerry.smit@lombard.ca wrote: > Now THAT's REALLY fascinating. We upgraded from 6.5 to 7.1 about 4-5 > months ago , and only got this problem after that. I leave "ANALYZE" out of > VACUUM due to the large volume of (I though needless) output that gets > piped to my cron_log. Display of activities is controllable through settings. It is also somewhat silly to not run analyze just because it displays too much data in a log. > Does anyone have a feal for how "necessary" VACUUM ANALYZE is over "VACUUM" > Gerry Like many other things in live the answer is: it depends. Vacuum is about been able to reuse space which was used by rows that have been deleted. Vacuum Full is about recovering the disk space used by deleted rows. Analyze is about re-computing statistics that the analyzer uses to determine best strategies. The answer to your question will be directly related to how much and how often new data is inserted in your database. If you NEVER change your data or make inserts then you NEVER have to run either vacuum or analyze. If you have a very heavily loaded environment with hundreds/thousands transactions per minute, then an hourly vacuum analyze would probably help. Why don you tell us more about your particular scenario? For instance I do ONE big load every morning from a production Foxpro system to my reporting PostgreSQL system. I do one vacuum analyze just after the load, then I compute some stats and populate some files, then another vacuum analyze. At 10pm I do a vacuum full. This works great for me. Before I was doing one vacuum analyze after the stats computations and it was horrible. Something which takes to compute less than an hour if I have done a vacuum analyze took about 4 hours without vacuum analyze and that was just ONE of the things that needed to be done. The first vacuum analyze takes about 50 minutes and the second takes about 8 minutes. A lot less than the 4+ hours of savings from having run vacuum analyze vs doing the stats computations without them.
Well, having copies of stats from the server.log obscure timing info in the VACUUM log doesn't help. And the stats are fairly unreadable, at least to this novice. :-) vacuum - FULL is new for me, and I've been wondering about how to recycle disk space. THANKS. vacuum - ANALYZE - prodcing stats for the QUERY PLANner is also something I wasn'tt aware of. Here's the scenario. We have about 8 tables or so, in 4 regional databases. - two of the tables are updated via on-line transaction from our users. Probably a couple of hundred records a day , tops. - records eligible for archiving are "SELECT INTO" and "DELETE"d about once a quarter. typically 1000 or so. - the rest of the tables are updated from the Mainframe batch cycle, at about 3:00 am local time. - these records NEVER get updated, just more added each day. - the problem tables have about 800,000 records in them. - the daily table getting 3000-5000 new records a day. - once a month, records more than 12 months old are purged, no archive required. - the non-daily table is DROPd and rebuilt monthly. Newly generated records are loaded into it from batch processes that take about two days time , at the start of each month. - thus the "high water" mark is highest just before monthend. - daily volumes are slowly increasing over time, so the "high water mark" is also going higher. VACUUM is run nightly, at about 7:00 pm. This is to make the system ready for UNIX back-ups, at about 9:30 pm. Also, in the past, the VACUUM has taken hours, and seems to lock the database (not just the table). Thus we're reluctant to run VACUUM at 5:00 am, when daily batch finishes. The primary reason for the hours of VACUUM was the non-daily table , above, back when we DELETEd most records, and then reloaded over two days. Deleting 780,000 records of an 800,000 row table creates an IMMENSELY long VACUUM. Hence our DROP and rebuild. I suspect that what I should do is : 1) continue nightly VACUUM's at about 7:00 pm 2) add a weekly VACUUM FULL ANALYZE to be run Saturday evening, ensuring its ready for Sunday FULL disk back-ups. 3) I should run a VACUUM FULL ANALYZE tonight as a one-shot. 4) Consider adding the statement "SET enable_seqscan='false';" to the TWO locations in our on-line system where the problem SELECT's are coded. However, during my testing, SET returns the output "SET VARIABLE" which is being interpretted as the first line of output from the accompanying SELECT statement. Any ideas on how to make the SET statement "quiet" ? Step 4 may not be necessary if Steps 1-3 do the job, btw. Oh, we're a 16/5 operation, with some people dialing in or coming in on weekends. 24/7 is NOT a requirement. Gerry Smit, Toronto, Ontario "Francisco Reyes" To: gerry.smit@lombard.ca <lists@natserv.c cc: ghaverla@freenet.edmonton.ab.ca, pgsql-novice@postgresql.org om> Fax to: Subject: Re: [NOVICE] Answering my own question 15/05/2002 03:29 PM On Wed, 15 May 2002 gerry.smit@lombard.ca wrote: > Now THAT's REALLY fascinating. We upgraded from 6.5 to 7.1 about 4-5 > months ago , and only got this problem after that. I leave "ANALYZE" out of > VACUUM due to the large volume of (I though needless) output that gets > piped to my cron_log. Display of activities is controllable through settings. It is also somewhat silly to not run analyze just because it displays too much data in a log. > Does anyone have a feal for how "necessary" VACUUM ANALYZE is over "VACUUM" > Gerry Like many other things in live the answer is: it depends. Vacuum is about been able to reuse space which was used by rows that have been deleted. Vacuum Full is about recovering the disk space used by deleted rows. Analyze is about re-computing statistics that the analyzer uses to determine best strategies. The answer to your question will be directly related to how much and how often new data is inserted in your database. If you NEVER change your data or make inserts then you NEVER have to run either vacuum or analyze. If you have a very heavily loaded environment with hundreds/thousands transactions per minute, then an hourly vacuum analyze would probably help. Why don you tell us more about your particular scenario? For instance I do ONE big load every morning from a production Foxpro system to my reporting PostgreSQL system. I do one vacuum analyze just after the load, then I compute some stats and populate some files, then another vacuum analyze. At 10pm I do a vacuum full. This works great for me. Before I was doing one vacuum analyze after the stats computations and it was horrible. Something which takes to compute less than an hour if I have done a vacuum analyze took about 4 hours without vacuum analyze and that was just ONE of the things that needed to be done. The first vacuum analyze takes about 50 minutes and the second takes about 8 minutes. A lot less than the 4+ hours of savings from having run vacuum analyze vs doing the stats computations without them.
gerry.smit@lombard.ca writes: > Now THAT's REALLY fascinating. We upgraded from 6.5 to 7.1 about > 4-5 months ago , and only got this problem after that. I leave > "ANALYZE" out of VACUUM due to the large volume of (I though > needless) output that gets piped to my cron_log. > > Does anyone have a feal for how "necessary" VACUUM ANALYZE is over > "VACUUM" ?? The analyze part records statistical information about your data. Without it the planner doesn't have information to decide a better plan. BTW if you are still in 7.1 you should upgrade. 7.2 is smarter about the statistical information it records and is able to make better decisions. Regards, Manuel. ps. yes: mx -> México
Hi all, I'm embarrassed to even ask this, but this being a novice list here it goes. What exactly is Vacuum? Is there an equivalent of this in MS SQL Server? Thanks, Balazs Wellisch
>Now THAT's REALLY fascinating. We upgraded from 6.5 to 7.1 about 4-5 >months ago , and only got this problem after that. I leave "ANALYZE" out of >VACUUM due to the large volume of (I though needless) output that gets >piped to my cron_log. > >Does anyone have a feal for how "necessary" VACUUM ANALYZE is over "VACUUM" >?? As you noticed, pretty necessary :) A vacuum analyze updates the statistics used when the planner decides on what action to take (whether it's a sequential scan, using an index etc). A plain vacuum removes dead tuples from the system, not much else. ----------------- Chris Smith http://www.squiz.net/
As in another post, VACUUM ANALYZE has fixed my index, optimer, query plan problem. A follow-up question I have is about VACUUM FULL. There's no mention of "FULL" as an arguement to VACUUM in the 7.1.3 doc/html on-line reference manual. Was FULL added to a later release? Gerry "Francisco Reyes" To: gerry.smit@lombard.ca <lists@natserv.c cc: ghaverla@freenet.edmonton.ab.ca, pgsql-novice@postgresql.org om> Fax to: Subject: Re: [NOVICE] Answering my own question 15/05/2002 03:29 PM On Wed, 15 May 2002 gerry.smit@lombard.ca wrote: > Now THAT's REALLY fascinating. We upgraded from 6.5 to 7.1 about 4-5 > months ago , and only got this problem after that. I leave "ANALYZE" out of > VACUUM due to the large volume of (I though needless) output that gets > piped to my cron_log. Display of activities is controllable through settings. It is also somewhat silly to not run analyze just because it displays too much data in a log. > Does anyone have a feal for how "necessary" VACUUM ANALYZE is over "VACUUM" > Gerry Like many other things in live the answer is: it depends. Vacuum is about been able to reuse space which was used by rows that have been deleted. Vacuum Full is about recovering the disk space used by deleted rows. Analyze is about re-computing statistics that the analyzer uses to determine best strategies. The answer to your question will be directly related to how much and how often new data is inserted in your database. If you NEVER change your data or make inserts then you NEVER have to run either vacuum or analyze. If you have a very heavily loaded environment with hundreds/thousands transactions per minute, then an hourly vacuum analyze would probably help. Why don you tell us more about your particular scenario? For instance I do ONE big load every morning from a production Foxpro system to my reporting PostgreSQL system. I do one vacuum analyze just after the load, then I compute some stats and populate some files, then another vacuum analyze. At 10pm I do a vacuum full. This works great for me. Before I was doing one vacuum analyze after the stats computations and it was horrible. Something which takes to compute less than an hour if I have done a vacuum analyze took about 4 hours without vacuum analyze and that was just ONE of the things that needed to be done. The first vacuum analyze takes about 50 minutes and the second takes about 8 minutes. A lot less than the 4+ hours of savings from having run vacuum analyze vs doing the stats computations without them.
Well, the VACUUM ANALYZE took 4 hours last night, but it seems to have fixed the problem. EXPLAIN SELECT .... and SELECT from both tables is now using the indicies and gets sub-second response, even with 800,000 rows of data. Now to see whether or not I make VACUUM ANALYZE a nightly thing, or weekly. 4 hours is long, but its been a LONG time since the last one (like never). btw I was confusing VERBOSE with ANALYZE.(Well, this IS the NOVICE list, right?) VERBOSE definitely makes the cron_log output too long to read. ANALYZE produces nothing in the cron log, unless VERBOSE Is given as well. Thanks for all the help so far folks, it's been bang on. Gerry Chris <csmith@squiz.ne To: gerry.smit@lombard.ca t> cc: pgsql-novice@postgresql.org Fax to: 15/05/2002 07:28 Subject: Re: [NOVICE] Answering my own question PM >Now THAT's REALLY fascinating. We upgraded from 6.5 to 7.1 about 4-5 >months ago , and only got this problem after that. I leave "ANALYZE" out of >VACUUM due to the large volume of (I though needless) output that gets >piped to my cron_log. > >Does anyone have a feal for how "necessary" VACUUM ANALYZE is over "VACUUM" >?? As you noticed, pretty necessary :) A vacuum analyze updates the statistics used when the planner decides on what action to take (whether it's a sequential scan, using an index etc). A plain vacuum removes dead tuples from the system, not much else. ----------------- Chris Smith http://www.squiz.net/
> As in another post, VACUUM ANALYZE has fixed my index, optimer, query plan > problem. > > A follow-up question I have is about VACUUM FULL. > > There's no mention of "FULL" as an arguement to VACUUM in the 7.1.3 > doc/html on-line reference manual. Was FULL added to a later release? The changed vacuum in 7.[something]. Among other things, which I am unaware of, vacuum no longer frees physical disk space when cleaning up tables. Instead, it frees space in the file for pg to reuse. VACUUM FULL was then added for those who wanted vacuum to behave as it did before. One nice thing about the new behavior is that it no longer needs to lock the tables during vacuum. Doing a vacuum full will lock the table though. -Adam
Well, this is only my second day on the list, but I don't think "NOVICE" could ever have a question to be embarrased about. VACUUM does 3 things (at least in 7.2) : - VACUUM (alone) takes rows marked for deletion, and frees up the space to be re-used by later INSERTs or UPDATEs. - I'm guessing, but it shure looks like DELETE doesn't physically delete, it logically deletes. - similarly UPDATE doesn't seem to physically update in place, but logically update by marking the old record as "deleteable" and INSERTing (in effect) a new record with your changes. - all of which leaves a lot of Logically deleted, but physically still present , records in your file. - VACUUM FULL apparently goes one step further than VACUUM, and actually frees up the disk space for the operating system. - FULL is NOT an option to VACUUM in 7.1.3 so I'm guessing, based on comments from other emails. - In our shop we're running Postgres 7.1.3 on a Solaris 2.(6?7?8) E3500. After a VACUUM, the unix files in pgsql/data/base are no smaller. - eventually these get so big, its worthwhile to pg_dump , DROP, and reload a volatile table. At which point the unix files are smaller, and response time is improved. - presumably in postgres 7.2+ , FULL was added as an option to VACUUM to do this inherently. In IBM mainframe speak, it somewhat analogous to doing a "freespace" and "de-frag". - VACUUM ANALYZE, as I've learned so well these past two days, provides stats on the remaining rows in your tables, allowing the QUERY PLAN OPTIMIZER to best determine whether or not to use any indicies, and if so, which ones. - VACUUM VERBOSE - just dumps the output of the VACUUM to the calling script or command line. A copy of said output is in your server.log file in any case. Interestingly enough VACUUM VERBOSE ANALYZE doesn't get you any stats from the Analyzer, just the line "Analyzing...." I was hoping for crytpic clues for things like "commonly recurring key, index useless" type messages. Or "too few rows, index useless". Gerry Smit, Toronto, Canada. "Balazs Wellisch" <balazs@bwellisch.com> To: pgsql-novice@postgresql.org Sent by: cc: pgsql-novice-owner@pos Fax to: tgresql.org Subject: Re: [NOVICE] Answering my own question 15/05/2002 06:57 PM Hi all, I'm embarrassed to even ask this, but this being a novice list here it goes. What exactly is Vacuum? Is there an equivalent of this in MS SQL Server? Thanks, Balazs Wellisch ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
VACUUM FULL is NOT available in 7.1.3, so it must be later. Gerry "Adam Erickson" <adamre@cox.net> To: pgsql-novice@postgresql.org Sent by: cc: pgsql-novice-owner@pos Fax to: tgresql.org Subject: Re: [NOVICE] Answering my own question 16/05/2002 11:56 AM > As in another post, VACUUM ANALYZE has fixed my index, optimer, query plan > problem. > > A follow-up question I have is about VACUUM FULL. > > There's no mention of "FULL" as an arguement to VACUUM in the 7.1.3 > doc/html on-line reference manual. Was FULL added to a later release? The changed vacuum in 7.[something]. Among other things, which I am unaware of, vacuum no longer frees physical disk space when cleaning up tables. Instead, it frees space in the file for pg to reuse. VACUUM FULL was then added for those who wanted vacuum to behave as it did before. One nice thing about the new behavior is that it no longer needs to lock the tables during vacuum. Doing a vacuum full will lock the table though. -Adam ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On Thu, 16 May 2002 gerry.smit@lombard.ca wrote: > There's no mention of "FULL" as an arguement to VACUUM in the 7.1.3 > doc/html on-line reference manual. Was FULL added to a later release? Yes. I also recommend you upgrade to 7.2 when you can... actually it is now 7.2.1. :-)
On Wed, 15 May 2002 gerry.smit@lombard.ca wrote: > Here's the scenario. We have about 8 tables or so, in 4 regional databases. Why do you have this "regional" databases? Couldn't they all see one database? Do you have a WAN? > - two of the tables are updated via on-line transaction from our users. > Probably a couple of hundred records a day , tops. > - records eligible for archiving are "SELECT INTO" and "DELETE"d > about once a quarter. typically 1000 or so. These records are so few that I don't even see why you bother archiving them. > - the rest of the tables are updated from the Mainframe batch cycle, at > about 3:00 am local time. > - these records NEVER get updated, just more added each day. Good. > - the problem tables have about 800,000 records in them. > - the daily table getting 3000-5000 new records a day. Not really much data added. > - the non-daily table is DROPd and rebuilt monthly. Why do you drop it? Why not truncate it instead? In particular if you ever decide to use views, dropping the tables will be a problem unless you recreate the views too. > VACUUM is run nightly, at about 7:00 pm. Just plain vacuum? You may want to do a vacuum full. >This is to make the system ready >for UNIX back-ups, at about 9:30 pm. What do you mean by Unix backups? You should NOT be backing up the database from the OS. You should do a pg_dump and then backup that file only. Also, in the past, the VACUUM has > taken hours, and seems to lock the database (not just the table). Thus > we're reluctant to run VACUUM at 5:00 am, when daily batch finishes. That is no longer an issue with 7.2 and higher. Run your vacuum analyze right after your loads and before you start doing any large computations. If those computations create additional files or modify substantial number of records do yet another vacuum analyze. > The primary reason for the hours of VACUUM was the non-daily table , above, > back when we DELETEd most records, and then reloaded over two days. > Deleting 780,000 records of an 800,000 row table creates an IMMENSELY long > VACUUM. Vacuum analyze is fairly quick. >Hence our DROP and rebuild. You can truncate instead. Near instantaneous and doesn't leave any records behind. > I suspect that what I should do is : > 1) continue nightly VACUUM's at about 7:00 pm I think you should do a vacuum full. > 2) add a weekly VACUUM FULL ANALYZE to be run Saturday evening, ensuring > its ready for Sunday FULL disk back-ups. Again you should not be backing up the database files, but the output of pg_dump. > Oh, we're a 16/5 operation, with some people dialing in or coming in on > weekends. 24/7 is NOT a requirement. Good. It would help you a lot if you do nightly vacuum fulls. Specially if space is an issue and/or you have many queries which do sequential scans.
Replies imbedded. btw VACUUM ANALYZE has fixed this problem, For those of you looking for solutions, we might be getting into "interesting background" vs "pgsql problem". AS a new comer to this list, should I be taking this reply off-line? Gerry "Francisco Reyes" To: gerry.smit@lombard.ca <lists@natserv.c cc: ghaverla@freenet.edmonton.ab.ca, "Pgsql Novice" om> <pgsql-novice@postgresql.org> Fax to: 17/05/2002 11:30 Subject: Re: [NOVICE] Answering my own question AM On Wed, 15 May 2002 gerry.smit@lombard.ca wrote: > Here's the scenario. We have about 8 tables or so, in 4 regional databases. Why do you have this "regional" databases? Couldn't they all see one database? Do you have a WAN? Gerry : "Regional" is holdover from when these were on Sun w/s in offices. Now a Single server in Toronto, with 4 DBs, keeps the tables small, controls access, and we're REALLY new to SQL. Plus data is "regional" in scope. Political requirements at the time of the server merge also involved. WAN carries traffic to the regional users. We're toying with ONE national DB, but the index speed problem in Central's table, (the origin of this thread,) kept us thinking response time gets worse exponentially. Thus inflicting 800,000 useless records on Atlantic slows their response for no good reason. We will DEFINITLY be re-thinking that now that we've got "ANALYZE" and this bug figured out. Speed is WAY UP these past two days. > - two of the tables are updated via on-line transaction from our users. > Probably a couple of hundred records a day , tops. > - records eligible for archiving are "SELECT INTO" and "DELETE"d > about once a quarter. typically 1000 or so. These records are so few that I don't even see why you bother archiving them. Gerry : builds up over time, and a novice's understanding that "fewer records means faster table". PLus, occasionally, we delete 20-30 thousand records as an entire Brokerage cancels off. > - the rest of the tables are updated from the Mainframe batch cycle, at > about 3:00 am local time. > - these records NEVER get updated, just more added each day. Good. > - the problem tables have about 800,000 records in them. > - the daily table getting 3000-5000 new records a day. Not really much data added. > - the non-daily table is DROPd and rebuilt monthly. Why do you drop it? Why not truncate it instead? In particular if you ever decide to use views, dropping the tables will be a problem unless you recreate the views too. Gerry : The monthend process that updates the second large table (pol_xref_b) gets updates of 20-90 thousand records at a time. That involved, in 1st implementation (running on postgeSQL 6.5) we issued a DELETE for the older versions of these records, and loading these records. By weekend's completion, you're looking at 780,000 deleted records of 800,000, and then loading in another 780,000 (plus grows with time) records. The VACUUM Sunday night would run 24 hours, locking the Database Monday morning. So 2nd implementation, (ready the next month ,) was to DROP the table, rebuild it, and then just let it load. the 20,000 lost records were for "old" clients, and not considered a primary worry, certainly less so than locking the database during Office hours. Gerry : What is "truncate" ?? Gerry : Not what are "Views", but how to use them, and install the? That's a DB 101 question, and retorical. PostgreSQL for us was a freeware start at SQL. As we learn things, and need things, we'll either find out how Postgres does them, or jump to a commercial DB where we can hire contractor's for turn key programming (including some knowledge transfer of WHY they built whatever they might design). Room for a plug here, anyone know of commercial Postrges support? > VACUUM is run nightly, at about 7:00 pm. Just plain vacuum? You may want to do a vacuum full. Gerry : not available in 7.1.3 If all it does is free up disk space, not critical at this time. And the team has more things to do. Fixing something that isn't "broken" goes down in priority. (Sigh). >This is to make the system ready >for UNIX back-ups, at about 9:30 pm. What do you mean by Unix backups? You should NOT be backing up the database from the OS. You should do a pg_dump and then backup that file only. Gerry : Current back-up / restore plan is, should something happen that the RAID-5 can't handle, we'ld restore the entire $PGDIR. Yes it might have changed during back-up, but we'll take that chance. Interesting that 7.1 changed unix file names from table names to meaningless numbers, thus hiding and preventing the ability to individually restore damaged tables. From what I've seen, there's enough other data in other files that this is an ABSOLUTE NO NO. Gerry : Unix back-ups are generic, full on Sunday, incremental other nights. Postgres resides in a UFS. Implementing a pg_dump cron-script makes sense, given the above comments (yours and mine). Also, in the past, the VACUUM has > taken hours, and seems to lock the database (not just the table). Thus > we're reluctant to run VACUUM at 5:00 am, when daily batch finishes. That is no longer an issue with 7.2 and higher. Run your vacuum analyze right after your loads and before you start doing any large computations. If those computations create additional files or modify substantial number of records do yet another vacuum analyze. Gerry : That puts 7.2.1 on the "mid-burner" for us at this point. Better than "back-burner", like too many other things. Also, our first VACUUM ANALYZE, run Wednesday night took about 5 hours for all DBs. Thursday night VACUUM ANALYZE took the usual half hour or so. So your point of VACUUM ANALYZE following 5:00 am batch updates, prior to on-line usage makes sense, and would help end-users. Thanks. > The primary reason for the hours of VACUUM was the non-daily table , above, > back when we DELETEd most records, and then reloaded over two days. > Deleting 780,000 records of an 800,000 row table creates an IMMENSELY long > VACUUM. Vacuum analyze is fairly quick. Gerry : Noted, and now seen, even in 7.1.3 ! :-) >Hence our DROP and rebuild. You can truncate instead. Near instantaneous and doesn't leave any records behind. Gerry : I just read about TRUNCATE in the 7.1.3 on-line manual. I'll definitly try that next in lieu of DROP next month-end. The worry is that the VACUUM of the emptied table may take a long time. The DROP, rebuild PREVENTS this, as there is NOTHING to VACUUM, just a new, empty table. Something has to "free up" those rows, and if it's VACUUM, then I'm looking at 24 hours again. Any comments? Or is that my postgres 6.5 experience and not relevant to postgreSQL 7.x ? Gerry : btw what do people LIKE to refer to Postgres as in writing? Postgres, postgres, postgreSQL (one s or two?), pgsql? > I suspect that what I should do is : > 1) continue nightly VACUUM's at about 7:00 pm I think you should do a vacuum full. Gerry : Agree, but not available until we upgrade to 7.2.1 > 2) add a weekly VACUUM FULL ANALYZE to be run Saturday evening, ensuring > its ready for Sunday FULL disk back-ups. Again you should not be backing up the database files, but the output of pg_dump. Gerry : Thanks. Such a cron-script has been on my mind for a while. RAID-5 is our primary protection. restore is our secondary. Clearly, I can't count on tape restore of $PGDIR/data/base, so I'll work on said script , when I have time (ouch sigh). > Oh, we're a 16/5 operation, with some people dialing in or coming in on > weekends. 24/7 is NOT a requirement. Good. It would help you a lot if you do nightly vacuum fulls. Specially if space is an issue and/or you have many queries which do sequential scans. Gerry : We CERTAINLY have MANY MANY queries, do they create space problems? Or are we referring to the Optimizer's poor choice of sequential vs index scans (now solved since I implemented VACUUM ANALYZE)? Disk space, whilst cheap, is NOT unlimited. SO if the QUERIES are wasting space, then a VACUUM FULL would be useful. Poor man's version, pg_dump. pg_ctl stop. deletedb "x"; createdb "x", pg_ctl start, reload, VACUUM ANALYZE. There's a weekend I'ld rather be camping. Gerry Smit, Toronto, Canada (Go Leafs Go!) (For gawd's sake, go! Begone!)