Thread: Re: [ SOLVED ] select count(*) very slow on an already
Bill, if you had alot of updates and deletions and wanted to optimize your table, can you just issue the cluster command. Will the cluster command rewrite the table without the obsolete data that a vacuum flags or do you need to issue a vacuum first? Dan. -----Original Message----- From: Bill Moran [mailto:wmoran@potentialtech.com] Sent: Thursday, April 15, 2004 2:49 PM To: Rajesh Kumar Mallah Cc: Postgres Performance Subject: Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already Rajesh Kumar Mallah wrote: > Bill Moran wrote: > >> Rajesh Kumar Mallah wrote: >> >>> Hi, >>> >>> The problem was solved by reloading the Table. >>> the query now takes only 3 seconds. But that is >>> not a solution. >> >> If dropping/recreating the table improves things, then we can reasonably >> assume that the table is pretty active with updates/inserts. Correct? > > Yes the table results from an import process and under goes lots > of inserts and updates , but thats before the vacuum full operation. > the table is not accessed during vacuum. What i want to know is > is there any wat to automate the dumping and reload of a table > individually. will the below be safe and effective: The CLUSTER command I described is one way of doing this. It essentially automates the task of copying the table, dropping the old one, and recreating it. >> If the data gets too fragmented, a vacuum may not be enough. Also, read >> up on the recommendations _against_ vacuum full (recommending only using >> vacuum on databases) With full, vacuum condenses the database, which may >> actually hurt performance. A regular vacuum just fixes things up, and >> may leave unused space lying around. However, this should apparently >> achieve a balance between usage and vacuum. See the docs, they are much >> better at describing this than I am. >> > i understand simultaneous vacuum and usage detoriates performance mostly. > but this case is different. Just want to make sure we're on the same page here. I'm not talking about vacuuming simultaneous with anything. I'm simply saying that "vacuum full" isn't always the best choice. You should probably only be doing "vacuum". The reason and details for this are in the admin docs. -- Bill Moran Potential Technologies http://www.potentialtech.com ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
Shea,Dan [CIS] wrote: > Bill, if you had alot of updates and deletions and wanted to optimize your > table, can you just issue the cluster command. > Will the cluster command rewrite the table without the obsolete data that a > vacuum flags or do you need to issue a vacuum first? From the reference docs: "During the cluster operation, a temporary copy of the table is created that contains the table data in the index order. Temporary copies of each index on the table are created as well. Therefore, you need free space on disk at least equal to the sum of the table size and the index sizes. "CLUSTER preserves GRANT, inheritance, index, foreign key, and other ancillary information about the table. "Because the optimizer records statistics about the ordering of tables, it is advisable to run ANALYZE on the newly clustered table. Otherwise, the optimizer may make poor choices of query plans." The primary reason CLUSTER exists is to allow you to physically reorder a table based on a key. This should provide a performance improvement if data with the same key is accessed all at once. (i.e. if you do "SELECT * FROM table WHERE key=5" and it returns 100 rows, those 100 rows are guaranteed to be all on the same part of the disk after CLUSTER, thus a performance improvement should result.) Updates and inserts will add data in the next available space in a table with no regard for any keys, and _may_ require running all over the disk to retrieve the data in the previous example query. I doubt if CLUSTER is an end-all optimization tool. The specific reason I suggested it was because the original poster was asking for an easier way to drop/recreate a table (as prior experimentation had shown this to improve performance) I can't think of anything easier than "CLUSTER <tablename> ON <keyname>" Since CLUSTER recreates the table, it implicitly removes the dead tuples. However, it's going to be a LOT slower than vacuum, so if dead tuples are the main problem, vacuum is still the way to go. -- Bill Moran Potential Technologies http://www.potentialtech.com