Thread: updating all records of a table
Hi:
I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentation when this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want the resulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those?
(I remember the bad-ole days with Oracle where table defragging and index rebuilding was something we had to do)
Thanks for any help !
On Thu, 2011-03-03 at 20:03 -0700, Gauthier, Dave wrote: > Hi: > > I have to update all the records of a table. I'm worried about what > the table will look like in terms of fragmentation when this is > finished. Is there some sort of table healing/reorg/rebuild measure I > should take if I want the resulting table to operate at optimal > efficiency? What about indexes, should I drop/recreate those? Well it depends on the size of table but yes it is going to create a lot of dead space. A cluster or reindex of the table will solve this for you. JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote: > Hi: > > I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentationwhen this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want theresulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those? Is it really important that it happen in one transaction? In the past when I've had to do this on large numbers of rows, I always tried to do it in batches. You can run vacuums in between groups, so that the table doesn't get too bloated. Otherwise, yeah, you're better off to do some of the cleanup Joshua suggested. A -- Andrew Sullivan ajs@crankycanuck.ca
On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote: > On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote: >> Hi: >> >> I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentationwhen this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want theresulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those? > > Is it really important that it happen in one transaction? > > In the past when I've had to do this on large numbers of rows, I > always tried to do it in batches. You can run vacuums in between > groups, so that the table doesn't get too bloated. > > Otherwise, yeah, you're better off to do some of the cleanup Joshua > suggested. > > A +1 If UPDATE is for all rows, then 1. CTAS with change value in SELECT 2. Rename the tables. -- This will give zero Bloats. Thanks & Regards, Vibhor Kumar EnterpriseDB Corporation The Enterprise PostgreSQL Company vibhor.kumar@enterprisedb.com Blog:http://vibhork.blogspot.com
On 03/04/2011 04:54 AM, Vibhor Kumar wrote: > > On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote: > >> On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote: >>> Hi: >>> >>> I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentationwhen this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want theresulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those? >> >> Is it really important that it happen in one transaction? >> >> In the past when I've had to do this on large numbers of rows, I >> always tried to do it in batches. You can run vacuums in between >> groups, so that the table doesn't get too bloated. >> >> Otherwise, yeah, you're better off to do some of the cleanup Joshua >> suggested. >> >> A > +1 > > If UPDATE is for all rows, then > 1. CTAS with change value in SELECT > 2. Rename the tables. -- This will give zero Bloats. > Elegant, but of course, this doubles the disk space consumed. Not always tenable.
I like the "cluster" and "reindex" ideas. The table is not that big and I do have the disk space. This table will alsogrow over time, so if the table ends up taking more space in the end, that's OK, it'll get used. The DB will also beunavailable to the users while this is happening, so I won't have to be contending with interactive users. Thanks for all the suggestions! -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Rob Sargent Sent: Friday, March 04, 2011 10:21 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] updating all records of a table On 03/04/2011 04:54 AM, Vibhor Kumar wrote: > > On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote: > >> On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote: >>> Hi: >>> >>> I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentationwhen this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want theresulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those? >> >> Is it really important that it happen in one transaction? >> >> In the past when I've had to do this on large numbers of rows, I >> always tried to do it in batches. You can run vacuums in between >> groups, so that the table doesn't get too bloated. >> >> Otherwise, yeah, you're better off to do some of the cleanup Joshua >> suggested. >> >> A > +1 > > If UPDATE is for all rows, then > 1. CTAS with change value in SELECT > 2. Rename the tables. -- This will give zero Bloats. > Elegant, but of course, this doubles the disk space consumed. Not always tenable. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
robjsargent@gmail.com (Rob Sargent) writes: > On 03/04/2011 04:54 AM, Vibhor Kumar wrote: >> >> On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote: >> >>> On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote: >>>> Hi: >>>> >>>> I have to update all the records of a table. I'm worried about >>>> what the table will look like in terms of fragmentation when this >>>> is finished. Is there some sort of table healing/reorg/rebuild >>>> measure I should take if I want the resulting table to operate at >>>> optimal efficiency? What about indexes, should I drop/recreate >>>> those? >>> >>> Is it really important that it happen in one transaction? >>> >>> In the past when I've had to do this on large numbers of rows, I >>> always tried to do it in batches. You can run vacuums in between >>> groups, so that the table doesn't get too bloated. >>> >>> Otherwise, yeah, you're better off to do some of the cleanup Joshua >>> suggested. >>> >>> A >> +1 >> >> If UPDATE is for all rows, then >> 1. CTAS with change value in SELECT >> 2. Rename the tables. -- This will give zero Bloats. >> > > Elegant, but of course, this doubles the disk space consumed. Not > always tenable. ... But if you needed to do it in one Swell Foop, there really wasn't any other choice. The only way *not* to double (or more) space consumption is to do incremental updates, vacuuming around each increment. -- select 'cbbrowne' || '@' || 'linuxdatabases.info'; http://www3.sympatico.ca/cbbrowne/lisp.html We are MICROS~1. You will be assimilated. Resistance is futile. (Attributed to B.G., Gill Bates)
This has been a great thread! I am missing something because I do not know what CTAS is. WOuld someone please help me understand. ray
On Sat, Mar 05, 2011 at 07:38:23AM -0800, ray wrote: > This has been a great thread! I am missing something because I do not > know what CTAS is. WOuld someone please help me understand. Create Table As Select. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle
Attachment
- fillfactor (integer)
The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage; the remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. For a table whose entries are never updated, complete packing is the best choice, but in heavily updated tables smaller fillfactors are appropriate. This parameter cannot be set for TOAST tables.--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw