Thread: updating all records of a table

updating all records of a table

From
"Gauthier, Dave"
Date:

Hi:

 

I have to update all the records of a table.  I'm worried about what the table will look like in terms of fragmentation when this is finished.  Is there some sort of table healing/reorg/rebuild measure I should take if I want the resulting table to operate at optimal efficiency?  What about indexes, should I drop/recreate those?

 

(I remember the bad-ole days with Oracle where table defragging and index rebuilding was something we had to do)

 

Thanks  for any help !

Re: updating all records of a table

From
"Joshua D. Drake"
Date:
On Thu, 2011-03-03 at 20:03 -0700, Gauthier, Dave wrote:
> Hi:
>
> I have to update all the records of a table.  I'm worried about what
> the table will look like in terms of fragmentation when this is
> finished.  Is there some sort of table healing/reorg/rebuild measure I
> should take if I want the resulting table to operate at optimal
> efficiency?  What about indexes, should I drop/recreate those?

Well it depends on the size of table but yes it is going to create a lot
of dead space. A cluster or reindex of the table will solve this for
you.

JD

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


Re: updating all records of a table

From
Andrew Sullivan
Date:
On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
> Hi:
>
> I have to update all the records of a table.  I'm worried about what the table will look like in terms of
fragmentationwhen this is finished.  Is there some sort of table healing/reorg/rebuild measure I should take if I want
theresulting table to operate at optimal efficiency?  What about indexes, should I drop/recreate those? 

Is it really important that it happen in one transaction?

In the past when I've had to do this on large numbers of rows, I
always tried to do it in batches.  You can run vacuums in between
groups, so that the table doesn't get too bloated.

Otherwise, yeah, you're better off to do some of the cleanup Joshua
suggested.

A

--
Andrew Sullivan
ajs@crankycanuck.ca

Re: updating all records of a table

From
Vibhor Kumar
Date:
On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:

> On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
>> Hi:
>>
>> I have to update all the records of a table.  I'm worried about what the table will look like in terms of
fragmentationwhen this is finished.  Is there some sort of table healing/reorg/rebuild measure I should take if I want
theresulting table to operate at optimal efficiency?  What about indexes, should I drop/recreate those? 
>
> Is it really important that it happen in one transaction?
>
> In the past when I've had to do this on large numbers of rows, I
> always tried to do it in batches.  You can run vacuums in between
> groups, so that the table doesn't get too bloated.
>
> Otherwise, yeah, you're better off to do some of the cleanup Joshua
> suggested.
>
> A
+1

If UPDATE is for all rows, then
1. CTAS with change value in SELECT
2. Rename the tables. -- This will give zero Bloats.


Thanks & Regards,
Vibhor Kumar
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
vibhor.kumar@enterprisedb.com
Blog:http://vibhork.blogspot.com


Re: updating all records of a table

From
Rob Sargent
Date:

On 03/04/2011 04:54 AM, Vibhor Kumar wrote:
>
> On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:
>
>> On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
>>> Hi:
>>>
>>> I have to update all the records of a table.  I'm worried about what the table will look like in terms of
fragmentationwhen this is finished.  Is there some sort of table healing/reorg/rebuild measure I should take if I want
theresulting table to operate at optimal efficiency?  What about indexes, should I drop/recreate those? 
>>
>> Is it really important that it happen in one transaction?
>>
>> In the past when I've had to do this on large numbers of rows, I
>> always tried to do it in batches.  You can run vacuums in between
>> groups, so that the table doesn't get too bloated.
>>
>> Otherwise, yeah, you're better off to do some of the cleanup Joshua
>> suggested.
>>
>> A
> +1
>
> If UPDATE is for all rows, then
> 1. CTAS with change value in SELECT
> 2. Rename the tables. -- This will give zero Bloats.
>

Elegant, but of course, this doubles the disk space consumed.  Not
always tenable.

Re: updating all records of a table

From
"Gauthier, Dave"
Date:
I like the "cluster" and "reindex" ideas.  The table is not that big and I do have the disk space.  This table will
alsogrow over time, so if the table ends up taking more space in the end, that's OK, it'll get used.  The DB will also
beunavailable to the users while this is happening, so I won't have to be contending with interactive users. 

Thanks for all the suggestions!

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Rob Sargent
Sent: Friday, March 04, 2011 10:21 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] updating all records of a table



On 03/04/2011 04:54 AM, Vibhor Kumar wrote:
>
> On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:
>
>> On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
>>> Hi:
>>>
>>> I have to update all the records of a table.  I'm worried about what the table will look like in terms of
fragmentationwhen this is finished.  Is there some sort of table healing/reorg/rebuild measure I should take if I want
theresulting table to operate at optimal efficiency?  What about indexes, should I drop/recreate those? 
>>
>> Is it really important that it happen in one transaction?
>>
>> In the past when I've had to do this on large numbers of rows, I
>> always tried to do it in batches.  You can run vacuums in between
>> groups, so that the table doesn't get too bloated.
>>
>> Otherwise, yeah, you're better off to do some of the cleanup Joshua
>> suggested.
>>
>> A
> +1
>
> If UPDATE is for all rows, then
> 1. CTAS with change value in SELECT
> 2. Rename the tables. -- This will give zero Bloats.
>

Elegant, but of course, this doubles the disk space consumed.  Not
always tenable.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: updating all records of a table

From
Chris Browne
Date:
robjsargent@gmail.com (Rob Sargent) writes:
> On 03/04/2011 04:54 AM, Vibhor Kumar wrote:
>>
>> On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:
>>
>>> On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
>>>> Hi:
>>>>
>>>> I have to update all the records of a table.  I'm worried about
>>>> what the table will look like in terms of fragmentation when this
>>>> is finished.  Is there some sort of table healing/reorg/rebuild
>>>> measure I should take if I want the resulting table to operate at
>>>> optimal efficiency?  What about indexes, should I drop/recreate
>>>> those?
>>>
>>> Is it really important that it happen in one transaction?
>>>
>>> In the past when I've had to do this on large numbers of rows, I
>>> always tried to do it in batches.  You can run vacuums in between
>>> groups, so that the table doesn't get too bloated.
>>>
>>> Otherwise, yeah, you're better off to do some of the cleanup Joshua
>>> suggested.
>>>
>>> A
>> +1
>>
>> If UPDATE is for all rows, then
>> 1. CTAS with change value in SELECT
>> 2. Rename the tables. -- This will give zero Bloats.
>>
>
> Elegant, but of course, this doubles the disk space consumed.  Not
> always tenable.

... But if you needed to do it in one Swell Foop, there really wasn't
any other choice.

The only way *not* to double (or more) space consumption is to do
incremental updates, vacuuming around each increment.
--
select 'cbbrowne' || '@' || 'linuxdatabases.info';
http://www3.sympatico.ca/cbbrowne/lisp.html
We  are MICROS~1.   You will  be assimilated.   Resistance  is futile.
(Attributed to B.G., Gill Bates)

Re: updating all records of a table

From
ray
Date:
This has been a great thread!  I am missing something because I do not
know what CTAS is.  WOuld someone please help me understand.

ray

Re: updating all records of a table

From
Martijn van Oosterhout
Date:
On Sat, Mar 05, 2011 at 07:38:23AM -0800, ray wrote:
> This has been a great thread!  I am missing something because I do not
> know what CTAS is.  WOuld someone please help me understand.

Create Table As Select.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Attachment

Re: updating all records of a table

From
Willy-Bas Loos
Date:
you might consider lowering the fillfactor a bit. It will consume more space, but it will make updates and inserts faster.

fillfactor (integer)

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage; the remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. For a table whose entries are never updated, complete packing is the best choice, but in heavily updated tables smaller fillfactors are appropriate. This parameter cannot be set for TOAST tables.--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw