Home > mailing lists

Re: UPDATE many records - Mailing list pgsql-general

From	Adrian Klaver
Subject	Re: UPDATE many records
Date	January 6, 2020 19:07:14
Msg-id	0dc9dec2-8227-f489-b38a-f034b8746ae6@aklaver.com Whole thread Raw
In response to	UPDATE many records (Israel Brewster <ijbrewster@alaska.edu>)
List	pgsql-general

Tree view

On 1/6/20 10:36 AM, Israel Brewster wrote:
> Thanks to a change in historical data, I have a need to update a large 
> number of records (around 50 million). The update itself is straight 
> forward, as I can just issue an "UPDATE table_name SET 
> changed_field=new_value();" (yes, new_value is the result of a stored 
> procedure, if that makes a difference) command via psql, and it should 
> work. However, due to the large number of records this command will 
> obviously take a while, and if anything goes wrong during the update 
> (one bad value in row 45 million, lost connection, etc), all the work 
> that has been done already will be lost due to the transactional nature 
> of such commands (unless I am missing something).
> 
> Given that each row update is completely independent of any other row, I 
> have the following questions:
> 
> 1) Is there any way to set the command such that each row change is 
> committed as it is calculated?

Pretty sure:
     UPDATE table_name SET changed_field=new_value();
is seen as a single statement and is all or none.

If you want to go row by row you will need to have the statement run on 
a row by row basis or maybe in batches.

> 2) Is there some way to run this command in parallel in order to better 
> utilize multiple processor cores, other than manually breaking the data 
> into chunks and running a separate psql/update process for each chunk? 
> Honestly, manual parallelizing wouldn’t be too bad (there are a number 
> of logical segregations I can apply), I’m just wondering if there is a 
> more automatic option.

This is good time to ask what Postgres version?

I am still working out the recent parallel query system additions. Not 
sure if it applies to UPDATE or not.

> ---
> Israel Brewster
> Software Engineer
> Alaska Volcano Observatory
> Geophysical Institute - UAF
> 2156 Koyukuk Drive
> Fairbanks AK 99775-7320
> Work: 907-474-5172
> cell:  907-328-9145
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

From: Justin
Date: 06 January 2020, 19:05:04
Subject: Re: UPDATE many records

From: Christopher Browne
Date: 06 January 2020, 19:08:09
Subject: Re: UPDATE many records

Re: UPDATE many records - Mailing list pgsql-general

Previous

Next