Re: The fastest way to update thousands of rows in moderately sized table - Mailing list pgsql-general

From Adrian Klaver
Subject Re: The fastest way to update thousands of rows in moderately sized table
Date
Msg-id 55B154DB.30305@aklaver.com
Whole thread Raw
In response to The fastest way to update thousands of rows in moderately sized table  (twoflower <standa.kurik@gmail.com>)
Responses Re: The fastest way to update thousands of rows in moderately sized table
List pgsql-general
On 07/23/2015 01:17 PM, twoflower wrote:
> Hello, I have a table with 30 million records in which I need to update
> a single column for a couple of thousands of rows, let's say 10 000. The
> new column value is identical for all matching rows. Doing
>
> |update "TRANSLATION" set fk_assignment where fk_job = 1000; |
>
> takes 45 seconds. I understand that |UPDATE| is basically an |INSERT|
> followed by |DELETE| but I was hoping I could do better than that. I
> found a suggestion to use a temporary table to speed things up, so now I
> have this:
>
> |create unlogged table "temp_table" as
> select id, fk_assignment
> from "TRANSLATION"
> where fk_job = 1000;
>
> update "temp_table" set fk_assignment = null;
>
> update "TRANSLATION" _target
> set fk_assignment = _source.fk_assignment
> from "temp_table" _source
> where _target.id = _source.id;
>
> drop table "temp_table";

Have you tried wrapping the above in a BEGIN/COMMIT block?

So:

BEGIN;

create unlogged table "temp_table" as
select id, fk_assignment
from "TRANSLATION"
where fk_job = 1000;

update "temp_table" set fk_assignment = null;

update "TRANSLATION" _target
set fk_assignment = _source.fk_assignment
from "temp_table" _source
where _target.id = _source.id;

drop table "temp_table";

COMMIT;

> |
>
> This got me to about 37 seconds. Still pretty slow. The |TRANSLATION|
> has an index and a foreign key constraint on |fk_assignment|. Removing
> the constraint brought very little benefit. Removing the index is
> probably out of question as these kind of operations are very frequent
> and the table itself is used heavily, including the index. Execution plan:
>
> |Update on "TRANSLATION" _target (cost=0.56..116987.76 rows=13983
> width=405) (actual time=43262.266..43262.266 rows=0 loops=1)
>   -> Nested Loop (cost=0.56..116987.76 rows=13983 width=405) (actual
> time=0.566..146.084 rows=8920 loops=1)
>     -> Seq Scan on temp_segs _source (cost=0.00..218.83 rows=13983
> width=22) (actual time=0.457..13.994 rows=8920 loops=1)
>     -> Index Scan using "TRANSLATION_pkey" on "TRANSLATION" _target
> (cost=0.56..8.34 rows=1 width=391) (actual time=0.009..0.011 rows=1
> loops=8920)
>        Index Cond: (id = _source.id)
>
> Planning time: 1.167 ms
> Execution time: 43262.577 ms
> |
> Is there anything else worth trying? Are these numbers something to be
> expected, from your experience?
>
> I have Postgres 9.4, the database is on SSD.
>
> Thank you very much for any suggestions.
>
> Standa
> ------------------------------------------------------------------------
> View this message in context: The fastest way to update thousands of
> rows in moderately sized table
> <http://postgresql.nabble.com/The-fastest-way-to-update-thousands-of-rows-in-moderately-sized-table-tp5859144.html>
> Sent from the PostgreSQL - general mailing list archive
> <http://postgresql.nabble.com/PostgreSQL-general-f1843780.html> at
> Nabble.com.


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Using the database to validate data
Next
From: Tim Clarke
Date:
Subject: Re: Using the database to validate data