Re: Long Running Update - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: Long Running Update
Date
Msg-id 4E034E78020000250003EB37@gw.wicourts.gov
Whole thread Raw
In response to Long Running Update  (Harry Mantheakis <harry.mantheakis@riskcontrollimited.com>)
Responses Re: Long Running Update
List pgsql-performance
Harry Mantheakis <harry.mantheakis@riskcontrollimited.com> wrote:

> UPDATE
>    table_A
> SET
> (
>    field_1
> , field_2
> ) = (
> table_B.field_1
> , table_B.field_2
> )
> FROM
> table_B
> WHERE
> table_B.id = table_A.id
> ;

I would have just done:

  SET field_1 = table_B.field_1, field_2 = table_B.field_2

instead of using row value constructors.  That might be slowing
things down a bit.

> I tested (the logic of) this statement with a very small sample,
> and it worked correctly.

Always a good sign.  :-)

> The statement has been running for 18+ hours so far.

> My question is: can I reasonably expect a statement like this to
> complete with such a large data-set, even if it takes several
> days?

If it's not leaking memory, I expect that it will complete.

To get some sense of what it's doing, you could log on to another
connection and EXPLAIN the statement.  (NOTE: Be careful *not* to
use EXPLAIN ANALYZE.)

Another thing to consider if you run something like this again is
that an UPDATE is an awful lot like an INSERT combined with a
DELETE.  The way PostgreSQL MVCC works, the old version of each row
must remain until the updating transaction completes.  If you were
to divide this update into a series of updates by key range, the new
versions of the rows from later updates could re-use the space
previously occupied by the old version of rows from earlier updates.
For similar reasons, you might want to add something like this to
your WHERE clause, to prevent unnecessary updates:

  AND (table_B.field_1 IS DISTINCT FROM table_A.field_1
    OR table_B.field_2 IS DISTINCT FROM table_A.field_2);

-Kevin

pgsql-performance by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Long Running Update
Next
From: Tripura
Date:
Subject: Re: Improve the Postgres Query performance