Re: Long Running Update - Mailing list pgsql-performance

From Harry Mantheakis
Subject Re: Long Running Update
Date
Msg-id 4E047210.4020400@riskcontrollimited.com
Whole thread Raw
In response to Re: Long Running Update  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: Long Running Update
Re: Long Running Update
List pgsql-performance
Thank you Kevin.

 > SET field_1 = table_B.field_1, field_2 = table_B.field_2

I will try that, if I have to next time.

 > add something like this toyour WHERE clause,
 > to prevent unnecessary updates:
 >
 >   AND (table_B.field_1 IS DISTINCT FROM table_A.field_1
 >    OR table_B.field_2 IS DISTINCT FROM table_A.field_2);

Thank you for that explanation - I will keep that in mind in future. (In
this case, the two fields that are being updated are all known to be
empty - hence, distinct - in the target table.)

 > EXPLAIN the statement

Here is the EXPLAIN result:

----------------------------------------------------------------------
QUERY PLAN
----------------------------------------------------------------------
Hash Join  (cost=2589312.08..16596998.47 rows=74558048 width=63)
Hash Cond: (table_A.id = table_B.id)
->  Seq Scan on table_A(cost=0.00..1941825.05 rows=95612705 width=47)
->  Hash  (cost=1220472.48..1220472.48 rows=74558048 width=20)
->  Seq Scan on table_B(cost=0.00..1220472.48 rows=74558048 width=20)
----------------------------------------------------------------------

The documentation says the 'cost' numbers are 'units of disk page fetches'.

Do you, by any chance, have any notion of how many disk page fetches can
be processed per second in practice - at least a rough idea?

IOW how do I convert - guesstimate! - these numbers into (plausible)
time values?

Kind regards

Harry Mantheakis
London, UK


pgsql-performance by date:

Previous
From: Tripura
Date:
Subject: Re: Improve the Postgres Query performance
Next
From: Harry Mantheakis
Date:
Subject: Re: Long Running Update