Re: Slow update - Mailing list pgsql-general

From Marc Mamin
Subject Re: Slow update
Date
Msg-id C4DAC901169B624F933534A26ED7DF311D5368@JENMAIL01.ad.intershop.net
Whole thread Raw
In response to Re: Slow update  (Herouth Maoz <herouth@unicell.co.il>)
List pgsql-general
> I don't see why it would
 
This may reduce I/O activity and reduce the vacuum activity on this table.
 
Here a small example:

insert into update_test select * from generate_series (1,100000)
vacuum full verbose update_test
-> INFO:  "update_test": found 0 removable, 100000 nonremovable row versions in 393 pages
 
--now update one row:
-- or in you case, only the rows that would get modified (my query proposal)
 
update update_test set a=1 where a=1;
 
vacuum full verbose update_test
-> INFO:  "update_test": found 1 removable, 100000 nonremovable row versions in 393 pages
 
--update all rows
-- or in your case, all rows that match your update query
update update_test set a=a
vacuum full verbose update_test
 
-> INFO:  "update_test": found 100000 removable, 100000 nonremovable row versions in 785 pages
 

Adding elements in the where clause will slow down the "recheck" operations, but your indexes will probably be used as in your query.
 
While limiting the number of rows being updated, you will reduce I/O activity and reduce the need of vacuuming your table...
This approach may be superfluous if the extra conditions do not reduce the number of updated rows significantly...
 
 
 
cheers,
 
Marc
 


From: Herouth Maoz [mailto:herouth@unicell.co.il]
Sent: Wednesday, January 21, 2009 12:50 PM
To: Marc Mamin
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Slow update

Marc Mamin wrote:

Hello,
 
- did you vacuum your tables recently ?
 
- What I miss in your query is a check for the rows that do not need to be udated:
 
AND NOT (service = b.service
               AND status = b.status
              AND has_notification = gateway_id NOT IN (4,101,102)
              AND operator = COALESCE( b.actual_target_network_id,  b.requested_target_network_id   )
 
 
depending on the fraction of rows that are already up to date, the might fasten your process quite a lot...
I don't see why it would. As far as I know, the high saving in update time is done by using the indices. All the other conditions that are not on indices are all checked using a sequential scan on the rows that were brought from the index, so adding more conditions wouldn't make this a lot faster - maybe even slower because more comparisons are made.

In any case, the logic of the database is that the records that have delivered = 0 are always a subset of the records that are changed in this query, so querying on delivered=0 - which is an indexed query - actually make the above redundant.

Thanks for your response,
Herouth

pgsql-general by date:

Previous
From: Igor Katson
Date:
Subject: Re: [Plproxy-users] A complex plproxy query
Next
From: Kent Tong
Date:
Subject: Re: how to specify the locale that psql uses