Re: [GENERAL] Index impact on update? - Mailing list pgsql-general

From Tomas Vondra
Subject Re: [GENERAL] Index impact on update?
Date
Msg-id dfbdbe34-9037-48cc-1623-ab7fdc8fcf29@2ndquadrant.com
Whole thread Raw
In response to [GENERAL] Index impact on update?  (Israel Brewster <israel@ravnalaska.net>)
List pgsql-general
On 01/04/2017 05:59 PM, Israel Brewster wrote:
> Short version:
> Do indexes impact the speed of an UPDATE, even when the indexed columns
> aren't changing?
>

They shouldn't, as long as the updated tuple can be updated on the same
page (8kB chunk of data). In that case we can do a HOT update for the
row, without updating the index(es).

But as you're updating the whole table, that would require about 50% of
all pages to be free, which is unlikely to be true. So perhaps some
updates can proceed without touching indexes, but most can't.

> Details:
> I have a table containing geographical data (Latitude, longitude, and
> elevation) with 406,833,705 records. The Latitude and Longitude columns
> are indexed. In order to better utilize the data, I've been looking into
> PostGIS, and decided I wanted to add a "Location" column with PostGIS
> type "GEOGRAPHY(point)". I then tried to populate it from the existing
> latitude/longitude data using the following query:
>
> UPDATE data SET
> location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||'
> '||lat::text||')');
>
> I expected this update to take quite a while, since it has 406 million
> rows to update, but at this point it's been over 19 hours since I
> started the query, and it still hasn't completed.
>
> I'm wondering if the presence of the indexes could be slowing things
> down even though the indexed columns aren't being updated? Would I be
> better off canceling the update query, dropping the indexes, and trying
> again? Or is more likely that the update query is "almost" done, and it
> would be better to just let it run it's course? Or is there an even
> better option, such as perhaps exporting the data, adding the additional
> column in a text editor, and re-importing the data with a COPY command?
>

As explained above, it's likely that such full-table update has to
modify the indexes anyway, making it much more expensive. Without
additional information it's however impossible to confirm that's what's
causing the long update in this case - there may be other bits slowing
it down - e.g. foreign keys checks, triggers.

CREATE TABLE AS SELECT would not pay any of those costs, of course.
Also, if you're running with wal_level=minimal, it would not have to
write the changes into WAL, while the regular UPDATE has to do that.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-general by date:

Previous
From: DrakoRod
Date:
Subject: [GENERAL] Re: could not load library "$libdir/sslutils": in pg_upgradeprocess
Next
From: Vitaly Burovoy
Date:
Subject: Re: [GENERAL] [BUGS] Postgresql query HAVING do not work