Home > mailing lists

Thread: Index not being used for UPDATE?

Index not being used for UPDATE?

From

Phoenix Kiula

Date:

27 April 2011, 13:16:16

Hi.

Just want to check why, in an UPDATE sql, the JOIN condition is not making use of the index?

In both tables being joined, the column in question is in fact the primary key!

Table structure and query below. All I want is to take values from a smaller "accesscount" table and update from it the values in the TABLE1 table, which is a larger table.

The query plan shows sequential scan of both the tables. Why is this and how can I work around it?

Thanks!

Table "public.TABLE1"

QUERY PLAN
------------------------------------------------------------------------------------------
Update (cost=22985.69..1088981.66 rows=613453 width=173)
-> Hash Join (cost=22985.69..1088981.66 rows=613453 width=173)
Hash Cond: ((TABLE1.alias)::text = (v.alias)::text)
-> Seq Scan on TABLE1 (cost=0.00..410625.10 rows=12029410 width=159)
-> Hash (cost=11722.53..11722.53 rows=613453 width=21)
-> Seq Scan on accesscount v (cost=0.00..11722.53 rows=613453 width=21)
(6 rows)

Time: 0.848 ms

Re: Index not being used for UPDATE?

From

Tom Lane

Date:

27 April 2011, 14:50:26

Phoenix Kiula <phoenix.kiula@gmail.com> writes:
> Just want to check why, in an UPDATE sql, the JOIN condition is not making
> use of the index?
> [ whole-table update done with a hash join ]

That's a perfectly good plan.  Indexes typically help only when you want
to process just part of a table.

If you don't believe it, you can try forcing a different plan to be
chosen (see enable_hashjoin et al), but more than likely it'll be
slower.

            regards, tom lane

Re: Index not being used for UPDATE?

From

Andy Colson

Date:

27 April 2011, 14:52:17

On 4/27/2011 11:15 AM, Phoenix Kiula wrote:
> Hi.
>
> Just want to check why, in an UPDATE sql, the JOIN condition is not
> making use of the index?
>
> In both tables being joined, the column in question is in fact the
> primary key!
>
> Table structure and query below. All I want is to take values from a
> smaller "accesscount" table and update from it the values in the TABLE1
> table, which is a larger table.
>
> The query plan shows sequential scan of both the tables. Why is this and
> how can I work around it?
>
> Thanks!
>
>
> *
> Table "public.TABLE1"*
>
>
>   Column    |            Type             |            Modifiers
> --------------+-----------------------------+---------------------------------
> alias        | character varying(35)       | not null
> som          | text                        | not null
> user_id      | character varying(30)       | not null
> modify_date  | timestamp without time zone | default now()
> volatility   | character varying(32)       |
> acount       | integer                     |
> Indexes:
> "idx_TABLE1_pkey" PRIMARY KEY, btree (alias)
> "idx_TABLE1_userid" btree (user_id) CLUSTER
>
>
>
>
> *Table "public.accesscount"
> *
>     Column    |         Type          | Modifiers
> --------------+-----------------------+-----------
> alias        | character varying(35) | not null
> acount       | integer               |
> Indexes:
> "idx_9" PRIMARY KEY, btree (alias)
>
>
>
>
> *=# explain
> *update TABLE1
>     set acount = v.acount
> from accesscount v
> where TABLE1.alias = v.alias
> ;
> *
> *
> *                                       QUERY PLAN
> *------------------------------------------------------------------------------------------
> Update  (cost=22985.69..1088981.66 rows=613453 width=173)
>    ->  Hash Join  (cost=22985.69..1088981.66 rows=613453 width=173)
>          Hash Cond: ((TABLE1.alias)::text = (v.alias)::text)
>          ->  Seq Scan on TABLE1  (cost=0.00..410625.10 rows=12029410
> width=159)
>          ->  Hash  (cost=11722.53..11722.53 rows=613453 width=21)
>                ->  Seq Scan on accesscount v  (cost=0.00..11722.53
> rows=613453 width=21)
> (6 rows)
>
> Time: 0.848 ms
>
>

Looks to me like it loaded the entire accesscount table into an in
memory hash, then it scanned table1 to update each row.  Because
accessCount is small, it was faster to read all of it at once.  If the
table grows, at some point (and with the help of work_mem?), I assume PG
will switch to looking up rows, which is gonna be slower.

-Andy