Re: Optimizer failure on update w/integer column - Mailing list pgsql-general

From nolan@celery.tssi.com
Subject Re: Optimizer failure on update w/integer column
Date
Msg-id 20030615234626.25325.qmail@celery.tssi.com
Whole thread Raw
In response to Re: Optimizer failure on update w/integer column  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Optimizer failure on update w/integer column  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
> That seems excessive for a 700K-row update, but I doubt that the query
> plan is the problem.  I'm betting there's a lot of per-update overhead
> due to triggers or foreign keys.  What have you got in that line?  Do
> you have indexes on both sides of any foreign-key relationships that
> missing_ids participates in?

There are no triggers or foreign key relationships.  There are unique
indexes on mtranseq on both tables, and there is just the one index on
missing_ids.  There is another index on the memtran table on the
mtranmemid and mtranseq columns.

This gets stranger and stranger.  I moved the missing_id's table over to
a faster computer so I could do some timings without it taking all night.
(That increased the size of the memtran table, but the general behavior
is similar.  Most postgres tasks on this computer run 5-10 times faster
than on the other one.)

Here's the revised plan:

explain update missing_ids
set mtransts = a.mtransts,
mtranmemtp = a.mtranmemtp
from memtran as a
where a.mtranmemid = missing_ids.mtranmemid
and a.mtranseq = missing_ids.mtranseq

 Hash Join  (cost=60271.38..88098.75 rows=1 width=48)
   Hash Cond: ("outer".mtranseq = "inner".mtranseq)
   Join Filter: ("inner".mtranmemid = "outer".mtranmemid)
   ->  Seq Scan on missing_ids  (cost=0.00..6289.91 rows=351891 width=22)
   ->  Hash  (cost=27070.30..27070.30 rows=941530 width=26)
         ->  Seq Scan on memtran a  (cost=0.00..27070.30 rows=941530 width=26)

The first time I ran it, it took 318 seconds on this machine.

That's MUCH better than on the other machine, but here's where things
get a bit weird.

If I drop the index on missing_ids completely, it runs much faster, 35
seconds the first time, 38 seconds when I ran it a second time.

I then recreated the index on missing_ids(memtranseq), the execution time
slowed down to 48 seconds the first time I reran the update, and it took
262 seconds when I ran the update again.  Subsequent passes got progressivly
slower: 371 seconds, then 764 seconds.

I dropped the index again, here are consecutive running times for the
query:

54 seconds, 45 seconds, 42 seconds, 43 seconds, 43 seconds, 45 seconds.
(I am the only user on the system this afternoon.)
--
Mike Nolan

pgsql-general by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: [HACKERS] UTF8 and KOI8 mini-howto
Next
From: Tom Lane
Date:
Subject: Re: Optimizer failure on update w/integer column