On Thursday 04 December 2003 22:13, Ivar Zarans wrote:
> On Thu, Dec 04, 2003 at 08:23:36PM +0000, Richard Huxton wrote:
> > Ah - it's probably not the update but the IN. You can rewrite it using
> > PG's non-standard FROM:
> >
> > UPDATE t1 SET status='done' FROM t_tmp WHERE t1.rec_id = t_tmp.rec_id;
>
> This was one *very useful* hint! Using this method i got my processing
> time of 24000 records down to around 3 minutes 10 seconds. Comparing
> with initial 1 hour 20 minutes and then 16 minutes, this is impressive
> improvement!
Be aware, this is specific to PG - I'm not aware of this construction working
on any other DB. Three minutes still doesn't sound brilliant, but that could
be tuning issues.
> > Now that doesn't explain why the update is taking so long. One fifth of a
> > second is extremely slow. Are you certain that the index is being used?
>
> I posted results of "EXPLAIN" in my previous message. Meanwhile i tried
> to update just one record, using "psql". Also tried out "EXPLAIN
> ANALYZE". This way i did not see any big delay - total runtime for one
> update was around 1 msec.
Yep - the explain looked fine. If you run EXPLAIN ANALYSE it will give you
timings too (actual timings will be slightly less than reported ones since PG
won't be timing/reporting).
> I am confused - has slowness of UPDATE something to do with Python and
> PyPgSQL, since "psql" seems to have no delay whatsoever? Or is this
> related to using two cursors, one for select results and other for
> update? Even if this is related to Python or cursors, how am i getting
> so big speed improvement only by using different query?
Hmm - you didn't mention cursors. If this was a problem with PyPgSQL in
general I suspect we'd know about it by now. It could however be some
cursor-related issue. In general, you're probably better off trying to do
updates/inserts as a single statement and letting PG manage things rather
than processing one row at a time.
If you've got the time, try putting together a small test-script with some
dummy data and see if it's reproducible. I'm sure the other Python users
would be interested in seeing where the problem is.
--
Richard Huxton
Archonet Ltd