Re: Slow update - Mailing list pgsql-general

From Herouth Maoz
Subject Re: Slow update
Date
Msg-id 4997E4FA.4000600@unicell.co.il
Whole thread Raw
In response to Re: Slow update  (Alban Hertroys <dalroi@solfertje.student.utwente.nl>)
List pgsql-general
Alban Hertroys wrote:
> On Feb 9, 2009, at 2:07 PM, Grzegorz Jaśkiewicz wrote:
>
>> On Mon, Feb 9, 2009 at 12:50 PM, Herouth Maoz <herouth@unicell.co.il>
>> wrote:
>>> I hope someone can clue me in based on the results of explain analyze.
>>
>> Did you have a chance to run vmstat on it, and post it here ? Maybe -
>> if db resides on the same disc with everything else, something
>> (ab)uses that much io, and it has to wait.
>> Also, I don't know - but personaly I didn't like the line in explain:
>>
>> ->  Bitmap Index Scan on billing_msisdn_sme_reference
>> (cost=0.00..24.70 rows=389 width=0) (actual time=2
>> 1.418..21.418 rows=252 loops=151332)
>>        Index Cond: ((b.msisdn)::text =
>> (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substrin
>> g"((rb.msisdn)::text, 2)))
>>
>> But the cost is next to none, so that's not it.
>
>
> Actually, it's inside a nested loop and if I read correctly it gets
> looped over 151332 times. That means it takes 151332 * (21.418 -
> 1.418) = 3026640 ms, which is almost 12% of the total time.
>
> The biggie seems to be the bitmap heap scan on rb though. The row
> estimates for that one are way off (estimated 549 rows vs actual 151332).
>
> Alban Hertroys
>
To be quite honest, I'm not sure exactly how to read this plan, and what
the row values mean. The issue here is that sometimes the query works in
reasonable time, and sometimes it takes half a day. Of course, this may
be because the data size is different, but I don't know which part of
the plan tells me that.

What do rows vs. loops signify? How can the estimate be so far off if
I'm running analyze on all the tables right after I make any big updates
to them?

I find it hard to believe that the problem is with the complex
comparison caused by the different formats of the fields in rb and in
billing. This should add a constant multiplier to the time it takes to
run the query, but not cause the query to run one time in 5 minutes, and
the next day in 12 hours!

Thanks for the assistance.

Herouth

Here is the plan collected tonight, which took a reasonable amount of
time (5 minutes). It seems to be a totally different plan, isn't it?:


QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=1157750.08..1167132.31 rows=1 width=210) (actual
time=238247.983..239980.264 rows=111676 loops=1)
   Merge Cond: ((rb.reference = b.user_reference) AND
((rb.sms_user)::text = (b.user_id)::text) AND
(((("substring"((rb.msisdn)::text, 1, 0) || '972'::text) ||
"substring"((rb.msisdn)::text, 2))) = (b.msisdn)::text))
   ->  Sort  (cost=31137.76..31141.31 rows=1423 width=198) (actual
time=117858.431..117932.544 rows=111676 loops=1)
         Sort Key: rb.reference, rb.sms_user,
((("substring"((rb.msisdn)::text, 1, 0) || '972'::text) ||
"substring"((rb.msisdn)::text, 2)))
         Sort Method:  quicksort  Memory: 30941kB
         ->  Bitmap Heap Scan on rb  (cost=26968.95..31063.23 rows=1423
width=198) (actual time=113615.187..116935.502 rows=111676 loops=1)
               Recheck Cond: ((delivered = 0) AND (time_stamp >=
'2009-02-12 00:00:00'::timestamp without time zone))
               Filter: ((NOT mo_billed) AND (system_id <> 6))
               ->  BitmapAnd  (cost=26968.95..26968.95 rows=1423
width=0) (actual time=113454.761..113454.761 rows=0 loops=1)
                     ->  Bitmap Index Scan on rb_delivered_ind
(cost=0.00..2522.46 rows=69896 width=0) (actual time=9358.397..9358.397
rows=150651 loops=1)
                           Index Cond: (delivered = 0)
                     ->  Bitmap Index Scan on rb_timestamp_ind
(cost=0.00..24445.53 rows=213475 width=0) (actual
time=104091.620..104091.620 rows=303308 loops=1)
                           Index Cond: (time_stamp >= '2009-02-12
00:00:00'::timestamp without time zone)
   ->  Sort  (cost=1117952.26..1120779.49 rows=1130889 width=50) (actual
time=119485.709..120263.045 rows=756135 loops=1)
         Sort Key: b.user_reference, b.user_id, b.msisdn
         Sort Method:  external sort  Disk: 60976kB
         ->  Bitmap Heap Scan on billing b  (cost=36754.98..1004246.88
rows=1130889 width=50) (actual time=24409.448..101034.765 rows=896474
loops=1)
               Recheck Cond: (time_arrived >= '2009-02-12
00:00:00'::timestamp without time zone)
               ->  Bitmap Index Scan on billing_time_arrived
(cost=0.00..36472.26 rows=1130889 width=0) (actual
time=23936.245..23936.245 rows=1166881 loops=1)
                     Index Cond: (time_arrived >= '2009-02-12
00:00:00'::timestamp without time zone)
 Total runtime: 307958.152 ms
(21 rows)





pgsql-general by date:

Previous
From: Sim Zacks
Date:
Subject: Re: password for postgres
Next
From: Gerhard Heift
Date:
Subject: Re: Array in nested query