Slow update - Mailing list pgsql-general

From Herouth Maoz
Subject Slow update
Date
Msg-id 4976EB1D.7080509@unicell.co.il
Whole thread Raw
Responses Re: Slow update  ("Marc Mamin" <M.Mamin@intershop.de>)
Re: Slow update  (Filip Rembiałkowski <plk.zuber@gmail.com>)
List pgsql-general
Hello.

I have a daily process that synchronizes our reports database from our production databases. In the past few days, it happened a couple of times that an update query took around 7-8 hours to complete, which seems a bit excessive. This is the query:

    UPDATE rb
    SET service = b.service,
    status = b.status,
    has_notification = gateway_id NOT IN (4,101,102),
    operator = COALESCE(
        b.actual_target_network_id,
        b.requested_target_network_id
    )
    FROM sms.billing b
    WHERE b.time_arrived >= :date_start
    AND   rb.time_stamp >= :date_start
    AND   rb.delivered = 0
    AND   rb.sms_user = b.user_id
    AND   rb.reference = b.user_reference
    AND   OVERLAY( rb.msisdn placing '972' from 1 for 1 ) = b.msisdn
    AND   NOT mo_billed
    AND   system_id <> 6  -- Exclude Corporate, as it aleady has service/status
    ;

The variable ":date_start" is set to a date 3 days ago.

I ran explain for this query and it gave me this:


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=21567.12..854759.82 rows=1 width=210)
   Join Filter: (((rb.sms_user)::text = (b.user_id)::text) AND (rb.reference = b.user_reference))
   ->  Bitmap Heap Scan on rb  (cost=21546.02..23946.16 rows=819 width=198)
         Recheck Cond: ((delivered = 0) AND (time_stamp >= '2009-01-18 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21 00:00:00'::timestamp without time zone))
         Filter: ((NOT mo_billed) AND (system_id <> 6))
         ->  BitmapAnd  (cost=21546.02..21546.02 rows=819 width=0)
               ->  Bitmap Index Scan on rb_delivered_ind  (cost=0.00..1419.99 rows=45768 width=0)
                     Index Cond: (delivered = 0)
               ->  Bitmap Index Scan on rb_timestamp_ind  (cost=0.00..20125.37 rows=188994 width=0)
                     Index Cond: ((time_stamp >= '2009-01-18 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21 00:00:00'::timestamp without time zone))
   ->  Bitmap Heap Scan on billing b  (cost=21.10..1004.77 rows=351 width=49)
         Recheck Cond: ((b.msisdn)::text = (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2)))
         Filter: ((b.time_arrived >= '2009-01-18 00:00:00'::timestamp without time zone) AND (b.time_arrived < '2009-01-21 00:00:00'::timestamp without time zone))
         ->  Bitmap Index Scan on billing_msisdn_sme_reference  (cost=0.00..21.10 rows=351 width=0)
               Index Cond: ((b.msisdn)::text = (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2)))

I'm not an expert on reading plans, but it seems to me that it uses indices on both tables that participate in this query, so it shouldn't take such a long time.

The number of records in the table rb for the past three days is 386833. On the sms.billing table it seems to select the index on the msisdn and sme_reference fields and use it partially (only using the msisdn field). Looking at that table, the frequency of each value in the msisdn field is at most 17678 for the current data, where mostly it's a couple of thousands. How can this take so long?

Thanks,
Herouth

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Custom type, operators and operator class not sorting/indexing correctly
Next
From: Andreas Wenk
Date:
Subject: Re: pg_restore: [archiver] entry ID -825110830 out of range -- perhaps a corrupt TOC