Thread: CPU Intensive query

CPU Intensive query

From
Abu Mushayeed
Date:
I have an interesting problem. I have the following query that ran ok on Monday and Tuesday and it has been running ok since I have been at this job. I have seen it to be IO intensive, but since Wednesday it has become CPU intensive. Database wise fresh data has been put into the tables, vacuumed & analyzed, no other parameter has been modified.
 
Wednesday it ran over 24 hours and it did not finish and all this time it pegged a CPU between 95-99%. Yesterday the same story. I do not understand what could have caused it to behave like this suddenly. I am hoping somebody can point me to do research in the right direction.
 
The query is as follows and it's explain plan is also attached:
 
set enable_nestloop = off;
INSERT INTO linkshare.macys_ls_daily_shipped
SELECT
 ddw.intr_xref,
 cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10),
 to_char(cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 10), -5), 'YYYY-MM-DD/HH24:MI:SS'),
 to_char(cdm.cdm_utc_convert(to_char(sales.order_date, 'YYYY-MM-DD HH24:MI:SS')::timestamp without time zone, -5), 'YYYY-MM-DD/HH24:MI:SS') ,
 ddw.item_upc,
 sum(abs(ddw.itm_qty)),
 sum((ddw.tran_itm_total * 100::numeric)::integer),
 'USD', '', '', '',
 ddw.item_desc
FROM
 cdm.cdm_ddw_tran_item_grouped ddw
JOIN
 cdm.cdm_sitesales sales ON ddw.intr_xref::text = sales.order_number::text
WHERE
 ddw.cal_date > (CURRENT_DATE - 7) AND ddw.cal_date < CURRENT_DATE
AND
 ddw.intr_xref IS NOT NULL
AND  trim(cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10)) <> ''
AND  cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 10), -5)::text::date >= (CURRENT_DATE - 52)
AND  sales.order_date >= (CURRENT_DATE - 52)
AND  (tran_typ_id = 'S'::bpchar)
AND  btrim(item_group::text) <> 'EGC'::text
AND  btrim(item_group::text) <> 'VGC'::text
GROUP BY
 ddw.intr_xref,
 cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10),
 to_char(cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 10), -5), 'YYYY-MM-DD/HH24:MI:SS'),
 to_char(cdm.cdm_utc_convert(to_char(sales.order_date, 'YYYY-MM-DD HH24:MI:SS')::timestamp without time zone, -5), 'YYYY-MM-DD/HH24:MI:SS'),
 ddw.item_upc,
 8, 9, 10, 11,
 ddw.item_desc;
 
 
HashAggregate  (cost=152555.97..152567.32 rows=267 width=162)
  ->  Hash Join  (cost=139308.18..152547.96 rows=267 width=162)
        Hash Cond: (("outer".intr_xref)::text = ("inner".order_number)::text)
        ->  GroupAggregate  (cost=106793.14..109222.13 rows=4319 width=189)
              ->  Sort  (cost=106793.14..106901.09 rows=43182 width=189)
                    Sort Key: cdm_ddw_tran_item.appl_xref, cdm_ddw_tran_item.intr_xref, cdm_ddw_tran_item.tran_typ_id, cdm_ddw_tran_item.cal_date, cdm_ddw_tran_item.cal_time, cdm_ddw_tran_item.tran_itm_total, cdm_ddw_tran_item.tran_tot_amt, cdm_ddw_tran_item.fill_store_div, cdm_ddw_tran_item.itm_price, cdm_ddw_tran_item.item_id, cdm_ddw_tran_item.item_upc, cdm_ddw_tran_item.item_pid, cdm_ddw_tran_item.item_desc, cdm_ddw_tran_item.nrf_color_name, cdm_ddw_tran_item.nrf_size_name, cdm_ddw_tran_item.dept_id, c
                    ->  Index Scan using cdm_ddw_tranp_item_cal_date on cdm_ddw_tran_item  (cost=0.01..103468.52 rows=43182 width=189)
                          Index Cond: ((cal_date > (('now'::text)::date - 7)) AND (cal_date < ('now'::text)::date))
                          Filter: ((intr_xref IS NOT NULL) AND (btrim(cdm.cdm_get_linkshare_id_safe(intr_xref, 10)) <> ''::text) AND (((cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(intr_xref, 10), -5))::text)::date >= (('now'::text)::date - 52)) AND (tran_typ_id = 'S'::bpchar) AND (btrim((item_group)::text) <> 'EGC'::text) AND (btrim((item_group)::text) <> 'VGC'::text))
        ->  Hash  (cost=31409.92..31409.92 rows=442050 width=20)
              ->  Index Scan using cdm_sitesales_order_date on cdm_sitesales sales  (cost=0.00..31409.92 rows=442050 width=20)
                    Index Cond: (order_date >= (('now'::text)::date - 52))
 


Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.

Re: CPU Intensive query

From
"Steinar H. Gunderson"
Date:
On Fri, May 18, 2007 at 09:02:52AM -0700, Abu Mushayeed wrote:
> I have an interesting problem. I have the following query that ran ok on
> Monday and Tuesday and it has been running ok since I have been at this
> job. I have seen it to be IO intensive, but since Wednesday it has become
> CPU intensive. Database wise fresh data has been put into the tables,
> vacuumed & analyzed, no other parameter has been modified.

What Postgres version is this?

>   The query is as follows and it's explain plan is also attached:

Normally EXPLAIN ANALYZE data would be much better than EXPLAIN, but if the
query indeed does not finish, it's not going to help much.

>   set enable_nestloop = off;

What's the rationale for this?

>   HashAggregate  (cost=152555.97..152567.32 rows=267 width=162)

152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
probably misestimation involved at some point here. Does it really return 267
rows, or many more?

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: CPU Intensive query

From
Abu Mushayeed
Date:
What Postgres version is this?
 
8.1.3

> set enable_nestloop = off;

What's the rationale for this?
To eliminate nested loop. It does a nested loop betwwen to very large table(millions of rows).
 
> HashAggregate (cost=152555.97..152567.32 rows=267 width=162)

152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
probably misestimation involved at some point here. Does it really return 267
rows, or many more?
It returns finally about 19-20 thousand rows.


"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
On Fri, May 18, 2007 at 09:02:52AM -0700, Abu Mushayeed wrote:
> I have an interesting problem. I have the following query that ran ok on
> Monday and Tuesday and it has been running ok since I have been at this
> job. I have seen it to be IO intensive, but since Wednesday it has become
> CPU intensive. Database wise fresh data has been put into the tables,
> vacuumed & analyzed, no other parameter has been modified.

What Postgres version is this?

> The query is as follows and it's explain plan is also attached:

Normally EXPLAIN ANALYZE data would be much better than EXPLAIN, but if the
query indeed does not finish, it's not going to help much.

> set enable_nestloop = off;

What's the rationale for this?

> HashAggregate (cost=152555.97..152567.32 rows=267 width=162)

152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
probably misestimation involved at some point here. Does it really return 267
rows, or many more?

/* Steinar */
--
Homepage: http://www.sesse.net/

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


8:00? 8:25? 8:40? Find a flick in no time
with theYahoo! Search movie showtime shortcut.

Re: CPU Intensive query

From
Tom Lane
Date:
Abu Mushayeed <abumushayeed@yahoo.com> writes:
>   The query is as follows and it's explain plan is also attached:

The query itself seems to be a simple join over not too many rows, so
I don't see how it could be taking 24 hours.  What I suspect is you're
incurring lots and lots of invocations of those user-written functions
and one of them has suddenly decided to get very slow.

            regards, tom lane

Re: CPU Intensive query

From
Abu Mushayeed
Date:
> one of them has suddenly decided to get very slow
 
Is there a way to predict when the system will do this? Also, why would it suddenly go from IO intensive to CPU intensive.
 
Also, this query ran today and it already finished. Today it was IO intensive.
 
Please provide me some direction to this problem.
 
Thanks.

Tom Lane <tgl@sss.pgh.pa.us> wrote:
Abu Mushayeed writes:
> The query is as follows and it's explain plan is also attached:

The query itself seems to be a simple join over not too many rows, so
I don't see how it could be taking 24 hours. What I suspect is you're
incurring lots and lots of invocations of those user-written functions
and one of them has suddenly decided to get very slow.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Now that's room service! Choose from over 150,000 hotels
in 45,000 destinations on Yahoo! Travel
to find your fit.

Re: CPU Intensive query

From
"Steinar H. Gunderson"
Date:
On Fri, May 18, 2007 at 02:37:27PM -0700, Abu Mushayeed wrote:
>>> set enable_nestloop = off;
>> What's the rationale for this?
> To eliminate nested loop. It does a nested loop betwwen to very large
> table(millions of rows).

If the planner chooses a nested loop, it is because it believes it is the
most efficient solution. I'd turn it back on and try to figure out why the
planner was wrong. Note that a nested loop with an index scan on one or both
sides can easily be as efficient as anything.

Did you ANALYZE your tables recently? If the joins are really between
millions of rows and the planner thinks it's a couple thousands, the stats
sound rather off...

>>> HashAggregate (cost=152555.97..152567.32 rows=267 width=162)
>> 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
>> probably misestimation involved at some point here. Does it really return 267
>> rows, or many more?
> It returns finally about 19-20 thousand rows.

So the planner is off by a factor of at least a hundred. That's a good
first-level explanation for why it's slow, at least...

If you can, please provide EXPLAIN ANALYZE output for your query (after
running ANALYZE on all your tables, if you haven't already); even though
it will take some time, it usually makes this kind of performance debugging
much easier.

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: CPU Intensive query

From
"Steinar H. Gunderson"
Date:
On Sat, May 19, 2007 at 12:32:33AM +0200, Steinar H. Gunderson wrote:
> Did you ANALYZE your tables recently? If the joins are really between
> millions of rows and the planner thinks it's a couple thousands, the stats
> sound rather off...

Sorry, I forgot your first e-mail where you said you had both vacuumed and
analyzed recently. The estimates are still off, though -- the WHERE query
might be difficult to estimate properly. (I'm not sure how Tom arrived on
his conclusion of expensive user-defined functions, but given the usual
precisions of his guesses, I'd check that too...)

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: CPU Intensive query

From
Andrew Sullivan
Date:
On Fri, May 18, 2007 at 03:26:08PM -0700, Abu Mushayeed wrote:

>   Also, this query ran today and it already finished. Today it was
>   IO intensive.

Are you entirely sure that it's not a coincidence, and something
_else_ in the system is causing the CPU issues?

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.
        --Roger Brinner