Thread: Inconsistent query performance

Inconsistent query performance

From
Ramsey Gurley
Date:
Hi all,

I'm having issues with slow queries using postgres, and I'm finding some of the issues difficult to reproduce. My
applicationlogs slow queries for me, but often, when I go back to run explain analyze on the query it is very fast. I
assumethis is due to some sort of caching. Here is an example: 

sqlLogging _valuesForSQL ran 116509ms with sql string:

select t1.transaction_id, t1.campaign_definition_id, t1.campaign_detail_number, t2.name as campaign_name,
t1.communication_type,t1.delivery_error, t1.call_result into temporary activity_transactions_temp from
customer.customer_transactionst1, campaign.campaign_definitions t2 where t1.customer_id = 2577 and t1.timestamp >=
'04/08/2013'and t1.timestamp < '04/09/2013'  and t1.campaign_definition_id = 23   and t1.campaign_definition_id =
t2.campaign_definition_idorder by campaign_name, communication_type; 

But if I run an explain analyze on that I get

"Sort  (cost=18.08..18.08 rows=1 width=89) (actual time=767.051..767.061 rows=17 loops=1)"
"  Sort Key: t2.name, t1.communication_type"
"  Sort Method:  quicksort  Memory: 26kB"
"  ->  Nested Loop  (cost=0.00..18.07 rows=1 width=89) (actual time=758.421..766.881 rows=17 loops=1)"
"        ->  Index Scan using timestamp_only_ndx on customer_transactions t1  (cost=0.00..9.79 rows=1 width=69) (actual
time=79.771..88.119rows=17 loops=1)" 
"              Index Cond: (("timestamp" >= '2013-04-08 00:00:00'::timestamp without time zone) AND ("timestamp" <
'2013-04-0900:00:00'::timestamp without time zone))" 
"              Filter: ((customer_id = 2577) AND (campaign_definition_id = 23))"
"        ->  Index Scan using campaign_definitions_campaign_definition_id_key on campaign_definitions t2
(cost=0.00..8.27rows=1 width=24) (actual time=39.922..39.923 rows=1 loops=17)" 
"              Index Cond: (t2.campaign_definition_id = 23)"
"Total runtime: 770.830 ms"

Is there anything that can tell me why this query took 117 seconds, and what, if anything, can be done to fix it?

Here's another much more simple one:

expression took 20147 ms:

SELECT t0.appointment_id, t0.customer_id, t0.event_date, t0.patient_id, t0.transaction_id FROM
customer.customer_transactions_detailt0 WHERE (t0.patient_id = 7441 AND t0.customer_id = 2965) 

"Index Scan using customer_id_patient_id_idx on customer_transactions_detail t0  (cost=0.00..10.22 rows=1 width=24)
(actualtime=35.952..99.487 rows=14 loops=1)" 
"  Index Cond: ((customer_id = 2965) AND (patient_id = 7441))"
"Total runtime: 99.537 ms"

So it took 20 seconds at 12:18pm today, but now it takes ~100ms.

Thanks,

Ramsey

Re: Inconsistent query performance

From
Ramsey Gurley
Date:
I forgot to mention. I'm currently using postgres 8.3.x

On Apr 8, 2013, at 3:08 PM, Ramsey Gurley wrote:

> Hi all,
>
> I'm having issues with slow queries using postgres, and I'm finding some of the issues difficult to reproduce. My
applicationlogs slow queries for me, but often, when I go back to run explain analyze on the query it is very fast. I
assumethis is due to some sort of caching. Here is an example: 
>
> sqlLogging _valuesForSQL ran 116509ms with sql string:
>
> select t1.transaction_id, t1.campaign_definition_id, t1.campaign_detail_number, t2.name as campaign_name,
t1.communication_type,t1.delivery_error, t1.call_result into temporary activity_transactions_temp from
customer.customer_transactionst1, campaign.campaign_definitions t2 where t1.customer_id = 2577 and t1.timestamp >=
'04/08/2013'and t1.timestamp < '04/09/2013'  and t1.campaign_definition_id = 23   and t1.campaign_definition_id =
t2.campaign_definition_idorder by campaign_name, communication_type; 
>
> But if I run an explain analyze on that I get
>
> "Sort  (cost=18.08..18.08 rows=1 width=89) (actual time=767.051..767.061 rows=17 loops=1)"
> "  Sort Key: t2.name, t1.communication_type"
> "  Sort Method:  quicksort  Memory: 26kB"
> "  ->  Nested Loop  (cost=0.00..18.07 rows=1 width=89) (actual time=758.421..766.881 rows=17 loops=1)"
> "        ->  Index Scan using timestamp_only_ndx on customer_transactions t1  (cost=0.00..9.79 rows=1 width=69)
(actualtime=79.771..88.119 rows=17 loops=1)" 
> "              Index Cond: (("timestamp" >= '2013-04-08 00:00:00'::timestamp without time zone) AND ("timestamp" <
'2013-04-0900:00:00'::timestamp without time zone))" 
> "              Filter: ((customer_id = 2577) AND (campaign_definition_id = 23))"
> "        ->  Index Scan using campaign_definitions_campaign_definition_id_key on campaign_definitions t2
(cost=0.00..8.27rows=1 width=24) (actual time=39.922..39.923 rows=1 loops=17)" 
> "              Index Cond: (t2.campaign_definition_id = 23)"
> "Total runtime: 770.830 ms"
>
> Is there anything that can tell me why this query took 117 seconds, and what, if anything, can be done to fix it?
>
> Here's another much more simple one:
>
> expression took 20147 ms:
>
> SELECT t0.appointment_id, t0.customer_id, t0.event_date, t0.patient_id, t0.transaction_id FROM
customer.customer_transactions_detailt0 WHERE (t0.patient_id = 7441 AND t0.customer_id = 2965) 
>
> "Index Scan using customer_id_patient_id_idx on customer_transactions_detail t0  (cost=0.00..10.22 rows=1 width=24)
(actualtime=35.952..99.487 rows=14 loops=1)" 
> "  Index Cond: ((customer_id = 2965) AND (patient_id = 7441))"
> "Total runtime: 99.537 ms"
>
> So it took 20 seconds at 12:18pm today, but now it takes ~100ms.
>
> Thanks,
>
> Ramsey



Re: Inconsistent query performance

From
Kevin Grittner
Date:
Ramsey Gurley <rgurley@smarthealth.com> wrote:

> I'm having issues with slow queries using postgres, and I'm
> finding some of the issues difficult to reproduce. My application
> logs slow queries for me, but often, when I go back to run explain
> analyze on the query it is very fast. I assume this is due to some
> sort of caching.

> expression took 20147 ms:
>
> SELECT t0.appointment_id, t0.customer_id, t0.event_date,
>        t0.patient_id, t0.transaction_id
>   FROM customer.customer_transactions_detail t0
>   WHERE (t0.patient_id = 7441 AND t0.customer_id = 2965)
>
> "Index Scan using customer_id_patient_id_idx on customer_transactions_detail t0
>    (cost=0.00..10.22 rows=1 width=24)
>    (actual time=35.952..99.487 rows=14 loops=1)"
> "  Index Cond: ((customer_id = 2965) AND (patient_id = 7441))"
> "Total runtime: 99.537 ms"
>
> So it took 20 seconds at 12:18pm today, but now it takes ~100ms.

Well, often when you get a faster run time when running a query
again it is due to caching, but there are other things which can
come into play.  There could be blocking.  There could be a glut of
disk writes at checkpoint time which holds up all other disk I/O.
You could have had a badly bloated index when the query was run the
first time, and a VACUUM command or autovacuum cleaned things up
before your explain analyze.

> I'm currently using postgres 8.3.x

That was much more vulnerable to the write glut problem than
versions which are still in support.  It's hard to even suggest
what steps to take next without knowing the OS, your hardware, or
your configuration.  Please read these two pages:

http://www.postgresql.org/support/versioning/

http://wiki.postgresql.org/wiki/SlowQueryQuestions

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Inconsistent query performance

From
Jeff Janes
Date:
On Monday, April 8, 2013, Ramsey Gurley wrote:
Hi all,

I'm having issues with slow queries using postgres, and I'm finding some of the issues difficult to reproduce. My application logs slow queries for me, but often, when I go back to run explain analyze on the query it is very fast. I assume this is due to some sort of caching. Here is an example:

sqlLogging _valuesForSQL ran 116509ms with sql string:

I think that your IO subsystem is getting congested on occasion, and that you have a "selection bias".  You are seeing in your log the occasions on which it is running slow, but not seeing the occasions it is running fast.  When you run it manually, you see a result similar to the (invisible, but frequent) times that it did not run slow. 

You should monitor with sar, or iostat, or vmstat, or similar.

Cheers,

Jeff

Re: Inconsistent query performance

From
Ramsey Gurley
Date:
On Apr 8, 2013, at 7:09 PM, Kevin Grittner wrote:

> Ramsey Gurley <rgurley@smarthealth.com> wrote:
>
>> I'm having issues with slow queries using postgres, and I'm
>> finding some of the issues difficult to reproduce. My application
>> logs slow queries for me, but often, when I go back to run explain
>> analyze on the query it is very fast. I assume this is due to some
>> sort of caching.
>
>> expression took 20147 ms:
>>
>> SELECT t0.appointment_id, t0.customer_id, t0.event_date,
>>         t0.patient_id, t0.transaction_id
>>    FROM customer.customer_transactions_detail t0
>>    WHERE (t0.patient_id = 7441 AND t0.customer_id = 2965)
>>
>> "Index Scan using customer_id_patient_id_idx on customer_transactions_detail t0
>>     (cost=0.00..10.22 rows=1 width=24)
>>     (actual time=35.952..99.487 rows=14 loops=1)"
>> "  Index Cond: ((customer_id = 2965) AND (patient_id = 7441))"
>> "Total runtime: 99.537 ms"
>>
>> So it took 20 seconds at 12:18pm today, but now it takes ~100ms.
>
> Well, often when you get a faster run time when running a query
> again it is due to caching, but there are other things which can
> come into play.  There could be blocking.  There could be a glut of
> disk writes at checkpoint time which holds up all other disk I/O.

Is there a way to make checkpoints happen more frequently so that large ones don't cause two minute delays?

> You could have had a badly bloated index when the query was run the
> first time, and a VACUUM command or autovacuum cleaned things up
> before your explain analyze.

I did run a vacuum on the entire database the day before. I don't know if I have auto-vacuuming set up.

>
>> I'm currently using postgres 8.3.x
>
> That was much more vulnerable to the write glut problem than
> versions which are still in support.

I'm in the process of upgrading to 9.2.x. I'm also moving the database to a dedicated machine with more RAM available
tothe database. So maybe a less crufty setup in another week or two. 

> It's hard to even suggest
> what steps to take next without knowing the OS, your hardware, or
> your configuration.  Please read these two pages:
>
> http://www.postgresql.org/support/versioning/
>
> http://wiki.postgresql.org/wiki/SlowQueryQuestions

Thanks for the suggestions Kevin :)

>
> --
> Kevin Grittner
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company





Re: Inconsistent query performance

From
Ramsey Gurley
Date:

On Apr 8, 2013, at 8:46 PM, Jeff Janes wrote:

On Monday, April 8, 2013, Ramsey Gurley wrote:
Hi all,

I'm having issues with slow queries using postgres, and I'm finding some of the issues difficult to reproduce. My application logs slow queries for me, but often, when I go back to run explain analyze on the query it is very fast. I assume this is due to some sort of caching. Here is an example:

sqlLogging _valuesForSQL ran 116509ms with sql string:

I think that your IO subsystem is getting congested on occasion, and that you have a "selection bias".  You are seeing in your log the occasions on which it is running slow, but not seeing the occasions it is running fast.  When you run it manually, you see a result similar to the (invisible, but frequent) times that it did not run slow. 

I think you're right about this. It seems to run fast most of the time. It's those times it takes 2 minutes to respond to a customer which I'd like to minimize or eliminate. 

But I think caching has a bit to do with it too. I can go back to the logs from last week and run a slow query and it will frequently be equally slow again. Not always, but enough that I think something else is wrong with my setup.


You should monitor with sar, or iostat, or vmstat, or similar.

I'll look into that. Thanks Jeff :)


Cheers,

Jeff