Re: Query Performance Problem - Mailing list pgsql-performance

From Felipe Santos
Subject Re: Query Performance Problem
Date
Msg-id CAPYcRiUp5Epwy7MwU_7COCKqtWcfcPSmw7KHPPhRScSG=tutTA@mail.gmail.com
Whole thread Raw
In response to Query Performance Problem  (<john@jpm-cola.com>)
List pgsql-performance


2014-10-21 10:57 GMT-02:00 <john@jpm-cola.com>:


Hi all,

I'm experimenting with table partitioning though inheritance. I'm testing a query as follows:

explain (analyze, buffers)
select response.id
from claim.response
where response.account_id = 4766
and response.expire_timestamp is null
and response.create_timestamp >= DATE '2014-08-01'
order by create_timestamp;

The response table looks like this:
"account_id";"integer"
"file_type_id";"integer"
"receiver_inbound_detail_id";"integer"
"processing_status_id";"integer"
"processing";"boolean"
"expire_timestamp";"timestamp without time zone"
"last_mod_timestamp";"timestamp without time zone"
"create_timestamp";"timestamp without time zone"
"response_trace_nbr";"character varying"
"posted_timestamp";"timestamp without time zone"
"need_to_post";"boolean"
"response_message";"text"
"worked";"boolean"
"response_status_id";"integer"
"response_type_id";"integer"
"outbound_claim_detail_id";"bigint"
"id";"bigint"

Here are some rowcounts:

SELECT count(*) from claim_response.response_201408;
  count
---------
 4585746
(1 row)

Time: 7271.054 ms
SELECT count(*) from claim_response.response_201409;
  count
---------
 3523370
(1 row)

Time: 4341.116 ms
SELECT count(*) from claim_response.response_201410;
 count
-------
   154
(1 row)

Time: 0.258 ms

The entire table has 225,665,512 rows. I read that a partitioning rule of thumb is that benefits of partitioning occur starting around 100 million rows.

SELECT count(*) from claim.response;
   count
-----------
 225665512
(1 row)

Time: 685064.637 ms


The partitioning is on the create_timestamp field.

The server is Red Hat Enterprise Linux Server release 6.2 (Santiago) on a VM machine - 8 GB RAM with 2 CPUs:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
CPU socket(s):         1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Stepping:              2
CPU MHz:               2660.000
BogoMIPS:              5320.00
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0,1



2 users,  load average: 0.00, 0.12, 0.37


Please see the following for the explain analysis :


I'm trying to understand why I'm getting the yellow, orange, and red on the inclusive, and the yellow on the exclusive. (referring to the explain.depesz.com/s/I3SL page.)
I'm relatively new to PostgreSQL, but I've been an Oracle DBA for some time. I suspect the I/O may be dragging but I don't know how to dig that information out from here. Please point out anything else you can decipher from this. 

Thanks,

John


Hi John,

Dont know about the colors, but the Stats tab looks fine. You've got yourself 5 Index Scans, which are a very fast way to dig data.

 I noticed you've also cast your filter field "(create_timestamp >= '2014-08-01'::date)". As far as I know, Postgresql doesn't need this kind of explicit conversion. You would be fine with just "(create_timestamp >= '2014-08-01')".

Regards,

Felipe

pgsql-performance by date:

Previous
From:
Date:
Subject: Query Performance Problem
Next
From: Merlin Moncure
Date:
Subject: Re: Query with large number of joins