Thread: postgres performance

postgres performance

From
chidamparam muthusamy
Date:
hi,
Registered with PostgreSQL Help Forum to identify and resolve the Postgres DB performance issue, received suggestions but could not improve the speed/response time. Please help.

Details:
Postgres Version 9.3.1
Server configuration:
Processor: 2 x Intel Quad core E5620 @ 2.40GHz
RAM: 16 GB

Postgres configuration:
Effective cache size = 10 GB
shared Buffer = 1250 MB
random page cost = 4

Table size = 60 GB
Number of records = 44 million 
Carried out Vacuum Analyze after inserting new records and also after creating Index,
6 months data, every month around 10 GB will get added. Expecting good performance with 3 years data.
DB Will be used for Reporting/Read, will not be used for transaction. Daily records will be inserted through bulk insertion every day.

Table schema:
                    Table "public.detailed_report"
            Column             |            Type            | Modifiers 
-------------------------------+----------------------------+-----------
 group_id                      | character varying(50)      | not null
 client                        | character varying(50)      | 
 gateway                       | character varying(50)      | 
 call_id                       | character varying(120)     | not null
 parent_call_id                | character varying(120)     | 
 start_time                    | timestamp with time zone   | not null
 connect_time                  | timestamp with time zone   | 
 end_time                      | timestamp with time zone   | 
 duration                      | integer                    | 
 source                        | character varying(50)      | 
 source_alias                  | character varying(50)      | 
 dest_in_number                | character varying(50)      | 
 dest_out_number               | character varying(50)      | 
 bp_code_pay                   | character varying[]        | 
 billed_duration_pay           | integer[]                  | 
 rate_pay                      | character varying[]        | 
 rate_effective_date_pay       | timestamp with time zone[] | 
 type_value_pay                | character varying[]        | 
 slab_time_pay                 | character varying[]        | 
 pin_pay                       | bigint[]                   | 
 amount_pay                    | double precision[]         | 
 adjusted_pin_pay              | bigint[]                   | 
 adjusted_amount_pay           | double precision[]         | 
 call_amount_pay               | double precision           | 
 country_code_pay              | character varying[]        | 
 country_desc_pay              | character varying[]        | 
 master_country_code           | character varying(15)      | 
 master_country_desc           | character varying(100)     | 
 bp_code_recv                  | character varying[]        | 
 billed_duration_recv          | integer[]                  | 
 rate_recv                     | character varying[]        | 
 rate_effective_date_recv      | timestamp with time zone[] | 
 type_value_recv               | character varying[]        | 
 slab_time_recv                | character varying[]        | 
 pin_recv                      | bigint[]                   | 
 amount_recv                   | double precision[]         | 
 adjusted_pin_recv             | bigint[]                   | 
 adjusted_amount_recv          | double precision[]         | 
 call_amount_recv              | double precision           | 
 country_code_recv             | character varying[]        | 
 country_desc_recv             | character varying[]        | 
 subscriber_type               | character varying(50)      | 
 pdd                           | smallint                   | 
 disconnect_reason             | character varying(200)     | 
 source_ip                     | character varying(20)      | 
 dest_ip                       | character varying(20)      | 
 caller_hop                    | character varying(20)      | 
 callee_hop                    | character varying(20)      | 
 caller_received_from_hop      | character varying(20)      | 
 callee_sent_to_hop            | character varying(20)      | 
 caller_media_ip_port          | character varying(25)      | 
 callee_media_ip_port          | character varying(25)      | 
 caller_original_media_ip_port | character varying(25)      | 
 callee_original_media_ip_port | character varying(25)      | 
 switch_ip                     | character varying(20)      | 
 call_shop_amount_paid         | boolean                    | 
 version                       | character varying          | 
 call_duration_pay             | integer                    | 
 call_duration_recv            | integer                    | 
 audio_codec                   | character varying(5)       | 
 video_codec                   | character varying(5)       | 
 shadow_amount_recv            | double precision           | 
 shadow_amount_pay             | double precision           | 
 pulse_applied_recv            | character varying(50)      | 
 pulse_applied_pay             | character varying(50)      | 

Index, multi column, 3 columns, matches exactly with query where condition
"endtime_groupid_client_tsidx_detail_report" btree (end_time DESC, group_id, client), tablespace "indexspace" which exactly matches with 'where' condition,
" WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway;" 
Index on a separate tablespace on another hard disk.

Query:
EXPLAIN (analyze, buffers) SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_
pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC; 
QUERY PLAN 
------------------------------------------------------
Sort (cost=3422863.06..3422868.69 rows=2254 width=44) (actual time=137852.474..137852.474 rows=5 loops=1)
Sort Key: (sum(call_amount_recv))
Sort Method: quicksort Memory: 25kB
Buffers: shared read=2491664
-> HashAggregate (cost=3422664.28..3422737.53 rows=2254 width=44) (actual time=137852.402..137852.454 rows=5 loops=1)
Buffers: shared read=2491664
-> Bitmap Heap Scan on detailed_report (cost=644828.11..3399506.87 rows=1029218 width=44) (actual time=4499.558..125443.122 rows=5248227 loops=1)
Recheck Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time zone) AND ((group_id)::text = 'adm
in'::text) AND ((client)::text = 'CHOICE'::text))
Buffers: shared read=2491664
-> Bitmap Index Scan on endtime_groupid_client_tsidx_detail_report (cost=0.00..644570.81 rows=1029218 width=0) (actual time=3418.754..3418.754 rows=5248227 loops=1)
Index Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time zone) AND ((group_id)::text = 
'admin'::text) AND ((client)::text = 'CHOICE'::text))
Buffers: shared read=95055
Total runtime: 137868.946 ms
(13 rows)

Checked by removing ORDER BY, but no improvement.

By increasing random_page_cost to 25, the query gets executed sequentially,Seq Scan on detailed_report, time taken is comparatively better than Indexed scan. But I am not preferring because when the data grows the sequential scan performance will come down.

carried out 3 more set of tests:
1. Index on columns
2. multiple column index, with 2 columns
3. multiple column index, with three columns

Test Case 1:
************
indexes :
1)client
2)group_id
3)gateway
4)end_time

"det_rep_pkey" PRIMARY KEY, btree (group_id, call_id, start_time)
"client_detailed_report_idx" btree (client), tablespace "indexspace"
"end_time_detailed_report_idx" btree (end_time), tablespace "indexspace"
"gateway_detailed_report_idx" btree (gateway), tablespace "indexspace"
"group_id_detailed_report_idx" btree (group_id), tablespace "indexspace"

testdb=# EXPLAIN (analyze,buffers,verbose)SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC; 
QUERY PLAN 

Sort (cost=3510106.93..3510112.25 rows=2127 width=44) (actual time=148557.599..148557.599 rows=5 loops=1)
Output: ((client)::text), ((gateway)::text), (count(*)), (((avg(duration) / 1000.0))::numeric(10,2)), ((avg(pdd))::numeric(10,2)), (((sum(call_duration_recv))::numeric / 1000.0)), (((sum(c
all_duration_pay))::numeric / 1000.0)), (sum(call_amount_recv)), (sum(call_amount_pay)), client, gateway
Sort Key: (sum(detailed_report.call_amount_recv))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=69 read=2505035
-> HashAggregate (cost=3509920.24..3509989.37 rows=2127 width=44) (actual time=148557.556..148557.581 rows=5 loops=1)
Output: (client)::text, (gateway)::text, count(*), ((avg(duration) / 1000.0))::numeric(10,2), (avg(pdd))::numeric(10,2), ((sum(call_duration_recv))::numeric / 1000.0), ((sum(call_dur
ation_pay))::numeric / 1000.0), sum(call_amount_recv), sum(call_amount_pay), client, gateway
Buffers: shared hit=69 read=2505035
-> Bitmap Heap Scan on public.detailed_report (cost=832774.93..3487872.62 rows=979894 width=44) (actual time=14257.148..135355.676 rows=5248227 loops=1)
Output: group_id, client, gateway, call_id, parent_call_id, start_time, connect_time, end_time, duration, source, source_alias, dest_in_number, dest_out_number, bp_code_pay, bi
lled_duration_pay, rate_pay, rate_effective_date_pay, type_value_pay, slab_time_pay, pin_pay, amount_pay, adjusted_pin_pay, adjusted_amount_pay, call_amount_pay, country_code_pay, country_des
c_pay, master_country_code, master_country_desc, bp_code_recv, billed_duration_recv, rate_recv, rate_effective_date_recv, type_value_recv, slab_time_recv, pin_recv, amount_recv, adjusted_pin_
recv, adjusted_amount_recv, call_amount_recv, country_code_recv, country_desc_recv, subscriber_type, pdd, disconnect_reason, source_ip, dest_ip, caller_hop, callee_hop, caller_received_from_h
op, callee_sent_to_hop, caller_media_ip_port, callee_media_ip_port, caller_original_media_ip_port, callee_original_media_ip_port, switch_ip, call_shop_amount_paid, version, call_duration_pay,
call_duration_recv, audio_codec, video_codec, shadow_amount_recv, shadow_amount_pay, pulse_applied_recv, pulse_applied_pay
Recheck Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:00+00
'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
Buffers: shared hit=69 read=2505035
-> BitmapAnd (cost=832774.93..832774.93 rows=979894 width=0) (actual time=13007.643..13007.643 rows=0 loops=1)
Buffers: shared read=108495
-> Bitmap Index Scan on client_detailed_report_idx (cost=0.00..172876.66 rows=7862413 width=0) (actual time=2546.204..2546.204 rows=7840766 loops=1)
Index Cond: ((detailed_report.client)::text = 'CHOICE'::text)
Buffers: shared read=21427
-> Bitmap Index Scan on group_id_detailed_report_idx (cost=0.00..307105.20 rows=14971818 width=0) (actual time=4265.728..4265.728 rows=14945965 loops=1)
Index Cond: ((detailed_report.group_id)::text = 'admin'::text)
Buffers: shared read=40840
-> Bitmap Index Scan on end_time_detailed_report_idx (cost=0.00..352057.65 rows=16790108 width=0) (actual time=3489.106..3489.106 rows=16917795 loops=1)
Index Cond: ((detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp wi
th time zone))
Buffers: shared read=46228
Total runtime: 148558.070 ms
(24 rows)



Test Case 2:
************
Indexes :
1)client
2)group_id
3)gateway
4)end_time
5)client,group_id

"det_rep_pkey" PRIMARY KEY, btree (group_id, call_id, start_time)
"client_detailed_report_idx" btree (client), tablespace "indexspace"
"clientgroupid_detailed_report_idx" btree (client, group_id), tablespace "indexspace"
"end_time_detailed_report_idx" btree (end_time), tablespace "indexspace"
"gateway_detailed_report_idx" btree (gateway), tablespace "indexspace"
"group_id_detailed_report_idx" btree (group_id), tablespace "indexspace"


testdb=# EXPLAIN (analyze,buffers,verbose)SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC;

QUERY PLAN 
Sort (cost=3172381.37..3172387.11 rows=2297 width=44) (actual time=132725.901..132725.901 rows=5 loops=1)
Output: ((client)::text), ((gateway)::text), (count(*)), (((avg(duration) / 1000.0))::numeric(10,2)), ((avg(pdd))::numeric(10,2)), (((sum(call_duration_recv))::numeric / 1000.0)), (((sum(c
all_duration_pay))::numeric / 1000.0)), (sum(call_amount_recv)), (sum(call_amount_pay)), client, gateway
Sort Key: (sum(detailed_report.call_amount_recv))
Sort Method: quicksort Memory: 25kB
Buffers: shared read=2472883
-> HashAggregate (cost=3172178.48..3172253.13 rows=2297 width=44) (actual time=132725.861..132725.881 rows=5 loops=1)
Output: (client)::text, (gateway)::text, count(*), ((avg(duration) / 1000.0))::numeric(10,2), (avg(pdd))::numeric(10,2), ((sum(call_duration_recv))::numeric / 1000.0), ((sum(call_dur
ation_pay))::numeric / 1000.0), sum(call_amount_recv), sum(call_amount_pay), client, gateway
Buffers: shared read=2472883
-> Bitmap Heap Scan on public.detailed_report (cost=434121.21..3149462.57 rows=1009596 width=44) (actual time=8257.581..120311.450 rows=5248227 loops=1)
Output: group_id, client, gateway, call_id, parent_call_id, start_time, connect_time, end_time, duration, source, source_alias, dest_in_number, dest_out_number, bp_code_pay, bi
lled_duration_pay, rate_pay, rate_effective_date_pay, type_value_pay, slab_time_pay, pin_pay, amount_pay, adjusted_pin_pay, adjusted_amount_pay, call_amount_pay, country_code_pay, country_des
c_pay, master_country_code, master_country_desc, bp_code_recv, billed_duration_recv, rate_recv, rate_effective_date_recv, type_value_recv, slab_time_recv, pin_recv, amount_recv, adjusted_pin_
recv, adjusted_amount_recv, call_amount_recv, country_code_recv, country_desc_recv, subscriber_type, pdd, disconnect_reason, source_ip, dest_ip, caller_hop, callee_hop, caller_received_from_h
op, callee_sent_to_hop, caller_media_ip_port, callee_media_ip_port, caller_original_media_ip_port, callee_original_media_ip_port, switch_ip, call_shop_amount_paid, version, call_duration_pay,
call_duration_recv, audio_codec, video_codec, shadow_amount_recv, shadow_amount_pay, pulse_applied_recv, pulse_applied_pay
Recheck Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:00+00
'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
Buffers: shared read=2472883
-> BitmapAnd (cost=434121.21..434121.21 rows=1009596 width=0) (actual time=7101.419..7101.419 rows=0 loops=1)
Buffers: shared read=76274
-> Bitmap Index Scan on clientgroupid_detailed_report_idx (cost=0.00..74766.52 rows=2649396 width=0) (actual time=3066.346..3066.346 rows=7840766 loops=1)
Index Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text))
Buffers: shared read=30046
-> Bitmap Index Scan on end_time_detailed_report_idx (cost=0.00..358849.64 rows=17114107 width=0) (actual time=2969.577..2969.577 rows=16917795 loops=1)
Index Cond: ((detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp wi
th time zone))
Buffers: shared read=46228
Total runtime: 132726.073 ms
(21 rows)



Test Case 3:
************
Indexes:
Index :
1)client
2)group_id
3)gateway
4)end_time
5)client,group_id
6)client,group_id,end_time

"det_rep_pkey" PRIMARY KEY, btree (group_id, call_id, start_time)
"client_detailed_report_idx" btree (client), tablespace "indexspace"
"clientgroupid_detailed_report_idx" btree (client, group_id), tablespace "indexspace"
"clientgroupidendtime_detailed_report_idx" btree (client, group_id, end_time), tablespace "indexspace"
"end_time_detailed_report_idx" btree (end_time), tablespace "indexspace"
"gateway_detailed_report_idx" btree (gateway), tablespace "indexspace"
"group_id_detailed_report_idx" btree (group_id), tablespace "indexspace"


testdb=# EXPLAIN (analyze, verbose) SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_
pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC; 

QUERY PLAN 

Sort (cost=2725603.99..2725609.46 rows=2188 width=44) (actual time=137713.264..137713.265 rows=5 loops=1)
Output: ((client)::text), ((gateway)::text), (count(*)), (((avg(duration) / 1000.0))::numeric(10,2)), ((avg(pdd))::numeric(10,2)), (((sum(call_duration_recv))::numeric / 1000.0)), (((sum(c
all_duration_pay))::numeric / 1000.0)), (sum(call_amount_recv)), (sum(call_amount_pay)), client, gateway
Sort Key: (sum(detailed_report.call_amount_recv))
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=2725411.50..2725482.61 rows=2188 width=44) (actual time=137713.192..137713.215 rows=5 loops=1)
Output: (client)::text, (gateway)::text, count(*), ((avg(duration) / 1000.0))::numeric(10,2), (avg(pdd))::numeric(10,2), ((sum(call_duration_recv))::numeric / 1000.0), ((sum(call_dur
ation_pay))::numeric / 1000.0), sum(call_amount_recv), sum(call_amount_pay), client, gateway
-> Bitmap Heap Scan on public.detailed_report (cost=37356.61..2703244.88 rows=985183 width=44) (actual time=3925.850..124647.660 rows=5248227 loops=1)
Output: group_id, client, gateway, call_id, parent_call_id, start_time, connect_time, end_time, duration, source, source_alias, dest_in_number, dest_out_number, bp_code_pay, bi
lled_duration_pay, rate_pay, rate_effective_date_pay, type_value_pay, slab_time_pay, pin_pay, amount_pay, adjusted_pin_pay, adjusted_amount_pay, call_amount_pay, country_code_pay, country_des
c_pay, master_country_code, master_country_desc, bp_code_recv, billed_duration_recv, rate_recv, rate_effective_date_recv, type_value_recv, slab_time_recv, pin_recv, amount_recv, adjusted_pin_
recv, adjusted_amount_recv, call_amount_recv, country_code_recv, country_desc_recv, subscriber_type, pdd, disconnect_reason, source_ip, dest_ip, caller_hop, callee_hop, caller_received_from_h
op, callee_sent_to_hop, caller_media_ip_port, callee_media_ip_port, caller_original_media_ip_port, callee_original_media_ip_port, switch_ip, call_shop_amount_paid, version, call_duration_pay,
call_duration_recv, audio_codec, video_codec, shadow_amount_recv, shadow_amount_pay, pulse_applied_recv, pulse_applied_pay
Recheck Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:00+00
'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
-> Bitmap Index Scan on clientgroupidendtime_detailed_report_idx (cost=0.00..37110.31 rows=985183 width=0) (actual time=2820.150..2820.150 rows=5248227 loops=1)
Index Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:0
0+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
Total runtime: 137728.029 ms
(12 rows)

Tried by creating partial Index on group_id column for the value 'admin' and also end_time column for one month range.

With all the above experiment, could not reduce the response time, please suggest.



Re: postgres performance

From
Alan Hodgson
Date:
On Friday, December 06, 2013 11:06:58 PM chidamparam muthusamy wrote:
> hi,
> Registered with PostgreSQL Help Forum to identify and resolve the Postgres
> DB performance issue, received suggestions but could not improve the
> speed/response time. Please help.
>
> Details:
> Postgres Version 9.3.1
> Server configuration:
> Processor: 2 x Intel Quad core E5620 @ 2.40GHz
> RAM: 16 GB
>
> Postgres configuration:
> Effective cache size = 10 GB
> shared Buffer = 1250 MB
> random page cost = 4
>
> Table size = 60 GB
> Number of records = 44 million
> Carried out Vacuum Analyze after inserting new records and also after
> creating Index,
> 6 months data, every month around 10 GB will get added. Expecting good
> performance with 3 years data.
> DB Will be used for Reporting/Read, will not be used for transaction. Daily
> records will be inserted through bulk insertion every day.

Suggestions:

Partition by month.

Add many more disks, in RAID-10.
 or move to SSD.

Add a lot more RAM.



Re: postgres performance

From
Richard Huxton
Date:
On 06/12/13 17:36, chidamparam muthusamy wrote:

I rather think Alan is right - you either want a lot more RAM or faster
disks. Have a look at your first query...

> Query:
> EXPLAIN (analyze, buffers) SELECT text(client) as client, text(gateway)
> as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0)
> ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd,
> sum(call_duration_recv)/1000.0 as duration_recv,
> sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as
> call_amount_recv, sum(call_amount_pay) as call_amount_
> pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and
> end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE'
> GROUP by client, gateway ORDER BY call_amount_recv DESC;

> QUERY PLAN
> ------------------------------------------------------
> Sort (cost=3422863.06..3422868.69 rows=2254 width=44) (actual
> time=137852.474..137852.474 rows=5 loops=1)
> Sort Key: (sum(call_amount_recv))
> Sort Method: quicksort Memory: 25kB
> Buffers: shared read=2491664

> -> HashAggregate (cost=3422664.28..3422737.53 rows=2254 width=44)
> (actual time=137852.402..137852.454 rows=5 loops=1)
> Buffers: shared read=2491664

> -> Bitmap Heap Scan on detailed_report (cost=644828.11..3399506.87
> rows=1029218 width=44) (actual time=4499.558..125443.122 rows=5248227
> loops=1)
> Recheck Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with
> time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
> zone) AND ((group_id)::text = 'adm
> in'::text) AND ((client)::text = 'CHOICE'::text))
> Buffers: shared read=2491664

> -> Bitmap Index Scan on endtime_groupid_client_tsidx_detail_report
> (cost=0.00..644570.81 rows=1029218 width=0) (actual
> time=3418.754..3418.754 rows=5248227 loops=1)
> Index Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time
> zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
> zone) AND ((group_id)::text =
> 'admin'::text) AND ((client)::text = 'CHOICE'::text))
> Buffers: shared read=95055

> Total runtime: *137868.946 ms*
> (13 rows)

The index is being used, but most of your time is going on the "Bitmap
Heap Scan". You're processing 5.2 million rows in about 120 seconds -
that's about 43 rows per millisecond - not too bad. It's not getting any
cache hits though, it's having to read all the blocks. Looking at the
number of blocks, that's ~2.5 million at 8KB each or about 20GB. You
just don't have the RAM to cache that.

If you have lots of similar reporting queries to run, you might get away
with dropping the index and letting them run in parallel. Each
individual query would be slow but they should be smart enough to share
each other's sequential scans - the disks would basically be looping
through you data continuously.

--
   Richard Huxton
   Archonet Ltd


Re: postgres performance

From
Tomas Vondra
Date:
On 6.12.2013 18:36, chidamparam muthusamy wrote:
> hi,
> Registered with PostgreSQL Help Forum to identify and resolve the
> Postgres DB performance issue, received suggestions but could not
> improve the speed/response time. Please help.
>
> Details:
> Postgres Version 9.3.1
> Server configuration:
> Processor: 2 x Intel Quad core E5620 @ 2.40GHz
> RAM: 16 GB
>
> Postgres configuration:
> Effective cache size = 10 GB
> shared Buffer = 1250 MB
> random page cost = 4
>
> Table size = 60 GB
> Number of records = 44 million
> Carried out Vacuum Analyze after inserting new records and also after
> creating Index,
> 6 months data, every month around 10 GB will get added. Expecting good
> performance with 3 years data.

So, what is good performance? What times do you need to achieve for the
queries you've posted?

It's difficult to read the explain plans wrapped in the message, so I've
pasted some of them into explain.depesz.com:

  http://explain.depesz.com/s/9SH
  http://explain.depesz.com/s/hFp

The estimates are very accurate, and as Richard Huxton pointed out, the
dominating part is the bitmap heap scan. I assume this is because or
reading the data from disk. Can you check iostat/vmstat while the
queries are running? Are you CPU or I/O bound? I'd guess the latter.

In that case, adding more RAM / more powerful I/O is probably the
easiest way to improve the performance. And a partitioning (but that
depends on the queries, as it may improve some and hurt others).

regards
Tomas


Re: postgres performance

From
chidamparam muthusamy
Date:
hi,
thank you so much for the input.
Can you please clarify the following points:
1. Output of BitmapAnd = 303660 rows
 ->  BitmapAnd  (cost=539314.51..539314.51 rows=303660 width=0) (actual time=9083.085..9083.085 rows=0 loops=1)
                     ->  Bitmap Index Scan on groupid_index  (cost=0.00..164070.62 rows=7998674 width=0) (actual time=2303.788..2303.788 rows=7840766 loops=1)
                           Index Cond: ((detailed_report.group_id)::text = 'CHOICE'::text)
                     ->  Bitmap Index Scan on client_index  (cost=0.00..175870.62 rows=7998674 width=0) (actual time=2879.691..2879.691 rows=7840113 loops=1)
                           Index Cond: ((detailed_report.client)::text = 'ChoiceFone'::text)
                     ->  Bitmap Index Scan on partial_endtime_index  (cost=0.00..199145.02 rows=9573259 width=0) (actual time=1754.044..1754.044 rows=9476589 loops=1)
                           Index Cond: ((detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-06-01 00:00:00+00'::timestamp wi
th time zone))

2.  In the Next outer node Bitmap Heap Scan, estimated rows = 303660 and actual rows = 2958392, why huge difference ? How to bring it down. 
Bitmap Heap Scan on public.detailed_report  (cost=539314.51..1544589.52 rows=303660 width=44) (actual time=9619.913..51757.911 rows=2958392 loops=1)

3. what is the cause for Recheck, is it possible to reduce the time taken for Recheck ?  
Recheck Cond: (((detailed_report.group_id)::text = 'CHOICE'::text) AND ((detailed_report.client)::text = 'ChoiceFone'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:
00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-06-01 00:00:00+00'::timestamp with time zone))

thanks


On Sat, Dec 7, 2013 at 12:07 AM, Richard Huxton <dev@archonet.com> wrote:
On 06/12/13 17:36, chidamparam muthusamy wrote:

I rather think Alan is right - you either want a lot more RAM or faster disks. Have a look at your first query...


Query:
EXPLAIN (analyze, buffers) SELECT text(client) as client, text(gateway)
as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0)
::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd,
sum(call_duration_recv)/1000.0 as duration_recv,
sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as
call_amount_recv, sum(call_amount_pay) as call_amount_
pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and
end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE'
GROUP by client, gateway ORDER BY call_amount_recv DESC;

QUERY PLAN
------------------------------------------------------
Sort (cost=3422863.06..3422868.69 rows=2254 width=44) (actual
time=137852.474..137852.474 rows=5 loops=1)
Sort Key: (sum(call_amount_recv))
Sort Method: quicksort Memory: 25kB
Buffers: shared read=2491664

-> HashAggregate (cost=3422664.28..3422737.53 rows=2254 width=44)
(actual time=137852.402..137852.454 rows=5 loops=1)
Buffers: shared read=2491664

-> Bitmap Heap Scan on detailed_report (cost=644828.11..3399506.87
rows=1029218 width=44) (actual time=4499.558..125443.122 rows=5248227
loops=1)
Recheck Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with
time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
zone) AND ((group_id)::text = 'adm
in'::text) AND ((client)::text = 'CHOICE'::text))
Buffers: shared read=2491664

-> Bitmap Index Scan on endtime_groupid_client_tsidx_detail_report
(cost=0.00..644570.81 rows=1029218 width=0) (actual
time=3418.754..3418.754 rows=5248227 loops=1)
Index Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time
zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
zone) AND ((group_id)::text =
'admin'::text) AND ((client)::text = 'CHOICE'::text))
Buffers: shared read=95055

Total runtime: *137868.946 ms*
(13 rows)

The index is being used, but most of your time is going on the "Bitmap Heap Scan". You're processing 5.2 million rows in about 120 seconds - that's about 43 rows per millisecond - not too bad. It's not getting any cache hits though, it's having to read all the blocks. Looking at the number of blocks, that's ~2.5 million at 8KB each or about 20GB. You just don't have the RAM to cache that.

If you have lots of similar reporting queries to run, you might get away with dropping the index and letting them run in parallel. Each individual query would be slow but they should be smart enough to share each other's sequential scans - the disks would basically be looping through you data continuously.

--
  Richard Huxton
  Archonet Ltd

Re: postgres performance

From
desmodemone
Date:



2013/12/7 chidamparam muthusamy <mchidamparam@gmail.com>
hi,
thank you so much for the input.
Can you please clarify the following points:
1. Output of BitmapAnd = 303660 rows
 ->  BitmapAnd  (cost=539314.51..539314.51 rows=303660 width=0) (actual time=9083.085..9083.085 rows=0 loops=1)
                     ->  Bitmap Index Scan on groupid_index  (cost=0.00..164070.62 rows=7998674 width=0) (actual time=2303.788..2303.788 rows=7840766 loops=1)
                           Index Cond: ((detailed_report.group_id)::text = 'CHOICE'::text)
                     ->  Bitmap Index Scan on client_index  (cost=0.00..175870.62 rows=7998674 width=0) (actual time=2879.691..2879.691 rows=7840113 loops=1)
                           Index Cond: ((detailed_report.client)::text = 'ChoiceFone'::text)
                     ->  Bitmap Index Scan on partial_endtime_index  (cost=0.00..199145.02 rows=9573259 width=0) (actual time=1754.044..1754.044 rows=9476589 loops=1)
                           Index Cond: ((detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-06-01 00:00:00+00'::timestamp wi
th time zone))

2.  In the Next outer node Bitmap Heap Scan, estimated rows = 303660 and actual rows = 2958392, why huge difference ? How to bring it down. 
Bitmap Heap Scan on public.detailed_report  (cost=539314.51..1544589.52 rows=303660 width=44) (actual time=9619.913..51757.911 rows=2958392 loops=1)

3. what is the cause for Recheck, is it possible to reduce the time taken for Recheck ?  
Recheck Cond: (((detailed_report.group_id)::text = 'CHOICE'::text) AND ((detailed_report.client)::text = 'ChoiceFone'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:
00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-06-01 00:00:00+00'::timestamp with time zone))

thanks


On Sat, Dec 7, 2013 at 12:07 AM, Richard Huxton <dev@archonet.com> wrote:
On 06/12/13 17:36, chidamparam muthusamy wrote:

I rather think Alan is right - you either want a lot more RAM or faster disks. Have a look at your first query...


Query:
EXPLAIN (analyze, buffers) SELECT text(client) as client, text(gateway)
as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0)
::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd,
sum(call_duration_recv)/1000.0 as duration_recv,
sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as
call_amount_recv, sum(call_amount_pay) as call_amount_
pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and
end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE'
GROUP by client, gateway ORDER BY call_amount_recv DESC;

QUERY PLAN
------------------------------------------------------
Sort (cost=3422863.06..3422868.69 rows=2254 width=44) (actual
time=137852.474..137852.474 rows=5 loops=1)
Sort Key: (sum(call_amount_recv))
Sort Method: quicksort Memory: 25kB
Buffers: shared read=2491664

-> HashAggregate (cost=3422664.28..3422737.53 rows=2254 width=44)
(actual time=137852.402..137852.454 rows=5 loops=1)
Buffers: shared read=2491664

-> Bitmap Heap Scan on detailed_report (cost=644828.11..3399506.87
rows=1029218 width=44) (actual time=4499.558..125443.122 rows=5248227
loops=1)
Recheck Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with
time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
zone) AND ((group_id)::text = 'adm
in'::text) AND ((client)::text = 'CHOICE'::text))
Buffers: shared read=2491664

-> Bitmap Index Scan on endtime_groupid_client_tsidx_detail_report
(cost=0.00..644570.81 rows=1029218 width=0) (actual
time=3418.754..3418.754 rows=5248227 loops=1)
Index Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time
zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
zone) AND ((group_id)::text =
'admin'::text) AND ((client)::text = 'CHOICE'::text))
Buffers: shared read=95055

Total runtime: *137868.946 ms*
(13 rows)

The index is being used, but most of your time is going on the "Bitmap Heap Scan". You're processing 5.2 million rows in about 120 seconds - that's about 43 rows per millisecond - not too bad. It's not getting any cache hits though, it's having to read all the blocks. Looking at the number of blocks, that's ~2.5 million at 8KB each or about 20GB. You just don't have the RAM to cache that.

If you have lots of similar reporting queries to run, you might get away with dropping the index and letting them run in parallel. Each individual query would be slow but they should be smart enough to share each other's sequential scans - the disks would basically be looping through you data continuously.

--
  Richard Huxton
  Archonet Ltd


Hi,
       about point 3, if I remembr correctly, the problem is that the module that create the bitmap index could choose between not lossy or lossy. The problem is  correlated to the max number of  tuples inside  a block ( 256 for 8kb block) , so if you not have enought work_memory , the module switches  to the lossy storage (that use only 1 bit for a disk page) and so your backend process have to do the recheck condition on the tuples read from table.

You could try to increase work_mem (better) to avoid the module  switches from not lossy bitmap to lossy bitmap, or try  to disable the enable_bitmapscan   (set enable_bitmapscan=off)  to see if you could gain something.

Abount point 1 , it's doing a bitwise and  operation  between the bitmap indexes  so it use both 3 bitmap indexes to apply the predicates of the query.

About point 2 it depends on statistics, it's possible you are not analyzing enough rows of the table, by the way the important thing is that your plans are table and "good".

Moreover it will be interesting to know what type of storage and filesystem you are using, are you monitoring the latency of your storage ?

Did you try the effective_io_concurrency to speed up bitmap heap scan ?  see here 


 
Bye

Mat