Home > mailing lists

Re: CPU Consuming query. Sequential scan despite indexing. - Mailing list pgsql-performance

From	Laurenz Albe
Subject	Re: CPU Consuming query. Sequential scan despite indexing.
Date	October 16, 2020 11:36:10
Msg-id	eb1854e1a1b7893870363b0e3ac80b594887d440.camel@cybertec.at Whole thread Raw
In response to	CPU Consuming query. Sequential scan despite indexing. (aditya desai <admad123@gmail.com>)
Responses	Re: CPU Consuming query. Sequential scan despite indexing. (aditya desai <admad123@gmail.com>)
List	pgsql-performance

Tree view

On Thu, 2020-10-15 at 20:34 +0530, aditya desai wrote:
> Below query always shows up on top in the CPU matrix. Also despite having indexes it does sequential scans
> (probably because WHERE condition satisfies almost all of the data from table). This query
> runs on the default landing page in application and needs to fetch records in less that 100 ms
>  without consuming too much CPU.
> 
>  Any opinions? Table is very huge and due to referential identity and business requirements we could not
>  implement partitioning as well.
> 
> There is index on (countrycode,facilitycode,jobstartdatetime)
> 
> explain (analyze,buffers) with JobCount as ( select jobstatuscode,count(1) stat_count from job j
>  where 1=1 and j.countrycode = 'TH'
> and j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>  and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30 00:00:00' ) or j.jobstartdatetime IS NULL )
groupby j.jobstatuscode)
 
>  select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc right outer join jobstatus js on
jc.jobstatuscode=js.jobstatuscode;
> 
>                           QUERY PLAN
> 
>  Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual time=1314.809..1314.849 rows=10 loops=1)
>                              ->  Parallel Seq Scan on job j  (cost=0.00..96837.93 rows=200963 width=4) (actual
time=13.010..1144.434rows=163200 loops=3)
 
>                                    Filter: (((countrycode)::text = 'TH'::text) AND (((jobstartdatetime >= '2020-08-01
00:00:00'::timestampwithout time zone) AND (jobst
 
> artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR (jobstartdatetime IS NULL)) AND
((facilitycode)::text= ANY ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
 
> ,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
>                                    Rows Removed by Filter: 449035
>                                    Buffers: shared hit=60086 read=11834
>                                    I/O Timings: read=59.194
> 

You should rewrite the subquery as a UNION to avoid the OR:

  ... WHERE j.countrycode = 'TH'
        and j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
        and j.jobstartdatetime between '2020-08-01 00:00:00' and '2020-09-30 00:00:00'

and

  ... WHERE j.countrycode = 'TH'
        and j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
        and j.jobstartdatetime IS NULL

These indexes could speed up the resulting query:

  CREATE INDEX ON job (countrycode, facilitycode);
  CREATE INDEX ON job (countrycode, jobstartdatetime);
  CREATE INDEX ON job (countrycode, facilitycode) WHERE jobstartdaytime IS NULL;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

pgsql-performance by date:

From: Michael Lewis
Date: 15 October 2020, 19:26:35
Subject: Re: CPU Consuming query. Sequential scan despite indexing.

From: Michael Lewis
Date: 19 October 2020, 19:20:12
Subject: Re: CPU Consuming query. Sequential scan despite indexing.

Re: CPU Consuming query. Sequential scan despite indexing. - Mailing list pgsql-performance

Previous

Next