CPU Consuming query. Sequential scan despite indexing. - Mailing list pgsql-performance

From aditya desai
Subject CPU Consuming query. Sequential scan despite indexing.
Date
Msg-id CAN0SRDFYi2+C9+LsV9mzmmLjjBt74g_0VR6d8uEaMV2fTnSYxA@mail.gmail.com
Whole thread Raw
Responses Re: CPU Consuming query. Sequential scan despite indexing.  (Michael Lewis <mlewis@entrata.com>)
Re: CPU Consuming query. Sequential scan despite indexing.  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-performance
Hi,
Below query always shows up on top in the CPU matrix. Also despite having indexes it does sequential scans(probably because WHERE condition satisfies almost all of the data from table). This query runs on the default landing page in application and needs to fetch records in less that 100 ms without consuming too much CPU.

 Any opinions? Table is very huge and due to referential identity and business requirements we could not implement partitioning as well.

There is index on (countrycode,facilitycode,jobstartdatetime)

explain (analyze,buffers) with JobCount as ( select jobstatuscode,count(1) stat_count from job j where 1=1 and j.countrycode = 'TH'   and j.facilitycode in ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1') and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30 00:00:00' ) or j.jobstartdatetime IS NULL )  group by j.jobstatuscode)
 select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;

                          QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------
 Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual time=1314.809..1314.849 rows=10 loops=1)
   Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
   Buffers: shared hit=21314 read=3231
   I/O Timings: read=19.867
   CTE jobcount
     ->  Finalize GroupAggregate  (cost=98842.93..98844.71 rows=7 width=12) (actual time=1314.780..1314.802 rows=6 loops=1)
           Group Key: j.jobstatuscode
           Buffers: shared hit=21313 read=3231
           I/O Timings: read=19.867
           ->  Gather Merge  (cost=98842.93..98844.57 rows=14 width=12) (actual time=1314.766..1314.857 rows=18 loops=1)
                 Workers Planned: 2
                 Workers Launched: 2
                 Buffers: shared hit=60102 read=11834
                 I/O Timings: read=59.194
                 ->  Sort  (cost=97842.91..97842.93 rows=7 width=12) (actual time=1305.044..1305.047 rows=6 loops=3)
                       Sort Key: j.jobstatuscode
                       Sort Method: quicksort  Memory: 25kB
                       Worker 0:  Sort Method: quicksort  Memory: 25kB
                       Worker 1:  Sort Method: quicksort  Memory: 25kB
                       Buffers: shared hit=60102 read=11834
                       I/O Timings: read=59.194
                       ->  Partial HashAggregate  (cost=97842.74..97842.81 rows=7 width=12) (actual time=1305.010..1305.013 rows=6 loops=3)
                             Group Key: j.jobstatuscode
                             Buffers: shared hit=60086 read=11834
                             I/O Timings: read=59.194
                             ->  Parallel Seq Scan on job j  (cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434 rows=163200 loops=3)
                                   Filter: (((countrycode)::text = 'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp without time zone) AND (jobst
artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR (jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
                                   Rows Removed by Filter: 449035
                                   Buffers: shared hit=60086 read=11834
                                   I/O Timings: read=59.194
   ->  CTE Scan on jobcount jc  (cost=0.00..0.14 rows=7 width=24) (actual time=1314.784..1314.811 rows=6 loops=1)
         Buffers: shared hit=21313 read=3231
         I/O Timings: read=19.867
   ->  Hash  (cost=1.10..1.10 rows=10 width=4) (actual time=0.014..0.015 rows=10 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         Buffers: shared hit=1
         ->  Seq Scan on jobstatus js  (cost=0.00..1.10 rows=10 width=4) (actual time=0.005..0.008 rows=10 loops=1)
               Buffers: shared hit=1
 Planning Time: 0.949 ms
 Execution Time: 1314.993 ms
(40 rows)

Regards,
Aditya.

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Poor Performance running Django unit tests after upgrading from 10.6
Next
From: Michael Lewis
Date:
Subject: Re: CPU Consuming query. Sequential scan despite indexing.