Re: Re: How to improve the performance of my SQL query? - Mailing list pgsql-general

From David Rowley
Subject Re: Re: How to improve the performance of my SQL query?
Date
Msg-id CAApHDvrYBKK_5DC2gPgG=H9_3jqoryXBUydNmD6ftFN0kcyuyA@mail.gmail.com
Whole thread Raw
In response to Re:Re: How to improve the performance of my SQL query?  (gzh <gzhcoder@126.com>)
List pgsql-general
On Thu, 20 Jul 2023 at 23:36, gzh <gzhcoder@126.com> wrote:
>
>
> Thank you very much for taking the time to reply to my question.
>
>
> Sorry, I provided incorrect information.
>
> The index also does not work in the following query statement.
>
>
> > select COUNT(ET_CD)
> > from TBL_SHA
> > WHERE MS_CD = '009'
> > AND ETRYS = '000001'
>
> QUERY PLAN
> Limit  (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.439..128668.250 rows=1 loops=1)
>   ->  Finalize Aggregate  (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.437..128668.246 rows=1
loops=1)
>         ->  Gather  (cost=2419643.25..2419643.46 rows=2 width=8) (actual time=128664.108..128668.233 rows=3 loops=1)
>               Workers Planned: 2
>               Workers Launched: 2
>               ->  Partial Aggregate  (cost=2418643.25..2418643.26 rows=1 width=8) (actual time=128655.256..128655.258
rows=1loops=3)
 
>                     ->  Parallel Seq Scan on TBL_SHA  (cost=0.00..2415548.85 rows=1237762 width=9) (actual
time=75357.455..128531.615rows=1066667 loops=3)
 
>                           Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
>                           Rows Removed by Filter: 11833442
> Planning Time: 0.118 ms
> Execution Time: 128668.290 ms
>
> The TBL_SHA table has another index, as shown below.
>
> CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS)
> CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, BK_CD, FR_CD, RM_CD)
>
> When I take the following query statement, the result is returned quickly.
> Why does index_search_01 always not work?

The method to access the table is decided by the query planner based
on costs.  The costs are driven off the row estimates which are driven
from table statistics.  If the table statistics, for example say that
99% of rows have MS_CD = '009', then scanning an index on MS_CD is
unlikely to be a good idea as that would likely require random access
to the heap.  It's likely better to perform a table scan and then just
filter out the 1% of rows that don't match.

Try executing the query after having done:

SET enable_seqscan TO off;

What plan does it use now?

Is that plan faster or slower than the seq scan plan?

David



pgsql-general by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Difference in the tablespace folders on primary and secondary nodes
Next
From: David Rowley
Date:
Subject: Re: How to improve the performance of my SQL query?