Re: Improving Performance of Query ~ Filter by A, Sort by B

From: Tom Lane
Subject: Re: Improving Performance of Query ~ Filter by A, Sort by B
Date: ,
Msg-id: 23355.1531365018@sss.pgh.pa.us
(view: Whole thread, Raw)
In response to: Re: Improving Performance of Query ~ Filter by A, Sort by B  (Lincoln Swaine-Moore)
List: pgsql-performance


Lincoln Swaine-Moore <> writes:
> Here's the result (I turned off the timeout and got it to finish):
> ...

I think the core of the problem here is bad rowcount estimation.  We can't
tell from your output how many rows really match

> WHERE "a"."parent_id" IN (
>     49188,14816,14758,8402
> )

but the planner is guessing there are 5823 of them.  In the case with
only three IN items, we have

>          ->  Index Scan using a_parent_id_idx1 on a_partition1 a (cost=0.43..4888.37 rows=4367 width=12) (actual
time=5.581..36.270rows=50 loops=1) 
>                Index Cond: (parent_id = ANY ('{19948,21436,41220}'::integer[]))

so the planner thinks there are 4367 matching rows but there are only 50.
Anytime you've got a factor-of-100 estimation error, you're going to be
really lucky if you get a decent plan.

I suggest increasing the statistics target for the parent_id column
in hopes of getting better estimates for the number of matches.

            regards, tom lane



pgsql-performance by date:

From: Dinesh Chandra 12108
Date:
Subject: Suggestion to optimize performance of the PLSQL procedure.
From: Roman Konoval
Date:
Subject: Re: High concurrency but simple updating causes deadlock