Re: Improving Performance of Query ~ Filter by A, Sort by B - Mailing list pgsql-performance

From Tom Lane
Subject Re: Improving Performance of Query ~ Filter by A, Sort by B
Date
Msg-id 23355.1531365018@sss.pgh.pa.us
Whole thread Raw
In response to Re: Improving Performance of Query ~ Filter by A, Sort by B  (Lincoln Swaine-Moore <lswainemoore@gmail.com>)
List pgsql-performance
Lincoln Swaine-Moore <lswainemoore@gmail.com> writes:
> Here's the result (I turned off the timeout and got it to finish):
> ...

I think the core of the problem here is bad rowcount estimation.  We can't
tell from your output how many rows really match

> WHERE "a"."parent_id" IN (
>     49188,14816,14758,8402
> )

but the planner is guessing there are 5823 of them.  In the case with
only three IN items, we have

>          ->  Index Scan using a_parent_id_idx1 on a_partition1 a (cost=0.43..4888.37 rows=4367 width=12) (actual
time=5.581..36.270rows=50 loops=1) 
>                Index Cond: (parent_id = ANY ('{19948,21436,41220}'::integer[]))

so the planner thinks there are 4367 matching rows but there are only 50.
Anytime you've got a factor-of-100 estimation error, you're going to be
really lucky if you get a decent plan.

I suggest increasing the statistics target for the parent_id column
in hopes of getting better estimates for the number of matches.

            regards, tom lane


pgsql-performance by date:

Previous
From: Lincoln Swaine-Moore
Date:
Subject: Re: Improving Performance of Query ~ Filter by A, Sort by B
Next
From: Dinesh Chandra 12108
Date:
Subject: Suggestion to optimize performance of the PLSQL procedure.