Re: enable_XXX options - Mailing list pgsql-performance

From Tom Lane
Subject Re: enable_XXX options
Date
Msg-id 15285.1112376470@sss.pgh.pa.us
Whole thread Raw
In response to Re: enable_XXX options  ("Peterson, Bjorn" <Bjorn.Peterson@pearson.com>)
List pgsql-performance
"Peterson, Bjorn" <Bjorn.Peterson@pearson.com> writes:
>> That's what it's there for ... but it would be useful to look into why
>> the planner gets it so wrong without that hint.  Could we see EXPLAIN
>> ANALYZE both ways?

> Below is my query and the output of EXPLAIN - I was not able to run EXPLAIN
> ANALYZE, as the query never completes unless we turn enable_nestloop off:

Well, when the point is to find out why the planner's estimates don't
match reality, it's difficult to learn anything by looking only at the
estimates and not at reality.

Given what you say about the table sizes, the planner's preferred plan
looks somewhat reasonable.  I think the weak spot is the assumption that
this index check will be fast:

>                           ->  Index Scan using "Attendance_pkey" on
> "Attendance" a  (cost=0.00..4.37 rows=1 width=14)
>                                 Index Cond: ((a.attendance_date <= now())
> AND (a.attendance_type = 1) AND ("outer".course_id = a.time_slot) AND
> (a.user_id = "outer".user_id))

and the reason this seems like a weak spot is that the plan implies that
you made attendance_date be the first column in the index.  At least for
this query, it'd be far better for attendance_date to be the last
column, so that the info for any one user_id is bunched together in the
index.  For that matter I'd bet that attendance_type shouldn't be the
highest part of the key either --- either course_id or user_id should
probably be the leading key, depending on what sorts of queries you do.
It wouldn't matter for this query, but you should look to see if you
have other queries that select on only one of the two.

If you have both equalities and inequalities in an index condition, you
always want the equalities to be on the higher-order keys.  Otherwise
the scan will involve wasted scanning over index entries that match
only some of the conditions.  (Think about the ordering of a multicolumn
index to see why this is so.)  In this particular case I think the thing
will be scanning almost the whole index every time :-(

            regards, tom lane

pgsql-performance by date:

Previous
From: "Peterson, Bjorn"
Date:
Subject: Re: enable_XXX options
Next
From: "Marc G. Fournier"
Date:
Subject: Sustained inserts per sec ... ?