Re: Reasons for choosing one execution plan overanother? - Mailing list pgsql-performance

From Mikkel Lauritsen
Subject Re: Reasons for choosing one execution plan overanother?
Date
Msg-id bf5b44d979fb06ba3eda79042d19dbea@localhost
Whole thread Raw
Responses Re: Reasons for choosing one execution plan overanother?
List pgsql-performance
Hi all,

On Wed, 11 Sep 2013 18:55:38 +0200, Giuseppe Broccolo
<giuseppe.broccolo@2ndquadrant.it> wrote:
> Il 11/09/2013 13:16, Mikkel Lauritsen ha scritto:
> > Hi all,
> >
> > I have a number of Postgres 9.2.4 databases with the same schema but
> > with slightly different contents, running on small servers that are
> > basically alike (8-16 GB ram).
>
> I think that your answer can be found in your statement "slightly
> different contents".

Yup, that's what I've been thinking myself so far - it definitely doesn't
look as if I'm hitting a bug, and I've been through the schema so I feel
reasonably sure that I'm not missing an index.

In the example from my original mail the i and a tables are identical in
the two databases. The slow plan is chosen when the x and e tables contain
3.2M and 6.2M rows, the fast plan has 12.8M and 17M rows.

So - does anybody with enough insight in the planner know if it sounds
likely that it would choose the given plans in these two cases, or if
it's more likely that I have a tuning problem that leads to bad planning?

And if the different plans are to be expected, is there any way I can hint
at the planner to make it choose the fast plan in both cases?

FWIW if I do a very simple query like

SELECT e.id FROM e INNER JOIN x USING (id) WHERE e.h_id = 'foo';

on the two databases I also end up with two different plans (included
below). Here the execution time difference is much less pronounced (note
that the "fast" execution is on inferior hardware and with a much larger
result), but the way the join is executed is the same as in the initial
larger plans. Setting seq_page_cost and random_page_cost to the same
value makes the planner choose the fast plan in both cases, but unfortu-
nately that has no effect on my initial problem :-/

Best regards & thanks,
  Mikkel Lauritsen

---

Fast plan:

 Nested Loop  (cost=0.00..24523.00 rows=1138 width=39) (actual
time=2.546..33.858 rows=1192 loops=1)
   Buffers: shared hit=8991
   ->  Index Scan using e_h_id_idx on e  (cost=0.00..6171.55 rows=1525
width=39) (actual time=0.053..1.211 rows=1857 loops=1)
         Index Cond: (healthtrack_id =
'-95674114670403931535179954575983492851'::text)
         Buffers: shared hit=350
   ->  Index Only Scan using x_pkey on x  (cost=0.00..12.02 rows=1
width=39) (actual time=0.017..0.017 rows=1 loops=1857)
         Index Cond: (id = e.id)
         Heap Fetches: 1192
         Buffers: shared hit=8641
 Total runtime: 34.065 ms


Slow plan:

 Nested Loop  (cost=22.25..7020.66 rows=277 width=39) (actual
time=0.298..13.996 rows=228 loops=1)
   Buffers: shared hit=3173
   ->  Bitmap Heap Scan on e  (cost=22.25..2093.50 rows=537 width=39)
(actual time=0.219..0.628 rows=697 loops=1)
         Recheck Cond: (healthtrack_id = 'foo'::text)
         Buffers: shared hit=152
         ->  Bitmap Index Scan on e_h_id_idx  (cost=0.00..22.12 rows=537
width=0) (actual time=0.188..0.188 rows=697 loops=1)
               Index Cond: (h_id = 'foo'::text)
               Buffers: shared hit=9
   ->  Index Only Scan using x_pkey on x  (cost=0.00..9.17 rows=1
width=39) (actual time=0.018..0.018 rows=0 loops=697)
         Index Cond: (id = e.id)
         Heap Fetches: 228
         Buffers: shared hit=3021
 Total runtime: 14.070 ms


pgsql-performance by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Performance bug in prepared statement binding in 9.2?
Next
From: Gavin Flower
Date:
Subject: Re: Reasons for choosing one execution plan over another?