Thread: Why is it choosing a different plan?

Why is it choosing a different plan?

From
Anthony Presley
Date:
Hi all, I'm having some confusion with the 7.4 query planner.

I have two identical queries, whereby the passed (varchar) parameter
appears to be the deciding factor between a sequential or an index scan.


IE, This query:

explain SELECT DISTINCT (a1.ENTRY_ID) AS retrieved FROM OS_CURRENTSTEP
AS a1 , OS_CURRENTSTEP AS a2  WHERE a1.ENTRY_ID = a1.ENTRY_ID AND
a1.ENTRY_ID = a2.ENTRY_ID AND ( a1.OWNER =  'p1'  AND a2.STEP_ID =
1  );
NOTICE:  QUERY PLAN:

Unique  (cost=1175.88..1175.88 rows=1 width=16)
  ->  Sort  (cost=1175.88..1175.88 rows=1 width=16)
        ->  Nested Loop  (cost=0.00..1175.87 rows=1 width=16)
              ->  Index Scan using idx_9 on os_currentstep a1
(cost=0.00..1172.45 rows=1 width=8)
              ->  Index Scan using idx_8 on os_currentstep a2
(cost=0.00..3.41 rows=1 width=8)

However, this query:

explain SELECT DISTINCT (a1.ENTRY_ID) AS retrieved FROM OS_CURRENTSTEP
AS a1 , OS_CURRENTSTEP AS a2  WHERE a1.ENTRY_ID = a1.ENTRY_ID AND
a1.ENTRY_ID = a2.ENTRY_ID AND ( a1.OWNER =  'GIL'  AND a2.STEP_ID =
1  );
NOTICE:  QUERY PLAN:

Unique  (cost=3110.22..3110.22 rows=1 width=16)
  ->  Sort  (cost=3110.22..3110.22 rows=1 width=16)
        ->  Nested Loop  (cost=0.00..3110.21 rows=1 width=16)
              ->  Seq Scan on os_currentstep a1  (cost=0.00..3106.78
rows=1 width=8)
              ->  Index Scan using idx_8 on os_currentstep a2
(cost=0.00..3.41 rows=1 width=8)


Thoughts about why changing OWNER from 'p1' to 'GIL' would go from an
Index Scan to a Sequential?

[There is an index on os_currentstep, and it was vacuum analyze'd
recently.]

Running version 7.4 (working on upgrading to 8.0 soon).  Thanks!

--
Anthony


Re: Why is it choosing a different plan?

From
Anthony Presley
Date:
I thought this was related to the TYPE (ie, I could cast it using
something like: attr1=1::int8).  However, I tried a few more values, and
the query planner is confusing me.

With these values, in the owner, I get a Seq Scan:
    'GIL', '1122', '2305'

With these values, in the owner, I get an Index Scan:
    'p1', 'p2', '2300', '8088', 'CHANGEINVENTION'

The os_currentstep table has about 119,700 rows in it -- and I can't do
too much to actually change the query, since it's coming from something
of a 'black box' application.

Thoughts?

--
Anthony

On Fri, 2006-09-22 at 17:59 -0500, Anthony Presley wrote:
> Hi all, I'm having some confusion with the 7.4 query planner.
>
> I have two identical queries, whereby the passed (varchar) parameter
> appears to be the deciding factor between a sequential or an index scan.
>
>
> IE, This query:
>
> explain SELECT DISTINCT (a1.ENTRY_ID) AS retrieved FROM OS_CURRENTSTEP
> AS a1 , OS_CURRENTSTEP AS a2  WHERE a1.ENTRY_ID = a1.ENTRY_ID AND
> a1.ENTRY_ID = a2.ENTRY_ID AND ( a1.OWNER =  'p1'  AND a2.STEP_ID =
> 1  );
> NOTICE:  QUERY PLAN:
>
> Unique  (cost=1175.88..1175.88 rows=1 width=16)
>   ->  Sort  (cost=1175.88..1175.88 rows=1 width=16)
>         ->  Nested Loop  (cost=0.00..1175.87 rows=1 width=16)
>               ->  Index Scan using idx_9 on os_currentstep a1
> (cost=0.00..1172.45 rows=1 width=8)
>               ->  Index Scan using idx_8 on os_currentstep a2
> (cost=0.00..3.41 rows=1 width=8)
>
> However, this query:
>
> explain SELECT DISTINCT (a1.ENTRY_ID) AS retrieved FROM OS_CURRENTSTEP
> AS a1 , OS_CURRENTSTEP AS a2  WHERE a1.ENTRY_ID = a1.ENTRY_ID AND
> a1.ENTRY_ID = a2.ENTRY_ID AND ( a1.OWNER =  'GIL'  AND a2.STEP_ID =
> 1  );
> NOTICE:  QUERY PLAN:
>
> Unique  (cost=3110.22..3110.22 rows=1 width=16)
>   ->  Sort  (cost=3110.22..3110.22 rows=1 width=16)
>         ->  Nested Loop  (cost=0.00..3110.21 rows=1 width=16)
>               ->  Seq Scan on os_currentstep a1  (cost=0.00..3106.78
> rows=1 width=8)
>               ->  Index Scan using idx_8 on os_currentstep a2
> (cost=0.00..3.41 rows=1 width=8)
>
>
> Thoughts about why changing OWNER from 'p1' to 'GIL' would go from an
> Index Scan to a Sequential?
>
> [There is an index on os_currentstep, and it was vacuum analyze'd
> recently.]
>
> Running version 7.4 (working on upgrading to 8.0 soon).  Thanks!
>
> --
> Anthony
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org


Re: Why is it choosing a different plan?

From
Anthony Presley
Date:
Doh!

Bad kharma.  I apologize.  Too late, and not enuf caffeine.  I posted
here because this query is taking 2+ minutes on a production machine,
and under 4 seconds on a development machine.

For posterity sakes .... the seq scan is because of the distribution of
those values.  GIL is in about 1/2 of the records.  The others are very
common.  Cheaper to do a Sequential than to do an index.  The other
values are present in only a few spotted cases (1 to 3000), and the
index is better.

Also helps when the production machine has all of its indexes in place
to actually do the reading.

Sorry to be a bother!

--
Anthony

On Fri, 2006-09-22 at 18:58 -0500, Anthony Presley wrote:
> I thought this was related to the TYPE (ie, I could cast it using
> something like: attr1=1::int8).  However, I tried a few more values, and
> the query planner is confusing me.
>
> With these values, in the owner, I get a Seq Scan:
>  'GIL', '1122', '2305'
>
> With these values, in the owner, I get an Index Scan:
>  'p1', 'p2', '2300', '8088', 'CHANGEINVENTION'
>
> The os_currentstep table has about 119,700 rows in it -- and I can't do
> too much to actually change the query, since it's coming from something
> of a 'black box' application.
>
> Thoughts?
>
> --
> Anthony
>
> On Fri, 2006-09-22 at 17:59 -0500, Anthony Presley wrote:
> > Hi all, I'm having some confusion with the 7.4 query planner.
> >
> > I have two identical queries, whereby the passed (varchar) parameter
> > appears to be the deciding factor between a sequential or an index scan.
> >
> >
> > IE, This query:
> >
> > explain SELECT DISTINCT (a1.ENTRY_ID) AS retrieved FROM OS_CURRENTSTEP
> > AS a1 , OS_CURRENTSTEP AS a2  WHERE a1.ENTRY_ID = a1.ENTRY_ID AND
> > a1.ENTRY_ID = a2.ENTRY_ID AND ( a1.OWNER =  'p1'  AND a2.STEP_ID =
> > 1  );
> > NOTICE:  QUERY PLAN:
> >
> > Unique  (cost=1175.88..1175.88 rows=1 width=16)
> >   ->  Sort  (cost=1175.88..1175.88 rows=1 width=16)
> >         ->  Nested Loop  (cost=0.00..1175.87 rows=1 width=16)
> >               ->  Index Scan using idx_9 on os_currentstep a1
> > (cost=0.00..1172.45 rows=1 width=8)
> >               ->  Index Scan using idx_8 on os_currentstep a2
> > (cost=0.00..3.41 rows=1 width=8)
> >
> > However, this query:
> >
> > explain SELECT DISTINCT (a1.ENTRY_ID) AS retrieved FROM OS_CURRENTSTEP
> > AS a1 , OS_CURRENTSTEP AS a2  WHERE a1.ENTRY_ID = a1.ENTRY_ID AND
> > a1.ENTRY_ID = a2.ENTRY_ID AND ( a1.OWNER =  'GIL'  AND a2.STEP_ID =
> > 1  );
> > NOTICE:  QUERY PLAN:
> >
> > Unique  (cost=3110.22..3110.22 rows=1 width=16)
> >   ->  Sort  (cost=3110.22..3110.22 rows=1 width=16)
> >         ->  Nested Loop  (cost=0.00..3110.21 rows=1 width=16)
> >               ->  Seq Scan on os_currentstep a1  (cost=0.00..3106.78
> > rows=1 width=8)
> >               ->  Index Scan using idx_8 on os_currentstep a2
> > (cost=0.00..3.41 rows=1 width=8)
> >
> >
> > Thoughts about why changing OWNER from 'p1' to 'GIL' would go from an
> > Index Scan to a Sequential?
> >
> > [There is an index on os_currentstep, and it was vacuum analyze'd
> > recently.]
> >
> > Running version 7.4 (working on upgrading to 8.0 soon).  Thanks!
> >
> > --
> > Anthony
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster