Thread: 7.4.7: strange planner decision

7.4.7: strange planner decision

From

Roman Neuhauser

Date:

13 July 2005, 08:12:30

Why does the planner want to crawl the table that has 5M rows instead of the one
with 176k rows? Both tables are freshly vacuum-full-analyzed.

7.4.7 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)

callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
                                 QUERY PLAN
----------------------------------------------------------------------------
 Hash Join  (cost=5340.00..292675.06 rows=176161 width=44)
   Hash Cond: (("outer".base)::text = ("inner".base)::text)
   ->  Seq Scan on files ff  (cost=0.00..117301.58 rows=5278458 width=41)
   ->  Hash  (cost=3436.60..3436.60 rows=176160 width=44)
         ->  Seq Scan on dups fd  (cost=0.00..3436.60 rows=176160 width=44)
(5 rows)

callrec32=# \d fix.files
              Table "fix.files"
 Column |          Type          | Modifiers
--------+------------------------+-----------
 dir    | character varying(255) |
 base   | character varying(255) |
Indexes:
    "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
    "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))

callrec32=# \d fix.dups
              Table "fix.dups"
 Column |          Type          | Modifiers
--------+------------------------+-----------
 base   | character varying(255) |
Indexes:
    "dups_base_key" unique, btree (base)

callrec32=# select count(*) from fix.files; select count(*) from fix.dups;
  count
---------
 5278458
(1 row)

 count
--------
 176160
(1 row)


--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991

Re: 7.4.7: strange planner decision

From

Richard Huxton

Date:

13 July 2005, 09:00:44

Roman Neuhauser wrote:
> Why does the planner want to crawl the table that has 5M rows instead of the one
> with 176k rows? Both tables are freshly vacuum-full-analyzed.

Because you don't have an index on "base" for the files table.

> callrec32=# \d fix.files
>               Table "fix.files"
>  Column |          Type          | Modifiers
> --------+------------------------+-----------
>  dir    | character varying(255) |
>  base   | character varying(255) |
> Indexes:
>     "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
>     "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))

A couple of indexes, but none simple on "base", so it can't be used for
the join.

--
   Richard Huxton
   Archonet Ltd

Array as parameter for plpgsql function

From

David Pratt

Date:

13 July 2005, 09:00:49

How does one pass an array as a parameter to a plpgsql function?  I
have tried this the following. I can't seem to get a select statement
to work without syntax problems and no examples in Postgres book to
help with this :(  This is just a test so please ignore the fact it is
a simple function.

CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS '

    DECLARE

    test_array ALIAS FOR $1;      -- alias for input array

    BEGIN

        return array_upper(test_array,1)

    END;
' LANGUAGE 'plpgsql';

SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test
two']]) AS output;

Re: 7.4.7: strange planner decision

From

Roman Neuhauser

Date:

13 July 2005, 09:48:37

# dev@archonet.com / 2005-07-13 12:57:31 +0100:
> Roman Neuhauser wrote:
> >Why does the planner want to crawl the table that has 5M rows instead of
> >the one
> >with 176k rows? Both tables are freshly vacuum-full-analyzed.
>
> Because you don't have an index on "base" for the files table.

    I added one, ran vacuum full analyze fix.files, and:

    callrec32=# \d fix.files
                  Table "fix.files"
     Column |          Type          | Modifiers
    --------+------------------------+-----------
     dir    | character varying(255) |
     base   | character varying(255) |
    Indexes:
        "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
        "ff_baseonly_idx" btree (base)
        "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))

    callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
                                     QUERY PLAN
    ----------------------------------------------------------------------------
     Hash Join  (cost=5340.00..292675.06 rows=176161 width=44)
       Hash Cond: (("outer".base)::text = ("inner".base)::text)
       ->  Seq Scan on files ff  (cost=0.00..117301.58 rows=5278458 width=41)
       ->  Hash  (cost=3436.60..3436.60 rows=176160 width=44)
             ->  Seq Scan on dups fd  (cost=0.00..3436.60 rows=176160 width=44)
    (5 rows)

    Which is exactly what I expected. Using left prefix of a multicolumn
    index normally works just fine, thank you.

    http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html:

    The query planner can use a multicolumn index for queries that involve
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    the leftmost column in the index definition plus any number of columns
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    listed to the right of it, without a gap. For example, an index on (a,
    b, c) can be used in queries involving all of a, b, and c, or in queries
    involving both a and b, or in queries involving only a

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991

Re: 7.4.7: strange planner decision

From

Richard Huxton

Date:

13 July 2005, 10:19:00

Roman Neuhauser wrote:
>>Because you don't have an index on "base" for the files table.
>
>
>     I added one, ran vacuum full analyze fix.files, and:
>
>     callrec32=# \d fix.files
>                   Table "fix.files"
>      Column |          Type          | Modifiers
>     --------+------------------------+-----------
>      dir    | character varying(255) |
>      base   | character varying(255) |
>     Indexes:
>         "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text)))
>         "ff_baseonly_idx" btree (base)
>         "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text)))
>
>     callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
>                                      QUERY PLAN
>     ----------------------------------------------------------------------------
>      Hash Join  (cost=5340.00..292675.06 rows=176161 width=44)
>        Hash Cond: (("outer".base)::text = ("inner".base)::text)
>        ->  Seq Scan on files ff  (cost=0.00..117301.58 rows=5278458 width=41)
>        ->  Hash  (cost=3436.60..3436.60 rows=176160 width=44)
>              ->  Seq Scan on dups fd  (cost=0.00..3436.60 rows=176160 width=44)
>     (5 rows)
>
>     Which is exactly what I expected. Using left prefix of a multicolumn
>     index normally works just fine, thank you.

Couldn't figure out what you meant here - had to go back and re-read
your index definitions. Sorry - missed the (base, ...) on the front of
base_storename_idx.

What happens to the plan if you SET enable_seqscan=false; first? It's
presumably getting the row-estimate right, so unless there's terrible
correlation on "base" in the files table I can only assume it's getting
the cost estimates horribly wrong.

--
   Richard Huxton
   Archonet Ltd

Re: 7.4.7: strange planner decision

From

Roman Neuhauser

Date:

13 July 2005, 10:58:18

# dev@archonet.com / 2005-07-13 14:09:34 +0100:
> Roman Neuhauser wrote:
> >    callrec32=# \d fix.files
> >                  Table "fix.files"
> >     Column |          Type          | Modifiers
> >    --------+------------------------+-----------
> >     dir    | character varying(255) |
> >     base   | character varying(255) |
> >    Indexes:
> >        "base_storename_idx" btree (base, ((((dir)::text || '/'::text) ||
> >        (base)::text)))
> >        "ff_baseonly_idx" btree (base)
> >        "ff_storename_idx" btree (((((dir)::text || '/'::text) ||
> >        (base)::text)))
> >
> >    callrec32=# explain select fd.base from fix.dups fd join fix.files ff
> >    using (base);
> >                                     QUERY PLAN
> >    ----------------------------------------------------------------------------
> >     Hash Join  (cost=5340.00..292675.06 rows=176161 width=44)
> >       Hash Cond: (("outer".base)::text = ("inner".base)::text)
> >       ->  Seq Scan on files ff  (cost=0.00..117301.58 rows=5278458
> >       width=41)
> >       ->  Hash  (cost=3436.60..3436.60 rows=176160 width=44)
> >             ->  Seq Scan on dups fd  (cost=0.00..3436.60 rows=176160
> >             width=44)
> >    (5 rows)

> What happens to the plan if you SET enable_seqscan=false; first? It's
> presumably getting the row-estimate right, so unless there's terrible
> correlation on "base" in the files table I can only assume it's getting
> the cost estimates horribly wrong.

callrec32=# SET enable_seqscan=false;
SET
callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..1066990.93 rows=176161 width=44)
   ->  Index Scan using dups_base_key on dups fd  (cost=0.00..5570.86 rows=176160 width=44)
   ->  Index Scan using ff_baseonly_idx on files ff  (cost=0.00..6.01 rows=1 width=41)
         Index Cond: (("outer".base)::text = (ff.base)::text)
(4 rows)

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991

Re: Array as parameter for plpgsql function

From

Tom Lane

Date:

13 July 2005, 11:08:09

David Pratt <fairwinds@eastlink.ca> writes:
> CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS '
>     DECLARE
>     test_array ALIAS FOR $1;      -- alias for input array
>     BEGIN
>         return array_upper(test_array,1)
>     END;
> ' LANGUAGE 'plpgsql';

> SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test
> two']]) AS output;

Works fine for me, once I add the semicolon you forgot:

        return array_upper(test_array,1);

            regards, tom lane

Re: 7.4.7: strange planner decision

From

Roman Neuhauser

Date:

13 July 2005, 11:14:28

# neuhauser@sigpipe.cz / 2005-07-13 15:58:09 +0200:
> # dev@archonet.com / 2005-07-13 14:09:34 +0100:
> > Roman Neuhauser wrote:
> > >    callrec32=# \d fix.files
> > >                  Table "fix.files"
> > >     Column |          Type          | Modifiers
> > >    --------+------------------------+-----------
> > >     dir    | character varying(255) |
> > >     base   | character varying(255) |
> > >    Indexes:
> > >        "base_storename_idx" btree (base, ((((dir)::text || '/'::text) ||
> > >        (base)::text)))
> > >        "ff_baseonly_idx" btree (base)
> > >        "ff_storename_idx" btree (((((dir)::text || '/'::text) ||
> > >        (base)::text)))
> > >
> > >    callrec32=# explain select fd.base from fix.dups fd join fix.files ff
> > >    using (base);
> > >                                     QUERY PLAN
> > >    ----------------------------------------------------------------------------
> > >     Hash Join  (cost=5340.00..292675.06 rows=176161 width=44)
> > >       Hash Cond: (("outer".base)::text = ("inner".base)::text)
> > >       ->  Seq Scan on files ff  (cost=0.00..117301.58 rows=5278458
> > >       width=41)
> > >       ->  Hash  (cost=3436.60..3436.60 rows=176160 width=44)
> > >             ->  Seq Scan on dups fd  (cost=0.00..3436.60 rows=176160
> > >             width=44)
> > >    (5 rows)
>
> > What happens to the plan if you SET enable_seqscan=false; first? It's
> > presumably getting the row-estimate right, so unless there's terrible
> > correlation on "base" in the files table I can only assume it's getting
> > the cost estimates horribly wrong.
>
> callrec32=# SET enable_seqscan=false;
> SET
> callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
>                                          QUERY PLAN
> --------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..1066990.93 rows=176161 width=44)
>    ->  Index Scan using dups_base_key on dups fd  (cost=0.00..5570.86 rows=176160 width=44)
>    ->  Index Scan using ff_baseonly_idx on files ff  (cost=0.00..6.01 rows=1 width=41)
>          Index Cond: (("outer".base)::text = (ff.base)::text)
> (4 rows)

    BTW, this query or its equivalent is what I'm really after:

    callrec32=# explain select c1.storename from fix.dups fd join calls
    c1 on (fd.base = basename(c1.storename) and c1.iscouple = '1') where
    not exists (select 1 from fix.files ff where c1.storename = ff.dir
    || '/' || ff.base);
                                                   QUERY PLAN
    --------------------------------------------------------------------------------------------------------
     Hash Join  (cost=7474.26..23127970.91 rows=2354719 width=60)
       Hash Cond: ((basename("outer".storename))::text = ("inner".base)::text)
       ->  Index Scan using calls2_iscouple_idx on calls c1  (cost=0.00..22982439.69 rows=2354719 width=60)
             Filter: ((iscouple = 1::smallint) AND (NOT (subplan)))
             SubPlan
               ->  Index Scan using ff_storename_idx on files ff  (cost=0.00..88570.16 rows=26393 width=0)
                     Index Cond: (($0)::text = (((dir)::text || '/'::text) || (base)::text))
       ->  Hash  (cost=5570.86..5570.86 rows=176160 width=44)
             ->  Index Scan using dups_base_key on dups fd  (cost=0.00..5570.86 rows=176160 width=44)
    (9 rows)

    calls has (among others):

    "calls2_basename_storename_idx" btree (basename(storename), storename) WHERE (iscouple = (1)::smallint)
    "calls2_storename_idx" btree (storename) WHERE (iscouple = (1)::smallint)
    "calls2_iscouple_idx" btree (id) WHERE (iscouple = (1)::smallint)

    WHy does it use the calls2_iscouple_idx index when calls.id isn't
    used anywhere in the query? I would guess that calls2_storename_idx
    would actually be more useful.

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991

Re: 7.4.7: strange planner decision

From

Tom Lane

Date:

13 July 2005, 11:15:29

Richard Huxton <dev@archonet.com> writes:
> What happens to the plan if you SET enable_seqscan=false; first? It's
> presumably getting the row-estimate right, so unless there's terrible
> correlation on "base" in the files table I can only assume it's getting
> the cost estimates horribly wrong.

I think you'll find that the results suck ;-).  It looks to me that the
planner is making exactly the right choice here.  The only plausible
alternative is a nestloop with inner indexscan on "files", which would
imply 176160 separate index probes into "files", which is unlikely to
win compared to one seqscan.  (In the aggregate the index probes are
likely to end up touching every page of "files" anyway --- you would
need a much larger files table before this stopped being true.)

If you want to compare the nestloop plan to test this theory, turning
off enable_hashjoin and (if necessary) enable_mergejoin would be the
better way to get it.  But let's see EXPLAIN ANALYZE results for both
cases, not just EXPLAIN.

            regards, tom lane

Re: 7.4.7: strange planner decision

From

Richard Huxton

Date:

13 July 2005, 11:21:30

Roman Neuhauser wrote:
> # dev@archonet.com / 2005-07-13 14:09:34 +0100:
>
>>Roman Neuhauser wrote:
>>
>>>   callrec32=# \d fix.files
>>>                 Table "fix.files"
>>>    Column |          Type          | Modifiers
>>>   --------+------------------------+-----------
>>>    dir    | character varying(255) |
>>>    base   | character varying(255) |
>>>   Indexes:
>>>       "base_storename_idx" btree (base, ((((dir)::text || '/'::text) ||
>>>       (base)::text)))
>>>       "ff_baseonly_idx" btree (base)
>>>       "ff_storename_idx" btree (((((dir)::text || '/'::text) ||
>>>       (base)::text)))
>>>
>>>   callrec32=# explain select fd.base from fix.dups fd join fix.files ff
>>>   using (base);
>>>                                    QUERY PLAN
>>>   ----------------------------------------------------------------------------
>>>    Hash Join  (cost=5340.00..292675.06 rows=176161 width=44)
>>>      Hash Cond: (("outer".base)::text = ("inner".base)::text)
>>>      ->  Seq Scan on files ff  (cost=0.00..117301.58 rows=5278458
>>>      width=41)
>>>      ->  Hash  (cost=3436.60..3436.60 rows=176160 width=44)
>>>            ->  Seq Scan on dups fd  (cost=0.00..3436.60 rows=176160
>>>            width=44)
>>>   (5 rows)
>
>
>>What happens to the plan if you SET enable_seqscan=false; first? It's
>>presumably getting the row-estimate right, so unless there's terrible
>>correlation on "base" in the files table I can only assume it's getting
>>the cost estimates horribly wrong.
>
>
> callrec32=# SET enable_seqscan=false;
> SET
> callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base);
>                                          QUERY PLAN
> --------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..1066990.93 rows=176161 width=44)
>    ->  Index Scan using dups_base_key on dups fd  (cost=0.00..5570.86 rows=176160 width=44)
>    ->  Index Scan using ff_baseonly_idx on files ff  (cost=0.00..6.01 rows=1 width=41)
>          Index Cond: (("outer".base)::text = (ff.base)::text)

OK - so it thinks the cost of this plan will be about 1 million, whereas
the old plan was 290 thousand. The question is - why?
What are your planner settings? Ch 16.4.4.2 here
http://www.postgresql.org/docs/7.4/static/runtime-config.html#RUNTIME-CONFIG-QUERY

I'm guessing something to do with cpu_index_tuple_cost or random_page_cost.
--
   Richard Huxton
   Archonet Ltd

Re: Array as parameter for plpgsql function

From

David Pratt

Date:

13 July 2005, 12:09:27

argh!!!  It was telling me I had an error in select statement. Thanks
Tom!

Regards
David

On Wednesday, July 13, 2005, at 11:08 AM, Tom Lane wrote:

> David Pratt <fairwinds@eastlink.ca> writes:
>> CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS '
>>     DECLARE
>>     test_array ALIAS FOR $1;      -- alias for input array
>>     BEGIN
>>         return array_upper(test_array,1)
>>     END;
>> ' LANGUAGE 'plpgsql';
>
>> SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test
>> two']]) AS output;
>
> Works fine for me, once I add the semicolon you forgot:
>
>         return array_upper(test_array,1);
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that
> your
>        message can get through to the mailing list cleanly
>