Re: Query planner plans very inefficient plans - Mailing list pgsql-performance

From Sean Chittenden
Subject Re: Query planner plans very inefficient plans
Date
Msg-id 20030630211336.GF96753@perrin.int.nxad.com
Whole thread Raw
In response to Query planner plans very inefficient plans  ("Robert Wille" <a2om6sy02@sneakemail.com>)
List pgsql-performance
>    I have somewhere around 3M rows in the image table, and 37K rows in the
>    ancestry table. The following is representative of some of the common
>    queries I issue:
>
>    select * from image natural join ancestry where ancestorid=1000000 and
>    (state & 7::bigint) = 0::bigint;
>
>    When I ask postgres to EXPLAIN it, I get the following:
>
>    Merge Join  (cost=81858.22..81900.60 rows=124 width=49)
>      ->  Sort  (cost=81693.15..81693.15 rows=16288 width=41)
>            ->  Seq Scan on image  (cost=0.00..80279.17 rows=16288 width=41)
>      ->  Sort  (cost=165.06..165.06 rows=45 width=8)
>            ->  Index Scan using ancestry_ancestorid_key on ancestry
>    (cost=0.00..163.83 rows=45 width=8)
>
>    It appears to me that the query executes as follows:
>
>    1. Scan every row in the image table to find those where (state &
>    7::bigint) = 0::bigint
>    2. Sort the results
>    3. Use an index on ancestry to find rows where ancestorid=1000000
>    4. Sort the results
>    5. Join the two

FWIW, I use INTs as bit vectors for options in various applications
and have run into this in a few cases.  In the database, I only care
about a few bits in the options INT, so what I did was create a
function for each of the bits that I care about and then a function
index.  Between the two, I've managed to solve my performance
problems.

CREATE FUNCTION app_option_foo_is_set(INT)
    RETURNS BOOL
    IMMUTABLE
    AS '
BEGIN
    IF $1 & 7::INT THEN
        RETURN TRUE;
    ELSE
        RETURN FALSE;
    END IF;
END;
' LANGUAGE 'plpgsql';
CREATE INDEX app_option_foo_fidx ON app_option_tbl (app_option_foo_is_set(options));
VACUUM ANALYZE;

Just make sure that you set your function to be IMMUTABLE. -sc


PS It'd be slick if PostgreSQL would collapse adjacent booleans into a
   bit in a byte: it'd save some apps a chunk of space.  32 options ==
   32 bytes with the type BOOL, but if adjacent BOOLs were collapsed,
   it'd only be 4 bytes on disk and maybe some page header data.

--
Sean Chittenden

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Query planner plans very inefficient plans
Next
From: Toby Sargeant
Date:
Subject: excessive disk access during query