Thread: Query planner plans very inefficient plans

Query planner plans very inefficient plans

From
"Robert Wille"
Date:
I have a number of very common queries that the optimizer plans a very inefficient plan. I vacuum hourly. I'm wondering what I can do to make the queries faster.
 
Here are the relevant tables:
 
create table image(
    imageid integer not null,     /* The image's ID */
    containerid integer not null,     /* The container that owns it */
    name varchar(120) not null,     /* Its name */
    state bigint not null default 0,    /* Its state */
    primary key (imageid),
    unique (containerid, name)     /* All images in a container must be uniquely named */
);
 
create table ancestry(
    containerid integer not null,     /* The container that has an ancestor */
    ancestorid integer not null,     /* The ancestor of the container */
    unique (containerid, ancestorid),
    unique (ancestorid, containerid)
);
 
I have somewhere around 3M rows in the image table, and 37K rows in the ancestry table. The following is representative of some of the common queries I issue:
 
select * from image natural join ancestry where ancestorid=1000000 and (state & 7::bigint) = 0::bigint;
 
When I ask postgres to EXPLAIN it, I get the following:
 
Merge Join  (cost=81858.22..81900.60 rows=124 width=49)
  ->  Sort  (cost=81693.15..81693.15 rows=16288 width=41)
        ->  Seq Scan on image  (cost=0.00..80279.17 rows=16288 width=41)
  ->  Sort  (cost=165.06..165.06 rows=45 width=8)
        ->  Index Scan using ancestry_ancestorid_key on ancestry  (cost=0.00..163.83 rows=45 width=8)
 
It appears to me that the query executes as follows:
 
1. Scan every row in the image table to find those where (state & 7::bigint) = 0::bigint
2. Sort the results
3. Use an index on ancestry to find rows where ancestorid=1000000
4. Sort the results
5. Join the two
 
It seems to me that if this query is going to return a small percentage of the rows (which is the common case), it could be done much faster by first joining (all columns involved in the join are indexed), and then by applying the (state & 7::bigint) = 0::bigint constraint to the results.
 
Similarly, when I update, I get the following:
 
explain update image set state=0 from ancestry where ancestorid=1000000 and ancestry.containerid=image.containerid and (state & 7::bigint) = 0::bigint;
 
NOTICE:  QUERY PLAN:
 
Merge Join  (cost=81841.92..81884.30 rows=124 width=43)
  ->  Sort  (cost=81676.74..81676.74 rows=16288 width=39)
        ->  Seq Scan on image  (cost=0.00..80279.17 rows=16288 width=39)
  ->  Sort  (cost=165.19..165.19 rows=45 width=4)
        ->  Index Scan using ancestry_ancestorid_key on ancestry  (cost=0.00..163.95 rows=45 width=4)
 
How can I avoid the sequential scan of the entire image table (i.e. how can I get it to perform the join first)?

Thanks in advance.
 
Robert Wille
 

Re: Query planner plans very inefficient plans

From
Tom Lane
Date:
"Robert Wille" <a2om6sy02@sneakemail.com> writes:
> select * from image natural join ancestry where ancestorid=1000000 and
> (state & 7::bigint) = 0::bigint;

The planner is not going to have any statistics that allow it to predict
the number of rows satisfying that &-condition, and so it's unsurprising
if its off-the-cuff guess has little to do with reality.

I'd recommend skipping any cute tricks with bit-packing, and storing the
state (and any other values you query frequently) as its own column, so
that the query looks like

select * from image natural join ancestry where ancestorid=1000000 and
state = 0;

ANALYZE should be able to do a reasonable job with a column that has 8
or fewer distinct values ...

            regards, tom lane

Re: Query planner plans very inefficient plans

From
Sean Chittenden
Date:
>    I have somewhere around 3M rows in the image table, and 37K rows in the
>    ancestry table. The following is representative of some of the common
>    queries I issue:
>
>    select * from image natural join ancestry where ancestorid=1000000 and
>    (state & 7::bigint) = 0::bigint;
>
>    When I ask postgres to EXPLAIN it, I get the following:
>
>    Merge Join  (cost=81858.22..81900.60 rows=124 width=49)
>      ->  Sort  (cost=81693.15..81693.15 rows=16288 width=41)
>            ->  Seq Scan on image  (cost=0.00..80279.17 rows=16288 width=41)
>      ->  Sort  (cost=165.06..165.06 rows=45 width=8)
>            ->  Index Scan using ancestry_ancestorid_key on ancestry
>    (cost=0.00..163.83 rows=45 width=8)
>
>    It appears to me that the query executes as follows:
>
>    1. Scan every row in the image table to find those where (state &
>    7::bigint) = 0::bigint
>    2. Sort the results
>    3. Use an index on ancestry to find rows where ancestorid=1000000
>    4. Sort the results
>    5. Join the two

FWIW, I use INTs as bit vectors for options in various applications
and have run into this in a few cases.  In the database, I only care
about a few bits in the options INT, so what I did was create a
function for each of the bits that I care about and then a function
index.  Between the two, I've managed to solve my performance
problems.

CREATE FUNCTION app_option_foo_is_set(INT)
    RETURNS BOOL
    IMMUTABLE
    AS '
BEGIN
    IF $1 & 7::INT THEN
        RETURN TRUE;
    ELSE
        RETURN FALSE;
    END IF;
END;
' LANGUAGE 'plpgsql';
CREATE INDEX app_option_foo_fidx ON app_option_tbl (app_option_foo_is_set(options));
VACUUM ANALYZE;

Just make sure that you set your function to be IMMUTABLE. -sc


PS It'd be slick if PostgreSQL would collapse adjacent booleans into a
   bit in a byte: it'd save some apps a chunk of space.  32 options ==
   32 bytes with the type BOOL, but if adjacent BOOLs were collapsed,
   it'd only be 4 bytes on disk and maybe some page header data.

--
Sean Chittenden