Re: Inefficient queryplan for query with intersectable - Mailing list pgsql-performance

From Arjen van der Meijden
Subject Re: Inefficient queryplan for query with intersectable
Date
Msg-id 4310456F.8040303@tweakers.net
Whole thread Raw
In response to Re: Inefficient queryplan for query with intersectable  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Inefficient queryplan for query with intersectable  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance

On 27-8-2005 0:56, Tom Lane wrote:
> Arjen van der Meijden <acmmailing@tweakers.net> writes:
>
>>As said, it chooses sequential scans or "the wrong index plans" over a
>>perfectly good plan that is just not selected when the parameters are
>>"too well tuned" or sequential scanning of the table is allowed.
>
>
> I think some part of the problem comes from using inconsistent
> datatypes.  For instance, it seems very odd that the thing is not
> using a hash or something to handle
>
>  t_0.Cat2 IN (SELECT 545 UNION SELECT ID FROM cat WHERE ParentID = 545)
>
> seeing that it correctly guesses there are only going to be about 8 rows
> in the union.  Part of the reason is that cat2 is smallint, whereas the
> output of the union must be at least int, maybe wider depending on the
> datatype of cat.id (which you did not show us); so the comparison isn't
> hashable.  Even a smallint vs int comparison would be mergejoinable,
> though, so I'm really wondering what cat.id is.

cat.id is a smallint. I replaced that subquery with these two:
t_0.Cat2 IN (SELECT '545'::smallint UNION SELECT ID FROM cat WHERE
ParentID = '545'::smallint)

t_0.Cat2 IN (SELECT '545' UNION SELECT ID FROM cat WHERE ParentID = '545')

But appareantly there is a bug in the explain mechanism of the 8.1devel
I'm using (I downloaded a nightly 25 august somewhere in the morning
(CEST)), since it returned:
ERROR:  bogus varno: 9

So I can't see whether the plan changed, execution times didn't change
much. I also replaced the subselect with the result of that query (like
('545', '546', ...) ) but that didn't seem to make much difference in
the execution time as well. The plan did change of course, it used a
BitmapOr of 8 Bitmap Index Scans over the pwprodukten.

By the way, as far as I know, this is the only datatype mismatch in the
query.

> Another big part of the problem comes from poor result size estimation.
> I'm not sure you can eliminate that entirely given the multiple
> conditions on different columns (which'd require cross-column statistics
> to really do well, which we do not have).  But you could avoid
> constructs like
>
>     WHERE ... t_1.recordtimestamp >=
>       (SELECT max_date - 60 FROM last_dates WHERE table_name = 'pricetracker')
>
> The planner is basically going to throw up its hands and make a default
> guess on the selectivity of this; it's not smart enough to decide that
> the sub-select probably represents a constant.  What I'd do with this
> is to define a function marked STABLE for the sub-select result, perhaps
> something like
[...]
> need.)  Then write the query like
>
>     WHERE ... t_1.recordtimestamp >= get_last_date('pricetracker', 60)
>
> In this formulation the planner will be able to make a reasonable guess
> about how many rows will match ... at least if your statistics are up
> to date ...

I tried such a function and also tried replacing it with the fixed
outcome of that suquery itself. Although it has a considerable more
accurate estimate of the rows returned, it doesn't seem to impact the
basic plan much. It does make the sub-query itself use another index
(the one on the recordtimestamp alone, rather than the combined index on
leverancierid and recordtimestamp).
With that changed subquery it estimates about 4173 rows over 4405 real rows.

Actually with the adjusted or original query, it seems to favor the hash
join over a nested loop, but the rest of the plan (for the subqueries)
seems to be exactly the same.

Here is the first part of the explain analyze when it can do any trick
it wants:
  Hash Join  (cost=7367.43..186630.19 rows=132426 width=12) (actual
time=191.726..11072.025 rows=58065 loops=1)
    Hash Cond: ("outer".produktid = "inner".id)
    ->  Seq Scan on pwprijs chart_2  (cost=0.00..137491.07 rows=7692207
width=16) (actual time=0.018..6267.744 rows=7692207 loops=1)
    ->  Hash  (cost=7366.02..7366.02 rows=565 width=4) (actual
time=123.265..123.265 rows=103 loops=1)
          ->  SetOp Intersect  (cost=7332.10..7360.37 rows=565 width=4)
(actual time=115.760..123.192 rows=103 loops=1)
[snip]

And here is the first (and last) part when I disable hash joins or seq
scans:
  Nested Loop  (cost=7334.92..517159.39 rows=132426 width=12) (actual
time=111.905..512.575 rows=58065 loops=1)
    ->  SetOp Intersect  (cost=7332.10..7360.37 rows=565 width=4)
(actual time=111.588..120.035 rows=103 loops=1)
[snip]
    ->  Bitmap Heap Scan on pwprijs chart_2  (cost=2.82..895.85 rows=234
width=16) (actual time=0.344..2.149 rows=564 loops=103)
          Recheck Cond: (chart_2.produktid = "outer".id)
          ->  Bitmap Index Scan on pwprijs_produktid_idx
(cost=0.00..2.82 rows=234 width=0) (actual time=0.189..0.189 rows=564
loops=103)
                Index Cond: (chart_2.produktid = "outer".id)

Is a nested loop normally so much (3x) more costly than a hash join? Or
is it just this query that gets estimated wronly?

Best regards,

Arjen

pgsql-performance by date:

Previous
From: "Steinar H. Gunderson"
Date:
Subject: Re: Weird performance drop after VACUUM
Next
From: Tom Lane
Date:
Subject: Re: Inefficient queryplan for query with intersectable