Re: merge semi join cost calculation error - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: merge semi join cost calculation error
Date
Msg-id CAFj8pRCCnAijokK6Qx8p+2RwtiXMqd8-yasykvUbXNMC47rkew@mail.gmail.com
Whole thread Raw
In response to Re: merge semi join cost calculation error  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers


po 8. 10. 2018 v 17:00 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> The user sent a plan:

> QUERY PLAN
> Merge Semi Join  (cost=82.97..580.24 rows=580 width=8) (actual
> time=0.503..9557.396 rows=721 loops=1)
>   Merge Cond: (tips.users_id = follows.users_id_to)
>   ->  Index Scan using tips_idx_users_id01 on tips  (cost=0.43..8378397.19
> rows=2491358 width=16) (actual time=0.009..9231.585 rows=2353914 loops=1)
>   ->  Sort  (cost=1.77..1.82 rows=22 width=8) (actual time=0.052..0.089
> rows=28 loops=1)
>         Sort Key: follows.users_id_to
>         Sort Method: quicksort  Memory: 26kB
>         ->  Seq Scan on follows  (cost=0.00..1.27 rows=22 width=8) (actual
> time=0.013..0.020 rows=28 loops=1)
>               Filter: (users_id_from = 1)

> He has PostgreSQL 10.5. I cannot to understand to too low total cost of Merge
> Semi Join because subnode has very high cost 8378397.

The planner seems to be supposing that the merge will stop far short of
scanning the entire LHS table, presumably as a result of thinking that
the maximum value of follows.users_id_to is much less than the maximum
value of tips.users_id.  Given the actual rowcounts, that's seemingly
not true, which suggests out-of-date stats for one table or the other.

good tip - the table follows was too small for autovacuum, and it was terrible effect. I didn't know about this optimization.

Thank you

Pavel

                        regards, tom lane

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: executor relation handling
Next
From: Amit Langote
Date:
Subject: Re: executor relation handling