I have a case that I though was an example of this issue,
and that this patch would correct. I applied this patch
to an 8.0.3 source distribution, but it didn't seem to
solve my problem.
In a nutshell, I have a LIMIT query where the planner
seems to favor a merge join over a nested loop. I've
simplified the query as much as possible:
itvtrackdata3=> \d tableA
Table "public.tableA"
Column | Type | Modifiers
--------+----------+-----------
foo | bigint | not null
bar | smallint | not null
bap | bigint | not null
bip | bigint | not null
bom | bigint | not null
Indexes:
"idx_tableA_bip" btree (bip) WHERE (bip =
9000000000000000000::bigint)
"idx_tableA_foo" btree (foo)
itvtrackdata3=> \d tableB
Table "tableB"
Column | Type | Modifiers
---------+----------+-----------
bim | bigint | not null
bif | smallint | not null
baf | smallint | not null
bof | smallint | not null
buf | smallint | not null
foo | bigint | not null
Indexes:
"idx_tableB_bim" btree ("bim", foo)
itvtrackdata3=> set default_statistics_target to 1000;
SET
Time: 0.448 ms
itvtrackdata3=> analyze tableA;
ANALYZE
Time: 4237.151 ms
itvtrackdata3=> analyze tableB;
ANALYZE
Time: 46672.939 ms
itvtrackdata3=> explain analyze SELECT * FROM tableB NATURAL JOIN tableA
WHERE bim>=72555896091359 AND bim<72555935412959 AND bim=bap ORDER BY
bim ASC LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=149626.57..252987.71 rows=1 width=50) (actual
time=5684.013..5684.013 rows=1 loops=1)
-> Merge Join (cost=149626.57..252987.71 rows=1 width=50) (actual
time=5684.012..5684.012 rows=1 loops=1)
Merge Cond: (("outer"."bim" = "inner"."bap") AND ("outer".foo =
"inner".foo))
-> Index Scan using idx_tableB_bim on tableB
(cost=0.00..97391.22 rows=55672 width=24) (actual time=0.017..0.059
rows=29 loops=1)
Index Cond: (("bim" >= 72555896091359::bigint) AND ("bim"
< 72555935412959::bigint))
-> Sort (cost=149626.57..151523.94 rows=758948 width=34)
(actual time=5099.300..5442.825 rows=560856 loops=1)
Sort Key: tableA."bap", tableA.foo
-> Seq Scan on tableA (cost=0.00..47351.48 rows=758948
width=34) (actual time=0.021..1645.204 rows=758948 loops=1)
Total runtime: 5706.655 ms
(9 rows)
Time: 5729.984 ms
itvtrackdata3=> set enable_mergejoin to false;
SET
Time: 0.373 ms
itvtrackdata3=> explain analyze SELECT * FROM tableB NATURAL JOIN tableA
WHERE bim>=72555896091359 AND bim<72555935412959 AND bim=bap ORDER BY
bim ASC LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..432619.68 rows=1 width=50) (actual
time=11.149..11.150 rows=1 loops=1)
-> Nested Loop (cost=0.00..432619.68 rows=1 width=50) (actual
time=11.148..11.148 rows=1 loops=1)
Join Filter: ("outer"."bim" = "inner"."bap")
-> Index Scan using idx_tableB_bim on tableB
(cost=0.00..97391.22 rows=55672 width=24) (actual time=0.017..0.062
rows=29 loops=1)
Index Cond: (("bim" >= 72555896091359::bigint) AND ("bim"
< 72555935412959::bigint))
-> Index Scan using idx_tableA_foo on tableA (cost=0.00..6.01
rows=1 width=34) (actual time=0.007..0.379 rows=1 loops=29)
Index Cond: ("outer".foo = tableA.foo)
Total runtime: 11.215 ms
(8 rows)
Time: 32.007 ms
Have I just flubbed the patch, or is there something else
going on here?
Thanks,
--Ian
On Fri, 2005-07-22 at 12:20, Tom Lane wrote:
> I wrote:
> > Dawid Kuroczko <qnex42@gmail.com> writes:
> >> qnex=# EXPLAIN SELECT * FROM log NATURAL JOIN useragents LIMIT 1;
>
> >> Limit (cost=15912.20..15912.31 rows=1 width=272)
> >> -> Hash Join (cost=15912.20..5328368.96 rows=47044336 width=272)
>
> >> If I set enable_hashjoin=false:
>
> >> qnex=# EXPLAIN ANALYZE SELECT * FROM log NATURAL LEFT JOIN useragents LIMIT 1;
>
> >> Limit (cost=0.00..3.07 rows=1 width=272) (actual time=74.214..74.216
> >> rows=1 loops=1)
> >> -> Nested Loop Left Join (cost=0.00..144295895.01 rows=47044336
> >> width=272) (actual time=74.204..74.204 rows=1 loops=1)
>
> > This is quite strange. The nestloop plan definitely should be preferred
> > in the context of the LIMIT, considering that it has far lower estimated
> > cost. And it is preferred in simple tests for me.
>
> After a suitable period of contemplating my navel, I figured out
> what is going on here: the total costs involved are large enough that
> the still-fairly-high startup cost of the hash is disregarded by
> compare_fuzzy_path_costs(), and so the nestloop is discarded as not
> having any significant potential advantage in startup time.
>
> I think that this refutes the original scheme of using the same fuzz
> factor for both startup and total cost comparisons, and therefore
> propose the attached patch.
>
> Comments?
>
> regards, tom lane
>
> *** src/backend/optimizer/util/pathnode.c.orig Fri Jul 15 13:09:25 2005
> --- src/backend/optimizer/util/pathnode.c Fri Jul 22 12:08:25 2005
> ***************
> *** 98,157 ****
> static int
> compare_fuzzy_path_costs(Path *path1, Path *path2, CostSelector criterion)
> {
> - Cost fuzz;
> -
> /*
> ! * The fuzz factor is set at one percent of the smaller total_cost,
> ! * but not less than 0.01 cost units (just in case total cost is
> ! * zero).
> *
> * XXX does this percentage need to be user-configurable?
> */
> - fuzz = Min(path1->total_cost, path2->total_cost) * 0.01;
> - fuzz = Max(fuzz, 0.01);
> -
> if (criterion == STARTUP_COST)
> {
> ! if (Abs(path1->startup_cost - path2->startup_cost) > fuzz)
> ! {
> ! if (path1->startup_cost < path2->startup_cost)
> ! return -1;
> ! else
> ! return +1;
> ! }
>
> /*
> * If paths have the same startup cost (not at all unlikely),
> * order them by total cost.
> */
> ! if (Abs(path1->total_cost - path2->total_cost) > fuzz)
> ! {
> ! if (path1->total_cost < path2->total_cost)
> ! return -1;
> ! else
> ! return +1;
> ! }
> }
> else
> {
> ! if (Abs(path1->total_cost - path2->total_cost) > fuzz)
> ! {
> ! if (path1->total_cost < path2->total_cost)
> ! return -1;
> ! else
> ! return +1;
> ! }
>
> /*
> * If paths have the same total cost, order them by startup cost.
> */
> ! if (Abs(path1->startup_cost - path2->startup_cost) > fuzz)
> ! {
> ! if (path1->startup_cost < path2->startup_cost)
> ! return -1;
> ! else
> ! return +1;
> ! }
> }
> return 0;
> }
> --- 98,138 ----
> static int
> compare_fuzzy_path_costs(Path *path1, Path *path2, CostSelector criterion)
> {
> /*
> ! * We use a fuzz factor of 1% of the smaller cost.
> *
> * XXX does this percentage need to be user-configurable?
> */
> if (criterion == STARTUP_COST)
> {
> ! if (path1->startup_cost > path2->startup_cost * 1.01)
> ! return +1;
> ! if (path2->startup_cost > path1->startup_cost * 1.01)
> ! return -1;
>
> /*
> * If paths have the same startup cost (not at all unlikely),
> * order them by total cost.
> */
> ! if (path1->total_cost > path2->total_cost * 1.01)
> ! return +1;
> ! if (path2->total_cost > path1->total_cost * 1.01)
> ! return -1;
> }
> else
> {
> ! if (path1->total_cost > path2->total_cost * 1.01)
> ! return +1;
> ! if (path2->total_cost > path1->total_cost * 1.01)
> ! return -1;
>
> /*
> * If paths have the same total cost, order them by startup cost.
> */
> ! if (path1->startup_cost > path2->startup_cost * 1.01)
> ! return +1;
> ! if (path2->startup_cost > path1->startup_cost * 1.01)
> ! return -1;
> }
> return 0;
> }
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq