Re: make add_paths_to_append_rel aware of startup cost - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: make add_paths_to_append_rel aware of startup cost |
Date | |
Msg-id | CAApHDvogv5MBw6JV82Yb6gAa0mp64HQtX--TSK_wZOn9x2fP4Q@mail.gmail.com Whole thread Raw |
In response to | Re: make add_paths_to_append_rel aware of startup cost (Andy Fan <zhihuifan1213@163.com>) |
Responses |
Re: make add_paths_to_append_rel aware of startup cost
Re: make add_paths_to_append_rel aware of startup cost |
List | pgsql-hackers |
On Thu, 15 Feb 2024 at 21:42, Andy Fan <zhihuifan1213@163.com> wrote: > I found the both plans have the same cost, I can't get the accurate > cause of this after some hours research, but it is pretty similar with > 7516056c584e3, so I uses a similar strategy to stable it. is it > acceptable? It's pretty hard to say. I can only guess why this test would be flapping like this. I see it's happened before on mylodon, so probably not a cosmic ray. It's not like add_path() chooses a random path when the costs are the same, so I wondered if something similar is going on here that was going on that led to f03a9ca4. In particular, see [1]. On master, I set a breakpoint in try_nestloop_path() to break on "outerrel->relid==1 && innerrel->relid==2". I see the total Nested Loop cost comes out the same with the join order reversed. Which is: -> Nested Loop (cost=0.00..1500915.00 rows=10000 width=4) Doing the same with your patch applied, I get: -> Nested Loop (cost=0.00..600925.00 rows=4000 width=4) and forcing the join order to swap with the debugger, I see: -> Nested Loop (cost=0.00..600940.00 rows=4000 width=4) So there's a difference now, but it's quite small. If it was a problem like we had on [1], then since tenk1 and tenk2 have 345 pages (on my machine), if relpages is down 1 or 2 pages, we'll likely get more of a costing difference than 600925 vs 600940. If I use: explain select t1.unique1 from tenk1 t1 inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0 union all (values(1)) limit 1; I get: -> Nested Loop (cost=0.00..2415.03 rows=10 width=4) and with the join order reversed, I get: -> Nested Loop (cost=0.00..2440.00 rows=10 width=4) I'd be more happy using this one as percentage-wise, the cost difference is much larger. I don't quite have the will to go through proving what the actual problem is here. I think [1] already proved the relpages problem can (or could) happen. I checked that the t2.thounsand = 0 query still tests the cheap startup paths in add_paths_to_append_rel() and it does. If I flip startup_subpaths_valid to false in the debugger, the plan flips to: QUERY PLAN ----------------------------------------------------------------------------------- Limit (cost=470.12..514.00 rows=1 width=4) -> Append (cost=470.12..952.79 rows=11 width=4) -> Hash Join (cost=470.12..952.73 rows=10 width=4) Hash Cond: (t1.tenthous = t2.tenthous) -> Seq Scan on tenk1 t1 (cost=0.00..445.00 rows=10000 width=8) -> Hash (cost=470.00..470.00 rows=10 width=4) -> Seq Scan on tenk2 t2 (cost=0.00..470.00 rows=10 width=4) Filter: (thousand = 0) -> Result (cost=0.00..0.01 rows=1 width=4) So, if nobody has any better ideas, I'm just going to push the " and t2.thousand = 0" adjustment. David [1] https://www.postgresql.org/message-id/4174.1563239552%40sss.pgh.pa.us
pgsql-hackers by date: