Re: On disable_cost - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: On disable_cost |
Date | |
Msg-id | CAApHDvpsTVnG8ZBv-szXdJa9imSJegsKOqawm+QQR6TGVpg4bQ@mail.gmail.com Whole thread Raw |
In response to | Re: On disable_cost (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Sun, 5 May 2024 at 04:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > That doesn't get you the benefits of fewer CPU cycles, but where did > > that come from as a motive to change this? There's no shortage of > > other ways to make the planner faster if that's an issue. > > The concern was to not *add* CPU cycles in order to make this area > better. But I do tend to agree that we've exhausted all the other > options. It really looks to me that Robert was talking about not generating paths for disabled path types. He did write "just to save CPU cycles" in the paragraph I quoted. I think we should concern ourselves with adding overhead to add_path() *only* when we actually see a patch which slows it down in a way that we can measure. I find it hard to imagine that adding a single comparison for every Path is measurable. Each of these paths has been palloced and costed, both of which are significantly more expensive than adding another comparison to compare_path_costs_fuzzily(). I'm only willing for benchmarks on an actual patch to prove me wrong on that. Nothing else. add_path() has become a rat's nest of conditions over the years and those seem to have made it without concerns about performance. > BTW, I looked through costsize.c just now to see exactly what we are > using disable_cost for, and it seemed like a majority of the cases are > just wrong. Where possible, we should implement a plan-type-disable > flag by not generating the associated Path in the first place, not by > applying disable_cost to it. But it looks like a lot of people have > erroneously copied the wrong logic. I would say that only these plan > types should use the disable_cost method: > > seqscan > nestloop join > sort I think this oversimplifies the situation. I only spent 30 seconds looking and I saw cases where this would cause issues. If enable_hashagg is false, we could fail to produce some plans where the type is sortable but not hashable. There's also an issue with nested loops being unable to FULL OUTER JOIN. However, I do agree that there are some in there that are adding disable_cost that should be done by just not creating the Path. enable_gathermerge is one. enable_bitmapscan is probably another. I understand you only talked about the cases adding disable_cost in costsize.c. But just as a reminder, there are other things we need to be careful not to break. For example, enable_indexonlyscan=false should defer to still making an index scan. Nobody who disables enable_indexonlyscan without disabling enable_indexscan wants queries that are eligible to use IOS to use seq scan instead. They'd still want Index Scan to be considered, otherwise they'd have disabled enable_indexscan. David
pgsql-hackers by date: