Thread: On disable_cost
Hi,
Postgres has a global variable `disable_cost`. It is set the value 1.0e10.
This value will be added to the cost of path if related GUC is set off. For example,
if enable_nestloop is set off, when planner trys to add nestloop join path, it continues
to add such path but with a huge cost `disable_cost`.
But 1.0e10 may not be large enough. I encounter this issue in Greenplum(based on postgres).
Heikki tolds me that someone also encountered the same issue on Postgres. So I send it here to
have a discussion.
My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For query 104, it generates
nestloop join even with enable_nestloop set off. And the final plan's total cost is very huge (about 1e24). But If I enlarge the disable_cost to 1e30, then, planner will generate hash join.
So I guess that disable_cost is not large enough for huge amount of data.
It is tricky to set disable_cost a huge number. Can we come up with better solution?
The following thoughts are from Heikki:
Aside from not having a large enough disable cost, there's also the fact that the high cost might affect the rest of the plan, if we have to use a plan type that's disabled. For example, if a table doesn't have any indexes, but enable_seqscan is off, we might put the unavoidable Seq Scan on different side of a join than we we would with enable_seqscan=on, because of the high cost estimate.
I think a more robust way to disable forbidden plan types would be to handle the disabling in add_path(). Instead of having a high disable cost on the Path itself, the comparison add_path() would always consider disabled paths as more expensive than others, regardless of the cost.
Any thoughts or ideas on the problem? Thanks!
Best Regards,
Zhenghua Lyu
On Fri, Nov 1, 2019 at 7:42 PM Zhenghua Lyu <zlv@pivotal.io> wrote: > It is tricky to set disable_cost a huge number. Can we come up with better solution? What happens if you use DBL_MAX?
Em sex, 1 de nov de 2019 às 03:42, Zhenghua Lyu <zlv@pivotal.io> escreveu: > > My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For query 104, it generates > nestloop join even with enable_nestloop set off. And the final plan's total cost is very huge (about 1e24). But If I enlargethe disable_cost to 1e30, then, planner will generate hash join. > > So I guess that disable_cost is not large enough for huge amount of data. > > It is tricky to set disable_cost a huge number. Can we come up with better solution? > Isn't it a case for a GUC disable_cost? As Thomas suggested, DBL_MAX upper limit should be sufficient. > The following thoughts are from Heikki: >> >> Aside from not having a large enough disable cost, there's also the fact that the high cost might affect the restof the plan, if we have to use a plan type that's disabled. For example, if a table doesn't have any indexes, but enable_seqscanis off, we might put the unavoidable Seq Scan on different side of a join than we we would with enable_seqscan=on,because of the high cost estimate. > > >> >> I think a more robust way to disable forbidden plan types would be to handle the disabling in add_path(). Instead of havinga high disable cost on the Path itself, the comparison add_path() would always consider disabled paths as more expensivethan others, regardless of the cost. > I'm afraid it is not as cheap as using diable_cost as a node cost. Are you proposing to add a new boolean variable in Path struct to handle those cases in compare_path_costs_fuzzily? -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Hi, On 2019-11-01 19:58:04 +1300, Thomas Munro wrote: > On Fri, Nov 1, 2019 at 7:42 PM Zhenghua Lyu <zlv@pivotal.io> wrote: > > It is tricky to set disable_cost a huge number. Can we come up with better solution? > > What happens if you use DBL_MAX? That seems like a bad idea - we add the cost multiple times. And we still want to compare plans that potentially involve that cost, if there's no other way to plan the query. - Andres
On Fri, Nov 1, 2019 at 12:00 PM Andres Freund <andres@anarazel.de> wrote: > That seems like a bad idea - we add the cost multiple times. And we > still want to compare plans that potentially involve that cost, if > there's no other way to plan the query. Yeah. I kind of wonder if we shouldn't instead (a) skip adding paths that use methods which are disabled and then (b) if we don't end up with any paths for that reloptinfo, try again, ignoring disabling GUCs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
re: coping with adding disable_cost more than once Another option would be to have a 2-part Cost structure. If disable_cost is ever added to the Cost, then you set a flag recording this. If any plans exist that have no disable_costs added to them, then the planner chooses the minimum cost among those, otherwise you choose the minimum cost path. ----- Jim Finnerty, AWS, Amazon Aurora PostgreSQL -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Hi, On 2019-11-01 12:22:06 -0400, Robert Haas wrote: > On Fri, Nov 1, 2019 at 12:00 PM Andres Freund <andres@anarazel.de> wrote: > > That seems like a bad idea - we add the cost multiple times. And we > > still want to compare plans that potentially involve that cost, if > > there's no other way to plan the query. > > Yeah. I kind of wonder if we shouldn't instead (a) skip adding paths > that use methods which are disabled and then (b) if we don't end up > with any paths for that reloptinfo, try again, ignoring disabling > GUCs. Hm. That seems complicated. Is it clear that we'd always notice that we have no plan early enough to know which paths to reconsider? I think there's cases where that'd only happen a few levels up. As a first step I'd be inclined to "just" adjust disable_cost up to something like 1.0e12. Unfortunately much higher and and we're getting into the area where the loss of precision starts to be significant enough that I'm not sure that we're always careful enough to perform math in the right order (e.g. 1.0e16 + 1 being 1.0e16, and 1e+20 + 1000 being 1e+20). I've seen queries with costs above 1e10 where that costing wasn't insane. And then, in a larger patch, go for something like Heikki's proposal quoted by Zhenghua Lyu upthread, where we treat 'forbidden' as a separate factor in comparisons of path costs, rather than fudging the cost upwards. But there's some care to be taken to make sure we don't regress performance too much due to the additional logic in compare_path_cost et al. I'd also be curious to see if there's some other problem with cost calculation here - some of the quoted final costs seem high enough to be suspicious. I'd be curious to see a plan... Greetings, Andres Freund
On Fri, Nov 1, 2019 at 12:43 PM Andres Freund <andres@anarazel.de> wrote: > Hm. That seems complicated. Is it clear that we'd always notice that we > have no plan early enough to know which paths to reconsider? I think > there's cases where that'd only happen a few levels up. Yeah, there could be problems of that kind. I think if a baserel has no paths, then we know right away that we've got a problem, but for joinrels it might be more complicated. > As a first step I'd be inclined to "just" adjust disable_cost up to > something like 1.0e12. Unfortunately much higher and and we're getting > into the area where the loss of precision starts to be significant > enough that I'm not sure that we're always careful enough to perform > math in the right order (e.g. 1.0e16 + 1 being 1.0e16, and 1e+20 + 1000 > being 1e+20). I've seen queries with costs above 1e10 where that costing > wasn't insane. We've done that before and we can do it again. But we're going to need to have something better eventually, I think, not just keep kicking the can down the road. Another point to consider here is that in some cases we could really just skip generating certain paths altogether. We already do this for hash joins: if we're planning a join and enable_hashjoin is disabled, we just don't generate hash joins paths at all, except for full joins, where there might be no other legal method. As this example shows, this cannot be applied in all cases, but maybe we could do it more widely than we do today. I'm not sure how beneficial that technique would be, though, because it doesn't seem like it's quite enough to solve this problem by itself. Yet another approach would be to divide the cost into two parts, a "cost" component and a "violations" component. If two paths are compared, the one with fewer violations always wins; if it's a tie, they compare on cost. A path's violation count is the total of its children, plus one for itself if it does something that's disabled. This would be more principled than the current approach, but maybe it's too costly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Nov 01, 2019 at 09:30:52AM -0700, Jim Finnerty wrote: >re: coping with adding disable_cost more than once > >Another option would be to have a 2-part Cost structure. If disable_cost is >ever added to the Cost, then you set a flag recording this. If any plans >exist that have no disable_costs added to them, then the planner chooses the >minimum cost among those, otherwise you choose the minimum cost path. > Yeah, I agree having is_disabled flag, and treat all paths with 'true' as more expensive than paths with 'false' (and when both paths have the same value then actually compare the cost) is probably the way forward. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-11-01 12:56:30 -0400, Robert Haas wrote: > On Fri, Nov 1, 2019 at 12:43 PM Andres Freund <andres@anarazel.de> wrote: > > As a first step I'd be inclined to "just" adjust disable_cost up to > > something like 1.0e12. Unfortunately much higher and and we're getting > > into the area where the loss of precision starts to be significant > > enough that I'm not sure that we're always careful enough to perform > > math in the right order (e.g. 1.0e16 + 1 being 1.0e16, and 1e+20 + 1000 > > being 1e+20). I've seen queries with costs above 1e10 where that costing > > wasn't insane. > > We've done that before and we can do it again. But we're going to need > to have something better eventually, I think, not just keep kicking > the can down the road. Yea, that's why I continued on to describe what we should do afterwards ;) > Yet another approach would be to divide the cost into two parts, a > "cost" component and a "violations" component. If two paths are > compared, the one with fewer violations always wins; if it's a tie, > they compare on cost. A path's violation count is the total of its > children, plus one for itself if it does something that's disabled. > This would be more principled than the current approach, but maybe > it's too costly. Namely go for something like this. I think we probably get away with the additional comparison, especially if we were to store the violations as an integer and did it like if (unlikely(path1->nviolations != path2->nviolations)) or such - that ought to be very well predicted in nearly all cases. I wonder how much we'd need to reformulate compare_path_costs/compare_path_costs_fuzzily to allow the compiler to auto-vectorize. Might not be worth caring... Greetings, Andres Freund
Zhenghua Lyu <zlv@pivotal.io> writes: >> I think a more robust way to disable forbidden plan types would be to >> handle the disabling in add_path(). Instead of having a high disable cost >> on the Path itself, the comparison add_path() would always consider >> disabled paths as more expensive than others, regardless of the cost. Getting rid of disable_cost would be a nice thing to do, but I would rather not do it by adding still more complexity to add_path(), not to mention having to bloat Paths with a separate "disabled" marker. The idea that I've been thinking about is to not generate disabled Paths in the first place, thus not only fixing the problem but saving some cycles. While this seems easy enough for "optional" paths, we have to reserve the ability to generate certain path types regardless, if there's no other way to implement the query. This is a bit of a stumbling block :-(. At the base relation level, we could do something like generating seqscan last, and only if no other path has been successfully generated. But I'm not sure how to scale that up to joins. In particular, imagine that we consider joining A to B, and find that the only way is a nestloop, so we generate a nestloop join despite that being nominally disabled. The next join level would then see that as an available path, and it might decide that ((A nestjoin B) join C) is the cheapest choice, even though there might have been a way to do, say, ((A join C) join B) with no use of nestloops. Users would find this surprising. Maybe the only way to do this is a separate number-of-uses-of- disabled-plan-types cost figure in Paths, but I still don't want to go there. The number of cases where disable_cost's shortcomings really matter is too small to justify that, IMHO. regards, tom lane
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On Fri, Nov 01, 2019 at 09:30:52AM -0700, Jim Finnerty wrote: >> re: coping with adding disable_cost more than once >> >> Another option would be to have a 2-part Cost structure. If disable_cost is >> ever added to the Cost, then you set a flag recording this. If any plans >> exist that have no disable_costs added to them, then the planner chooses the >> minimum cost among those, otherwise you choose the minimum cost path. > Yeah, I agree having is_disabled flag, and treat all paths with 'true' > as more expensive than paths with 'false' (and when both paths have the > same value then actually compare the cost) is probably the way forward. It would have to be a count, not a boolean --- for example, you want to prefer a path that uses one disabled SeqScan over a path that uses two. I'm with Andres in being pretty worried about the extra burden imposed on add_path comparisons. regards, tom lane
As a proof of concept, I hacked around a bit today to re-purpose one of the bits of the Cost structure to mean "is_disabled" so that we can distinguish 'disabled' from 'non-disabled' paths without making the Cost structure any bigger. In fact, it's still a valid double. The obvious choice would have been to re-purpose the sign bit, but I've had occasion to exploit negative costs before so for this POC I used the high-order bit of the fractional bits of the double. (see Wikipedia for double precision floating point for the layout). The idea is to set a special bit when disable_cost is added to a cost. Dedicating multiple bits instead of just 1 would be easily done, but as it is we can accumulate many disable_costs without overflowing, so just comparing the cost suffices. The patch is not fully debugged and fails on a couple of tests in the serial test suite. It seems to fail on Cartesian products, and maybe in one other non-CP case. I wasn't able to debug it before the day came to an end. In one place the core code subtracts off the disable_cost. I left the "disabled" bit set in this case, which might be wrong. I don't see an option to attach the patch as an attachment, so here is the patch inline (it is based on PG11). The more interesting part is in a small number of lines in costsize.c. Other changes just add functions that assign a disable_cost and set the bit, or that compare costs such that a non-disabled cost always compares less than a disabled cost. ------------------ diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 4e86458672..3718639330 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -123,6 +123,8 @@ double parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST; int effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE; Cost disable_cost = 1.0e10; +uint64 disabled_mask = 0x8000000000000; +#define IS_DISABLED(cost) (((uint64) cost) & disabled_mask) int max_parallel_workers_per_gather = 2; @@ -205,6 +207,53 @@ clamp_row_est(double nrows) return nrows; } +Cost +add_cost(Cost cost, Cost delta_cost) +{ + uint64 mask = (delta_cost == disable_cost) ? disabled_mask : 0; + Cost max_cost = disabled_mask - disable_cost; + + if (cost + delta_cost < max_cost) + return ((Cost) ((uint64)(cost + delta_cost) | mask)); + else + return ((Cost) ((uint64)(max_cost) | mask)); +} + +bool +is_lower_cost(Cost cost1, Cost cost2) +{ + if ((uint64)cost1 & disabled_mask && !((uint64)cost2 & disabled_mask)) + return false; + + if (!((uint64)cost1 & disabled_mask) && (uint64)cost2 & disabled_mask) + return true; + + return (cost1 < cost2); +} + +bool +is_greater_cost(Cost cost1, Cost cost2) +{ + if ((uint64)cost1 & disabled_mask && !((uint64)cost2 & disabled_mask)) + return true; + + if (!((uint64)cost1 & disabled_mask) && (uint64)cost2 & disabled_mask) + return false; + + return (cost1 > cost2); +} + +bool +is_geq_cost(Cost cost1, Cost cost2) +{ + if ((uint64)cost1 & disabled_mask && !((uint64)cost2 & disabled_mask)) + return true; + + if (!((uint64)cost1 & disabled_mask) && (uint64)cost2 & disabled_mask) + return false; + + return (cost1 >= cost2); +} /* * cost_seqscan @@ -235,7 +284,7 @@ cost_seqscan(Path *path, PlannerInfo *root, path->rows = baserel->rows; if (!enable_seqscan) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* fetch estimated page cost for tablespace containing table */ get_tablespace_page_costs(baserel->reltablespace, @@ -424,7 +473,7 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root, path->path.rows = rel->rows; if (!enable_gathermerge) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* * Add one to the number of workers to account for the leader. This might @@ -538,7 +587,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, } if (!enable_indexscan) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* we don't need to check enable_indexonlyscan; indxpath.c does that */ /* @@ -976,7 +1025,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel, path->rows = baserel->rows; if (!enable_bitmapscan) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual, loop_count, &indexTotalCost, @@ -1242,10 +1291,10 @@ cost_tidscan(Path *path, PlannerInfo *root, if (isCurrentOf) { Assert(baserel->baserestrictcost.startup >= disable_cost); - startup_cost -= disable_cost; + startup_cost -= disable_cost; /* but do not un-set the disabled mark */ } else if (!enable_tidscan) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* * The TID qual expressions will be computed once, any other baserestrict @@ -1676,7 +1725,7 @@ cost_sort(Path *path, PlannerInfo *root, long sort_mem_bytes = sort_mem * 1024L; if (!enable_sort) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); path->rows = tuples; @@ -2121,8 +2170,8 @@ cost_agg(Path *path, PlannerInfo *root, total_cost = input_total_cost; if (aggstrategy == AGG_MIXED && !enable_hashagg) { - startup_cost += disable_cost; - total_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); + total_cost = add_cost(total_cost, disable_cost); } /* calcs phrased this way to match HASHED case, see note above */ total_cost += aggcosts->transCost.startup; @@ -2137,7 +2186,7 @@ cost_agg(Path *path, PlannerInfo *root, /* must be AGG_HASHED */ startup_cost = input_total_cost; if (!enable_hashagg) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); startup_cost += aggcosts->transCost.startup; startup_cost += aggcosts->transCost.per_tuple * input_tuples; startup_cost += (cpu_operator_cost * numGroupCols) * input_tuples; @@ -2436,7 +2485,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path, * disabled, which doesn't seem like the way to bet. */ if (!enable_nestloop) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* cost of inner-relation source data (we already dealt with outer rel) */ @@ -2882,7 +2931,7 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path, * disabled, which doesn't seem like the way to bet. */ if (!enable_mergejoin) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* * Compute cost of the mergequals and qpquals (other restriction clauses) @@ -3312,7 +3361,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path, * disabled, which doesn't seem like the way to bet. */ if (!enable_hashjoin) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* mark the path with estimated # of batches */ path->num_batches = numbatches; @@ -3410,7 +3459,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path, if (relation_byte_size(clamp_row_est(inner_path_rows * innermcvfreq), inner_path->pathtarget->width) > (work_mem * 1024L)) - startup_cost += disable_cost; + startup_cost = add_cost(startup_cost, disable_cost); /* * Compute cost of the hashquals and qpquals (other restriction clauses) @@ -3930,7 +3979,7 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) else if (IsA(node, CurrentOfExpr)) { /* Report high cost to prevent selection of anything but TID scan */ - context->total.startup += disable_cost; + context->total.startup = add_cost(context->total.startup, disable_cost); } else if (IsA(node, SubLink)) { diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 4736d84a83..fd746a06bc 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -72,33 +72,33 @@ compare_path_costs(Path *path1, Path *path2, CostSelector criterion) { if (criterion == STARTUP_COST) { - if (path1->startup_cost < path2->startup_cost) + if (is_lower_cost(path1->startup_cost, path2->startup_cost)) return -1; - if (path1->startup_cost > path2->startup_cost) + if (is_greater_cost(path1->startup_cost, path2->startup_cost)) return +1; /* * If paths have the same startup cost (not at all unlikely), order * them by total cost. */ - if (path1->total_cost < path2->total_cost) + if (is_lower_cost(path1->total_cost, path2->total_cost)) return -1; - if (path1->total_cost > path2->total_cost) + if (is_greater_cost(path1->total_cost, path2->total_cost)) return +1; } else { - if (path1->total_cost < path2->total_cost) + if (is_lower_cost(path1->total_cost, path2->total_cost)) return -1; - if (path1->total_cost > path2->total_cost) + if (is_greater_cost(path1->total_cost, path2->total_cost)) return +1; /* * If paths have the same total cost, order them by startup cost. */ - if (path1->startup_cost < path2->startup_cost) + if (is_lower_cost(path1->startup_cost, path2->startup_cost)) return -1; - if (path1->startup_cost > path2->startup_cost) + if (is_greater_cost(path1->startup_cost, path2->startup_cost)) return +1; } return 0; @@ -126,9 +126,9 @@ compare_fractional_path_costs(Path *path1, Path *path2, fraction * (path1->total_cost - path1->startup_cost); cost2 = path2->startup_cost + fraction * (path2->total_cost - path2->startup_cost); - if (cost1 < cost2) + if (is_lower_cost(cost1, cost2)) return -1; - if (cost1 > cost2) + if (is_greater_cost(cost1, cost2)) return +1; return 0; } @@ -172,11 +172,11 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor) * Check total cost first since it's more likely to be different; many * paths have zero startup cost. */ - if (path1->total_cost > path2->total_cost * fuzz_factor) + if (is_greater_cost(path1->total_cost, path2->total_cost * fuzz_factor)) { /* path1 fuzzily worse on total cost */ if (CONSIDER_PATH_STARTUP_COST(path1) && - path2->startup_cost > path1->startup_cost * fuzz_factor) + is_greater_cost(path2->startup_cost, path1->startup_cost * fuzz_factor)) { /* ... but path2 fuzzily worse on startup, so DIFFERENT */ return COSTS_DIFFERENT; @@ -184,11 +184,11 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor) /* else path2 dominates */ return COSTS_BETTER2; } - if (path2->total_cost > path1->total_cost * fuzz_factor) + if (is_greater_cost(path2->total_cost, path1->total_cost * fuzz_factor)) { /* path2 fuzzily worse on total cost */ if (CONSIDER_PATH_STARTUP_COST(path2) && - path1->startup_cost > path2->startup_cost * fuzz_factor) + is_greater_cost(path1->startup_cost, path2->startup_cost * fuzz_factor)) { /* ... but path1 fuzzily worse on startup, so DIFFERENT */ return COSTS_DIFFERENT; @@ -197,12 +197,12 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor) return COSTS_BETTER1; } /* fuzzily the same on total cost ... */ - if (path1->startup_cost > path2->startup_cost * fuzz_factor) + if (is_greater_cost(path1->startup_cost, path2->startup_cost * fuzz_factor)) { /* ... but path1 fuzzily worse on startup, so path2 wins */ return COSTS_BETTER2; } - if (path2->startup_cost > path1->startup_cost * fuzz_factor) + if (is_greater_cost(path2->startup_cost, path1->startup_cost * fuzz_factor)) { /* ... but path2 fuzzily worse on startup, so path1 wins */ return COSTS_BETTER1; @@ -605,7 +605,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path) else { /* new belongs after this old path if it has cost >= old's */ - if (new_path->total_cost >= old_path->total_cost) + if (is_geq_cost(new_path->total_cost, old_path->total_cost)) insert_after = p1; /* p1_prev advances */ p1_prev = p1; @@ -681,7 +681,7 @@ add_path_precheck(RelOptInfo *parent_rel, * * Cost comparisons here should match compare_path_costs_fuzzily. */ - if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR) + if (is_greater_cost(total_cost, old_path->total_cost * STD_FUZZ_FACTOR)) { /* new path can win on startup cost only if consider_startup */ if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR || @@ -796,14 +796,14 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path) /* Unless pathkeys are incompable, keep just one of the two paths. */ if (keyscmp != PATHKEYS_DIFFERENT) { - if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR) + if (is_greater_cost(new_path->total_cost, old_path->total_cost * STD_FUZZ_FACTOR)) { /* New path costs more; keep it only if pathkeys are better. */ if (keyscmp != PATHKEYS_BETTER1) accept_new = false; } - else if (old_path->total_cost > new_path->total_cost - * STD_FUZZ_FACTOR) + else if (is_greater_cost(old_path->total_cost, new_path->total_cost + * STD_FUZZ_FACTOR)) { /* Old path costs more; keep it only if pathkeys are better. */ if (keyscmp != PATHKEYS_BETTER2) @@ -819,7 +819,7 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path) /* Costs are about the same, old path has better pathkeys. */ accept_new = false; } - else if (old_path->total_cost > new_path->total_cost * 1.0000000001) + else if (is_greater_cost(old_path->total_cost, new_path->total_cost * 1.0000000001)) { /* Pathkeys are the same, and the old path costs more. */ remove_old = true; @@ -847,7 +847,7 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path) else { /* new belongs after this old path if it has cost >= old's */ - if (new_path->total_cost >= old_path->total_cost) + if (is_geq_cost(new_path->total_cost, old_path->total_cost)) insert_after = p1; /* p1_prev advances */ p1_prev = p1; @@ -913,10 +913,10 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost, keyscmp = compare_pathkeys(pathkeys, old_path->pathkeys); if (keyscmp != PATHKEYS_DIFFERENT) { - if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR && + if (is_greater_cost(total_cost, old_path->total_cost * STD_FUZZ_FACTOR) && keyscmp != PATHKEYS_BETTER1) return false; - if (old_path->total_cost > total_cost * STD_FUZZ_FACTOR && + if (is_greater_cost(old_path->total_cost, total_cost * STD_FUZZ_FACTOR) && keyscmp != PATHKEYS_BETTER2) return true; } @@ -1697,7 +1697,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath, if (sjinfo->semi_can_btree && sjinfo->semi_can_hash) { - if (agg_path.total_cost < sort_path.total_cost) + if (is_lower_cost(agg_path.total_cost, sort_path.total_cost)) pathnode->umethod = UNIQUE_PATH_HASH; else pathnode->umethod = UNIQUE_PATH_SORT; diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 78f3b99a76..c261a9d790 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -5076,8 +5076,8 @@ IsProjectionFunctionalIndex(Relation index) * when values differ because the expression is recalculated when * inserting a new index entry for the changed value. */ - if ((index_expr_cost.startup + index_expr_cost.per_tuple) > - HEURISTIC_MAX_HOT_RECHECK_EXPR_COST) + if (is_greater_cost((index_expr_cost.startup + index_expr_cost.per_tuple), + HEURISTIC_MAX_HOT_RECHECK_EXPR_COST)) is_projection = false; tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(RelationGetRelid(index))); diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index 9159f2bab1..c01d08eae5 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -251,6 +251,12 @@ extern PathTarget *set_pathtarget_cost_width(PlannerInfo *root, PathTarget *targ extern double compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual, int loop_count, Cost *cost, double *tuple); +extern Cost add_cost(Cost cost, Cost delta_cost); +extern bool is_lower_cost(Cost cost1, Cost cost2); +extern bool is_greater_cost(Cost cost1, Cost cost2); +extern bool is_geq_cost(Cost cost1, Cost cost2); + + /* * prototypes for clausesel.c * routines to compute clause selectivities ----- Jim Finnerty, AWS, Amazon Aurora PostgreSQL -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On Tue, 2019-12-10 at 15:50 -0700, Jim Finnerty wrote: > As a proof of concept, I hacked around a bit today to re-purpose one of the > bits of the Cost structure to mean "is_disabled" so that we can distinguish > 'disabled' from 'non-disabled' paths without making the Cost structure any > bigger. In fact, it's still a valid double. The obvious choice would have > been to re-purpose the sign bit, but I've had occasion to exploit negative > costs before so for this POC I used the high-order bit of the fractional > bits of the double. (see Wikipedia for double precision floating point for > the layout). > > The idea is to set a special bit when disable_cost is added to a cost. > Dedicating multiple bits instead of just 1 would be easily done, but as it > is we can accumulate many disable_costs without overflowing, so just > comparing the cost suffices. Doesn't that rely on a specific implementation of double precision (IEEE)? I thought that we don't want to limit ourselves to platforms with IEEE floats. Yours, Laurenz Albe
On Wed, 11 Dec 2019 at 01:24, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Tue, 2019-12-10 at 15:50 -0700, Jim Finnerty wrote: > > As a proof of concept, I hacked around a bit today to re-purpose one of the > > bits of the Cost structure to mean "is_disabled" so that we can distinguish > > 'disabled' from 'non-disabled' paths without making the Cost structure any > > bigger. In fact, it's still a valid double. The obvious choice would have > > been to re-purpose the sign bit, but I've had occasion to exploit negative > > costs before so for this POC I used the high-order bit of the fractional > > bits of the double. (see Wikipedia for double precision floating point for > > the layout). > > > > The idea is to set a special bit when disable_cost is added to a cost. > > Dedicating multiple bits instead of just 1 would be easily done, but as it > > is we can accumulate many disable_costs without overflowing, so just > > comparing the cost suffices. > > Doesn't that rely on a specific implementation of double precision (IEEE)? > I thought that we don't want to limit ourselves to platforms with IEEE floats. We could always implement it again in another format.... However, I wouldn't have expected to be bit twiddling. I would have expected to use standard functions like ldexp to do this. In fact I think if you use the high bit of the exponent you could do it entirely using ldexp and regular double comparisons (with fabs). Ie, to set the bit you set cost = ldexp(cost, __DBL_MAX_EXP__/2). And to check for the bit being set you compare ilogb(cost, __DBL_MAX_EXP__/2). Hm. that doesn't handle if the cost is already < 1 in which case I guess you would have to set it to 1 first. Or reserve the two high bits of the cost so you can represent disabled values that had negative exponents before being disabled. I wonder if it wouldn't be a lot cleaner and more flexible to just go with a plain float for Cost and use the other 32 bits for counters and bitmasks and still be ahead of the game. A double can store 2^1024 but a float 2^128 which still feels like it should be more than enough to store the kinds of costs plans have without the disabled costs. 2^128 milliseconds is still 10^28 years which is an awfully expensive query.... -- greg
On Wed, Dec 11, 2019 at 7:24 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > Doesn't that rely on a specific implementation of double precision (IEEE)? > I thought that we don't want to limit ourselves to platforms with IEEE floats. Just by the way, you might want to read the second last paragraph of the commit message for 02ddd499. The dream is over, we're never going to run on Vax.
Thomas Munro <thomas.munro@gmail.com> writes: > On Wed, Dec 11, 2019 at 7:24 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: >> Doesn't that rely on a specific implementation of double precision (IEEE)? >> I thought that we don't want to limit ourselves to platforms with IEEE floats. > Just by the way, you might want to read the second last paragraph of > the commit message for 02ddd499. The dream is over, we're never going > to run on Vax. Still, the proposed hack is doubling down on IEEE dependency in a way that I quite dislike, in that (a) it doesn't just read float values but generates new ones (and assumes that the hardware/libc will react in a predictable way to them), (b) in a part of the code that has no damn business having close dependencies on float format, and (c) for a gain far smaller than what we got from the Ryu code. We have had prior discussions about whether 02ddd499 justifies adding more IEEE dependencies elsewhere. I don't think it does. IEEE 754 is not the last word that will ever be said on floating-point arithmetic, any more than x86_64 is the last CPU architecture that anyone will ever care about. We should keep our dependencies on it well circumscribed. regards, tom lane
I think this would be ready to abstract away behind a few functions that could always be replaced by something else later...
However on further thought I really think just using a 32-bit float and 32 bits of other bitmaps or counters would be a better approach.
On Sun., Dec. 15, 2019, 14:54 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Wed, Dec 11, 2019 at 7:24 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>> Doesn't that rely on a specific implementation of double precision (IEEE)?
>> I thought that we don't want to limit ourselves to platforms with IEEE floats.
> Just by the way, you might want to read the second last paragraph of
> the commit message for 02ddd499. The dream is over, we're never going
> to run on Vax.
Still, the proposed hack is doubling down on IEEE dependency in a way
that I quite dislike, in that (a) it doesn't just read float values
but generates new ones (and assumes that the hardware/libc will react in
a predictable way to them), (b) in a part of the code that has no damn
business having close dependencies on float format, and (c) for a gain
far smaller than what we got from the Ryu code.
We have had prior discussions about whether 02ddd499 justifies adding
more IEEE dependencies elsewhere. I don't think it does. IEEE 754
is not the last word that will ever be said on floating-point arithmetic,
any more than x86_64 is the last CPU architecture that anyone will ever
care about. We should keep our dependencies on it well circumscribed.
regards, tom lane
Hi hackers,
I have write an initial patch to retire the
disable_cost
GUC, which labeled a flag on the Path struct instead of adding up a big cost which is hard to estimate. Though it involved in tons of plan changes in regression tests, I have tested on some simple test cases such as eagerly generate a two-stage agg plans and it worked. Could someone help to review?regards,
Jian
From: Euler Taveira <euler@timbira.com.br>
Sent: Friday, November 1, 2019 22:48
To: Zhenghua Lyu <zlyu@vmware.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: On disable_cost
Sent: Friday, November 1, 2019 22:48
To: Zhenghua Lyu <zlyu@vmware.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: On disable_cost
!! External Email
Em sex, 1 de nov de 2019 às 03:42, Zhenghua Lyu <zlv@pivotal.io> escreveu:
>
> My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For query 104, it generates
> nestloop join even with enable_nestloop set off. And the final plan's total cost is very huge (about 1e24). But If I enlarge the disable_cost to 1e30, then, planner will generate hash join.
>
> So I guess that disable_cost is not large enough for huge amount of data.
>
> It is tricky to set disable_cost a huge number. Can we come up with better solution?
>
Isn't it a case for a GUC disable_cost? As Thomas suggested, DBL_MAX
upper limit should be sufficient.
> The following thoughts are from Heikki:
>>
>> Aside from not having a large enough disable cost, there's also the fact that the high cost might affect the rest of the plan, if we have to use a plan type that's disabled. For example, if a table doesn't have any indexes, but enable_seqscan is off, we might put the unavoidable Seq Scan on different side of a join than we we would with enable_seqscan=on, because of the high cost estimate.
>
>
>>
>> I think a more robust way to disable forbidden plan types would be to handle the disabling in add_path(). Instead of having a high disable cost on the Path itself, the comparison add_path() would always consider disabled paths as more expensive than others, regardless of the cost.
>
I'm afraid it is not as cheap as using diable_cost as a node cost. Are
you proposing to add a new boolean variable in Path struct to handle
those cases in compare_path_costs_fuzzily?
--
Euler Taveira Timbira -
https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.timbira.com.br%2F&data=05%7C01%7Cgjian%40vmware.com%7C12a30b2852dd4651667608db9401d056%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638266507757076648%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=v54JhsW8FX4mSmjgt2yP59t7xtv1mZvC%2BBhtKrfp%2FBY%3D&reserved=0
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.
Em sex, 1 de nov de 2019 às 03:42, Zhenghua Lyu <zlv@pivotal.io> escreveu:
>
> My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For query 104, it generates
> nestloop join even with enable_nestloop set off. And the final plan's total cost is very huge (about 1e24). But If I enlarge the disable_cost to 1e30, then, planner will generate hash join.
>
> So I guess that disable_cost is not large enough for huge amount of data.
>
> It is tricky to set disable_cost a huge number. Can we come up with better solution?
>
Isn't it a case for a GUC disable_cost? As Thomas suggested, DBL_MAX
upper limit should be sufficient.
> The following thoughts are from Heikki:
>>
>> Aside from not having a large enough disable cost, there's also the fact that the high cost might affect the rest of the plan, if we have to use a plan type that's disabled. For example, if a table doesn't have any indexes, but enable_seqscan is off, we might put the unavoidable Seq Scan on different side of a join than we we would with enable_seqscan=on, because of the high cost estimate.
>
>
>>
>> I think a more robust way to disable forbidden plan types would be to handle the disabling in add_path(). Instead of having a high disable cost on the Path itself, the comparison add_path() would always consider disabled paths as more expensive than others, regardless of the cost.
>
I'm afraid it is not as cheap as using diable_cost as a node cost. Are
you proposing to add a new boolean variable in Path struct to handle
those cases in compare_path_costs_fuzzily?
--
Euler Taveira Timbira -
https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.timbira.com.br%2F&data=05%7C01%7Cgjian%40vmware.com%7C12a30b2852dd4651667608db9401d056%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638266507757076648%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=v54JhsW8FX4mSmjgt2yP59t7xtv1mZvC%2BBhtKrfp%2FBY%3D&reserved=0
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.
Attachment
On Thu, Aug 3, 2023 at 5:22 AM Jian Guo <gjian@vmware.com> wrote: > I have write an initial patch to retire the disable_cost GUC, which labeled a flag on the Path struct instead of addingup a big cost which is hard to estimate. Though it involved in tons of plan changes in regression tests, I have testedon some simple test cases such as eagerly generate a two-stage agg plans and it worked. Could someone help to review? I took a look at this patch today. I believe that overall this may well be an approach worth pursuing. However, more work is going to be needed. Here are some comments: 1. You stated that it changes lots of plans in the regression tests, but you haven't provided any sort of analysis of why those plans changed. I'm kind of surprised that there would be "tons" of plan changes. You (or someone) should look into why that's happening. 2. The change to compare_path_costs_fuzzily() seems incorrect to me. When path1->is_disabled && path2->is_disabled, costs should be compared just as they are when neither path is disabled. Instead, the patch treats any two such paths as having equal cost. That seems catastrophically bad. Maybe it accounts for some of those plan changes, although that would only be true if those plans were created while using some disabling GUC. 3. Instead of adding is_disabled at the end of the Path structure, I suggest adding it between param_info and parallel_aware. I think if you do that, the space used by the new field will use up padding bytes that are currently included in the struct, instead of making it longer. 4. A critical issue for any patch of this type is performance. This concern was raised earlier on this thread, but your path doesn't address it. There's no performance analysis or benchmarking included in your email. One idea that I have is to write the cost-comparison test like this: if (unlikely(path1->is_disabled || path2->is_disabled)) { if (!path1->is_disabled) return COSTS_BETTER1; if (!path2->is_disabled) return COSTS_BETTER2; /* if both disabled, fall through */ } I'm not sure that would be enough to prevent the patch from adding noticeably to the cost of path comparison, but maybe it would help. 5. The patch changes only compare_path_costs_fuzzily() but I wonder whether compare_path_costs() and compare_fractional_path_costs() need similar surgery. Whether they do or don't, there should likely be some comments explaining the situation. 6. In fact, the patch changes no comments at all, anywhere. I'm not sure how many comment changes a patch like this needs to make, but the answer definitely isn't "none". 7. The patch doesn't actually remove disable_cost. I guess it should. 8. When you submit a patch, it's a good idea to also add it on commitfest.postgresql.org. It doesn't look like that was done in this case. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Aug 3, 2023 at 5:22 AM Jian Guo <gjian@vmware.com> wrote: >> I have write an initial patch to retire the disable_cost GUC, which labeled a flag on the Path struct instead of addingup a big cost which is hard to estimate. Though it involved in tons of plan changes in regression tests, I have testedon some simple test cases such as eagerly generate a two-stage agg plans and it worked. Could someone help to review? > I took a look at this patch today. I believe that overall this may > well be an approach worth pursuing. However, more work is going to be > needed. Here are some comments: > 1. You stated that it changes lots of plans in the regression tests, > but you haven't provided any sort of analysis of why those plans > changed. I'm kind of surprised that there would be "tons" of plan > changes. You (or someone) should look into why that's happening. I've not read the patch, but given this description I would expect there to be *zero* regression changes --- I don't think we have any test cases that depend on disable_cost being finite. If there's more than zero changes, that very likely indicates a bug in the patch. Even if there are places where the output legitimately changes, you need to justify each one and make sure that you didn't invalidate the intent of that test case. BTW, having written that paragraph, I wonder if we couldn't get the same end result with a nearly one-line change that consists of making disable_cost be IEEE infinity. Years ago we didn't want to rely on IEEE float semantics in this area, but nowadays I don't see why we shouldn't. regards, tom lane
On Tue, Mar 12, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > BTW, having written that paragraph, I wonder if we couldn't get > the same end result with a nearly one-line change that consists of > making disable_cost be IEEE infinity. Years ago we didn't want > to rely on IEEE float semantics in this area, but nowadays I don't > see why we shouldn't. I don't think so, because I think that what will happen in that case is that we'll pick a completely random plan if we can't pick a plan that avoids incurring disable_cost. Every plan that contains one disabled node anywhere in the plan tree will look like it has exactly the same cost as any other such plan. IMHO, this is actually one of the problems with disable_cost as it works today. I think the semantics that we want are: if it's possible to pick a plan where nothing is disabled, then pick the cheapest such plan; if not, pick the cheapest plan overall. But treating disable_cost doesn't really do that. It does the first part -- picking the cheapest plan where nothing is disabled -- but it doesn't do the second part, because once you add disable_cost into the cost of some particular plan node, it screws up the rest of the planning, because the cost estimates for the disabled nodes have no bearing in reality. Fast-start plans, for example, will look insanely good compared to what would be the case in normal planning (and we lean too much toward fast-start plans even normally). (I don't think we should care how MANY disabled nodes appear in a plan, particularly. This is a more arguable point. Is a plan with 1 disabled node and 10% more cost better or worse than a plan with 2 disabled nodes and 10% less cost? I'd argue that counting the number of disabled nodes isn't particularly meaningful.) -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Mar 12, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, having written that paragraph, I wonder if we couldn't get >> the same end result with a nearly one-line change that consists of >> making disable_cost be IEEE infinity. > I don't think so, because I think that what will happen in that case > is that we'll pick a completely random plan if we can't pick a plan > that avoids incurring disable_cost. Every plan that contains one > disabled node anywhere in the plan tree will look like it has exactly > the same cost as any other such plan. Good point. > IMHO, this is actually one of the problems with disable_cost as it > works today. Yeah. I keep thinking that the right solution is to not generate disabled paths in the first place if there are any other ways to produce the same relation. That has obvious order-of-operations problems though, and I've not been able to make it work. regards, tom lane
On Tue, Mar 12, 2024 at 3:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah. I keep thinking that the right solution is to not generate > disabled paths in the first place if there are any other ways to > produce the same relation. That has obvious order-of-operations > problems though, and I've not been able to make it work. I've expressed the same view in the past. It would be nice not to waste planner effort on paths that we're just going to throw away, but I'm not entirely sure what you mean by "obvious order-of-operations problems." To me, it seems like what we'd need is to be able to restart the whole planner process if we run out of steam before we get done. For example, suppose we're planning a 2-way join where index and index-only scans are disabled, sorts are disabled, and nested loops and hash joins are disabled. There's no problem generating just the non-disabled scan types at the baserel level, but when we reach the join, we're going to find that the only non-disabled join type is a merge join, and we're also going to find that we have no paths that provide pre-sorted input, so we need to sort, which we're also not allowed to do. If we could give up at that point and restart planning, disabling all of the plan-choice constraints and now creating all paths for each RelOptInfo, then everything would, I believe, be just fine. We'd end up needing neither disable_cost nor the mechanism proposed by this patch. But in the absence of that, we need some way to privilege the non-disabled paths over the disabled ones -- and I'd prefer to have something more principled than disable_cost, if we can work out the details. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, 13 Mar 2024 at 08:55, Robert Haas <robertmhaas@gmail.com> wrote: > But in the absence of that, we need some way to privilege the > non-disabled paths over the disabled ones -- and I'd prefer to have > something more principled than disable_cost, if we can work out the > details. The primary place I see issues with disabled_cost is caused by STD_FUZZ_FACTOR. When you add 1.0e10 to a couple of modestly costly paths, it makes them appear fuzzily the same in cases where one could be significantly cheaper than the other. If we were to bump up the disable_cost it would make this problem worse. I think we do still need some way to pick the cheapest disabled path when there are no other options. One way would be to set fuzz_factor to 1.0 when either of the paths costs in compare_path_costs_fuzzily() is >= disable_cost. clamp_row_est() does cap row estimates at MAXIMUM_ROWCOUNT (1e100), so I think there is some value of disable_cost that could almost certainly ensure we don't reach it because the path is truly expensive rather than disabled. So maybe the fix could be to set disable_cost to something like 1.0e110 and adjust compare_path_costs_fuzzily to not apply the fuzz_factor for paths >= disable_cost. However, I wonder if that risks the costs going infinite after a couple of cartesian joins. David
David Rowley <dgrowleyml@gmail.com> writes: > So maybe the fix could be to set disable_cost to something like > 1.0e110 and adjust compare_path_costs_fuzzily to not apply the > fuzz_factor for paths >= disable_cost. However, I wonder if that > risks the costs going infinite after a couple of cartesian joins. Perhaps. It still does nothing for Robert's point that once we're forced into using a "disabled" plan type, it'd be better if the disabled-ness didn't skew subsequent planning choices. On the whole I agree that getting rid of disable_cost entirely would be the way to go, if we can replace that with a separate boolean without driving up the cost of add_path too much. regards, tom lane
On Tue, Mar 12, 2024 at 4:55 PM David Rowley <dgrowleyml@gmail.com> wrote: > The primary place I see issues with disabled_cost is caused by > STD_FUZZ_FACTOR. When you add 1.0e10 to a couple of modestly costly > paths, it makes them appear fuzzily the same in cases where one could > be significantly cheaper than the other. If we were to bump up the > disable_cost it would make this problem worse. Hmm, good point. > So maybe the fix could be to set disable_cost to something like > 1.0e110 and adjust compare_path_costs_fuzzily to not apply the > fuzz_factor for paths >= disable_cost. However, I wonder if that > risks the costs going infinite after a couple of cartesian joins. Yeah, I think the disabled flag is a better answer if we can make it work. No matter what value we pick for disable_cost, it's bound to skew the planning sometimes. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Mar 12, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > 1. You stated that it changes lots of plans in the regression tests, > > but you haven't provided any sort of analysis of why those plans > > changed. I'm kind of surprised that there would be "tons" of plan > > changes. You (or someone) should look into why that's happening. > > I've not read the patch, but given this description I would expect > there to be *zero* regression changes --- I don't think we have any > test cases that depend on disable_cost being finite. If there's more > than zero changes, that very likely indicates a bug in the patch. > Even if there are places where the output legitimately changes, you > need to justify each one and make sure that you didn't invalidate the > intent of that test case. I spent some more time poking at this patch. It's missing a ton of important stuff and is wrong in a whole bunch of really serious ways, and I'm not going to try to mention all of them in this email. But I do want to talk about some of the more interesting realizations that came to me as I was working my way through this. One of the things I realized relatively early is that the patch does nothing to propagate disable_cost upward through the plan tree. That means that if you have a choice between, say, Sort-over-Append-over-SeqScan and MergeAppend-over-IndexScan, as we do in the regression tests, disabling IndexScan doesn't change the plan with the patch applied, as it does in master. That's because only the IndexScan node ends up flagged as disabled. Once we start stacking other plan nodes on top of disabled plan nodes, the resultant plans are at no disadvantage compared to plans containing no disabled nodes. The IndexScan plan survives initially, despite being disabled, because it's got a sort order. That lets us use it to build a MergeAppend path, and that MergeAppend path is not disabled, and compares favorably on cost. After straining my brain over various plan changes for a long time, and hacking on the code somewhat, I realized that just propagating the Boolean upward is insufficient to set things right. That's basically because I was being dumb when I said this: > I don't think we should care how MANY disabled nodes appear in a > plan, particularly. Suppose we try to plan a Nested Loop with SeqScan disabled, but there's no alternative to a SeqScan for the outer side of the join. If we suppose an upward-propagating Boolean, every path for the join is disabled because every path for the outer side is disabled. That means that we have no reason to avoid paths where the inner side also uses a disabled path. When we loop over the inner rel's pathlist looking for ways to build a path for the join, we may find some disabled paths there, and the join paths we build from those paths are disabled, but so are the join paths where we use a non-disabled path on the inner side. So those paths are just competing with each other on cost, and the path type that is supposedly disabled on the outer side of the join ends up not really being very disabled at all. More precisely, if disabling a plan type causes paths to be discarded completely before the join paths are constructed, then they actually do get removed from consideration. But if those paths make it into inner rel's path list, even way out towards the end, then paths derived from them can jump to the front of the joinrel's path list. The same kind of problem happens with Append or MergeAppend nodes. The regression tests expect that we'll avoid disabled plan types whenever possible even if we can't avoid them completely; for instance, the matest0 table intentionally omits an index on one child table. Disabling sequential scans is expected to disable them for all of the other child tables even though for that particular child table there is no other option. But that behavior is hard to achieve if every path for the parent rel is "equally disabled". You either want the path that uses only the one required SeqScan to be not-disabled even though one of its children is disabled ... or you want the disabled flag to be more than a Boolean. And while there's probably more than one way to make it work, the easiest thing seems to be to just have a disabled-counter in every node that gets initialized to the total disabled-counter values of all of its children, and then you add 1 if that node is itself doing something that is disabled, i.e. the exact opposite of what I said in the quote above. I haven't done enough work to know whether that would match the current behavior, let alone whether it would have acceptable performance, and I'm not at all convinced that's the right direction anyway. Actually, at the moment, I don't have a very good idea at all what the right direction is. I do have a feeling that this patch is probably not going in the right direction, but I might be wrong about that, too. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > One of the things I realized relatively early is that the patch does > nothing to propagate disable_cost upward through the plan tree. > ... > After straining my brain over various plan changes for a long time, > and hacking on the code somewhat, I realized that just propagating the > Boolean upward is insufficient to set things right. That's basically > because I was being dumb when I said this: >> I don't think we should care how MANY disabled nodes appear in a >> plan, particularly. Very interesting, thanks for the summary. So the fact that disable_cost is additive across plan nodes is actually a pretty important property of the current setup. I think this is closely related to one argument you made against my upthread idea of using IEEE Infinity for disable_cost: that'd mask whether more than one of the sub-plans had been disabled. > ... And while there's probably more than one way > to make it work, the easiest thing seems to be to just have a > disabled-counter in every node that gets initialized to the total > disabled-counter values of all of its children, and then you add 1 if > that node is itself doing something that is disabled, i.e. the exact > opposite of what I said in the quote above. Yeah, that seems like the next thing to try if anyone plans to pursue this further. That'd essentially do what we're doing now except that disable_cost is its own "order of infinity", entirely separate from normal costs. regards, tom lane
On Mon, Apr 1, 2024 at 5:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Very interesting, thanks for the summary. So the fact that > disable_cost is additive across plan nodes is actually a pretty > important property of the current setup. I think this is closely > related to one argument you made against my upthread idea of using > IEEE Infinity for disable_cost: that'd mask whether more than one > of the sub-plans had been disabled. Yes, exactly. I just hadn't quite put the pieces together. > Yeah, that seems like the next thing to try if anyone plans to pursue > this further. That'd essentially do what we're doing now except that > disable_cost is its own "order of infinity", entirely separate from > normal costs. Right. I think that's actually what I had in mind in the last paragraph of http://postgr.es/m/CA+TgmoY+Ltw7B=1FSFSN4yHcu2roWrz-ijBovj-99LZU=9h1dA@mail.gmail.com but that was a while ago and I'd lost track of why it actually mattered. But I also have questions about whether that's really the right approach. I think the approach of just not generating paths we don't want in the first place merits more consideration. We do that in some cases already, but not in others, and I'm not clear why. Like, if index-scans, index-only scans, sorts, nested loops, and hash joins are disabled, something is going to have to give, because the only remaining join type is a merge join yet we've ruled out every possible way of getting the day into some order, but I'm not sure whether there's some reason that we need exactly the behavior that we have right now rather than anything else. Maybe it would be OK to just insist on at least one unparameterized, non-partial path at the baserel level, and then if that ends up forcing us to ignore the join-type restrictions higher up, so be it. Or maybe that's not OK and after I try that out I'll end up writing another email about how I was a bit clueless about all of this. I don't know. But I feel like it merits more investigation, because I'm having trouble shaking the theory that what we've got right now is pretty arbitrary. And also ... looking at the regression tests, and also thinking about the kinds of problems that I think people run into in real deployments, I can't help feeling like we've somehow got this whole thing backwards. enable_wunk imagines that you want to plan as normal except with one particular plan type excluded from consideration. And maybe that makes sense if the point of the enable_wunk GUC is that the planner feature might be buggy and you might therefore want to turn it off to protect yourself, or if the planner feature might be expensive and you might want to turn it off to save cycles. But surely that's not the case with something like enable_seqscan or enable_indexscan. What I think we're mostly doing in the regression tests is shutting off every relevant type of plan except one. I theorize that what we actually want to do is tell the planner what we do want to happen, rather than what we don't want to happen, but we've got this weird set of GUCs that do the opposite of that and we're super-attached to them because they've existed forever. I don't really have a concrete proposal here, but I wonder if what we're actually talking about here is spending time and energy polishing a mechanism that nobody likes in the first place. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Apr 1, 2024 at 7:54 PM Robert Haas <robertmhaas@gmail.com> wrote:
What I think we're mostly doing in the regression tests is shutting
off every relevant type of plan except one. I theorize that what we
actually want to do is tell the planner what we do want to happen,
rather than what we don't want to happen, but we've got this weird set
of GUCs that do the opposite of that and we're super-attached to them
because they've existed forever.
So rather than listing all the things we don't want to happen, we need a way to force (nay, highly encourage) a particular solution. As our costing is a based on positive numbers, what if we did something like this in costsize.c?
Cost disable_cost = 1.0e10;
Cost promotion_cost = 1.0e10; // or higher or lower, depending on how strongly we want to "beat" disable_costs effects.
...
if (!enable_seqscan)
startup_cost += disable_cost;
startup_cost += disable_cost;
else if (promote_seqscan)
startup_cost -= promotion_cost; // or replace "promote" with "encourage"?
Greg
On Tue, Apr 2, 2024 at 10:04 AM Greg Sabino Mullane <htamfids@gmail.com> wrote: > So rather than listing all the things we don't want to happen, we need a way to force (nay, highly encourage) a particularsolution. As our costing is a based on positive numbers, what if we did something like this in costsize.c? > > Cost disable_cost = 1.0e10; > Cost promotion_cost = 1.0e10; // or higher or lower, depending on how strongly we want to "beat" disable_costseffects. > ... > > if (!enable_seqscan) > startup_cost += disable_cost; > else if (promote_seqscan) > startup_cost -= promotion_cost; // or replace "promote" with "encourage"? I'm pretty sure negative costs are going to create a variety of unpleasant planning artifacts. The large positive costs do, too, which is where this whole discussion started. If I disable (or promote) some particular plan, I want the rest of the plan tree to come out looking as much as possible like what would have happened if the same alternative had won organically on cost. I think the only reason we're driving this off of costing today is that making add_path() more complicated is unappealing, mostly on performance grounds, and if you add disabled-ness (or promoted-ness) as a separate axis of value then add_path() has to know about that on top of everything else. I think the goal here is to come up with a more principled alternative that isn't just based on whacking large numbers into the cost and hoping something good happens ... but it is a whole lot easier to be unhappy with the status quo than it is to come up with something that's actually better. I am planning to spend some more time thinking about it, though. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Apr 2, 2024 at 10:04 AM Greg Sabino Mullane <htamfids@gmail.com> wrote: >> if (!enable_seqscan) >> startup_cost += disable_cost; >> else if (promote_seqscan) >> startup_cost -= promotion_cost; // or replace "promote" with "encourage"? > I'm pretty sure negative costs are going to create a variety of > unpleasant planning artifacts. Indeed. It might be okay to have negative values for disabled-ness if we treat disabled-ness as a "separate order of infinity", but I suspect that it'd behave poorly when there are both disabled and promoted sub-paths in a tree, for pretty much the same reasons you explained just upthread. > I think the only reason we're > driving this off of costing today is that making add_path() more > complicated is unappealing, mostly on performance grounds, and if you > add disabled-ness (or promoted-ness) as a separate axis of value then > add_path() has to know about that on top of everything else. It doesn't seem to me that it's a separate axis of value, just a higher-order component of the cost metric. Nonetheless, adding even a few instructions to add_path comparisons sounds expensive. Maybe it'd be fine, but we'd need to do some performance testing. regards, tom lane
On Tue, Apr 2, 2024 at 11:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'm pretty sure negative costs are going to create a variety of > > unpleasant planning artifacts. > > Indeed. It might be okay to have negative values for disabled-ness > if we treat disabled-ness as a "separate order of infinity", but > I suspect that it'd behave poorly when there are both disabled and > promoted sub-paths in a tree, for pretty much the same reasons you > explained just upthread. Hmm, can you explain further? I think essentially you'd be maximizing #(promoted notes)-#(disabled nodes), but I have no real idea whether that behavior will be exactly what people want or extremely unintuitive or something in the middle. It seems like it should be fine if there's only promoting or only disabling or if we can respect both the promoting and the disabling, assuming we even want to have both, but I'm suspicious that it will be weird somehow in other cases. I can't say exactly in what way, though. Do you have more insight? > > I think the only reason we're > > driving this off of costing today is that making add_path() more > > complicated is unappealing, mostly on performance grounds, and if you > > add disabled-ness (or promoted-ness) as a separate axis of value then > > add_path() has to know about that on top of everything else. > > It doesn't seem to me that it's a separate axis of value, just a > higher-order component of the cost metric. Nonetheless, adding even > a few instructions to add_path comparisons sounds expensive. Maybe > it'd be fine, but we'd need to do some performance testing. Hmm, yeah. I'm not sure how much difference there is between these things in practice. I didn't run down everything that was happening, but I think what I did was equivalent to making it a higher-order component of the cost metric, and it seemed like an awful lot of paths were surviving anyway, e.g. index scans survived enable_indexscan=false because they had a sort order, and I think sequential scans were surviving enable_seqscan=false too, perhaps because they had no startup cost. At any rate there's no question that add_path() is hot. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Apr 2, 2024 at 11:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I suspect that it'd behave poorly when there are both disabled and >> promoted sub-paths in a tree, for pretty much the same reasons you >> explained just upthread. > Hmm, can you explain further? I think essentially you'd be maximizing > #(promoted notes)-#(disabled nodes), but I have no real idea whether > that behavior will be exactly what people want or extremely > unintuitive or something in the middle. It seems like it should be > fine if there's only promoting or only disabling or if we can respect > both the promoting and the disabling, assuming we even want to have > both, but I'm suspicious that it will be weird somehow in other cases. > I can't say exactly in what way, though. Do you have more insight? Not really. But if you had, say, a join of a promoted path to a disabled path, that would be treated as on-par with a join of two regular paths, which seems like it'd lead to odd choices. Maybe it'd be fine, but my gut says it'd likely not act nicely. As you say, it's a lot easier to believe that only-promoted or only-disabled situations would behave sanely. regards, tom lane
On Tue, Apr 2, 2024 at 12:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Not really. But if you had, say, a join of a promoted path to a > disabled path, that would be treated as on-par with a join of two > regular paths, which seems like it'd lead to odd choices. Maybe > it'd be fine, but my gut says it'd likely not act nicely. As you > say, it's a lot easier to believe that only-promoted or only-disabled > situations would behave sanely. Makes sense. I wanted to further explore the idea of just not generating plans of types that are currently disabled. I looked into doing this for enable_indexscan and enable_indexonlyscan. As a first step, I investigated how those settings work now, and was horrified. I don't know whether I just wasn't paying attention back when the original index-only scan work was done -- I remember discussing enable_indexonlyscan with you at the time -- or whether it got changed subsequently. Anyway, the current behavior is: [A] enable_indexscan=false adds disable_cost to the cost of every Index Scan path *and also* every Index-Only Scan path. So disabling index-scans also in effect discourages the use of index-only scans, which would make sense if we didn't have a separate setting called enable_indexonlyscan, but we do. Given that, I think this is completely and utterly wrong. [b] enable_indexonlyscan=false causes index-only scan paths not to be generated at all, but instead, we generate index-scan paths to do the same thing that we would not have generated otherwise. This is weird because it means that disabling one plan type causes us to consider additional plans of another type, which seems like a thing that a user might not expect. It's more defensible than [A], though, because you could argue that we only omit the index scan path as an optimization, on the theory that it will always lose to the index-only scan path, and thus if the index-only scan path is not generated, there's a point to generating the index scan path after all, so we should. However, it seems unlikely to me that someone reading the one line of documentation that we have about this parameter would be able to guess that it works this way. Here's an example of how the current system behaves: robert.haas=# explain select count(*) from pgbench_accounts; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Aggregate (cost=2854.29..2854.30 rows=1 width=8) -> Index Only Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.29..2604.29 rows=100000 width=0) (2 rows) robert.haas=# set enable_indexscan=false; SET robert.haas=# explain select count(*) from pgbench_accounts; QUERY PLAN ------------------------------------------------------------------------------ Aggregate (cost=2890.00..2890.01 rows=1 width=8) -> Seq Scan on pgbench_accounts (cost=0.00..2640.00 rows=100000 width=0) (2 rows) robert.haas=# set enable_seqscan=false; SET robert.haas=# set enable_bitmapscan=false; SET robert.haas=# explain select count(*) from pgbench_accounts; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=10000002854.29..10000002854.30 rows=1 width=8) -> Index Only Scan using pgbench_accounts_pkey on pgbench_accounts (cost=10000000000.29..10000002604.29 rows=100000 width=0) (2 rows) robert.haas=# set enable_indexonlyscan=false; SET robert.haas=# explain select count(*) from pgbench_accounts; QUERY PLAN ----------------------------------------------------------------------------------------------- Aggregate (cost=10000002890.00..10000002890.01 rows=1 width=8) -> Seq Scan on pgbench_accounts (cost=10000000000.00..10000002640.00 rows=100000 width=0) (2 rows) The first time we run the query, it picks an index-only scan because it's the cheapest. When index scans are disabled, the query now picks a sequential scan, even though it wasn't use an index-only scan, because the index scan that it was using is perceived to have become very expensive. When we then shut off sequential scans and bitmap-only scans, it switches back to an index-only scan, because setting enable_indexscan=false didn't completely disable index-only scans, but just made them expensive. But now everything looks expensive, so we go back to the same plan we had initially, except with the cost increased by a bazillion. Finally, when we disable index-only scans, that removes that plan from the pool, so now we pick the second-cheapest plan overall, which in this case is a sequential scan. So just to see what would happen, I wrote a patch to make enable_indexscan and enable_indexonlyscan do exactly what they say on the tin: when you set one of them to false, paths of that type are not generated, and nothing else changes. I found that there are a surprisingly large number of regression tests that rely on the current behavior, so I took a crack at fixing them to achieve their goals (or what I believed their goals to be) in other ways. The resulting patch is attached for your (or anyone's) possible edification. Just to be clear, I have no immediate plans to press forward with trying to get something committed here. It seems pretty clear to me that we should fix [A] in some way, but maybe not in the way I did it here. It's also pretty clear to me that the fact that enable_indexscan and enable_indexonlyscan work completely differently from each other is surprising at best, wrong at worst, but here again, what this patch does about that is not above reproach. I think it may make sense to dig through the behavior of some of the remaining enable_* GUCs before settling on a final strategy here, but I thought that the finds above were interesting enough and bizarre enough that it made sense to drop an email now and see what people think of all this before going further. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
On Wed, Apr 3, 2024 at 3:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
It's also pretty clear to me that the fact that enable_indexscan
and enable_indexonlyscan work completely differently from each other
is surprising at best, wrong at worst, but here again, what this patch
does about that is not above reproach.
Yes, that is wrong, surely there is a reason we have two vars. Thanks for digging into this: if nothing else, the code will be better for this discussion, even if we do nothing for now with disable_cost.
Cheers,
Greg
On Thu, 4 Apr 2024 at 08:21, Robert Haas <robertmhaas@gmail.com> wrote: > I wanted to further explore the idea of just not generating plans of > types that are currently disabled. I looked into doing this for > enable_indexscan and enable_indexonlyscan. As a first step, I > investigated how those settings work now, and was horrified. I don't > know whether I just wasn't paying attention back when the original > index-only scan work was done -- I remember discussing > enable_indexonlyscan with you at the time -- or whether it got changed > subsequently. Anyway, the current behavior is: > > [A] enable_indexscan=false adds disable_cost to the cost of every > Index Scan path *and also* every Index-Only Scan path. So disabling > index-scans also in effect discourages the use of index-only scans, > which would make sense if we didn't have a separate setting called > enable_indexonlyscan, but we do. Given that, I think this is > completely and utterly wrong. > > [b] enable_indexonlyscan=false causes index-only scan paths not to be > generated at all, but instead, we generate index-scan paths to do the > same thing that we would not have generated otherwise. FWIW, I think changing this is a bad idea and I don't think the behaviour that's in your patch is useful. With your patch, if I SET enable_indexonlyscan=false, any index that *can* support an IOS for my query will now not be considered at all! With your patch applied, I see: -- default enable_* GUC values. postgres=# explain select oid from pg_class order by oid; QUERY PLAN ------------------------------------------------------------------------------------------- Index Only Scan using pg_class_oid_index on pg_class (cost=0.27..22.50 rows=415 width=4) (1 row) postgres=# set enable_indexonlyscan=0; -- no index scan? SET postgres=# explain select oid from pg_class order by oid; QUERY PLAN ----------------------------------------------------------------- Sort (cost=36.20..37.23 rows=415 width=4) Sort Key: oid -> Seq Scan on pg_class (cost=0.00..18.15 rows=415 width=4) (3 rows) postgres=# set enable_seqscan=0; -- still no index scan! SET postgres=# explain select oid from pg_class order by oid; QUERY PLAN ------------------------------------------------------------------------------------ Sort (cost=10000000036.20..10000000037.23 rows=415 width=4) Sort Key: oid -> Seq Scan on pg_class (cost=10000000000.00..10000000018.15 rows=415 width=4) (3 rows) postgres=# explain select oid from pg_class order by oid,relname; -- now an index scan?! QUERY PLAN --------------------------------------------------------------------------------------------- Incremental Sort (cost=0.43..79.50 rows=415 width=68) Sort Key: oid, relname Presorted Key: oid -> Index Scan using pg_class_oid_index on pg_class (cost=0.27..60.82 rows=415 width=68) (4 rows) I don't think this is good as pg_class_oid_index effectively won't be used as long as the particular query could use that index with an index *only* scan. You can see above that as soon as I adjust the query slightly so that IOS isn't possible, the index can be used again. I think an Index Scan would have been a much better option for the 2nd query than the seq scan and sort. I think if I do SET enable_indexonlyscan=0; the index should still be used with an Index Scan and it definitely shouldn't result in Index Scan also being disabled if that index happens to contain all the columns required to support an IOS. FWIW, I'm fine with the current behaviour. It looks like we've assumed that, when possible, IOS are always superior to Index Scan, so there's no point in generating an Index Scan path when we can generate an IOS path. I think this makes sense. For that not to be true, checking the all visible flag would have to be more costly than visiting the heap. Perhaps that could be true if the visibility map page had to come from disk and the heap page was cached and the disk was slow, but I don't think that scenario is worthy of considering both Index scan and IOS path types when IOS is possible. We've no way to accurately cost that anyway. This all seems similar to enable_sort vs enable_incremental_sort. For a while, we did consider both an incremental sort and a sort when an incremental sort was possible, but it seemed to me that an incremental sort would always be better when it was possible, so I changed that in 4a29eabd1. I've not seen anyone complain. I made it so that when an incremental sort is possible but is disabled, we do a sort instead. That seems fairly similar to how master handles enable_indexonlyscan=false. In short, I don't find it strange that disabling one node type results in considering another type that we'd otherwise not consider in cases where we assume that the disabled node type is always superior and should always be used when it is possible. I do agree that adding disable_cost to IOS when enable_indexscan=0 is a bit weird. We don't penalise incremental sorts when sorts are disabled, so aligning those might make sense. David
On Thu, 4 Apr 2024 at 10:15, David Rowley <dgrowleyml@gmail.com> wrote: > In short, I don't find it strange that disabling one node type results > in considering another type that we'd otherwise not consider in cases > where we assume that the disabled node type is always superior and > should always be used when it is possible. In addition to what I said earlier, I think the current enable_indexonlyscan is implemented in a way that has the planner do what it did before IOS was added. I think that goal makes sense with any patch that make the planner try something new. We want to have some method to get the previous behaviour for the cases where the planner makes a dumb choice or to avoid some bug in the new feature. I think using that logic, the current scenario with enable_indexscan and enable_indexonlyscan makes complete sense. I mean, including enable_indexscan=0 adding disable_cost to IOS Paths. David
On Wed, Apr 3, 2024 at 11:09 PM David Rowley <dgrowleyml@gmail.com> wrote: > On Thu, 4 Apr 2024 at 10:15, David Rowley <dgrowleyml@gmail.com> wrote: > > In short, I don't find it strange that disabling one node type results > > in considering another type that we'd otherwise not consider in cases > > where we assume that the disabled node type is always superior and > > should always be used when it is possible. > > In addition to what I said earlier, I think the current > enable_indexonlyscan is implemented in a way that has the planner do > what it did before IOS was added. I think that goal makes sense with > any patch that make the planner try something new. We want to have > some method to get the previous behaviour for the cases where the > planner makes a dumb choice or to avoid some bug in the new feature. I see the logic of this, and I agree that the resulting behavior might be more intuitive than what I posted before. I'll do some experiments. > I think using that logic, the current scenario with enable_indexscan > and enable_indexonlyscan makes complete sense. I mean, including > enable_indexscan=0 adding disable_cost to IOS Paths. This, for me, is a bridge too far. I don't think there's a real argument that "what the planner did before IOS was added" was add disable_cost to the cost of index-only scan paths. There was no such path type. Independently of that argument, I also think the behavior of a setting needs to be something that a user can understand. Right now, the documentation says: Enables or disables the query planner's use of index-scan plan types. The default is on. Enables or disables the query planner's use of index-only-scan plan types (see Section 11.9). The default is on. I do not think that a user can be expected to guess from these descriptions that the first one also affects index-only scans, or that the two GUCs disable their respective plan types in completely different ways. Granted, the latter inconsistency affects a whole bunch of these settings, not just this one, but still. -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, Nov 2, 2019 at 10:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > The idea that I've been thinking about is to not generate disabled > Paths in the first place, thus not only fixing the problem but saving > some cycles. While this seems easy enough for "optional" paths, > we have to reserve the ability to generate certain path types regardless, > if there's no other way to implement the query. This is a bit of a > stumbling block :-(. At the base relation level, we could do something > like generating seqscan last, and only if no other path has been > successfully generated. Continuing my investigation into this rather old thread, I did a rather primitive implementation of this idea, for baserels only, and discovered that it caused a small number of planner failures running the regression tests. Here is a slightly simplified example: CREATE TABLE strtest (n text, t text); CREATE INDEX strtest_n_idx ON strtest (n); SET enable_seqscan=false; EXPLAIN SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n; With the patch, I get: ERROR: could not devise a query plan for the given query The problem here is that it's perfectly possible to generate a valid path for s1 -- and likewise for s2, since it's the same underlying relation -- while respecting the enable_seqscan=false constraint. However, all such paths are parameterized by the other of the two relations, which means that if we do that, we can't plan the join, because we need an unparameterized path for at least one of the two sides in order to build a nested loop join, which is the only way to satisfy the parameterization on the other side. Now, you could try to fix this by deciding that planning for a baserel hasn't really succeeded unless we got at least one *unparameterized* path for that baserel. I haven't tried that, but I presume that if you do, it fixes the above example, because now there will be a last-ditch sequential scan on both sides and so this example will behave as expected. But if you do that, then in other cases, that sequential scan is going to get picked even when it isn't strictly necessary to do so, just because some plan that uses it looks better on cost. Presumably that problem can in turn be fixed by deciding that we also need to keep disable_cost around (or the separate disable-counter idea that we were discussing recently in another branch of this thread), but that's arguably missing the point of this exercise. Another idea is to remove the ERROR mentioned above from set_cheapest() and just allow planning to continue even if some relations end up with no paths. (This would necessitate finding and fixing any code that could be confused by a pathless relation.) Then, if you get to the top of the plan tree and you have no paths there, redo the join search discarding the constraints (or maybe just some of the constraints, e.g. allow sequential scans and nested loops, or something). Conceptually, I like this idea a lot, but I think there are a few problems. One is that I'm not quite sure how to find all the code that would need to be adjusted to make it work, though the header comment for standard_join_search() seems like it's got some helpful tips. A second is that it's another version of the disable_cost = infinity problem: once you find that you can't generate a path while enforcing all of the restrictions, you just disregard the restrictions completely, instead of discarding them only to the extent necessary. I have a feeling that's not going to be very appealing. Now, I suppose it might be that even if we can't remove disable_cost, something along these lines is still worth doing, just to save CPU cycles. You could for example try planning with only non-disabled stuff and then do it over again with everything if that doesn't work out, still keeping disable_cost around so that you avoid disabled nodes where you can. But I'm kind of hoping that I'm missing something and there's some approach that could both kill disable_cost and save some cycles at the same time. If (any of) you have an idea, I'd love to hear it! -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, 4 May 2024 at 08:34, Robert Haas <robertmhaas@gmail.com> wrote: > Another idea is to remove the ERROR mentioned above from > set_cheapest() and just allow planning to continue even if some > relations end up with no paths. (This would necessitate finding and > fixing any code that could be confused by a pathless relation.) Then, > if you get to the top of the plan tree and you have no paths there, > redo the join search discarding the constraints (or maybe just some of > the constraints, e.g. allow sequential scans and nested loops, or > something). I don't think you'd need to wait longer than where we do set_cheapest and find no paths to find out that there's going to be a problem. I don't think redoing planning is going to be easy or even useful. I mean what do you change when you replan? You can't just do enable_seqscan and enable_nestloop as if there's no index to provide sorted input and the plan requires some sort, then you still can't produce a plan. Adding enable_sort to the list does not give me much confidence we'll never fail to produce a plan either. It just seems impossible to know which of the disabled ones caused the RelOptInfo to have no paths. Also, you might end up enabling one that caused the planner to do something different than it would do today. For example, a Path that today incurs 2x disable_cost vs a Path that only receives 1x disable_cost might do something different if you just went and enabled a bunch of enable* GUCs before replanning. > Now, I suppose it might be that even if we can't remove disable_cost, > something along these lines is still worth doing, just to save CPU > cycles. You could for example try planning with only non-disabled > stuff and then do it over again with everything if that doesn't work > out, still keeping disable_cost around so that you avoid disabled > nodes where you can. But I'm kind of hoping that I'm missing something > and there's some approach that could both kill disable_cost and save > some cycles at the same time. If (any of) you have an idea, I'd love > to hear it! I think the int Path.disabledness idea is worth coding up to try it. I imagine that a Path will incur the maximum of its subpath's disabledness's then add_path() just needs to prefer lower-valued disabledness Paths. That doesn't get you the benefits of fewer CPU cycles, but where did that come from as a motive to change this? There's no shortage of other ways to make the planner faster if that's an issue. David
David Rowley <dgrowleyml@gmail.com> writes: > I don't think you'd need to wait longer than where we do set_cheapest > and find no paths to find out that there's going to be a problem. At a base relation, yes, but that doesn't work for joins: it may be that a particular join cannot be formed, yet other join sequences will work. We have that all the time from outer-join ordering restrictions, never mind enable_xxxjoin flags. So I'm not sure that we can usefully declare early failure for joins. > I think the int Path.disabledness idea is worth coding up to try it. > I imagine that a Path will incur the maximum of its subpath's > disabledness's then add_path() just needs to prefer lower-valued > disabledness Paths. I would think sum not maximum, but that's a detail. > That doesn't get you the benefits of fewer CPU cycles, but where did > that come from as a motive to change this? There's no shortage of > other ways to make the planner faster if that's an issue. The concern was to not *add* CPU cycles in order to make this area better. But I do tend to agree that we've exhausted all the other options. BTW, I looked through costsize.c just now to see exactly what we are using disable_cost for, and it seemed like a majority of the cases are just wrong. Where possible, we should implement a plan-type-disable flag by not generating the associated Path in the first place, not by applying disable_cost to it. But it looks like a lot of people have erroneously copied the wrong logic. I would say that only these plan types should use the disable_cost method: seqscan nestloop join sort as those are the only ones where we risk not being able to make a plan at all for lack of other alternatives. There is also some weirdness around needing to force use of tidscan if we have WHERE CURRENT OF. But perhaps a different hack could be used for that. We also have this for hashjoin: * If the bucket holding the inner MCV would exceed hash_mem, we don't * want to hash unless there is really no other alternative, so apply * disable_cost. I'm content to leave that be, if we can't remove disable_cost entirely. What I'm wondering at this point is whether we need to trouble with implementing the separate-disabledness-count method, if we trim back the number of places using disable_cost to the absolute minimum. regards, tom lane
On Sun, 5 May 2024 at 04:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > That doesn't get you the benefits of fewer CPU cycles, but where did > > that come from as a motive to change this? There's no shortage of > > other ways to make the planner faster if that's an issue. > > The concern was to not *add* CPU cycles in order to make this area > better. But I do tend to agree that we've exhausted all the other > options. It really looks to me that Robert was talking about not generating paths for disabled path types. He did write "just to save CPU cycles" in the paragraph I quoted. I think we should concern ourselves with adding overhead to add_path() *only* when we actually see a patch which slows it down in a way that we can measure. I find it hard to imagine that adding a single comparison for every Path is measurable. Each of these paths has been palloced and costed, both of which are significantly more expensive than adding another comparison to compare_path_costs_fuzzily(). I'm only willing for benchmarks on an actual patch to prove me wrong on that. Nothing else. add_path() has become a rat's nest of conditions over the years and those seem to have made it without concerns about performance. > BTW, I looked through costsize.c just now to see exactly what we are > using disable_cost for, and it seemed like a majority of the cases are > just wrong. Where possible, we should implement a plan-type-disable > flag by not generating the associated Path in the first place, not by > applying disable_cost to it. But it looks like a lot of people have > erroneously copied the wrong logic. I would say that only these plan > types should use the disable_cost method: > > seqscan > nestloop join > sort I think this oversimplifies the situation. I only spent 30 seconds looking and I saw cases where this would cause issues. If enable_hashagg is false, we could fail to produce some plans where the type is sortable but not hashable. There's also an issue with nested loops being unable to FULL OUTER JOIN. However, I do agree that there are some in there that are adding disable_cost that should be done by just not creating the Path. enable_gathermerge is one. enable_bitmapscan is probably another. I understand you only talked about the cases adding disable_cost in costsize.c. But just as a reminder, there are other things we need to be careful not to break. For example, enable_indexonlyscan=false should defer to still making an index scan. Nobody who disables enable_indexonlyscan without disabling enable_indexscan wants queries that are eligible to use IOS to use seq scan instead. They'd still want Index Scan to be considered, otherwise they'd have disabled enable_indexscan. David
On Sat, May 4, 2024 at 9:16 AM David Rowley <dgrowleyml@gmail.com> wrote: > I don't think you'd need to wait longer than where we do set_cheapest > and find no paths to find out that there's going to be a problem. I'm confused by this response, because I thought that the main point of my previous email was explaining why that's not true. I showed an example where you do find paths at set_cheapest() time and yet are unable to complete planning. > I don't think redoing planning is going to be easy or even useful. I > mean what do you change when you replan? You can't just do > enable_seqscan and enable_nestloop as if there's no index to provide > sorted input and the plan requires some sort, then you still can't > produce a plan. Adding enable_sort to the list does not give me much > confidence we'll never fail to produce a plan either. It just seems > impossible to know which of the disabled ones caused the RelOptInfo to > have no paths. Also, you might end up enabling one that caused the > planner to do something different than it would do today. For > example, a Path that today incurs 2x disable_cost vs a Path that only > receives 1x disable_cost might do something different if you just went > and enabled a bunch of enable* GUCs before replanning. I agree that there are problems here, both in terms of implementation complexity and also in terms of what behavior you actually get, but I do not think that a proposal which changes some current behavior should be considered dead on arrival. Whatever new behavior we might want to implement needs to make sense, and there need to be good reasons for making whatever changes are contemplated, but I don't think we should take the position that it has to be identical to current. > I think the int Path.disabledness idea is worth coding up to try it. > I imagine that a Path will incur the maximum of its subpath's > disabledness's then add_path() just needs to prefer lower-valued > disabledness Paths. It definitely needs to be sum, not max. Otherwise you can't get the matest example from the regression tests right, where one child lacks the ability to comply with the GUC setting. > That doesn't get you the benefits of fewer CPU cycles, but where did > that come from as a motive to change this? There's no shortage of > other ways to make the planner faster if that's an issue. Well, I don't agree with that at all. If there are lots of ways to make the planner faster, we should definitely do a bunch of that stuff, because "will slow down the planner too much" has been a leading cause of proposed planner patches being rejected for as long as I've been involved with the project. My belief was that we were rather short of good ideas in that area, actually. But even if it's true that we have lots of other ways to speed up the planner, that doesn't mean that it wouldn't be good to do it here, too. Stepping back a bit, my current view of this area is: disable_cost is highly imperfect both as an idea and as implemented in PostgreSQL. Although I'm discovering that the current implementation gets more things right than I had realized, it also sometimes gets things wrong. The original poster gave an example of that, and there are others. Furthermore, the current implementation has some weird inconsistencies. Therefore, I would like something better. Better, to me, could mean any combination of (a) superior behavior, (b) superior performance, and (c) simpler, more elegant code. In a perfect world, we'd be able to come up with something that wins in all three of those areas, but I'm not seeing a way to achieve that, so I'm trying to figure out what is achievable. And because we need to reach consensus on whatever is to be done, I'm sharing raw research results rather than just dropping a completed patch. I don't think it's at all easy to understand what the realistic possibilities are in this area; certainly it isn't for me. At some point I'm hoping that there will be a patch (or a bunch of patches) that we can all agree are an improvement over now and the best we can reasonably do, but I don't yet know what the shape of those will be, because I'm still trying to understand (and document on-list) what all the problems are. -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, May 4, 2024 at 12:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > There is also some weirdness around needing to force use of tidscan > if we have WHERE CURRENT OF. But perhaps a different hack could be > used for that. Yeah, figuring out what to do about this was the trickiest part of the experimental patch that I wrote last week. The idea of the current code is that cost_qual_eval_walker charges disable_cost for CurrentOfExpr, but cost_tidscan then subtracts disable_cost if tidquals contains a CurrentOfExpr, so that we effectively disable everything except TID scan paths and, I think, also any TID scan paths that don't use the CurrentOfExpr as a qual. I'm not entirely sure whether the last can happen, but I imagine that it might be possible if the cursor refers to a query that itself contains some other kind of TID qual. It's not very clear that this mechanism is actually 100% reliable, because we know it's possible in general for the costs of two paths to be different by more than disable_cost. Maybe that's not possible in this specific context, though: I'm not sure. The approach I took for my experimental patch was pretty grotty, and probably not quite complete, but basically I defined the case where we currently subtract out disable_cost as a "forced TID-scan". I passed around a Boolean called forcedTidScan which gets set to true if we discover that some plan is a forced TID-scan path, and then we discard any other paths and then only add other forced TID-scan paths after that point. There can be more than one, because of parameterization. But I think that the right thing to do is probably to pull some of the logic up out of create_tidscan_paths() and decide ONCE whether we're in a forced TID-scan situation or not. If we are, then set_plain_rel_pathlist() should arrange to create only forced TID-scan paths; otherwise, it should proceed as it does now. Maybe if I try to do that I'll find problems, but the current approach seems backwards to me, like going to a restaurant and ordering one of everything on the menu, then cancelling all of the orders except the stuff you actually want. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, May 6, 2024 at 9:39 AM Robert Haas <robertmhaas@gmail.com> wrote: > It's not very clear that this mechanism is actually 100% reliable, It isn't. Here's a test case. As a non-superuser, do this: create table foo (a int, b text, primary key (a)); insert into foo values (1, 'Apple'); alter table foo enable row level security; alter table foo force row level security; create policy p1 on foo as permissive using (ctid in ('(0,1)', '(0,2)')); begin; declare c cursor for select * from foo; fetch from c; explain update foo set b = 'Manzana' where current of c; update foo set b = 'Manzana' where current of c; The explain produces this output: Update on foo (cost=10000000000.00..10000000008.02 rows=0 width=0) -> Tid Scan on foo (cost=10000000000.00..10000000008.02 rows=1 width=38) TID Cond: (ctid = ANY ('{"(0,1)","(0,2)"}'::tid[])) Filter: CURRENT OF c Unless I'm quite confused, the point of the code is to force CurrentOfExpr to be a TID Cond, and it normally succeeds in doing so, because WHERE CURRENT OF cursor_name has to be the one and only WHERE condition for a normal UPDATE. I tried various cases involving views and CTEs and got nowhere. But then I wrote a patch to make the regression tests fail if a baserel's restrictinfo list contains a CurrentOfExpr and also some other qual, and a couple of row-level security tests failed (and nothing else). Which then allowed me to construct the example above, where there are two possible TID quals and the logic in tidpath.c latches onto the wrong one. The actual UPDATE fails like this: ERROR: WHERE CURRENT OF is not supported for this table type ...because ExecEvalCurrentOfExpr supposes that the only way it can be reached is for an FDW without the necessary support, but actually in this case it's planner error that gets us here. Fortunately, there's no real reason for anyone to ever do something like this, or at least I can't see one, so the fact that it doesn't work probably doesn't really matter that much. And you can argue that the only problem here is that the costing hack just didn't get updated for RLS and now needs to be a bit more clever. But I think it'd be better to find a way of making it less hacky. With the way the code is structured right now, the chances of anyone understanding that RLS might have an impact on its correctness were just about nil, IMHO. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, May 6, 2024 at 9:39 AM Robert Haas <robertmhaas@gmail.com> wrote: >> It's not very clear that this mechanism is actually 100% reliable, > It isn't. Here's a test case. Very interesting. > ... Which then allowed me to > construct the example above, where there are two possible TID quals > and the logic in tidpath.c latches onto the wrong one. Hmm. Without having traced through it, I'm betting that the CurrentOfExpr qual is rejected as a tidqual because it's not considered leakproof. It's not obvious to me why we couldn't consider it as leakproof, though. If we don't want to do that in general, then we need some kind of hack in TidQualFromRestrictInfo to accept CurrentOfExpr quals anyway. In general I think you're right that something less rickety than the disable_cost hack would be a good idea to ensure the desired TidPath gets chosen, but this problem is not the fault of that. We're not making the TidPath with the correct contents in the first place. regards, tom lane
I wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> ... Which then allowed me to >> construct the example above, where there are two possible TID quals >> and the logic in tidpath.c latches onto the wrong one. > Hmm. Without having traced through it, I'm betting that the > CurrentOfExpr qual is rejected as a tidqual because it's not > considered leakproof. Nah, I'm wrong: we do treat it as leakproof, and the comment about that in contain_leaked_vars_walker shows that the interaction with RLS quals *was* thought about. What wasn't thought about was the possibility of RLS quals that themselves could be usable as tidquals, which breaks this assumption in TidQualFromRestrictInfoList: * Stop as soon as we find any usable CTID condition. In theory we * could get CTID equality conditions from different AND'ed clauses, * in which case we could try to pick the most efficient one. In * practice, such usage seems very unlikely, so we don't bother; we * just exit as soon as we find the first candidate. The executor doesn't seem to be prepared to cope with multiple AND'ed TID clauses (only OR'ed ones). So we need to fix this at least to the extent of looking for a CurrentOfExpr qual, and preferring that over anything else. I'm also now wondering about this assumption in the executor: /* CurrentOfExpr could never appear OR'd with something else */ Assert(list_length(tidstate->tss_tidexprs) == 1 || !tidstate->tss_isCurrentOf); It still seems OK, because anything that might come in from RLS quals would be AND'ed not OR'ed with the CurrentOfExpr. > In general I think you're right that something less rickety than > the disable_cost hack would be a good idea to ensure the desired > TidPath gets chosen, but this problem is not the fault of that. > We're not making the TidPath with the correct contents in the first > place. Still true. regards, tom lane
On Mon, May 6, 2024 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Nah, I'm wrong: we do treat it as leakproof, and the comment about > that in contain_leaked_vars_walker shows that the interaction with > RLS quals *was* thought about. What wasn't thought about was the > possibility of RLS quals that themselves could be usable as tidquals, > which breaks this assumption in TidQualFromRestrictInfoList: > > * Stop as soon as we find any usable CTID condition. In theory we > * could get CTID equality conditions from different AND'ed clauses, > * in which case we could try to pick the most efficient one. In > * practice, such usage seems very unlikely, so we don't bother; we > * just exit as soon as we find the first candidate. Right. I had noticed this but didn't spell it out. > The executor doesn't seem to be prepared to cope with multiple AND'ed > TID clauses (only OR'ed ones). So we need to fix this at least to the > extent of looking for a CurrentOfExpr qual, and preferring that over > anything else. > > I'm also now wondering about this assumption in the executor: > > /* CurrentOfExpr could never appear OR'd with something else */ > Assert(list_length(tidstate->tss_tidexprs) == 1 || > !tidstate->tss_isCurrentOf); > > It still seems OK, because anything that might come in from RLS quals > would be AND'ed not OR'ed with the CurrentOfExpr. This stuff I had not noticed. > > In general I think you're right that something less rickety than > > the disable_cost hack would be a good idea to ensure the desired > > TidPath gets chosen, but this problem is not the fault of that. > > We're not making the TidPath with the correct contents in the first > > place. > > Still true. I'll look into this, unless you want to do it. Incidentally, another thing I just noticed is that IsCurrentOfClause()'s test for (node->cvarno == rel->relid) is possibly dead code. At least, there are no examples in our test suite where it fails to hold. Which seems like it makes sense, because if it didn't, then how did the clause end up in baserestrictinfo? Maybe this is worth keeping as defensive coding, or maybe it should be changed to an Assert or something. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > I'll look into this, unless you want to do it. I have a draft patch already. Need to add a test case. > Incidentally, another thing I just noticed is that > IsCurrentOfClause()'s test for (node->cvarno == rel->relid) is > possibly dead code. At least, there are no examples in our test suite > where it fails to hold. Which seems like it makes sense, because if it > didn't, then how did the clause end up in baserestrictinfo? Maybe this > is worth keeping as defensive coding, or maybe it should be changed to > an Assert or something. I wouldn't remove it, but maybe an Assert is good enough. The tests on Vars' varno should be equally pointless no? regards, tom lane
On Mon, May 6, 2024 at 8:27 AM Robert Haas <robertmhaas@gmail.com> wrote: > Stepping back a bit, my current view of this area is: disable_cost is > highly imperfect both as an idea and as implemented in PostgreSQL. > Although I'm discovering that the current implementation gets more > things right than I had realized, it also sometimes gets things wrong. > The original poster gave an example of that, and there are others. > Furthermore, the current implementation has some weird > inconsistencies. Therefore, I would like something better. FWIW I always found those weird inconsistencies to be annoying at best, and confusing at worst. I speak as somebody that uses disable_cost a lot. I certainly wouldn't ask anybody to make it a priority for that reason alone -- it's not *that* bad. I've given my opinion on this because it's already under discussion. -- Peter Geoghegan
On Mon, May 6, 2024 at 4:30 PM Peter Geoghegan <pg@bowt.ie> wrote: > FWIW I always found those weird inconsistencies to be annoying at > best, and confusing at worst. I speak as somebody that uses > disable_cost a lot. > > I certainly wouldn't ask anybody to make it a priority for that reason > alone -- it's not *that* bad. I've given my opinion on this because > it's already under discussion. Thanks, it's good to have other perspectives. Here are some patches for discussion. 0001 gets rid of disable_cost as a mechanism for forcing a TID scan plan to be chosen when CurrentOfExpr is present. Instead, it arranges to generate only the valid path when that case occurs, and skip everything else. I think this is a good cleanup, and it doesn't seem totally impossible that it actually prevents a failure in some extreme case. 0002 cleans up the behavior of enable_indexscan and enable_indexonlyscan. Currently, setting enable_indexscan=false adds disable_cost to both the cost of index scans and the cost of index-only scans. I think that's indefensible and, in fact, a bug, although I believe David Rowley disagrees. With this patch, we simply don't generate index scans if enable_indexscan=false, and we don't generate index-only scans if enable_indexonlyscan=false, which seems a lot more consistent to me. However, I did revise one major thing from the patch I posted before, per feedback from David Rowley and also per my own observations: in this version, if enable_indexscan=true and enable_indexonlyscan=false, we'll generate index-scan paths for any cases where, with both set to true, we would have only generated index-only scan paths. That change makes the behavior of this patch a lot more comprehensible and intuitive: the only regression test changes are places where somebody expected that they could disable both index scans and index-only scans by setting enable_indexscan=false. 0003 and 0004 extend the approach of "just don't generate the disabled path" to bitmap scans and gather merge, respectively. I think these are more debatable, mostly because it's not clear how far we can really take this approach. Neither breaks any test cases, and 0003 is closely related to the work done in 0002, which seems like a point in its favor. 0004 was simply the only other case where it was obvious to me that this kind of approach made sense. In my view, it makes most sense to use this kind of approach for planner behaviors that seem like they're sort of optional: like if you don't use gather merge, you can still use gather, and if you don't use index scans, you can still use sequential scans. With all these patches applied, the remaining cases where we rely on disable_cost are: sequential scans sorts hash aggregation all 3 join types hash joins where a bucket holding the inner MCV would exceed hash_mem Sequential scans are clearly a last-ditch method. I find it a bit hard to decide whether hashing or sorting is the default, especially giving the asymmetry between enable_sort - presumptively anywhere - and enable_hashagg - specific to aggregation. As for the join types, it's tempting to consider nested-loop the default type -- it's the only way to satisfy parameterizations, for instance -- but the fact that it's the only method that can't do a full join undermines that position in my book. But, I don't want to pretend like I have all the answers here, either; I'm just sharing some thoughts. -- Robert Haas EDB: http://www.enterprisedb.com