Thread: [HACKERS] [PATCH] Overestimated filter cost and its mitigation
Hi hackers,
Currently, cost of a filter with multiple clauses is estimated by
summing up estimated cost of each clause. As long as a filter
consists of simple clauses and its cost is fairly small, it works
fine. However, when there exists some heavy clauses (like SubQuery or
user-defined functions) and most of tuples are filtered by other
simple clauses, total cost is likely to be overestimated. An attached
patch mitigates this overestimation by using selectivity estimation of
subsets of clauses in a filter.
Suppose that there are three qual clauses in a scan node, current
postgres estimates per-tuple cost of the filter to be:
cost(A) + cost(B) + cost(C)
And basic idea of the attached patch is:
cost(A) + clauselist_selectivity({A}) * cost(B) + clauselist_selectivity({A, B}) * cost(C)
We first noticed this overestimation during performance analysis of
our real applications. In order to reproduce the situation, we created
a similar query with pgbench data.
The database was prepared by following commands:
pgbench --scale=1000 -i pgb5
pgbench -c 10 -t 100000 pgb5
and the query is:
select e.tid,
e.aid,
e.bid,
e.mtime
from pgbench_history e
where e.tid between 1 and 1000
and (select count(*)
from pgbench_history f
where f.mtime < e.mtime
and f.bid = e.bid
group by f.bid) > 100
and e.aid between 1 and 100000;
== Query plan with current upstream ==
1 Seq Scan on pgbench_history e (cost=0.00..21393523889.00 rows=28 width=20) (actual time=2391.683..21775.191 rows=85 loops=1)
2 Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <= 100000) AND ((SubPlan 1) > 100))
3 Rows Removed by Filter: 999915
4 SubPlan 1
5 -> GroupAggregate (cost=0.00..21393.50 rows=283 width=12) (actual time=229.050..229.050 rows=1 loops=94)
6 Group Key: f.bid
7 -> Seq Scan on pgbench_history f (cost=0.00..21389.00 rows=333 width=4) (actual time=5.036..228.851 rows=529 loops=94)
8 Filter: ((mtime < e.mtime) AND (bid = e.bid))
9 Rows Removed by Filter: 999471
Most amount of total cost 21,393,523,889 comes from:
(cost of SubPlan 1) * (# of rows in pgbench_history) = 21,393.50 * 1,000,000 = 21,393,000,000
but in actual run, SubPlan 1 was executed only 94 times, and total
cost was overestimated more than 10000 times greater.
== Query plan with patched upstream ==
1 Seq Scan on pgbench_history e (cost=0.00..1687169.88 rows=28 width=20) (actual time=2388.802..21721.498 rows=85 loops=1)
2 Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <= 100000) AND ((SubPlan 1) > 100))
3 Rows Removed by Filter: 999915
4 SubPlan 1
5 -> GroupAggregate (cost=0.00..19726.83 rows=283 width=12) (actual time=228.507..228.507 rows=1 loops=94)
6 Group Key: f.bid
7 -> Seq Scan on pgbench_history f (cost=0.00..19722.33 rows=333 width=4) (actual time=5.378..228.316 rows=529 loops=94)
8 Filter: ((mtime < e.mtime) AND (bid = e.bid))
9 Rows Removed by Filter: 999471
In patched version of upstream, total cost is largely decreased. In
cost estimation, only 84.4 tuples of pgbench_history are expected to
be evaluated with SubPlan 1. It is close to actual number 94 to some
extent, and total cost seems much more reasonable than cost with
current upstream.
Any thoughts?
Regards,
Yuto Hayamizu
Attachment
On Mon, Sep 11, 2017 at 04:43:46PM +0900, Yuto Hayamizu wrote: > Hi hackers, > > Currently, cost of a filter with multiple clauses is estimated by > summing up estimated cost of each clause. As long as a filter > consists of simple clauses and its cost is fairly small, it works > fine. However, when there exists some heavy clauses (like SubQuery or > user-defined functions) and most of tuples are filtered by other > simple clauses, total cost is likely to be overestimated. An attached > patch mitigates this overestimation by using selectivity estimation of > subsets of clauses in a filter. I've taken the liberty of adding this to the upcoming commitfest. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 11, 2017 at 7:43 PM, Yuto Hayamizu <y.hayamizu@gmail.com> wrote: > Suppose that there are three qual clauses in a scan node, current > postgres estimates per-tuple cost of the filter to be: > cost(A) + cost(B) + cost(C) > > And basic idea of the attached patch is: > cost(A) + clauselist_selectivity({A}) * cost(B) + > clauselist_selectivity({A, B}) * cost(C) I am no planner expert and I haven't tested or studied the patch in detail, but here is some feedback for what it's worth. This idea seems to makes intuitive sense. I see that you use order_qual_clauses() to know what order they'll run in, so I'm wondering if there is any reason we shouldn't do it up front and keep it during path building, instead of running it again at plan creation time. Is there some way it could ever produce a different result at the two times? Why not also apply this logic to qpquals of joins, foreign scans, subplans? That is, instead of replacing cost_qual_eval with this code for baserels, I wonder if we should teach cost_qual_eval how to do this so those other users could also benefit (having perhaps already ordered the qual clauses earlier). This is one of those patches that risks having an impact on many query plans. Yikes. Of all the regression test queries, only updatable_views complained though, and that involves leakproof functions. I see that those get some kind of special treatment in order_qual_clauses(). + ordered_clauses = order_qual_clauses(root, rel->baserestrictinfo); + clause_list = ordered_clauses; Is clause_list necessary? Can't you just use ordered_clauses for the outer and inner loops? + List *clause_list_for_sel = NULL; The convention is to use NIL for empty lists (a nod to the Lisp machine they prototyped this project on). + /* Make a temporary clause list for selectivity calcuation */ s/calcuation/calculation/ -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Nov 6, 2017 at 10:01 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > > This idea seems to makes intuitive sense. I see that you use > order_qual_clauses() to know what order they'll run in, so I'm > wondering if there is any reason we shouldn't do it up front and keep > it during path building, instead of running it again at plan creation > time. Is there some way it could ever produce a different result at > the two times? IIRC, only thing that changes between plan time quals and execution time quals is constaint folding of constant parameters. But I don't think we change the selectivity estimates when that's done. At the same time, I don't think we should make a lot of effort to make sure that the order used during the estimation is same as the order at the execution; we are anyway estimating. There can always be some difference between what's estimated and what actually happens. > Why not also apply this logic to qpquals of joins, > foreign scans, subplans? That is, instead of replacing cost_qual_eval > with this code for baserels, I wonder if we should teach > cost_qual_eval how to do this so those other users could also benefit > (having perhaps already ordered the qual clauses earlier). +1. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Nov 6, 2017 at 5:19 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > IIRC, only thing that changes between plan time quals and execution > time quals is constaint folding of constant parameters. But I don't > think we change the selectivity estimates when that's done. At the > same time, I don't think we should make a lot of effort to make sure > that the order used during the estimation is same as the order at the > execution; we are anyway estimating. There can always be some > difference between what's estimated and what actually happens. I don't think that's a good justification for allowing the order to vary. It's true that, at execution time, we may find row counts and selectivities to be different than what we predicted, but that is a case of the real data being different from our idealized data -- which is difficult to avoid in general. Allowing our actual decisions to be different from the decisions we thought we would make seems like programmer sloppiness. It would also be very confusing if the planner uses one ordering to estimate the cost and then a different order at execution time and in the EXPLAIN ANALYZE output. How could anybody understand what had happened in such a case? I think it would be a good idea, as Thomas says, to order the qual clauses at an earlier stage and then remember our decision. However, we have to think about whether that's going to increase planning time in a noticeable way. I wonder why we currently postpone this until such a late phase of planning. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes: > I think it would be a good idea, as Thomas says, to order the qual > clauses at an earlier stage and then remember our decision. However, > we have to think about whether that's going to increase planning time > in a noticeable way. I wonder why we currently postpone this until > such a late phase of planning. Because (1) up to now there's been no need to consider the qual ordering till later, and (2) re-doing that sort for each path seemed unduly expensive. If we're to try to estimate whether later quals will be reached, then sure the ordering becomes important. I'm still concerned about (2) though. If there were a way to pre-sort the quals once for all paths, the problem goes away, but I don't think that works --- indexscans may segregate some quals, and in join cases different paths will actually have different sets of quals they need to check depending on the join order. So the bottom line here is how much is it going to cost us to add this additional refinement in cost estimation, and is it worth it given our extremely poor level of accuracy in expression cost estimation. Color me dubious. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 8, 2017 at 3:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I think it would be a good idea, as Thomas says, to order the qual >> clauses at an earlier stage and then remember our decision. However, >> we have to think about whether that's going to increase planning time >> in a noticeable way. I wonder why we currently postpone this until >> such a late phase of planning. > > Because (1) up to now there's been no need to consider the qual ordering > till later, and (2) re-doing that sort for each path seemed unduly > expensive. If we're to try to estimate whether later quals will be > reached, then sure the ordering becomes important. I'm still concerned > about (2) though. If there were a way to pre-sort the quals once for > all paths, the problem goes away, but I don't think that works --- > indexscans may segregate some quals, and in join cases different paths > will actually have different sets of quals they need to check depending > on the join order. > Looking at order_qual_clauses(), we can say that a set of quals q1 .... qn are ordered the same irrespective of the set of clauses they are subset of. E.g. if {q1 .. qn} is subset of Q (ordered as Qo) and also Q' (ordered as Q'o) the order in which they appear in Qo and Q'o is same. So, even if different paths segregate the clauses in different set, within each set the order is same. FWIW, we can order all clauses in largest set once and use that order every time. Albeit we will have to remember the order somewhere OR make the separator routine retain the order in the larger set, which I guess is true about all separator functions. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 9, 2017 at 12:33 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > Looking at order_qual_clauses(), we can say that a set of quals q1 > .... qn are ordered the same irrespective of the set of clauses they > are subset of. E.g. if {q1 .. qn} is subset of Q (ordered as Qo) and > also Q' (ordered as Q'o) the order in which they appear in Qo and Q'o > is same. So, even if different paths segregate the clauses in > different set, within each set the order is same. FWIW, we can order > all clauses in largest set once and use that order every time. Albeit > we will have to remember the order somewhere OR make the separator > routine retain the order in the larger set, which I guess is true > about all separator functions. I am not sure what to think about this patch, so moved to next CF. The patch still applies. Hayamizu-san, it would be nice as well if you could review other's patches. One patch reviewed for one patch submitted, with equal difficulty. You should also get a community account so as it is possible to add your name as an author of this patch. -- Michael
Greetings, * Michael Paquier (michael.paquier@gmail.com) wrote: > On Thu, Nov 9, 2017 at 12:33 PM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: > > Looking at order_qual_clauses(), we can say that a set of quals q1 > > .... qn are ordered the same irrespective of the set of clauses they > > are subset of. E.g. if {q1 .. qn} is subset of Q (ordered as Qo) and > > also Q' (ordered as Q'o) the order in which they appear in Qo and Q'o > > is same. So, even if different paths segregate the clauses in > > different set, within each set the order is same. FWIW, we can order > > all clauses in largest set once and use that order every time. Albeit > > we will have to remember the order somewhere OR make the separator > > routine retain the order in the larger set, which I guess is true > > about all separator functions. > > I am not sure what to think about this patch, so moved to next CF. The > patch still applies. Hayamizu-san, it would be nice as well if you > could review other's patches. One patch reviewed for one patch > submitted, with equal difficulty. You should also get a community > account so as it is possible to add your name as an author of this > patch. While this patch does still apply, it doesn't pass the 'make check' regression tests and there appears to be concern raised up-thread about the performance impact. Yuto Hayamizu, it looks like some good questions were raised and which need to be addressed, in addition to making sure that the patch at least passes 'make check', so I'm leaving this as waiting-for-author. If you believe the patch is no longer viable, we can change it to 'returned with feedback', otherwise, an updated patch and some responses to the questions raised would be great as we're already a week into this commitfest and this thread has been dormant since near the start of the last CF. Thanks! Stephen
Attachment
On Sun, Jan 7, 2018 at 8:25 AM, Stephen Frost <sfrost@snowman.net> wrote: > > Greetings, > > * Michael Paquier (michael.paquier@gmail.com) wrote: > > On Thu, Nov 9, 2017 at 12:33 PM, Ashutosh Bapat > > <ashutosh.bapat@enterprisedb.com> wrote: > > > Looking at order_qual_clauses(), we can say that a set of quals q1 > > > .... qn are ordered the same irrespective of the set of clauses they > > > are subset of. E.g. if {q1 .. qn} is subset of Q (ordered as Qo) and > > > also Q' (ordered as Q'o) the order in which they appear in Qo and Q'o > > > is same. So, even if different paths segregate the clauses in > > > different set, within each set the order is same. FWIW, we can order > > > all clauses in largest set once and use that order every time. Albeit > > > we will have to remember the order somewhere OR make the separator > > > routine retain the order in the larger set, which I guess is true > > > about all separator functions. > > > > I am not sure what to think about this patch, so moved to next CF. The > > patch still applies. Hayamizu-san, it would be nice as well if you > > could review other's patches. One patch reviewed for one patch > > submitted, with equal difficulty. You should also get a community > > account so as it is possible to add your name as an author of this > > patch. > > While this patch does still apply, it doesn't pass the 'make check' > regression tests and there appears to be concern raised up-thread about > the performance impact. > > Yuto Hayamizu, it looks like some good questions were raised and which > need to be addressed, in addition to making sure that the patch at least > passes 'make check', so I'm leaving this as waiting-for-author. If you > believe the patch is no longer viable, we can change it to 'returned > with feedback', otherwise, an updated patch and some responses to the > questions raised would be great as we're already a week into this > commitfest and this thread has been dormant since near the start of the > last CF. > > Thanks! > > Stephen Thank you for your kind guidance, and sorry for late reply. Attached patch fixes regression tests and now it passes the 'make check' tests. Since this patch changes cost estimation in set_baserel_size_estimates, picked query plans were changed for some tests, so I've updated their expected EXPLAIN results. I'm going to answer raised questions by replying to each message soon. regards, Yuto Hayamizu
Attachment
On Wed, Nov 8, 2017 at 6:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Because (1) up to now there's been no need to consider the qual ordering > till later, and (2) re-doing that sort for each path seemed unduly > expensive. If we're to try to estimate whether later quals will be > reached, then sure the ordering becomes important. I'm still concerned > about (2) though. This patch changes 'set_baserel_size_estimates', and it is called only once for each range table entry, so sorting does not happen for each path and its negative impact on optimizer performance is negligible. While sorting quals does not cost so much, I noticed another performance problem of this patch. Patched 'set_baserel_size_estimates' is O(N^2) (N is the number of quals) and would take so long time when a query has a large qual list. After sorting quals, it loops over quals q_1,q_2, ..., q_N to estimated "weighted" cost of each qual q_k. In each iteration, in order to calculate "the weight" of q_k, it calls clauselist_selectivity({q_1,q_2, ..., q_k}), which loops k times in the function. I've checked actual negative impact on optimization time with the artificial query: SELECT count(*) FROM pgbench_accounts WHERE (randomly generated N quals); N=50: 0.5966 msec (original), 0.938 msec (patched) N=100: 1.1992 msec (original), 1.5396 msec (patched) N=200: 1.9167 msec (original), 4.6676 msec (patched) N=400: 2.7583 msec (original), 12.0786 msec (patched) N=800: 5.2919 msec (original), 38.0228 msec (patched) N=1600: 9.3428 msec (original), 167.737 msec (patched) From this result, patched version is almost O(N^2). 100 quals in a query seems unusually large number, but I think there may exists some real-world queries that have hundreds or thousands of quals, so I feel this patch needs improvement for handling large N. My idea of improving this patch is that give a threshold N_limit, and for q_1 ... q_N_limit, do the same weighted cost estimation in the current version of this patch. For q_{N_limit+1} ...., stop calling clauselist_selectivity for calculating the weight and reuse the result of clauselist_selectivity({q_1,q_2, ..., q_N_limit}). For example, if N_limit=100, additional overhead is only sub-milliseconds per each range table entry, and cost estimation is surely better than the current postgres implementation. Any thoughts?
On Thu, Nov 9, 2017 at 12:33 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > different set, within each set the order is same. FWIW, we can order > all clauses in largest set once and use that order every time. Albeit > we will have to remember the order somewhere OR make the separator > routine retain the order in the larger set, which I guess is true > about all separator functions. For this patch, sorting of a qual list happens only once for each range table entry, not for each path. So there is no need for caching sorted qual lists as far as I know. ---- regards, Yuto Hayamizu
On Fri, Jan 19, 2018 at 5:07 PM, Yuto Hayamizu <y.hayamizu@gmail.com> wrote: > My idea of improving this patch is that give a threshold N_limit, > and for q_1 ... q_N_limit, do the same weighted cost estimation in the > current version of this patch. > For q_{N_limit+1} ...., stop calling clauselist_selectivity for > calculating the weight > and reuse the result of clauselist_selectivity({q_1,q_2, ..., q_N_limit}). > For example, if N_limit=100, additional overhead is only > sub-milliseconds per each range table entry, > and cost estimation is surely better than the current postgres implementation. Attached patch implemented the improvement idea above. With this patch attached, performance degradation of the test query with many quals was <1%. Example test query is attached. regards, ---- Yuto Hayamizu
Attachment
On Mon, Jan 29, 2018 at 10:42 AM, Yuto Hayamizu <y.hayamizu@gmail.com> wrote: > On Fri, Jan 19, 2018 at 5:07 PM, Yuto Hayamizu <y.hayamizu@gmail.com> wrote: >> My idea of improving this patch is that give a threshold N_limit, >> and for q_1 ... q_N_limit, do the same weighted cost estimation in the >> current version of this patch. >> For q_{N_limit+1} ...., stop calling clauselist_selectivity for >> calculating the weight >> and reuse the result of clauselist_selectivity({q_1,q_2, ..., q_N_limit}). >> For example, if N_limit=100, additional overhead is only >> sub-milliseconds per each range table entry, >> and cost estimation is surely better than the current postgres implementation. > > Attached patch implemented the improvement idea above. > With this patch attached, performance degradation of the test query > with many quals was <1%. > Example test query is attached. > I looked at the patch and the idea sounds reasonable to me. But I still have doubts about the usefulness of the patch. What this patch does is to produce more accurate costs of scan of a single relation. That's good, but it does that for all the paths created for that relation. So the accurate estimate doesn't help us to choose one method of scanning the relation over the other method. It also does not affect the costs of different join paths, sort paths etc. IOW, I can not imagine a find a query which will be planned in a different way than upstream when we apply the patch and the new plan will be more efficient than the previous one. I might be missing something but. Can you please provide such a query. In the patch clauselist_selectivity() gets called repeatedly for every new qual added to the clause list. Instead, if we try to combine the cost/row estimation with order_qual_clauses() or clauselist_selectivity(), we might be able to what the current patch does in O(n). clauselist_selectivity() calculates combined selectivity of all the given quals in O(n). We should do something similar to that in this case. As suggested upthread by Tom, it will be more useful to have this feature work with not just scans but joins as well. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On 26 July 2018 at 07:12, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > In the patch clauselist_selectivity() gets called repeatedly for every > new qual added to the clause list. Instead, if we try to combine the > cost/row estimation with order_qual_clauses() or > clauselist_selectivity(), we might be able to what the current patch > does in O(n). clauselist_selectivity() calculates combined selectivity > of all the given quals in O(n). We should do something similar to that > in this case. Duplicating the logic of clauselist_selectivity() seems like quite a lot of effort to go to just for improved filter cost estimates. Note also that clauselist_selectivity() might get a good deal more complex with multivariate stats. Perhaps a reasonable simplification would be to just treat the clauses as independent when estimating the filter cost, and so use the following as a "good enough" estimate for the filter cost: cost(A) + sel(A) * cost(B) + sel(A) * sel(B) * cost(C) + ... That would probably be an improvement over what we currently have. It would be O(n) to compute, and it would probably use the cached selectivity estimates for the clauses. Note also that with this simplified expression for the filter cost, it would be possible to improve order_qual_clauses() to take into account the clause selectivities when sorting the clauses, and minimise the cost of the filter by evaluating more selective clauses sooner, if they're not too expensive. I believe that amounts to sorting by cost/(1-sel) rather than just cost, except for clauses with sel==1, which it makes sense to move to the end, and just sort by cost. Regards, Dean
On Thu, Jul 26, 2018 at 10:30 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > On 26 July 2018 at 07:12, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: >> In the patch clauselist_selectivity() gets called repeatedly for every >> new qual added to the clause list. Instead, if we try to combine the >> cost/row estimation with order_qual_clauses() or >> clauselist_selectivity(), we might be able to what the current patch >> does in O(n). clauselist_selectivity() calculates combined selectivity >> of all the given quals in O(n). We should do something similar to that >> in this case. > > Duplicating the logic of clauselist_selectivity() seems like quite a > lot of effort to go to just for improved filter cost estimates. Note > also that clauselist_selectivity() might get a good deal more complex > with multivariate stats. I am not suggesting to duplicate code in clauselist_selectivity(). Instead I am suggesting that we incorporate costing in that function. I don't know how feasible is that. > > Perhaps a reasonable simplification would be to just treat the clauses > as independent when estimating the filter cost, and so use the > following as a "good enough" estimate for the filter cost: > > cost(A) + sel(A) * cost(B) + sel(A) * sel(B) * cost(C) + ... > > That would probably be an improvement over what we currently have. It > would be O(n) to compute, and it would probably use the cached > selectivity estimates for the clauses. That looks like a good trade-off. But looking at the following comment in clause_selectivity(), I doubt if this will work in all the cases /* * If possible, cache the result of the selectivity calculation for * the clause. We can cache if varRelid is zero or the clause * contains only vars of that relid --- otherwise varRelid will affect * the result, so mustn't cache. Outer join quals might be examined * with either their join's actual jointype or JOIN_INNER, so we need * two cache variables to remember both cases. Note: we assume the * result won't change if we are switching the input relations or * considering a unique-ified case, so we only need one cache variable * for all non-JOIN_INNER cases. */ > > Note also that with this simplified expression for the filter cost, it > would be possible to improve order_qual_clauses() to take into account > the clause selectivities when sorting the clauses, and minimise the > cost of the filter by evaluating more selective clauses sooner, if > they're not too expensive. I believe that amounts to sorting by > cost/(1-sel) rather than just cost, except for clauses with sel==1, > which it makes sense to move to the end, and just sort by cost. +1, if we could do that. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On Fri, Jul 27, 2018 at 02:19:27PM +0530, Ashutosh Bapat wrote: > [ removal of latest arguments ] > +1, if we could do that. The patch seems to have stuck a bit, so I am marking it as returned with feedback because of no activity. -- Michael