Re: Patch for removng unused targets - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Patch for removng unused targets |
Date | |
Msg-id | 6543.1375470829@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Patch for removng unused targets ("Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp>) |
Responses |
Re: Patch for removng unused targets
|
List | pgsql-hackers |
"Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp> writes: > Thank you for the adjustments and comments! In addition to adding comments to > the function, I've improved the code in the function a little bit. Please find > attached an updated version of the patch. I started looking at this patch (finally). I'm not terribly satisfied with it, because it addresses only a very small part of what we really need to do in this area, and I'm afraid we might end up throwing it away in toto once we make the larger changes needed. I carped about this a bit back in <15642.1354650764@sss.pgh.pa.us>, but to try to fill in some background, consider a query like select expensive_function(x) from t; where, since the DBA is smart, t has an index on expensive_function(x). Ideally we'd just scan the index and return values out of it, without recomputing the expensive_function(). The planner is physically able to produce such a plan, but only in very limited cases, and an even bigger problem is that its cost accounting doesn't recognize the potential savings from not evaluating expensive_function(x); therefore, even if it can generate the right plan, it might discard it in favor of a plan that doesn't use the index. This patch has got that same problem: it makes a useful improvement in the finished plan if that plan is of the right form, but it does nothing to push the planner to produce that form in the first place. Basically these problems stem from the assumption that we can treat all scan/join paths as producing the same "flat" tlist (containing only Vars) and only worry about tlist evaluation at the top level. I think the fix will have to involve recognizing that certain paths can produce some expressions more cheaply than others can, and explicitly including those expressions in the returned tlists in such cases. That's going to be a pretty invasive change. (Of course, the executor already works that way, but the planner has never included such considerations at the Path stage.) Now, the connection to the patch at hand is that if the query is select x,y,z from t order by expensive_function(x); this patch will successfully suppress calculation of the expensive function, *if* we were lucky enough to make the right choice of plan without considering the cost of the function. It's perfectly capable of making the wrong choice though. This will lead to bug reports about "the planner chooses a dumb plan, even though it knows the right plan is cheaper when I force it to choose that one". I think it's possible to revise the patch so that we do take the cost savings into account, at least at the point in grouping_planner where it chooses between the cheapest_path and the sorted_path returned by query_planner. (I'm not sure if there are cases where query_planner would discard the best choice at an earlier stage, but that seems possible in join queries.) But this won't do anything for cases where the expensive function appears in the SELECT list. So as I said, I'm worried that this will be mostly bogus once we address the larger problem. With the larger fix in place, the expensive_function value could come out of the indexscan, and then the resjunk expression would be nothing more than a Var referencing it, and hence hardly worth suppressing. Having said all that, there is one situation where this type of approach might still be useful even after such a fix, and that's KNNGist-style queries: select a,b,c from t order by col <-> constant limit 10; In a KNNGist search, there's no provision for the index AM to return the actual value of the ORDER BY expression, and in fact it's theoretically possible that that value is never even explicitly computed inside the index AM. So we couldn't suppress the useless evaluation of <-> by dint of requiring the physical scan to return that value as a Var. Reading between the lines of the original submission at <CAPpHfdtG5qoHoD+w=Tz3wC3fZ=b8i21=V5xandBFM=DTo-Yg=Q@mail.gmail.com>, I gather that it's the KNNGist-style case that worries you, so maybe it's worth applying this type of patch anyway. I'd want to rejigger it to be aware of the cost implications though, at least for grouping_planner's choices. Comments? regards, tom lane
pgsql-hackers by date:
Previous
From: Stephen FrostDate:
Subject: Re: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Next
From: Tom LaneDate:
Subject: Re: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])