Re: Inheritance planner CPU and memory usage change since 9.3.2 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Inheritance planner CPU and memory usage change since 9.3.2 |
Date | |
Msg-id | CA+TgmoZEQs2wHbhz-5AXF7A2Pk+SUajy=7LSmVBwBPS8JBTVPQ@mail.gmail.com Whole thread Raw |
In response to | Re: Inheritance planner CPU and memory usage change since 9.3.2 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Inheritance planner CPU and memory usage change since 9.3.2
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-hackers |
On Thu, Jun 18, 2015 at 3:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dean Rasheed <dean.a.rasheed@gmail.com> writes: >> On 18 June 2015 at 14:48, Robert Haas <robertmhaas@gmail.com> wrote: >>> I feel I might be missing a trick here. It seems unlikely to me that >>> we actually need the entire append_rel_list for every subquery; and we >>> almost certainly don't need to modify every element of the >>> append_rel_list for every subquery. Even the ones that no >>> ChangeVarNodes() call mutates still get deep-copied. > >> Yeah, you could probably pre-compute the indexes of the RTEs that need >> to copied, outside of the big loop, and store them in a bitmapset. >> Then, instead of copying the entire list of rowmarks/append_rel_infos >> each time, you could just copy the ones that referred to those RTE >> indexes (and only if the bitmapset was non-empty, which is the >> equivalent of your second optimisation). However, for AppendRelInfos, >> ChangeVarNodes() descends into the Vars in the translated_vars list, >> so short-cutting the copying of the AppendRelInfo isn't obviously >> safe. But, looking more closely, does ChangeVarNodes actually need to >> examine translated_vars (the fall-through case) when child_relid isn't >> the old rt_index? If not, that could be a big saving in cases like >> this. > > I'm a bit surprised that duplicating the append_rel_list is a noticeable > performance problem. It ought to be far smaller than the Query tree that > we've always duplicated in this loop --- in particular, it's really a > subset of what we have in the RTE list, no? Well, append_rel_list has an AppendRelInfo for every RTE and that contains a List (translated_vars) which in turn contains a Var node for every column. I'm not sure how that compares to the RTE itself. I think it's the cost of copying the translated_vars list that is really the problem here - you can have 200 or 500 columns in a table, so that's a Var and a ListCell for each one. In the problem cases, the number of AppendRelInfo elements is a small percentage of the number of Var nodes under them. That having been said, I don't know how the size of all that compares to the size of the Query. But I think the Query must be smaller, because arranging to discard the AppendRelInfo and its translated_vars list, and the per-subroot rowMarks list, after every call to grouping_planner stops the memory blowup. The whole translated_vars representation seems needless inefficient. For a subquery, you probably need something like that. But for an inheritance child, you just need to swap the varno and maybe remap some varattnos. Indexing into a C array to grab the varattno seems like it would be a heck of a lot more efficient than calling list_nth(). It might be worth making AppendRelInfo support a choice of representations so that we can use something more optimized for the simple case while not losing the full generality when we need it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: