Re: JIT compilation per plan node - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: JIT compilation per plan node |
Date | |
Msg-id | 05b3dc64-8d30-477a-96e4-afa348ac0782@enterprisedb.com Whole thread Raw |
In response to | JIT compilation per plan node (Melih Mutlu <m.melihmutlu@gmail.com>) |
Responses |
Re: JIT compilation per plan node
|
List | pgsql-hackers |
Hi Melih, On 1/2/24 20:50, Melih Mutlu wrote: > Hi hackers, > > After discussing this with David offlist, I decided to reinitiate this > discussion that has already been raised and discussed several times in the > past. [1] [2] > > Currently, if JIT is enabled, the decision for JIT compilation is purely > tied to the total cost of the query. The number of expressions to be JIT > compiled is not taken into consideration, however the time spent JITing > also depends on that number. This may cause the cost of JITing to become > too much that it hurts rather than improves anything. > An example case would be that you have many partitions and run a query that > touches one, or only a few, of those partitions. If there is no partition > pruning done in planning time, all 1000 partitions are JIT compiled while > most will not even be executed. > > Proposed patch (based on the patch from [1]) simply changes consideration > of JIT from plan level to per-plan-node level. Instead of depending on the > total cost, we decide whether to perform JIT on a node or not by > considering only that node's cost. This allows us to only JIT compile plan > nodes with high costs. > > Here is a small test case to see the issue and the benefit of the patch: > > CREATE TABLE listp(a int, b int) PARTITION BY LIST(a); > SELECT 'CREATE TABLE listp'|| x || ' PARTITION OF listp FOR VALUES IN > ('||x||');' FROM generate_Series(1,1000) x; \gexec > INSERT INTO listp SELECT 1,x FROM generate_series(1,10000000) x; > > EXPLAIN (VERBOSE, ANALYZE) SELECT COUNT(*) FROM listp WHERE b < 0; > > master jit=off: > Planning Time: 25.113 ms > Execution Time: 315.896 ms > > master jit=on: > Planning Time: 24.664 ms > JIT: > Functions: 9008 > Options: Inlining false, Optimization false, Expressions true, Deforming > true > Timing: Generation 290.705 ms (Deform 108.274 ms), Inlining 0.000 ms, > Optimization 165.991 ms, Emission 3232.775 ms, Total 3689.472 ms > Execution Time: 1612.817 ms > > patch jit=on: > Planning Time: 24.055 ms > JIT: > Functions: 17 > Options: Inlining false, Optimization false, Expressions true, Deforming > true > Timing: Generation 1.463 ms (Deform 0.232 ms), Inlining 0.000 ms, > Optimization 0.766 ms, Emission 11.609 ms, Total 13.837 ms > Execution Time: 299.721 ms > Thanks for the updated / refreshed patch. I think one of the main challenges this patch faces is that there's a couple old threads with previous attempts, and the current thread simply builds on top of them, without explaining stuff fully. But people either don't realize that, or don't have time to read old threads just in case, so can't follow some of the decisions :-( I think it'd be good to maybe try to explain some of the problems and solutions more thoroughly, or at least point to the relevant places in the old threads ... > > A bit more on what this patch does: > - It introduces a new context to keep track of the number of estimated > calls and if JIT is decided for each node that the context applies. AFAIK this is an attempt to deal with passing the necessary information while constructing the plan, which David originally tried [1] doing by passing est_calls during create_plan ... I doubt CreatePlanContext is a great way to achieve this. For one, it breaks the long-standing custom that PlannerInfo is the first parameter, usually followed by RelOptInfo, etc. CreatePlanContext is added to some functions (but not all), which makes it ... unpredictable. FWIW it's not clear to me if/how this solves the problem with early create_plan() calls for subplans. Or is it still the same? Wouldn't it be simpler to just build the plan as we do now, and then have an expression_tree_walker that walks the complete plan top-down, inspects the nodes, enables JIT where appropriate and so on? That can have arbitrary context, no problem with that. Considering we decide JIT pretty late anyway (long after costing and other stuff that might affect the plan selection), the result should be exactly the same, without the extensive createplan.c disruption ... (usual caveat: I haven't tried, maybe there's something that means this can't work) > - The number of estimated calls are especially useful where a node is > expected to be rescanned, such as Gather. Gather Merge, Memoize and Nested > Loop. Knowing the estimated number of calls for a node allows us to rely on > total cost multiplied by the estimated calls instead of only total cost for > the node. Not sure I follow. Why would be these nodes (Gather, Gather Merge, ...) more likely to be rescanned compared to other nodes? > - For each node, the planner considers if the node should be JITed. If the > cost of the node * the number of estimated calls is greater than > jit_above_cost, it's decided to be JIT compiled. Note that this changes > the meaning of jit_above_cost, it's now a threshold for a single plan node > and not the whole query. Additionally, this change in JIT consideration is > only for JIT compilations. Inlining and optimizations continue to be for > the whole query and based on the overall cost of the query. It's not clear to me why JIT compilation is decided for each node, while the inlining/optimization is decided for the plan as a whole. I'm not familiar with the JIT stuff, so maybe it's obvious to others ... > - EXPLAIN shows estimated number of "loops" and whether JIT is true or not > for the node. For text format, JIT=true/false information is shown only if > it's VERBOSE. (no reason to not show this info even if not VERBOSE. Showing > for only VERBOSE just requires less changes in tests, so I did this for > simplicity at the moment). > typo in sgml docs: ovarall > > There are also some things that I'm not sure of: > - What are other places where a node is likely to be rescanned, thus we > need to take estimated calls into account properly? Maybe recursive CTEs? Why would it matter if a node is more/less likely to be rescanned? Either the node is rescanned in the plan or not, and we have nloops to know how many rescans to expect. > - This change can make jit_above_cost mean something different. Should we > rename it or introduce a new GUC? If it'll be kept as it is now, then it > would probably be better to change its default value at least. You mean it changes from per-query to per-node threshold? In general I think we should not repurpose GUCs (because people are likely to copy and keep the old config, not realizing it works differently). But I'm not sure this change is different enough to be an issue. And it's in the opposite direction than usually causes problems (i.e. it would disable JIT in cases where it was enabled before). > - What can we do for inlining and optimization? AFAIU performing those per > node may be very costly and not make that much sense.But I'm not sure about > how to handle those operations. Not sure. I don't think I understand JIT details enough to have a good opinion on this. > - What about parallel queries? Total cost of the node is divided by the > number of workers, which can seem like the cost reduced quite a bit. The > patch amplifies the cost by the number of workers (by setting estimated > calls to the number of workers) to make it more likely to perform JIT for > Gather/Gather Merge nodes. OTOH JIT compilations are performed per worker > and this can make workers decide JIT compile when it's not really needed. > Using the number of workers as "number of calls" seems wrong to me. Why shouldn't each worker do the JIT decision on it's own, as if it was the only worker running (but seeing only it's fraction of the data)? Kinda as if there were multiple independent backends running "small" queries? regards [1] https://www.postgresql.org/message-id/CAApHDvoq5VhV%3D2euyjgBN2bC8Bds9Dtr0bG7R%3DreeefJWKJRXA%40mail.gmail.com -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: