Re: How to retain lesser paths at add_path()? - Mailing list pgsql-hackers
From | Nils Dijk |
---|---|
Subject | Re: How to retain lesser paths at add_path()? |
Date | |
Msg-id | CAECUgYygh8GT=yduiRX6vr5yVUt65=Aq29C-kQ+DMUjsCpL=Cw@mail.gmail.com Whole thread Raw |
In response to | Re: How to retain lesser paths at add_path()? (Jaime Casanova <jcasanov@systemguards.com.ec>) |
Responses |
Re: How to retain lesser paths at add_path()?
|
List | pgsql-hackers |
Hi, While researching the viability to change Citus' planner in a way to take more advantage of the postgres planner I ran into the same limitations in add_path as described on this thread. For Citus we are investigating a possibility to introduce Path nodes that describe the operation of transer tuples over the network. Due to the pruning rules in add_path we currently lose many paths that have great opportunities of future optimizations. Having looked at the patches provided I don't think that adding a new List to the Path structure where we retain 'lesser' paths is the right approach. First of all, this would require extension developers to interpret all these paths, where many would dominate others on cost, sorting, parameterization etc. Instead I like the approach suggested by both Robert Haas and Tom Lane. > have some kind of figure of merit or other special marking for paths > that will have some possible special advantage in later planning > steps. This is well in line with how the current logic works in add_path where cost, sorting and parameterization, rowcount and parallel safety are dimensions on which paths are compared. IMHO extensions should be able to add dimensions of their interest to this list of current dimensions used. The thoughts Tom Lane expressed earlier struc a chord with me. > [ thinks a bit... ] Maybe that could be improved if we can express > the result as a bitmask, defined in such a way that OR'ing (or maybe > AND'ing? haven't worked it out) the results from different comparisons > does the right thing. Attached you will find 3 patches that implement a way for extensions to introduce 'a figure of merit' by the means of path comparisons. - The first patch refactors the decision logic out of the forloop into their own function as to make it easier to reason about what is being compared. This refactor also changes the direction of some if-statements as to provide clear early decision points. - The second patch rewrites the original if/else/switch tree into 5 logical or operations of a bitmask expressing the comparison between paths, together with early escapes once we know the paths are different. To keep the original logic there is a slight deviation from simply or-ing 5 comparisons. After comparing cost, sorting and parameterization we only continue or-ing rows and parallelism if the paths are leaning either to path1 (new_path) or path2 (old_path). If the first 3 comparisons result in the paths being equal the original code prefers parallel paths over paths with less rows. - The third and last path builds on the logical or operations introduced in the middle patch. After the first three dimensions postgres compares paths on we allow extensions to compare paths in the same manner. Their result gets merged into the compounded comparison. To come to the three patches above I have decomposed the orignal behaviour into 3 possible decisions add_path can take per iteration in the loop. It either REJECTS the new path, DOMINATES the old path or CONTINUES with the next path in the list. The decision for any of the 3 actions is based on 6 different input parameters: - cost std fuzz factor - sort order / pathkeys - parameterizations / bitmap subset of required outer relid's - row count - parallel safety - cost smaller fuzz factor To verify the correct decisions being made in the refactor of the second patch I modelled both implementations in a scripting language and passed in all possible comparisons for the six dimensions above. With the change of behaviour after the first 3 dimensions I came to an exact one-to-one mapping of the decisions being made before and after patch 2. Throughout this thread an emphasis on performance has been expressed by many participants. I want to acknowledge their stance. Due to the nature of the current planner the code in add_path gets invoked on an exponential scale when the number of joins increases and extra paths are retained. Especially with my proposed patch being called inside the loop where paths are being compared makes that the hook gets called ever more often compared to the earlier proposed patches on this thread. To reason about the performance impact of a patch in this area I think we need to get to a mutual understanding and agreement on how to evaluate the performance. Given many if not most installations will run without any hook installed in this area we should aim for minimal to no measurable overhead of the code without a hook installed. This is also why I made sure, via modelling of the decision logic in a scripting language the behaviour of which paths are retained is not changed with these patches when no hook is installed. When an extension needs to add a dimension to the comparison of paths the impact of this patch is twofold: - A dynamic function gets invoked to compare every path to every other path. Both the dynamic function call and the logics to compare the paths will introduce extra time spent in add_path - A hook in this area will by definition always retain more paths. This is just the nature of how the comparison of paths in this area works. Both of the dimensions above will make that extensions requiring this hook to retain more paths will have to think carefully about the work they do, and on how many dimensions paths are compared in the end. As Tomas Vondra pointed out earlier in this thread: > I think the assumption is that the savings from > building better plans far outweight that extra cost. For Citus this translates into less network traffic by having more optimization possibilities of pushing down expensive and row reducing operations like joins and groupings, etc to the nodes hosting the data. For PG-Strom this translates into amortising the DMA transfer cost between host and GPU. Due to the gravity of data we always prefer extra planning time for complex queries over transferring many tuples. Frankly, our planner is already introducing much overhead currently and we expect to reduce this overhead by having the possibility to hook into postgres in areas like this. To understand the impact these patches have when no hook is installed I performed the following two comparisons. - Run a benchmark on a patched and a non-patched version of postgres 14.1. I used HammerDB for this as we have tooling around to quickly run these. Given the execution time dominates such a benchmark I don't think it gives good insights. At Least it shows that both versions perform in the same range of performance. - Analysed the machine code on Ubuntu 20.04 x86_64. The machine code is very much alike. The patched version has slightly less jumps and slightly less instructions. My fear was that the excessive breaking into small functions and switches to translate enums into each other would cause many layers of indirection to be introduced. Instead all code gets inlined when compiled with the same configuration as the 14.1 release. I don't think the above two are enough to conclude this patch doesn't introduce overhead when the hook is empty. I invite everyone with an interest in this area to perform their own measurements and report their findings. Best, -- Nils Citus Data / Microsoft
Attachment
pgsql-hackers by date: