On Fri, Jul 7, 2023 at 2:19 PM Masahiko Sawada <
sawada.mshk@gmail.com> wrote:
>
> On Wed, Jul 5, 2023 at 8:21 PM John Naylor <
john.naylor@enterprisedb.com> wrote:
> > Well, it's going to be a bit of a mess until I can demonstrate it working (and working well) with bitmap heap scan. Fixing that now is just going to create conflicts. I do have a couple small older patches laying around that were quick experiments -- I think at least some of them should give a performance boost in loading speed, but haven't had time to test. Would you like to take a look?
>
> Yes, I can experiment with these patches in the meantime.
Okay, here it is in v36. 0001-6 are same as v35.
0007 removes a wasted extra computation newly introduced by refactoring growing nodes. 0008 just makes 0011 nicer. Not worth testing by themselves, but better to be tidy.
0009 is an experiment to get rid of slow memmoves in node4, addressing a long-standing inefficiency. It looks a bit tricky, but I think it's actually straightforward after drawing out the cases with pen and paper. It works if the fanout is either 4 or 5, so we have some wiggle room. This may give a noticeable boost if the input is reversed or random.
0010 allows RT_EXTEND_DOWN to reduce function calls, so should help with sparse trees.
0011 reduces function calls when growing the smaller nodes. Not sure about this one -- possibly worth it for node4 only?
If these help, it'll show up more easily in smaller inputs. Large inputs tend to be more dominated by RAM latency.
--
John Naylor
EDB:
http://www.enterprisedb.com