On Thu, 4 Apr 2024 at 13:08, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> I changed internal_flush() to an inline function, results look better this way.
It seems you also change internal_flush_buffer to be inline (but only
in the actual function definition, not declaration at the top). I
don't think inlining internal_flush_buffer should be necessary to
avoid the perf regressions, i.e. internal_flush is adding extra
indirection compared to master and is only a single line, so that one
makes sense to inline.
Other than that the code looks good to me.
The new results look great.
One thing that is quite interesting about these results is that
increasing the buffer size results in even better performance (by
quite a bit). I don't think we can easily choose a perfect number, as
it seems to be a trade-off between memory usage and perf. But allowing
people to configure it through a GUC like in your second patchset
would be quite useful I think, especially because larger buffers could
be configured for connections that would benefit most for it (e.g.
replication connections or big COPYs).
I think your add-pq_send_buffer_size-GUC.patch is essentially what we
would need there but it would need some extra changes to actually be
merge-able:
1. needs docs
2. rename PQ_SEND_BUFFER_SIZE (at least make it not UPPER_CASE, but
maybe also remove the PQ_ prefix)
3. It's marked as PGC_USERSET, but there's no logic to grow/shrink it
after initial allocation