Re: Why does backend send buffer size hardcoded at 8KB? - Mailing list pgsql-general

From Andres Freund
Subject Re: Why does backend send buffer size hardcoded at 8KB?
Date
Msg-id 20190727232825.f3iwbdcto52rztig@alap3.anarazel.de
Whole thread Raw
In response to Re: Why does backend send buffer size hardcoded at 8KB?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi,

On 2019-07-27 19:10:22 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Additionally we perhaps ought to just not use the send buffer when
> > internal_putbytes() is called with more data than can fit in the
> > buffer. We should fill it with as much data as fits in it (so the
> > pending data like the message header, or smaller previous messages, are
> > flushed out in the largest size), and then just call secure_write()
> > directly on the rest. It's not free to memcpy all that data around, when
> > we already have a buffer.
> 
> Maybe, but how often does a single putbytes call transfer more than
> 16K?

I don't think it's that rare. COPY produces entire rows and sends them
at once, printtup also does, walsender can send pretty large chunks? I
think with several columns after text conversion it's pretty easy to
exceed 16k, not even taking large toasted columns into account.


> (If you fill the existing buffer, but don't have a full bufferload
> left to transfer, I doubt you want to shove the fractional bufferload
> directly to the kernel.)  Perhaps this added complexity will pay for
> itself, but I don't think we should just assume that.

Yea, I'm not certain either. One way to deal with the partially filled
buffer issue would be to use sendmsg() - and have two iovs (one pointing
to the filled buffer, one to the actual data). Wonder if it'd be
worthwhile to do in more scenarios, to avoid unnecessarily copying
memory around.


> > While the receive side is statically allocated, I don't think it ends up
> > in the process image as-is - as the contents aren't initialized, it ends
> > up in .bss.
> 
> Right, but then we pay for COW when a child process first touches it,
> no?  Maybe the kernel is smart about pages that started as BSS, but
> I wouldn't bet on it.

Well, they'll not exist as pages at that point, because postmaster won't
have used the send buffer to a meaningful degree? And I think that's the
same for >4k/pagesize blocks with malloc.  I think there could be a
benefit if we started the buffer pretty small with malloc, and only went
up as needed.

Greetings,

Andres Freund



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why does backend send buffer size hardcoded at 8KB?
Next
From: Neil
Date:
Subject: Re: Hardware for writing/updating 12,000,000 rows per hour