Why backend send buffer use exactly 8KB?
(https://github.com/postgres/postgres/blob/249d64999615802752940e017ee5166e726bc7cd/src/backend/libpq/pqcomm.c#L134)
I had this question when I try to measure the speed of reading data. The
bottleneck was a read syscall. With strace I found that in most cases
read returns 8192 bytes (https://pastebin.com/LU10BdBJ). With tcpdump we
can confirm, that network packets have size 8192
(https://pastebin.com/FD8abbiA)
So, with well-tuned networking stack, the limit is 8KB. The reason is
the hardcoded size of Postgres write buffer.
I found discussion, where Tom Lane says that the reason of this limit is
the size of pipe buffers in Unix machines:
https://www.postgresql.org/message-id/9426.1388761242%40sss.pgh.pa.us
> Traditionally, at least, that was the size of pipe buffers in Unix
machines, so in principle this is the most optimal chunk size for
sending data across a Unix socket. I have no idea though if that's
still true in kernels in common use today. For TCP communication it
might be marginally better to find out the MTU size and use that; but
it's unclear that it's worth the trouble, or indeed that we can
know the end-to-end MTU size with any reliability.
Does it make sense to make this parameter configurable?
--
Artemiy Ryabinkov
getlag(at)ya(dot)ru