On Sun, Dec 20, 2020 at 8:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> One minor thought is that in
>
> + struct iovec iov[Min(IOV_MAX, 1024)]; /* cap stack space */
>
> it seems like pretty much every use of IOV_MAX would want some
> similar cap. Should we centralize that idea with, say,
>
> #define PG_IOV_MAX Min(IOV_MAX, 1024)
>
> ? Or will the plausible cap vary across uses?
Hmm. For the real intended user of this, namely worker processes that
simulate AIO when native AIO isn't available, higher level code will
limit the iov count to much smaller numbers anyway. It wants to try
to stay under typical device limits for vectored I/O, because split
requests would confound attempts to model and limit queue depth and
control latency. In Andres's AIO prototype he currently has a macro
PGAIO_MAX_COMBINE set to 16 (meaning approximately 16 data block or
wal reads/writes = 128KB worth of scatter/gather per I/O request); I
guess it should really be Min(IOV_MAX, <something>), but I don't
currently have an opinion on the <something>, except that it should
surely be closer to 16 than 1024 (for example
/sys/block/nvme0n1/queue/max_segments is 33 here). I mention all this
to explain that I don't think the code in patch 0002 is going to turn
out to be very typical: it's trying to minimise system calls by
staying under an API limit (though I cap it for allocation sanity),
whereas more typical code probably wants to stay under a device limit,
so I don't immediately have another use for eg PG_IOV_MAX.