On 2018-01-12 17:43:00 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-01-12 17:24:54 -0500, Tom Lane wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >>> Right. I wonder if it be reasonable to move that to a page's header
> >>> instead of individual records? To avoid torn page issues we'd have to
> >>> reduce the page size to a sector size, but I'm not sure that's that bad?
>
> >> Giving up a dozen or two bytes out of every 512 sounds like quite an
> >> overhead.
>
> > It's not nothing, that's true. But if it avoids 8 bytes in every record,
> > that'd probably at least as much in most usecases.
>
> Fair point. I don't have a very good handle on what "typical" WAL record
> sizes are, but we might be fine with that --- some quick counting on the
> fingers says we'd break even with an average record size of ~160 bytes,
> and be ahead below that.
This is far from a definitive answer, but here's some data:
pgbench -i -s 100 -q:
Type N (%) Record size (%) FPI size (%)
Combined size (%)
---- - --- ----------- --- -------- ---
------------- ---
Total 308958 1077269060 [84.19%] 202269468 [15.81%]
1279538528 [100%]
So here records are really large, which makes sense, given it's
largelyinitialization of data. With wal_compression that'd probably look
different, but still commonly spanning multiple pages.
pgbench -M prepared -c 16 -j 16 -T 100
Type N (%) Record size (%) FPI size (%)
Combined size (%)
---- - --- ----------- --- -------- ---
------------- ---
Total 14228881 947824170 [100.00%] 8192 [0.00%]
947832362 [100%]
Here we're at 66 bytes...
> We'd need to investigate the page-crossing overhead carefully though.
agreed.
Greetings,
Andres Freund