Re: Replication identifiers, take 4 - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Replication identifiers, take 4 |
Date | |
Msg-id | 552AC14E.7080308@iki.fi Whole thread Raw |
In response to | Re: Replication identifiers, take 4 (Petr Jelinek <petr@2ndquadrant.com>) |
Responses |
Re: Replication identifiers, take 4
|
List | pgsql-hackers |
On 04/12/2015 02:56 AM, Petr Jelinek wrote: > On 10/04/15 18:03, Andres Freund wrote: >> On 2015-04-07 17:08:16 +0200, Andres Freund wrote: >>> I'm starting benchmarks now. >> >> What I'm benchmarking here is the WAL overhead, since that's what we're >> debating. >> >> The test setup I used was a pgbench scale 10 instance. I've run with >> full_page_write=off to have more reproducible results. This of course >> over-emphasizes the overhead a bit. But for a long checkpoint interval >> and a memory resident working set it's not that unrealistic. >> >> I ran 50k transactions in a signle b >> baseline: >> - 20445024 >> - 20437128 >> - 20436864 >> - avg: 20439672 >> extern 2byte identifiers: >> - 23318368 >> - 23148648 >> - 23128016 >> - avg: 23198344 >> - avg overhead: 13.5% >> padding 2byte identifiers: >> - 21160408 >> - 21319720 >> - 21164280 >> - avg: 21214802 >> - avg overhead: 3.8% >> extern 4byte identifiers: >> - 23514216 >> - 23540128 >> - 23523080 >> - avg: 23525808 >> - avg overhead: 15.1% >> >> To me that shows pretty clearly that a) reusing the padding is >> worthwhile b) even without that using 2byte instead of 4 byte >> identifiers is beneficial. >> > > My opinion is that 10% of WAL size difference is quite high price to pay > so that we can keep the padding for some other, yet unknown feature that > hasn't come up in several years, which would need those 2 bytes. > > But if we are willing to pay it then we can really go all the way and > just use Oids... This needs to be weighed against removing the padding bytes altogether. See attached. That would reduce the WAL size further when you don't need replication IDs. It's very straightforward, but need to do some performance/scalability testing to make sure that using memcpy instead of a straight 32-bit assignment doesn't hurt performance, since it happens in very performance critical paths. I'm surprised there's such a big difference between the "extern" and "padding" versions above. At a quick approximation, storing the ID as a separate "fragment", along with XLogRecordDataHeaderShort and XLogRecordDataHeaderLong, should add one byte of overhead plus the ID itself. So that would be 3 extra bytes for 2-byte identifiers, or 5 bytes for 4-byte identifiers. Does that mean that the average record length is only about 30 bytes? That's what it seems like, if adding the "extern 2 byte identifiers" added about 10% of overhead compared to the "padding 2 byte identifiers" version. That doesn't sound right, 30 bytes is very little. Perhaps the size of the records created by pgbench happen to cross a 8-byte alignment boundary at that point, making a big difference. In another workload, there might be no difference at all, due to alignment. Also, you don't need to tag every record type with the replication ID. All indexam records can skip it, for starters, since logical decoding doesn't care about them. That should remove a fair amount of bloat. - Heikki
Attachment
pgsql-hackers by date: