Re: Replication identifiers, take 4 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Replication identifiers, take 4
Date
Msg-id 552AC14E.7080308@iki.fi
Whole thread Raw
In response to Re: Replication identifiers, take 4  (Petr Jelinek <petr@2ndquadrant.com>)
Responses Re: Replication identifiers, take 4
List pgsql-hackers
On 04/12/2015 02:56 AM, Petr Jelinek wrote:
> On 10/04/15 18:03, Andres Freund wrote:
>> On 2015-04-07 17:08:16 +0200, Andres Freund wrote:
>>> I'm starting benchmarks now.
>>
>> What I'm benchmarking here is the WAL overhead, since that's what we're
>> debating.
>>
>> The test setup I used was a pgbench scale 10 instance. I've run with
>> full_page_write=off to have more reproducible results. This of course
>> over-emphasizes the overhead a bit. But for a long checkpoint interval
>> and a memory resident working set it's not that unrealistic.
>>
>> I ran 50k transactions in a signle b
>> baseline:
>> - 20445024
>> - 20437128
>> - 20436864
>> - avg: 20439672
>> extern 2byte identifiers:
>> - 23318368
>> - 23148648
>> - 23128016
>> - avg: 23198344
>> - avg overhead: 13.5%
>> padding 2byte identifiers:
>> - 21160408
>> - 21319720
>> - 21164280
>> - avg: 21214802
>> - avg overhead: 3.8%
>> extern 4byte identifiers:
>> - 23514216
>> - 23540128
>> - 23523080
>> - avg: 23525808
>> - avg overhead: 15.1%
>>
>> To me that shows pretty clearly that a) reusing the padding is
>> worthwhile b) even without that using 2byte instead of 4 byte
>> identifiers is beneficial.
>>
>
> My opinion is that 10% of WAL size difference is quite high price to pay
> so that we can keep the padding for some other, yet unknown feature that
> hasn't come up in several years, which would need those 2 bytes.
>
> But if we are willing to pay it then we can really go all the way and
> just use Oids...

This needs to be weighed against removing the padding bytes altogether.
See attached. That would reduce the WAL size further when you don't need
replication IDs. It's very straightforward, but need to do some
performance/scalability testing to make sure that using memcpy instead
of a straight 32-bit assignment doesn't hurt performance, since it
happens in very performance critical paths.

I'm surprised there's such a big difference between the "extern" and
"padding" versions above. At a quick approximation, storing the ID as a
separate "fragment", along with XLogRecordDataHeaderShort and
XLogRecordDataHeaderLong, should add one byte of overhead plus the ID
itself. So that would be 3 extra bytes for 2-byte identifiers, or 5
bytes for 4-byte identifiers. Does that mean that the average record
length is only about 30 bytes? That's what it seems like, if adding the
"extern 2 byte identifiers" added about 10% of overhead compared to the
"padding 2 byte identifiers" version. That doesn't sound right, 30 bytes
is very little. Perhaps the size of the records created by pgbench
happen to cross a 8-byte alignment boundary at that point, making a big
difference. In another workload, there might be no difference at all,
due to alignment.

Also, you don't need to tag every record type with the replication ID.
All indexam records can skip it, for starters, since logical decoding
doesn't care about them. That should remove a fair amount of bloat.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: SSL information view
Next
From: Alvaro Herrera
Date:
Subject: Re: moving from contrib to bin