Thread: Merge compact/non compact commits, make aborts dynamically sized

Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
Hi,

Right now wal_level=logical implies that the compact commit record
format isn't used and similarly 2pc commits also include the non compact
format of commits.

In the course of the 'replication identifier' patch submitted to the
current commitfest I added more information to the non compact commit
record if a xinfo flag was present. I optionally adding data is a better
course than the rigorous split between compact/non compact commits.

In the attached patch I've merged compact/noncompact code, made aborts
use similar logic to avoid including useless bytes and used both for the
2pc equivalents.

To avoid using more space in the compact case the 'xinfo' field
indicating the presence of further data is only included when a byte in
the xl_info flag is set (similar to what heap rmgr does). That means
that transactions without subtransactions and other things are four
bytes smaller than before, but ones with a subtransaction but no other
complications 4 byte larger. database info, nsubxacts, nrels, nmsgs et
al are also only included if required. I think that's a overall win,
even disregarding wal_level = logical.

When compact commits were discussed in
http://archives.postgresql.org/message-id/235610.92468.qm%40web29004.mail.ird.yahoo.com
such a scheme was discussed, but then discarded. I think that was a
mistake; the information in commit records is likely to be growing and
having to decide between the compact/non compact form instead on a more
granular level makes decisions harder. The information about two forms
of commits already has creeped into too many places...

I generally like how the patch looks, the relevant code actually looks
clearer to me afterwards - especially RecordTransactionCommit() being
noticeably shorter, decode.c having to care about less and that
twophase.c knows a less about xact.c type records seems good.

There's one bit that I'm not so sure about though: To avoid duplication
I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
available both in front and backend code - so it's currently living in
xactdesc.c. I think we can live with that, but it's certainly not
pretty.

I'm not going to rebase the replication identifier patch over this for
now, seems premature until some feedback is in.

Comments?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Merge compact/non compact commits, make aborts dynamically sized

From
Heikki Linnakangas
Date:
On 02/20/2015 05:21 PM, Andres Freund wrote:
> In the attached patch I've merged compact/noncompact code, made aborts
> use similar logic to avoid including useless bytes and used both for the
> 2pc equivalents.

+1 for this approach in general.

> To avoid using more space in the compact case the 'xinfo' field
> indicating the presence of further data is only included when a byte in
> the xl_info flag is set (similar to what heap rmgr does). That means
> that transactions without subtransactions and other things are four
> bytes smaller than before, but ones with a subtransaction but no other
> complications 4 byte larger. database info, nsubxacts, nrels, nmsgs et
> al are also only included if required. I think that's a overall win,
> even disregarding wal_level = logical.

xinfo is a bit underdocumented now. The comment in xl_xact_commit should 
mention that it's a uint32. But actually, why is it 32 bits wide? Only 6 
bits are used, so it would in a single byte. I guess that's because of 
alignment: now that it's 32 bits wide, that ensures that all the other 
structs are 4 byte aligned. That should be documented. Alternatively, 
store them unaligned to save the few bytes and use memcpy.

> There's one bit that I'm not so sure about though: To avoid duplication
> I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
> available both in front and backend code - so it's currently living in
> xactdesc.c. I think we can live with that, but it's certainly not
> pretty.

Yeah, that's ugly. Why does frontend code need that? The old format 
isn't exactly trivial for frontend code to decode either.

Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you 
could pass an xl_xact_parsed/abort_commit struct to them, instead of the 
individual fields? You could then also avoid the static variables inside 
it, passing pointers to that struct to XLogRegisterData() instead.

- Heikki




Re: Merge compact/non compact commits, make aborts dynamically sized

From
Simon Riggs
Date:
On 24 February 2015 at 18:51, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 02/20/2015 05:21 PM, Andres Freund wrote:
>>
>> In the attached patch I've merged compact/noncompact code, made aborts
>> use similar logic to avoid including useless bytes and used both for the
>> 2pc equivalents.
>
>
> +1 for this approach in general.

+1 also

The compact commit record idea wasn't useful enough to stick with,
especially in the light of logical rep, possible future extensions to
the commit record and the prevalence of subtransactions.

XactEmitCommitRecord and XactEmitAbortRecord are somewhat hard to
understand. More comments or clearer design please.

The record format itself needs more explanation than I see in the
patch. Might be worth documenting when and how those fields would be
populated also (my bad).

It might be useful to have the optional field descriptions be named
with a _opt suffix to make it clearer.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, RemoteDBA, Training &
Services



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
Hi,

On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
> On 02/20/2015 05:21 PM, Andres Freund wrote:
> >To avoid using more space in the compact case the 'xinfo' field
> >indicating the presence of further data is only included when a byte in
> >the xl_info flag is set (similar to what heap rmgr does). That means
> >that transactions without subtransactions and other things are four
> >bytes smaller than before, but ones with a subtransaction but no other
> >complications 4 byte larger. database info, nsubxacts, nrels, nmsgs et
> >al are also only included if required. I think that's a overall win,
> >even disregarding wal_level = logical.
> 
> xinfo is a bit underdocumented now.

Yea, the whole thing needs a bit more comments. I really wanted some
feedback before sinking even more time into this...

> The comment in xl_xact_commit should mention that it's a uint32. But
> actually, why is it 32 bits wide?

The first reason is that it's essentially just the old 'xinfo' field
moved to a different place ;). I think I'll add a xl_xact_flags or so,
and document it there.

> Only 6 bits are used, so it would in a single byte. I guess that's
> because of alignment: now that it's 32 bits wide, that ensures that
> all the other structs are 4 byte aligned. That should be
> documented. Alternatively, store them unaligned to save the few bytes
> and use memcpy.

But yes, I didn't think about changing it because of alignment. Given
that commit and abort records can be fairly large, it seems better to
waste three bytes in a few scenarios (not the minimal one!) than
allocate another large chunk of memory for the copied data.


Even in the current code there's some rather underdocumented assumptions
about alignment; namely that RelFileNode+, TransactionId+ always result
in a acceptable alignment for the data after them. That happens to be
the case, but is worth a comment and/or a MAXALIGN.

> >There's one bit that I'm not so sure about though: To avoid duplication
> >I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
> >available both in front and backend code - so it's currently living in
> >xactdesc.c. I think we can live with that, but it's certainly not
> >pretty.
> 
> Yeah, that's ugly. Why does frontend code need that? The old format
> isn't exactly trivial for frontend code to decode either.

pg_xlogdump outputs subxacts and such; I don't forsee other
usages. Sure, we could copy the code around, but I think that's worse
than having it in xactdesc.c. Needs a comment explaining why it's there
if I haven't added one already.

> Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you
> could pass an xl_xact_parsed/abort_commit struct to them, instead of the
> individual fields? You could then also avoid the static variables inside it,
> passing pointers to that struct to XLogRegisterData() instead.

Hm, that's an idea. And rename it to xaxt_commit/abort_data?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Michael Paquier
Date:
On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
>> On 02/20/2015 05:21 PM, Andres Freund wrote:
>> >There's one bit that I'm not so sure about though: To avoid duplication
>> >I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
>> >available both in front and backend code - so it's currently living in
>> >xactdesc.c. I think we can live with that, but it's certainly not
>> >pretty.
>>
>> Yeah, that's ugly. Why does frontend code need that? The old format
>> isn't exactly trivial for frontend code to decode either.
>
> pg_xlogdump outputs subxacts and such; I don't forsee other
> usages. Sure, we could copy the code around, but I think that's worse
> than having it in xactdesc.c. Needs a comment explaining why it's there
> if I haven't added one already.

FWIW, I think they would live better in frontend code for client applications.

That's a nice patch. +1 for merging them. Here are a couple of comments:

+/* Parse the WAL format of a xact abort into a easier to understand format. */
+void
+ParseCommitRecord(uint8 info, xl_xact_commit *xlrec,
xl_xact_parsed_commit *parsed)
I think that you mean here of "an xact commit", not abort.

+ * Emit, but don't insert, a abort record.
s/a abort/an abort/
XactEmitAbortRecord has some problems with tabs replaced by 4 spaces
at a couple of places.

+/* free opcode 0x70 */
+
+#define XLOG_XACT_OPMASK                       0x70
There is a contradiction here, 0x70 is not free.

Regards,
-- 
Michael



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
On 2015-02-27 16:26:08 +0900, Michael Paquier wrote:
> On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
> >> On 02/20/2015 05:21 PM, Andres Freund wrote:
> >> >There's one bit that I'm not so sure about though: To avoid duplication
> >> >I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
> >> >available both in front and backend code - so it's currently living in
> >> >xactdesc.c. I think we can live with that, but it's certainly not
> >> >pretty.
> >>
> >> Yeah, that's ugly. Why does frontend code need that? The old format
> >> isn't exactly trivial for frontend code to decode either.
> >
> > pg_xlogdump outputs subxacts and such; I don't forsee other
> > usages. Sure, we could copy the code around, but I think that's worse
> > than having it in xactdesc.c. Needs a comment explaining why it's there
> > if I haven't added one already.
> 
> FWIW, I think they would live better in frontend code for client applications.

What do you mean with that? You mean you'd rather see a copy in
pg_xlogdump somewhere? How would you trigger that being used instead of
the normal description routine?


> +/* free opcode 0x70 */
> +
> +#define XLOG_XACT_OPMASK                       0x70
> There is a contradiction here, 0x70 is not free.

The above is a mask, not an opcode - so I don't think it's a
contraction. I'll expand on the commentary on the next version.

Thanks for having a look!

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Michael Paquier
Date:
On Fri, Feb 27, 2015 at 6:49 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2015-02-27 16:26:08 +0900, Michael Paquier wrote:
>> On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
>> >> On 02/20/2015 05:21 PM, Andres Freund wrote:
>> >> >There's one bit that I'm not so sure about though: To avoid duplication
>> >> >I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
>> >> >available both in front and backend code - so it's currently living in
>> >> >xactdesc.c. I think we can live with that, but it's certainly not
>> >> >pretty.
>> >>
>> >> Yeah, that's ugly. Why does frontend code need that? The old format
>> >> isn't exactly trivial for frontend code to decode either.
>> >
>> > pg_xlogdump outputs subxacts and such; I don't forsee other
>> > usages. Sure, we could copy the code around, but I think that's worse
>> > than having it in xactdesc.c. Needs a comment explaining why it's there
>> > if I haven't added one already.
>>
>> FWIW, I think they would live better in frontend code for client applications.
>
> What do you mean with that? You mean you'd rather see a copy in
> pg_xlogdump somewhere? How would you trigger that being used instead of
> the normal description routine?

No, no. I meant that it is good the way your patch does it in
xactdesc.c, where both frontend and backend can reach it.
-- 
Michael



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Robert Haas
Date:
On Fri, Feb 27, 2015 at 7:10 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> No, no. I meant that it is good the way your patch does it in
> xactdesc.c, where both frontend and backend can reach it.

Agreed, that seems much better than duplicating it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
On 2015-02-25 12:10:42 +0100, Andres Freund wrote:
> On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
> > Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you
> > could pass an xl_xact_parsed/abort_commit struct to them, instead of the
> > individual fields? You could then also avoid the static variables inside it,
> > passing pointers to that struct to XLogRegisterData() instead.
> 
> Hm, that's an idea. And rename it to xaxt_commit/abort_data?

So, I just tried this, and it doesn't really seem to come out as a net
positive. More code at the callsites and more complex code in the *Emit*
routines. It's impossible to use [FLEXIBLE_ARRAY_MEMBER] employing
datatypes in xact_commit_data while emitting because there obviously are
several chunks needing it. And avoid using it would make for a slightly
less clear format.

So unless you feel strongly about it, don't think so, I'll keep the
statics, even if they're not particularly pretty.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Heikki Linnakangas
Date:
On 03/02/2015 06:51 PM, Andres Freund wrote:
> On 2015-02-25 12:10:42 +0100, Andres Freund wrote:
>> On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
>>> Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you
>>> could pass an xl_xact_parsed/abort_commit struct to them, instead of the
>>> individual fields? You could then also avoid the static variables inside it,
>>> passing pointers to that struct to XLogRegisterData() instead.
>>
>> Hm, that's an idea. And rename it to xaxt_commit/abort_data?
>
> So, I just tried this, and it doesn't really seem to come out as a net
> positive. More code at the callsites and more complex code in the *Emit*
> routines. It's impossible to use [FLEXIBLE_ARRAY_MEMBER] employing
> datatypes in xact_commit_data while emitting because there obviously are
> several chunks needing it. And avoid using it would make for a slightly
> less clear format.
>
> So unless you feel strongly about it, don't think so, I'll keep the
> statics, even if they're not particularly pretty.

Come to think of it, it would be cleaner anyway to move the 
XLogBeginInsert() and XLogInsert() calls inside XactEmitCommitRecord. 
Then those structs don't need to be static either.

- Heikki




Re: Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
On 2015-03-02 19:11:15 +0200, Heikki Linnakangas wrote:
> Come to think of it, it would be cleaner anyway to move the
> XLogBeginInsert() and XLogInsert() calls inside XactEmitCommitRecord. Then
> those structs don't need to be static either.

That was my first thought as well - but it doesn't easily work because
of the way commit/abort records are embedded into twophase.c.  I
couldn't come with a simple way to change that, and anythign non simple
imo defeats the purpose.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Heikki Linnakangas
Date:
On 03/02/2015 07:14 PM, Andres Freund wrote:
> On 2015-03-02 19:11:15 +0200, Heikki Linnakangas wrote:
>> Come to think of it, it would be cleaner anyway to move the
>> XLogBeginInsert() and XLogInsert() calls inside XactEmitCommitRecord. Then
>> those structs don't need to be static either.
>
> That was my first thought as well - but it doesn't easily work because
> of the way commit/abort records are embedded into twophase.c.  I
> couldn't come with a simple way to change that, and anythign non simple
> imo defeats the purpose.

Pass the prepared XID as yet another argument to XactEmitCommitRecord, 
and have XactEmitCommitRecord emit the xl_xact_commit_prepared part of 
the record too. It might even make sense to handle the prepared XID like 
all the other optional fields and add an xinfo flag for it.

- Heikki



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
On 2015-03-02 19:23:56 +0200, Heikki Linnakangas wrote:
> On 03/02/2015 07:14 PM, Andres Freund wrote:
> >On 2015-03-02 19:11:15 +0200, Heikki Linnakangas wrote:
> >>Come to think of it, it would be cleaner anyway to move the
> >>XLogBeginInsert() and XLogInsert() calls inside XactEmitCommitRecord. Then
> >>those structs don't need to be static either.
> >
> >That was my first thought as well - but it doesn't easily work because
> >of the way commit/abort records are embedded into twophase.c.  I
> >couldn't come with a simple way to change that, and anythign non simple
> >imo defeats the purpose.
> 
> Pass the prepared XID as yet another argument to XactEmitCommitRecord, and
> have XactEmitCommitRecord emit the xl_xact_commit_prepared part of the
> record too. It might even make sense to handle the prepared XID like all the
> other optional fields and add an xinfo flag for it.

That's what I mean with "non simple". Not a fan of teaching xact.c even
more about twophase's dealings than it already knows. We'd have to
either teach XactEmit* to insert different types of records pased a
parameter or switch in xact_redo() based on the availability of the
separate twophase xid via xinfo. Doesn't strike me as an improvement at
all.

I'd rather live with some statics in an isolated location - it's not as
if we're ever going to allow two concurrent xlog inserts to happen at
the same time.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
On 2015-03-02 18:45:18 +0100, Andres Freund wrote:
> On 2015-03-02 19:23:56 +0200, Heikki Linnakangas wrote:
> > Pass the prepared XID as yet another argument to XactEmitCommitRecord, and
> > have XactEmitCommitRecord emit the xl_xact_commit_prepared part of the
> > record too. It might even make sense to handle the prepared XID like all the
> > other optional fields and add an xinfo flag for it.
>
> That's what I mean with "non simple". Not a fan of teaching xact.c even
> more about twophase's dealings than it already knows.

I made an effort to show how horrible it would look like. Turns out it's
not that bad ;). Far from pretty, especially in xlog.c, but
acceptable. I think treating them more similar actually might open the
road for reducing the difference further at some point.

The attached patch does pretty much what you suggested above and tries
to address all the point previously made. I plan to push this fairly
soon; unless somebody has further input.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Merge compact/non compact commits, make aborts dynamically sized

From
Andres Freund
Date:
On 2015-03-15 16:30:16 +0100, Andres Freund wrote:
> The attached patch does pretty much what you suggested above and tries
> to address all the point previously made. I plan to push this fairly
> soon; unless somebody has further input.

Pushed, after fixing two typos in decode.c I introduced while "cleaning
up". This might not yet be perfect, but there seems little point in
waiting further.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services