Re: 32/64-bit transaction IDs? - Mailing list pgsql-general

From Ed L.
Subject Re: 32/64-bit transaction IDs?
Date
Msg-id 200303221257.01205.pgsql@bluepolka.net
Whole thread Raw
In response to Re: 32/64-bit transaction IDs?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: 32/64-bit transaction IDs?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Saturday March 22 2003 12:00, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > On Saturday March 22 2003 11:29, Tom Lane wrote:
> >> I think this last part is wrong.  It shouldn't be using the xid as
> >> part of the ordering, only the sequence value.
> >
> > Why not?  How would you replay them on the slave in the same
> > transaction groupings and order that occurred on the master?
>
> Why not is easy:
>
>     begin xact 1;
>     update tuple X;
>     ...
>                     begin xact 2;
>                     update tuple Y;
>                     commit;
>     ...
>     update tuple Y;
>     commit;
>
> (Note that this is only possible in read-committed mode, else xact 1
> wouldn't be allowed to update tuple Y like this.)  Here, you must
> replay xact 1's update of Y after xact 2's update of Y, else you'll
> get the wrong final state on the slave.  On the other hand, it really
> does not matter whether you replay the tuple X update before or after
> you replay xact 2, because xact 2 didn't touch tuple X.
>
> If the existing DBmirror code is sorting as you say, then it will fail
> in this scenario --- unless it always manages to execute a propagation
> step in between the commits of xacts 2 and 1, which doesn't seem very
> trustworthy.

Well, I'm not absolutely certain, but I think this problem may indeed exist
in dbmirror.  If I'm reading it correctly, dbmirror basically has the
following:

    create table xact_queue (xid int, seqid serial, ...);
    create table tuple_queue (seqid int, data, ...);

The dbmirror trigger does this:

myXid = GetCurrentTransactionId();
insert into xact_queue (myXid, nextval(seqid_seq));
insert into tuple_queue (seqid, data, ...) values (currval(seqid_seq), ...);

The slave then grabs all queued xids in order of the max seqid within each
transaction.  Essentially,

    SELECT xid, MAX(seqid)
    FROM xact_queue
    GROUP BY xid
    ORDER BY MAX(seqid);

In your scenario it would order them xact1, then xact2, since xact 1's
update of Y would have the max seqid.  For each xact, it replays the tuples
for that xact in seqid order.

    SELECT t.seqid, t.data, ...
    FROM tuple_queue t, xact_queue x
    WHERE t.seqid = x.seqid
        AND x.xid = $XID
    ORDER BY t.seqid;

So the actual replay order would be

    xact1: update X
    xact1: update Y
    xact2: update Y

leading to slave inconsistency.

> What I'm envisioning is that you should just send updates in the order
> of their insertion sequence numbers and *not* try to force them into
> transactional grouping. ...

Very good.  Makes perfect sense to me now.  That also apparently obviates
the need for 64-bit transactions since the sequence can be a BIGINT.

Thanks,
Ed


pgsql-general by date:

Previous
From: Joe Conway
Date:
Subject: Re: table function: limit, offset, order
Next
From: Tom Lane
Date:
Subject: Re: 32/64-bit transaction IDs?