Hello hackers,
While working on two phase related issues, I found something related to two phase could be optimized.
1. The current implementation decouples PREPRE and COMMIT/ABORT PREPARE a lot. This is flexible, but if
PREPARE & COMMIT/ABORT mostly happens on the same backend we could use the cache mechanism to
speed up, e.g.
a. FinishPreparedTransaction()->LockGXact(gid, user)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
find the gxact that matches gid
For this we can cache the gxact during PREPARE and use that for a fast path, i.e. if the cached gxact
matches gid we do not need to walk through the gxact array. By the way, if the gxact array is large this
will be a separate performance issue (use shared-memory hash table if needed?).
b. FinishPreparedTransaction() reads the PREPARE information from either state file (stored during checkpoint)
or wal file. We could cache the content during PREPARE, i.e. in EndPrepare() then in
FinishPreparedTransaction()
we can avoid reading the state file or the wal file.
It is possible that some databases based on Postgres two phase might not want the cache, e.g. if PREPARE
backend is always different than the COMMIT/ABORT PREPARE backend (I do not know what database is
designing like this though), but gxact caching is almost no overhead and for b we could use ifdef to guard the
PREPARE wal data copying code.
The two optimizations are easy and small. I've verified on Greenplum database (based on Postgres 12).
2. wal content duplication between PREPARE and COMMT/ABORT PREPARE
See the below COMMIT PREPARE function call. Those hdr->* have existed in PREPARE wal also. We do
not need them in the COMMIT PREPARE wal also. During recovery, we could load these information (both
COMMIT and ABORT) into memory and in COMMIT/ABORT PREPARE redo we use the corresponding data.
RecordTransactionCommitPrepared(xid,
hdr->nsubxacts, children,
hdr->ncommitrels, commitrels,
hdr->ninvalmsgs, invalmsgs,
hdr->initfileinval, gid);
One drawback of the change is this might involve non-trivial change.
3. About gid, current gid is defined as a char[]. I'm wondering if we should define an opaque type and let some
Databases implement their own gid types using callbacks. Typically if I want to use 64-bit distributed xid as gid,
current code is not that performance & storage friendly (e.g. still need to use strcmp to find gxact in
LockGXact,).
We may implement a default implementation as char[]. gid is not widely used so the change seems to
be small (interfaces of copy, comparison, conversion from string to internal gid type for the PREPARE statement,
etc)
Any thoughts?
Regards,
Paul