Dear Hackers,
I have been spending time for implementing the patch, and I think it's time to
share on -hackers.
Patches 0001-0004 are largely not changed; some refactoring were done.
Now 0004 has a basic test for dependency tracking.
Remained patches enhance the parallel apply feature. 0006, 0007 and 0008 contains tests.
0005 was copied from [1]. The patch is needed for applying the prepared
transactions correctly. Please post comments at [1] if you have any comments on
it.
0006 contains changes for supporting two-phase transactions in parallel.
Parallel workers can be assigned when the BEGIN_PREPARE message comes, and
released after the PREPARE message. As with normal non-streamed transactions,
prepared transactions are marked as parallelized when the leader dispatches a
PREPARE message to the parallel workers, and they are removed when the parallel
worker finishes preparing. This allows upcoming transactions to not commit
transactions till the parallel worker finishes the preparation.
Same as streaming transactions, COMMIT/ROLLBACK PREPARED messages are handled by
the leader worker. At that time, the leader waits for the last transaction
launched to finish.
0007 contains changes to track dependencies for streamed transactions.
In streaming=on mode, dependency tracking and waiting are performed while changes
are applied. The leader does nothing while serializing changes.
In the case of streaming=parallel mode, we must track and wait based on
dependencies. Basically, non-streamed transactions do not have to wait for
streamed transactions because the leader worker always waits for them to be
applied. In contrast, streamed transactions must wait for the lastly dispatched
non-streamed transactions. Based on that, streamed transactions won't be marked
as parallelized, and the XID of the streamed transaction won't be set for the
replica identity hash entry. This means no parallel workers would wait for the
streamed transactions. Other than that, dependency tracking is done the same as
in a non-streaming case.
0008 contains changes to track dependencies based on subscriber-local indexes.
This extends the RI hash table to allow values to be stored based on local
indexes. The information, which indexes are defined for the table, is gathered
by leader, when the dependency checking for the table is firstly done in a transaction.
The detection mechanism is mostly the same as the RI case.
How do you feel?
[1]:
https://www.postgresql.org/message-id/TY4PR01MB169078771FB31B395AB496A6B94B4A%40TY4PR01MB16907.jpnprd01.prod.outlook.com
[2]:
https://www.postgresql.org/message-id/OS0PR01MB5716D43CB68DB8FFE73BF65D942AA%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Best regards,
Hayato Kuroda
FUJITSU LIMITED