[PATCH] add concurrent_abort callback for output plugin - Mailing list pgsql-hackers

From Markus Wanner
Subject [PATCH] add concurrent_abort callback for output plugin
Date
Msg-id f82133c6-6055-b400-7922-97dae9f2b50b@enterprisedb.com
Whole thread Raw
Responses Re: [PATCH] add concurrent_abort callback for output plugin  (Andres Freund <andres@anarazel.de>)
Re: [PATCH] add concurrent_abort callback for output plugin  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

here is another tidbit from our experience with using logical decoding. 
The attached patch adds a callback to notify the output plugin of a 
concurrent abort.  I'll continue to describe the problem in more detail 
and how this additional callback solves it.

Streamed transactions as well as two-phase commit transactions may get 
decoded before they finish.  At the point the begin_cb is invoked and 
first changes are delivered to the output plugin, it is not necessarily 
known whether the transaction will commit or abort.

This leads to the possibility of the transaction getting aborted 
concurrent to logical decoding.  In that case, it is likely for the 
decoder to error on a catalog scan that conflicts with the abort of the 
transaction.  The reorderbuffer sports a PG_CATCH block to cleanup. 
However, it does not currently inform the output plugin.  From its point 
of view, the transaction is left dangling until another one comes along 
or until the final ROLLBACK or ROLLBACK PREPARED record from WAL gets 
decoded.  Therefore, what the output plugin might see in this case is:

* filter_prepare_cb (txn A)
* begin_prepare_cb  (txn A)
* apply_change      (txn A)
* apply_change      (txn A)
* apply_change      (txn A)
* begin_cb          (txn B)

In other words, in this example, only the begin_cb of the following 
transaction implicitly tells the output plugin that txn A could not be 
fully decoded.  And there's no upper time boundary on when that may 
happen.  (It could also be another filter_prepare_cb, if the subsequent 
transaction happens to be a two-phase transaction as well.  Or an 
explicit rollback_prepared_cb or stream_abort if there's no other 
transaction in between.)

An alternative and arguably cleaner approach for streamed transactions 
may be to directly invoke stream_abort.  However, the lsn argument 
passed could not be that of the abort record, as that's not known at the 
point in time of the concurrent abort.  Plus, this seems like a bad fit 
for two-phase commit transactions.

Again, this callback is especially important for output plugins that 
invoke further actions on downstream nodes that delay the COMMIT 
PREPARED of a transaction upstream, e.g. until prepared on other nodes. 
Up until now, the output plugin has no way to learn about a concurrent 
abort of the currently decoded (2PC or streamed) transaction (perhaps 
short of continued polling on the transaction status).

I also think it generally improves the API by allowing the output plugin 
to rely on such a callback, rather than having to implicitly deduce this 
from other callbacks.

Thoughts or comments?  If this is agreed on, I can look into adding 
tests (concurrent aborts are not currently covered, it seems).

Regards

Markus

Attachment

pgsql-hackers by date:

Previous
From: Markus Wanner
Date:
Subject: Re: [PATCH] Provide more information to filter_prepare
Next
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Custom compression methods