Thread: Logical decoding fast-forward and slot advance
Hi, Attached is patch which adds ability to do fast-forwarding while decoding. That means wal is consumed as fast as possible and changes are not given to output plugin for sending. The implementation is less invasive than I originally though it would be. Most of it is just additional filter condition in places where we would normally filter out changes because we don't yet have full snapshot. This is useful for multiple things. It enables us to do the replication slot advance for both physical and logical slots, something that Magnus took stab at some time ago, but does not seem like it went anywhere (this is useful for replication tooling). This patch adds SQL visible pg_replication_slot_advance() function for that use case. It also makes second phase (after we reached SNAPBUILD_FULL_SNAPSHOT) of replication slot creation faster, especially when there are big transactions as the reorder buffer does not have to deal with data changes and does not have to spill to disk. And finally it will be useful for developing failover support of slots. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Hi!
Excellent!
Just a quick note on my stab at it - it was physical only, which is more limited than this of course. My plan was to clean it up based on feedback in the next couple of days for the Jan cf, but with this submission in i think the effort is better directed there. I'll keep mine around as a backup in case something shows up with yours so it won't make it on time since mine is simpler.
/Magnus
On Dec 31, 2017 11:44, "Petr Jelinek" <petr.jelinek@2ndquadrant.com> wrote:
Hi,
Attached is patch which adds ability to do fast-forwarding while
decoding. That means wal is consumed as fast as possible and changes are
not given to output plugin for sending. The implementation is less
invasive than I originally though it would be. Most of it is just
additional filter condition in places where we would normally filter out
changes because we don't yet have full snapshot.
This is useful for multiple things. It enables us to do the replication
slot advance for both physical and logical slots, something that Magnus
took stab at some time ago, but does not seem like it went anywhere
(this is useful for replication tooling). This patch adds SQL visible
pg_replication_slot_advance() function for that use case.
It also makes second phase (after we reached SNAPBUILD_FULL_SNAPSHOT) of
replication slot creation faster, especially when there are big
transactions as the reorder buffer does not have to deal with data
changes and does not have to spill to disk.
And finally it will be useful for developing failover support of slots.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 31 December 2017 at 10:44, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > Attached is patch which adds ability to do fast-forwarding while > decoding. That means wal is consumed as fast as possible and changes are > not given to output plugin for sending. The implementation is less > invasive than I originally though it would be. Most of it is just > additional filter condition in places where we would normally filter out > changes because we don't yet have full snapshot. Looks good. The precise definition of "slot advance" or "fast forward" isn't documented in the patch. If we advance past everything, why is there not just one test in LogicalDecodingProcessRecord() to say if (ctx->fast_forward)? Why put it in multiple decoding subroutines? If ctx->fast_forward is set it might throw off other opps, so it would be useful to see some Asserts elsewhere to make sure we understand and avoid breakage. In pg_replication_slot_advance() the moveto variable is set to PG_GETARG_LSN(1) and then unconditionally overwritten before it is used for anything. Why? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 08/01/18 08:02, Simon Riggs wrote: > On 31 December 2017 at 10:44, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > >> Attached is patch which adds ability to do fast-forwarding while >> decoding. That means wal is consumed as fast as possible and changes are >> not given to output plugin for sending. The implementation is less >> invasive than I originally though it would be. Most of it is just >> additional filter condition in places where we would normally filter out >> changes because we don't yet have full snapshot. > > Looks good. > Thanks. > The precise definition of "slot advance" or "fast forward" isn't > documented in the patch. If we advance past everything, why is there > not just one test in LogicalDecodingProcessRecord() to say if > (ctx->fast_forward)? Why put it in multiple decoding subroutines? > Because we still need to track transactions (otherwise slot's restart position will not move forward) and mark transactions which did DDL changes so that historical snapshots are made. Otherwise if we moved slot forward and then started real decoding from that position we'd have wrong view of catalogs. We'd have to write different version of LogicalDecodingProcessRecord() and duplicate some of the code in the Decode* functions there which seems like it would be harder to maintain. I might be inclined to do it with this approach if the current approach would mean adding new branch into every Decode* function, but since there is already branch for filtering actions during initial snapshot build, I think it's better to just extend that. > If ctx->fast_forward is set it might throw off other opps, so it would > be useful to see some Asserts elsewhere to make sure we understand and > avoid breakage Hmm, I think the really only places where this can be issue and also can be checked using Assert are the cb wrappers in logical.c which call the output plugin (output plugin is not supposed to be called when fast-forwarding) so I Added assert to each of them. > In pg_replication_slot_advance() the moveto variable is set to > PG_GETARG_LSN(1) and then unconditionally overwritten before it is > used for anything. Why? > Eh, there is missing Min, it should be used for clamping, not done unconditionally. Fixed and added regression test for this. Updated version attached. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 14 January 2018 at 23:15, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > Updated version attached. Applied, thanks -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services