Thread: Logical archiving

Logical archiving

From

Andrey Borodin

Date:

04 December 2020, 07:33:44

Hi all

I was discussing problems of CDC with scientific community and they asked this simple question: "So you have efficient
WALarchive on a very cheap storage, why don't you have a logical archive too?" 
This seems like a wild idea. But really, we have a super expensive NVMe drives for OLTP workload. And use this devices
tostore buffer for data to be dumped into MapReduce\YT analytical system. 
If OLAP cannot consume data fast enough - we are out of space due to repl slot.
If we have a WAL HA switchover - OLAP has a hole in the stream and have to resync data from the scratch.

If we could just run archive command ```archive-tool wal-push 0000000900000F2C000000E1.logical``` with contents of
logicalreplication - this would be super cool for OLAP. I'd prefer even avoid writing 0000000900000F2C000000E1.logical
todisk, i.e. push data on stdio or something like that. 

What do you think?

Best regards, Andrey Borodin.

Re: Logical archiving

From

Euler Taveira

Date:

04 December 2020, 17:14:50

On Fri, 4 Dec 2020 at 04:33, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I was discussing problems of CDC with scientific community and they asked this simple question: "So you have efficient WAL archive on a very cheap storage, why don't you have a logical archive too?"

WAL archive doesn't process data; it just copies from one location into another one. However, "logical archive" must process data.

If we could just run archive command ```archive-tool wal-push 0000000900000F2C000000E1.logical``` with contents of logical replication - this would be super cool for OLAP. I'd prefer even avoid writing 0000000900000F2C000000E1.logical to disk, i.e. push data on stdio or something like that.

The most time consuming process is logical decoding, mainly due to long running transactions. In order to minimize your issue, we should improve the logical decoding mechanism. There was a discussion about allowing logical decoding on the replica that would probably help your use case a lot.

Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Logical archiving

From

Andrey Borodin

Date:

04 December 2020, 17:36:25

Hi Euler!

Thanks for your response.

> 4 дек. 2020 г., в 22:14, Euler Taveira <euler.taveira@2ndquadrant.com> написал(а):
>
> On Fri, 4 Dec 2020 at 04:33, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> I was discussing problems of CDC with scientific community and they asked this simple question: "So you have
efficientWAL archive on a very cheap storage, why don't you have a logical archive too?" 
>
> WAL archive doesn't process data; it just copies from one location into another one. However, "logical archive" must
processdata. 
WAL archiving processes data: it does compression, encryption and digesting. Only minimal impractical setup will copy
dataas is. However I agree, that all processing is done outside postgres. 

> If we could just run archive command ```archive-tool wal-push 0000000900000F2C000000E1.logical``` with contents of
logicalreplication - this would be super cool for OLAP. I'd prefer even avoid writing 0000000900000F2C000000E1.logical
todisk, i.e. push data on stdio or something like that. 
>
> The most time consuming process is logical decoding, mainly due to long running transactions.
Currently I do not experience problem of high CPU utilisation.

> In order to minimize your issue, we should improve the logical decoding mechanism.
No, the issue I'm facing comes from the fact that corner cases of failover are not solved properly for logical
replication.Timelines, partial segments, archiving along with streaming, starting from arbitrary LSN (within available
WAL),rewind, named restore points, cascade replication etc etc. All these nice things are there for WAL and are missing
forLR. I'm just trying to find shortest path through this to make CDC(changed data capture) work. 

> There was a discussion about allowing logical decoding on the replica that would probably help your use case a lot.
I will look there more closely, thanks! But it's only part of a solution.


Best regards, Andrey Borodin.

Re: Logical archiving

From

Euler Taveira

Date:

04 December 2020, 18:28:20

On Fri, 4 Dec 2020 at 14:36, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

>
> The most time consuming process is logical decoding, mainly due to long running transactions.
Currently I do not experience problem of high CPU utilisation.

I'm wondering why the LSN isn't moving fast enough for your use case.

> In order to minimize your issue, we should improve the logical decoding mechanism.
No, the issue I'm facing comes from the fact that corner cases of failover are not solved properly for logical replication. Timelines, partial segments, archiving along with streaming, starting from arbitrary LSN (within available WAL), rewind, named restore points, cascade replication etc etc. All these nice things are there for WAL and are missing for LR. I'm just trying to find shortest path through this to make CDC(changed data capture) work.

Craig started a thread a few days ago [1] that described some of these issues and possible solutions [2]. The lack of HA with logical replication reduces the number of solutions that could possibly use this technology. Some of the facilities such as logical replication slots and replication origin on failover-candidate subscribers should encourage users to adopt such solutions.

[1] https://www.postgresql.org/message-id/CAGRY4nx0-ZVnFJV5749QCqwmqBMkjQpcFkYY56a9U6Vf%2Bf7-7Q%40mail.gmail.com

[2] https://wiki.postgresql.org/wiki/Logical_replication_and_physical_standby_failover

Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Logical archiving

From

Craig Ringer

Date:

07 December 2020, 03:05:12

Reply follows inline. I addressed your last point first, so it's out of order.

On Fri, 4 Dec 2020 at 15:33, Andrey Borodin <x4mmm@yandex-team.ru> wrote

> If OLAP cannot consume data fast enough - we are out of space due to repl slot.

There is a much simpler solution to this than logical PITR.

What we should be doing is teaching xlogreader how to invoke the restore_command to fetch archived WALs for decoding.

Replication slots already have a WAL retention limit, but right now when that limit is reached the slot is invalidated and becomes useless, it's effectively dropped. Instead, if WAL archiving is enabled, we should leave the slot as valid. If a consumer of the slot needs WAL that no longer exists in pg_wal, we should have the walsender invoke the restore_command to read the missing WAL segment, decode it, and remove it again.

This would not be a technically difficult patch, and it's IMO one of the more important ones for improving logical replication.

> I was discussing problems of CDC with scientific community and they asked this simple question: "So you have efficient WAL archive on a very cheap storage, why don't you have a logical archive too?"

I've done work in this area, as has Petr (CC'd).

In short, logical archiving and PITR is very much desirable, but we're not nearly ready for it yet and we're missing a lot of the foundations needed to make it really useful.

IMO the strongest pre-requisite is that we need integrated DDL capture and replication in Pg. While this could be implemented in the publisher/subscriber logic for logical replication, it would make much more sense (IMO) to make it easier to feed DDL events into any logical replication output plugin.

pglogical3 (the closed one) has quite comprehensive DDL replication support. Doing it is not simple though - there are plenty of complexities:

* Reliably identifying the target objects and mapping them to replication set memberships for DML-replication
* Capturing, replicating and managing the search_path and other DDL execution context (DateStyle and much more) reliably

Each statement type needs specific logic to indicate whether it needs DDL replication (and often filter functions since we have lots of sub-types where some need replication and some don't)
Handling DDL affecting global objects in pg_global correctly, like those affecting roles, grants, database security labels etc. There's no one right answer for this, it depends on the deployment and requires the user to cooperate.
Correct handling of transactions that mix DDL and DML (mostly only an issue for multimaster).
Identifying statements that target a mix of replicated and non-replicated objects and handling them appropriately, including for CASCADEs
Gracefully handling DDL statements that mix TEMPORARY and persistent targets. We can do this ok for DROPs but it still requires care. Anything else gets messier.
Lack of hooks into table rewrite operations and the extremely clumsy and inefficient way logical decoding currently exposes decoding of the temp-table data during decoding of rewrites means handling table-rewriting DDL is difficult and impractical to do correctly. In pglogical we punt on it entirely and refuse to permit DDL that would rewrite a table except where we can prove it's reliant only on immutable inputs so we can discard the upstream rewrite and rely on statement replication.
As a consequence of the above, reliably determining whether a given statement will cause a table rewrite.
Handling re-entrant ProcessUtility_hook calls for ALTER TABLE etc.
Handling TRUNCATE's pseudo-DDL pseudo-DML halfway state, doing something sensible for truncate cascade.
Probably more I've forgotten

If we don't handle these, then any logical change-log archives will become largely useless as soon as there's any schema change.

So we kind of have to solve DDL replication first IMO.

Some consideration is also required for metadata management. Right now relation and type metadata has session-lifetime, but you'd want to be able to discard old logical change-stream archives and have the later ones still be usable. So we'd need to define some kind of restartpoint where we repeat the metadata, or we'd have to support externalizing the metadata so it can be retained when the main change archives get aged out.

We'd also need to separate the existing apply worker into a "receiver" and "apply/writer" part, so the wire-protocol handling isn't tightly coupled with the actual change apply code, in order to make it possible to actually consume those archives and apply them to the database. In pglogical3 we did that by splitting them into two processes, connected by a shm_mq. Originally the process split was optional and you could run a combined receiver/writer process without the shm_mq if you wanted, but we quickly found it difficult to reliably handle locking issues etc that way so the writers all moved out-of-process.

That was done mainly to make it possible to support parallelism in logical decoding apply. But we also have the intention of supporting an alternative reader process that can ingest "logical archives" and send them to the writer to apply them, as if they'd been received from the on-wire stream. That's not implemented at this time though. It'd be useful for a number of things:

* PITR-style logical replay and recovery

* Ability to pre-decode a txn once on the upstream then send the buffered protocol-stream to multiple subscribers, saving on logical decoding and reorder buffering overheads and write-multiplication costs

* ability to ingest change-streams generated by non-postgres sources so we could support streaming foreign-data ingestion, streaming OLAP and data warehousing, etc

To make logical PITR more useful we'd also want to be a bit more tolerant of schema divergence, though that's not overly hard to do:

fill defaults for downstream columns if no value is present for the column in the upstream row and the downstream column is nullable or has a default (I think built-in logical rep does this one already)
ignore values for columns in upstream data if the downstream table lacks the column and the upstream value is null
optionally allow apply to be configured to ignore non-null data in upstream columns that're missing on downstream
optionally allow apply to be configured to drop rows on the floor if the downstream table is missing
policies for handling data conflicts like duplicate PKs

and we'd probably want ways to filter the apply data-stream to apply changes for only a subset of tables, rows, etc at least in a later version.

None of this is insurmountable. Most or all of the DDL replication support and divergence-tolerance stuff is already done in production deployments using pglogical3 and bdr3.

While I can't share the code, I am happy to share the experience I have gained from my part in working on these things. As you've probably recently seen with the wiki article I wrote on physical/logical failover interop.

You're free to take information like this and use it in wiki articles too.

Right now I won't be able to launch into writing big patches for these things, but I'll do my best to share what I can and review things.

> This seems like a wild idea. But really, we have a super expensive NVMe drives for OLTP workload. And use this devices to store buffer for data to be dumped into MapReduce\YT analytical system.

It's not a wild idea at all, as noted above.

In pglogical3 we already support streaming decoded WAL data to alternative writer downstreams including RabbitMQ and Kafka via writer plugins.

Re: Logical archiving

From

Craig Ringer

Date:

07 December 2020, 03:05:35

Actually CC'd Petr this time.

On Mon, 7 Dec 2020 at 11:05, Craig Ringer <craig.ringer@enterprisedb.com> wrote:

Reply follows inline. I addressed your last point first, so it's out of order.

On Fri, 4 Dec 2020 at 15:33, Andrey Borodin <x4mmm@yandex-team.ru> wrote

> If OLAP cannot consume data fast enough - we are out of space due to repl slot.

There is a much simpler solution to this than logical PITR.

What we should be doing is teaching xlogreader how to invoke the restore_command to fetch archived WALs for decoding.

Replication slots already have a WAL retention limit, but right now when that limit is reached the slot is invalidated and becomes useless, it's effectively dropped. Instead, if WAL archiving is enabled, we should leave the slot as valid. If a consumer of the slot needs WAL that no longer exists in pg_wal, we should have the walsender invoke the restore_command to read the missing WAL segment, decode it, and remove it again.

This would not be a technically difficult patch, and it's IMO one of the more important ones for improving logical replication.

> I was discussing problems of CDC with scientific community and they asked this simple question: "So you have efficient WAL archive on a very cheap storage, why don't you have a logical archive too?"

I've done work in this area, as has Petr (CC'd).

In short, logical archiving and PITR is very much desirable, but we're not nearly ready for it yet and we're missing a lot of the foundations needed to make it really useful.

IMO the strongest pre-requisite is that we need integrated DDL capture and replication in Pg. While this could be implemented in the publisher/subscriber logic for logical replication, it would make much more sense (IMO) to make it easier to feed DDL events into any logical replication output plugin.

pglogical3 (the closed one) has quite comprehensive DDL replication support. Doing it is not simple though - there are plenty of complexities:

* Reliably identifying the target objects and mapping them to replication set memberships for DML-replication
* Capturing, replicating and managing the search_path and other DDL execution context (DateStyle and much more) reliably
Each statement type needs specific logic to indicate whether it needs DDL replication (and often filter functions since we have lots of sub-types where some need replication and some don't)
Handling DDL affecting global objects in pg_global correctly, like those affecting roles, grants, database security labels etc. There's no one right answer for this, it depends on the deployment and requires the user to cooperate.
Correct handling of transactions that mix DDL and DML (mostly only an issue for multimaster).
Identifying statements that target a mix of replicated and non-replicated objects and handling them appropriately, including for CASCADEs
Gracefully handling DDL statements that mix TEMPORARY and persistent targets. We can do this ok for DROPs but it still requires care. Anything else gets messier.
Lack of hooks into table rewrite operations and the extremely clumsy and inefficient way logical decoding currently exposes decoding of the temp-table data during decoding of rewrites means handling table-rewriting DDL is difficult and impractical to do correctly. In pglogical we punt on it entirely and refuse to permit DDL that would rewrite a table except where we can prove it's reliant only on immutable inputs so we can discard the upstream rewrite and rely on statement replication.
As a consequence of the above, reliably determining whether a given statement will cause a table rewrite.
Handling re-entrant ProcessUtility_hook calls for ALTER TABLE etc.
Handling TRUNCATE's pseudo-DDL pseudo-DML halfway state, doing something sensible for truncate cascade.
Probably more I've forgotten

If we don't handle these, then any logical change-log archives will become largely useless as soon as there's any schema change.

So we kind of have to solve DDL replication first IMO.

Some consideration is also required for metadata management. Right now relation and type metadata has session-lifetime, but you'd want to be able to discard old logical change-stream archives and have the later ones still be usable. So we'd need to define some kind of restartpoint where we repeat the metadata, or we'd have to support externalizing the metadata so it can be retained when the main change archives get aged out.

We'd also need to separate the existing apply worker into a "receiver" and "apply/writer" part, so the wire-protocol handling isn't tightly coupled with the actual change apply code, in order to make it possible to actually consume those archives and apply them to the database. In pglogical3 we did that by splitting them into two processes, connected by a shm_mq. Originally the process split was optional and you could run a combined receiver/writer process without the shm_mq if you wanted, but we quickly found it difficult to reliably handle locking issues etc that way so the writers all moved out-of-process.

That was done mainly to make it possible to support parallelism in logical decoding apply. But we also have the intention of supporting an alternative reader process that can ingest "logical archives" and send them to the writer to apply them, as if they'd been received from the on-wire stream. That's not implemented at this time though. It'd be useful for a number of things:

* PITR-style logical replay and recovery
* Ability to pre-decode a txn once on the upstream then send the buffered protocol-stream to multiple subscribers, saving on logical decoding and reorder buffering overheads and write-multiplication costs
* ability to ingest change-streams generated by non-postgres sources so we could support streaming foreign-data ingestion, streaming OLAP and data warehousing, etc

To make logical PITR more useful we'd also want to be a bit more tolerant of schema divergence, though that's not overly hard to do:
fill defaults for downstream columns if no value is present for the column in the upstream row and the downstream column is nullable or has a default (I think built-in logical rep does this one already)
ignore values for columns in upstream data if the downstream table lacks the column and the upstream value is null
optionally allow apply to be configured to ignore non-null data in upstream columns that're missing on downstream
optionally allow apply to be configured to drop rows on the floor if the downstream table is missing
policies for handling data conflicts like duplicate PKs
and we'd probably want ways to filter the apply data-stream to apply changes for only a subset of tables, rows, etc at least in a later version.

None of this is insurmountable. Most or all of the DDL replication support and divergence-tolerance stuff is already done in production deployments using pglogical3 and bdr3.

While I can't share the code, I am happy to share the experience I have gained from my part in working on these things. As you've probably recently seen with the wiki article I wrote on physical/logical failover interop.

You're free to take information like this and use it in wiki articles too.

Right now I won't be able to launch into writing big patches for these things, but I'll do my best to share what I can and review things.

> This seems like a wild idea. But really, we have a super expensive NVMe drives for OLTP workload. And use this devices to store buffer for data to be dumped into MapReduce\YT analytical system.

It's not a wild idea at all, as noted above.

In pglogical3 we already support streaming decoded WAL data to alternative writer downstreams including RabbitMQ and Kafka via writer plugins.

Re: Logical archiving

From

Amit Kapila

Date:

07 December 2020, 09:00:17

On Mon, Dec 7, 2020 at 8:35 AM Craig Ringer
<craig.ringer@enterprisedb.com> wrote:
>
> Reply follows inline. I addressed your last point first, so it's out of order.
>
> On Fri, 4 Dec 2020 at 15:33, Andrey Borodin <x4mmm@yandex-team.ru> wrote
>
> We'd also need to separate the existing apply worker into a "receiver" and "apply/writer" part, so the wire-protocol
handlingisn't tightly coupled with the actual change apply code, in order to make it possible to actually consume those
archivesand apply them to the database. In pglogical3 we did that by splitting them into two processes, connected by a
shm_mq.Originally the process split was optional and you could run a combined receiver/writer process without the
shm_mqif you wanted, but we quickly found it difficult to reliably handle locking issues etc that way so the writers
allmoved out-of-process. 
>
> That was done mainly to make it possible to support parallelism in logical decoding apply. But we also have the
intentionof supporting an alternative reader process that can ingest "logical archives" and send them to the writer to
applythem, as if they'd been received from the on-wire stream. That's not implemented at this time though. It'd be
usefulfor a number of things: 
>
> * PITR-style logical replay and recovery
> * Ability to pre-decode a txn once on the upstream then send the buffered protocol-stream to multiple subscribers,
savingon logical decoding and reorder buffering overheads and write-multiplication costs 
>

I think doing parallel apply and ability to decode a txn once are
really good improvements independent of all the work you listed.
Thanks for sharing your knowledge.

--
With Regards,
Amit Kapila.

Re: Logical archiving

From

Andrey Borodin

Date:

08 December 2020, 09:54:55

Thanks Craig!
Probably, I should better ask in your nearby thread about logical replication, it just seemed to me that logical
archivingis somewhat small independent piece of functionality... 

> 7 дек. 2020 г., в 08:05, Craig Ringer <craig.ringer@enterprisedb.com> написал(а):
>
> Reply follows inline. I addressed your last point first, so it's out of order.
>
> On Fri, 4 Dec 2020 at 15:33, Andrey Borodin <x4mmm@yandex-team.ru> wrote
>
> > If OLAP cannot consume data fast enough - we are out of space due to repl slot.
>
> There is a much simpler solution to this than logical PITR.
>
> What we should be doing is teaching xlogreader how to invoke the restore_command to fetch archived WALs for decoding.

> Replication slots already have a WAL retention limit, but right now when that limit is reached the slot is
invalidatedand becomes useless, it's effectively dropped. Instead, if WAL archiving is enabled, we should leave the
slotas valid. If a consumer of the slot needs WAL that no longer exists in pg_wal, we should have the walsender invoke
therestore_command to read the missing WAL segment, decode it, and remove it again. 
>
> This would not be a technically difficult patch, and it's IMO one of the more important ones for improving logical
replication.
Currently we have restore_command in regular config, not in recovery.conf, so, probably, it should not be a very big
dealto implement this. 
>
> > I was discussing problems of CDC with scientific community and they asked this simple question: "So you have
efficientWAL archive on a very cheap storage, why don't you have a logical archive too?" 
>
> I've done work in this area, as has Petr (CC'd).
>
> In short, logical archiving and PITR is very much desirable, but we're not nearly ready for it yet and we're missing
alot of the foundations needed to make it really useful. 
>
> IMO the strongest pre-requisite is that we need integrated DDL capture and replication in Pg. While this could be
implementedin the publisher/subscriber logic for logical replication, it would make much more sense (IMO) to make it
easierto feed DDL events into any logical replication output plugin. 
>
> pglogical3 (the closed one) has quite comprehensive DDL replication support. Doing it is not simple though - there
areplenty of complexities: 
>
> * Reliably identifying the target objects and mapping them to replication set memberships for DML-replication
> * Capturing, replicating and managing the search_path and other DDL execution context (DateStyle and much more)
reliably
>     • Each statement type needs specific logic to indicate whether it needs DDL replication (and often filter
functionssince we have lots of sub-types where some need replication and some don't) 
>     • Handling DDL affecting global objects in pg_global correctly, like those affecting roles, grants, database
securitylabels etc. There's no one right answer for this, it depends on the deployment and requires the user to
cooperate.
>     • Correct handling of transactions that mix DDL and DML (mostly only an issue for multimaster).
>     • Identifying statements that target a mix of replicated and non-replicated objects and handling them
appropriately,including for CASCADEs 
>     • Gracefully handling DDL statements that mix TEMPORARY and persistent targets. We can do this ok for DROPs but
itstill requires care. Anything else gets messier. 
>     • Lack of hooks into table rewrite operations and the extremely clumsy and inefficient way logical decoding
currentlyexposes decoding of the temp-table data during decoding of rewrites means handling table-rewriting DDL is
difficultand impractical to do correctly. In pglogical we punt on it entirely and refuse to permit DDL that would
rewritea table except where we can prove it's reliant only on immutable inputs so we can discard the upstream rewrite
andrely on statement replication. 
>     • As a consequence of the above, reliably determining whether a given statement will cause a table rewrite.
>     • Handling re-entrant ProcessUtility_hook calls for ALTER TABLE etc.
>     • Handling TRUNCATE's pseudo-DDL pseudo-DML halfway state, doing something sensible for truncate cascade.
>     • Probably more I've forgotten
>
> If we don't handle these, then any logical change-log archives will become largely useless as soon as there's any
schemachange. 
>
> So we kind of have to solve DDL replication first IMO.
>
> Some consideration is also required for metadata management. Right now relation and type metadata has
session-lifetime,but you'd want to be able to discard old logical change-stream archives and have the later ones still
beusable. So we'd need to define some kind of restartpoint where we repeat the metadata, or we'd have to support
externalizingthe metadata so it can be retained when the main change archives get aged out. 
>
> We'd also need to separate the existing apply worker into a "receiver" and "apply/writer" part, so the wire-protocol
handlingisn't tightly coupled with the actual change apply code, in order to make it possible to actually consume those
archivesand apply them to the database. In pglogical3 we did that by splitting them into two processes, connected by a
shm_mq.Originally the process split was optional and you could run a combined receiver/writer process without the
shm_mqif you wanted, but we quickly found it difficult to reliably handle locking issues etc that way so the writers
allmoved out-of-process. 
>
> That was done mainly to make it possible to support parallelism in logical decoding apply. But we also have the
intentionof supporting an alternative reader process that can ingest "logical archives" and send them to the writer to
applythem, as if they'd been received from the on-wire stream. That's not implemented at this time though. It'd be
usefulfor a number of things: 
>
> * PITR-style logical replay and recovery
> * Ability to pre-decode a txn once on the upstream then send the buffered protocol-stream to multiple subscribers,
savingon logical decoding and reorder buffering overheads and write-multiplication costs 
> * ability to ingest change-streams generated by non-postgres sources so we could support streaming foreign-data
ingestion,streaming OLAP and data warehousing, etc 
>
> To make logical PITR more useful we'd also want to be a bit more tolerant of schema divergence, though that's not
overlyhard to do: 
>     • fill defaults for downstream columns if no value is present for the column in the upstream row and the
downstreamcolumn is nullable or has a default (I think built-in logical rep does this one already) 
>     • ignore values for columns in upstream data if the downstream table lacks the column and the upstream value is
null
>     • optionally allow apply to be configured to ignore non-null data in upstream columns that're missing on
downstream
>     • optionally allow apply to be configured to drop rows on the floor if the downstream table is missing
>     • policies for handling data conflicts like duplicate PKs
> and we'd probably want ways to filter the apply data-stream to apply changes for only a subset of tables, rows, etc
atleast in a later version. 
>
> None of this is insurmountable. Most or all of the DDL replication support and divergence-tolerance stuff is already
donein production deployments using pglogical3 and bdr3. 
I really like this wording for "divergence-tolerance" stuff, it captures problems I want to solve. I believe it's
somewhatorthogonal to other issues. 

>
> While I can't share the code, I am happy to share the experience I have gained from my part in working on these
things.As you've probably recently seen with the wiki article I wrote on physical/logical failover interop. 
>
> You're free to take information like this and use it in wiki articles too.
>
> Right now I won't be able to launch into writing big patches for these things, but I'll do my best to share what I
canand review things. 
>
> > This seems like a wild idea. But really, we have a super expensive NVMe drives for OLTP workload. And use this
devicesto store buffer for data to be dumped into MapReduce\YT analytical system. 
>
> It's not a wild idea at all, as noted above.
>
> In pglogical3 we already support streaming decoded WAL data to alternative writer downstreams including RabbitMQ and
Kafkavia writer plugins. 
Yes, Yandex.Cloud Transfer Manger supports it too. But it has to be resynced after physical failover. And internal
installationof YC have mandatory drills: few times in a month one datacenter is disconnected and failover happens for
thousandsa DBS. 

Thank you for your input. Probably, I'll put some efforts into loading missing WAL as a first step towards bright
future:) 

Best regards, Andrey Borodin.

Re: Logical archiving

From

Craig Ringer

Date:

09 December 2020, 01:30:39

On Tue, 8 Dec 2020 at 17:54, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

> In pglogical3 we already support streaming decoded WAL data to alternative writer downstreams including RabbitMQ and Kafka via writer plugins.
Yes, Yandex.Cloud Transfer Manger supports it too. But it has to be resynced after physical failover. And internal installation of YC have mandatory drills: few times in a month one datacenter is disconnected and failover happens for thousands a DBS.

You'll want to look at https://wiki.postgresql.org/wiki/Logical_replication_and_physical_standby_failover#All-logical-replication_HA then.