Home > mailing lists

Logical replication for async service communication? - Mailing list pgsql-general

From	Sean Huber
Subject	Logical replication for async service communication?
Date	May 15, 2020 07:33:32
Msg-id	CAM8f5Mi1Ftj+48PZxN1AbM-P=4YMLENY5zRaPwTbmbkFwCsTkA@mail.gmail.com Whole thread Raw
List	pgsql-general

Tree view

Has anyone attempted to use logical replication with table partitioning for async service communication?

Proof of concept: https://gist.github.com/shuber/8e53d42d0de40e90edaf4fb182b59dfc

Services would commit messages to their own databases along with the rest of their data (with the same transactional guarantees) and then messages are "realtime" replicated (with all of its features and guarantees) to the receiving service's database where their workers (e.g. que-rb, skip locked polling, etc) are waiting to respond by inserting messages into their database to be replicated back.

Throw in a trigger to automatically acknowledge/cleanup/notify messages and I think we've got something that resembles a queue? Maybe make that same trigger match incoming messages against a "routes" table (based on message type, certain JSON schemas in the payload, etc) and write matches to the que-rb jobs table instead for some kind of distributed/replicated work queue hybrid?

My motivations for this line of thinking were mostly based around high availability and isolating service downtime/failures from each other. Our PostgreSQL databases are the most critical pieces of infrastructure for all of our services - if it's down then we don't want the impacted service to even attempt to be doing work. On the other hand, we don't want a service's downtime to impact its ability to receive (queued) messages from other services that it can resume consuming (once, in order) when it's back up.

We're exploring other message queues but keep getting drawn back to PostgreSQL because we can get the same transactional guarantees with our messages/jobs as the rest of our data. Even the act of enqueuing a job or sending a message to another service is something that must be committed and can be rolled back like everything else.

For our potential use case specifically, we're not dealing with high levels of realtime traffic etc - we're not even close to 1k jobs/messages per second.

I'm looking to poke holes in this concept before sinking anymore time exploring the idea. Any feedback/warnings/concerns would be much appreciated, thanks for your time!

Sean Huber

pgsql-general by date:

From: Tom Lane
Date: 15 May 2020, 04:13:30
Subject: Re: view selection during query rewrite

From: Chris Withers
Date: 15 May 2020, 09:06:39
Subject: Re: surprisingly slow creation of gist index used in excludeconstraint

Logical replication for async service communication? - Mailing list pgsql-general

Previous

Next