Thread: Using PostgreSQL for service discovery and health-check
Hi. I'm requesting advice on something I'm about to start on.
In a normally 2-tier system, where "desktop" native apps connect directly to PostgreSQL to access data, some things must be mediated via a middle-tier service (i.e. 3-tier architecture). That service is HTTP based thus clients (the "desktop" native apps) must know the URL (host+port+path) of a server providing the requested service.
Since clients must already have access to PostgreSQL to operate, I wanted to add a table in PostgreSQL for services, and server(s) on startup (probably using a random port) would register in that table, and deregister on shutdown. Also, since crashes are a fact of life, the server would regularly update the DB with a "heartbeat" on some frequency (e.g. 1s?), so clients would select (or see) only registered services with a "fresh enough" heartbeat timestamp.
That's basically my plan. Now come the questions:
1) will updating a row every second (for example) create issues?
2) if yes to #1 above, what would be good mitigation tactics? Use different table for service vs heartbeat? Special kind of table? or configured in a particular way?
3) if a service crashes, it won't remove its row(s), obviously. What kind of mechanism exists to "reap" "zombie" services?
4) Related to #3 above, I think built-in "cron"-like services are only available via extensions, not in PostgreSQL proper. Why? Seems like such an essential service.
5) Which cron-like extension to use? Especially since we run both on-prem but also in managed-PostgreSQL on the cloud?
I'd appreciate community input. Thanks, --DD
PS: Note that there could be more than 1 server registered, for the same service, possibly on the same machine, for redundancy. But I think that's mostly orthogonal to my questions above.
On Thu, Feb 9, 2023 at 8:30 AM Dominique Devienne <ddevienne@gmail.com> wrote:
That's basically my plan. Now come the questions:1) will updating a row every second (for example) create issues?2) if yes to #1 above, what would be good mitigation tactics? Use different table for service vs heartbeat? Special kind of table? or configured in a particular way?3) if a service crashes, it won't remove its row(s), obviously. What kind of mechanism exists to "reap" "zombie" services?4) Related to #3 above, I think built-in "cron"-like services are only available via extensions, not in PostgreSQL proper. Why? Seems like such an essential service.5) Which cron-like extension to use? Especially since we run both on-prem but also in managed-PostgreSQL on the cloud?
You can probably get good mileage from CREATE UNLOGGED TABLE.
I don't have any particular suggestion for PostgreSQL cron extensions but I'm fully on board with the current division of responsibilities here. Such a user-space application isn't something that the core developers should be worried about nor does such a project really want to be tied to the release cadence that the server is restricted to.
David J.
On Thu, Feb 9, 2023 at 4:46 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thu, Feb 9, 2023 at 8:30 AM Dominique Devienne <ddevienne@gmail.com> wrote:That's basically my plan. Now come the questions:1) will updating a row every second (for example) create issues?2) if yes to #1 above, what would be good mitigation tactics? Use different table for service vs heartbeat? Special kind of table? or configured in a particular way?3) if a service crashes, it won't remove its row(s), obviously. What kind of mechanism exists to "reap" "zombie" services?4) Related to #3 above, I think built-in "cron"-like services are only available via extensions, not in PostgreSQL proper. Why? Seems like such an essential service.5) Which cron-like extension to use? Especially since we run both on-prem but also in managed-PostgreSQL on the cloud?You can probably get good mileage from CREATE UNLOGGED TABLE.
Thanks. Although I guess the fact it's not replicated to standby servers could be a problem?
That's not something we test now, and also something the on-prem DBA (i.e. a client of ours)
might want to setup on his/her own, on top of our client/server arch I guess.
I have no experience with stand-bys (replication) and HA in PostgreSQL.
Would having the main service table be a regular one, and the service_heartbeat be an unlogged one be replication friendly?
I.e. if fail over to the stand-by happens, the service table is still there and populated, but the service_heartbeat is empty, but
then the services would start populating it "transparently" no? I.e. would using 2 tables instead of 1 be a better design?
On 2/9/23 07:30, Dominique Devienne wrote: > Hi. I'm requesting advice on something I'm about to start on. > > In a normally 2-tier system, where "desktop" native apps connect > directly to PostgreSQL to access data, some things must be mediated via > a middle-tier service (i.e. 3-tier architecture). That service is HTTP > based thus clients (the "desktop" native apps) must know the URL > (host+port+path) of a server providing the requested service. Is there more then one server providing the same service? > > Since clients must already have access to PostgreSQL to operate, I > wanted to add a table in PostgreSQL for services, and server(s) on > startup (probably using a random port) would register in that table, and > deregister on shutdown. Also, since crashes are a fact of life, the > server would regularly update the DB with a "heartbeat" on some > frequency (e.g. 1s?), so clients would select (or see) only registered > services with a "fresh enough" heartbeat timestamp. Would it no be easier to not have random ports and just attempt connections to the servers either: 1) In the client with reattempt to different port on failure. 2) From Postgres server and update table to have current up servers. > > That's basically my plan. Now come the questions: > 1) will updating a row every second (for example) create issues? > 2) if yes to #1 above, what would be good mitigation tactics? Use > different table for service vs heartbeat? Special kind of table? or > configured in a particular way? > 3) if a service crashes, it won't remove its row(s), obviously. What > kind of mechanism exists to "reap" "zombie" services? > 4) Related to #3 above, I think built-in "cron"-like services are only > available via extensions, not in PostgreSQL proper. Why? Seems like such > an essential service. > 5) Which cron-like extension to use? Especially since we run both > on-prem but also in managed-PostgreSQL on the cloud? > > I'd appreciate community input. Thanks, --DD > > PS: Note that there could be more than 1 server registered, for the same > service, possibly on the same machine, for redundancy. But I think > that's mostly orthogonal to my questions above. -- Adrian Klaver adrian.klaver@aklaver.com
On Thu, Feb 9, 2023 at 5:05 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 2/9/23 07:30, Dominique Devienne wrote:
> In a normally 2-tier system, where "desktop" native apps connect
> directly to PostgreSQL to access data, some things must be mediated via
> a middle-tier service (i.e. 3-tier architecture). That service is HTTP
> based thus clients (the "desktop" native apps) must know the URL
> (host+port+path) of a server providing the requested service.
Is there more then one server providing the same service?
Yes. That was my PS: basically. The client can connect to any one, randomly.
We need at least one of course. But there could me more than 1, yes.
> Since clients must already have access to PostgreSQL to operate, I
> wanted to add a table in PostgreSQL for services, and server(s) on
> startup (probably using a random port) would register in that table, and
> deregister on shutdown. Also, since crashes are a fact of life, the
> server would regularly update the DB with a "heartbeat" on some
> frequency (e.g. 1s?), so clients would select (or see) only registered
> services with a "fresh enough" heartbeat timestamp.
Would it no be easier to not have random ports and just attempt
connections to the servers either:
1) In the client with reattempt to different port on failure.
2) From Postgres server and update table to have current up servers.
I'm sorry, but I'm not following. Can you perhaps rephrase?
Regarding ports, once you have registration of services, just seems easier to me to NOT have a fixed port,
and let the host assign any port to the HTTP server. Those servers are not user-facing directly, from the client
side, it calls an API and lookup of the service and connection to the HTTP server is transparent is an implementation
detail, so the port used doesn't matter. In-DB registration of (HTTP) servers makes the while URL an implementation detail.
On 2/9/23 08:16, Dominique Devienne wrote: > On Thu, Feb 9, 2023 at 5:05 PM Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > Is there more then one server providing the same service? > > > Yes. That was my PS: basically. The client can connect to any one, randomly. > We need at least one of course. But there could me more than 1, yes. > Would it no be easier to not have random ports and just attempt > connections to the servers either: > 1) In the client with reattempt to different port on failure. > 2) From Postgres server and update table to have current up servers. > > > I'm sorry, but I'm not following. Can you perhaps rephrase? > > Regarding ports, once you have registration of services, just seems > easier to me to NOT have a fixed port, > and let the host assign any port to the HTTP server. Those servers are > not user-facing directly, from the client > side, it calls an API and lookup of the service and connection to the > HTTP server is transparent is an implementation > detail, so the port used doesn't matter. In-DB registration of (HTTP) > servers makes the while URL an implementation detail. The flip side of that is that with known ports it would it easier to have a process on the Postgres machine or in the database that checks the ports on regular basis. And as part of that process mark any non responding ports as inactive. That would solve the zombie problem. -- Adrian Klaver adrian.klaver@aklaver.com
On Thu, Feb 9, 2023 at 5:51 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 2/9/23 08:16, Dominique Devienne wrote:
> On Thu, Feb 9, 2023 at 5:05 PM Adrian Klaver <adrian.klaver@aklaver.com
The flip side of that is that with known ports it would it easier to
have a process on the Postgres machine or in the database that checks
the ports on regular basis. And as part of that process mark any non
responding ports as inactive. That would solve the zombie problem.
That's one possibility. But the "reaper" process could just as well scan the service table,
and probe those too. So again, I'm not sure what the fixed-port approach gains me, beside
perhaps the reaper not having to connect to PostgreSQL itself. I'm OK with connecting.
Thanks for the your input. Always good to have one's arguments challenged by experts.
On 2/9/23 09:40, Dominique Devienne wrote: > On Thu, Feb 9, 2023 at 5:51 PM Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 2/9/23 08:16, Dominique Devienne wrote: > > On Thu, Feb 9, 2023 at 5:05 PM Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > The flip side of that is that with known ports it would it easier to > have a process on the Postgres machine or in the database that checks > the ports on regular basis. And as part of that process mark any non > responding ports as inactive. That would solve the zombie problem. > > > That's one possibility. But the "reaper" process could just as well scan > the service table, > and probe those too. So again, I'm not sure what the fixed-port approach > gains me, beside > perhaps the reaper not having to connect to PostgreSQL itself. I'm OK > with connecting. What is the reaper process? > > Thanks for the your input. Always good to have one's arguments > challenged by experts. -- Adrian Klaver adrian.klaver@aklaver.com
On 2/9/23 09:40, Dominique Devienne wrote: > On Thu, Feb 9, 2023 at 5:51 PM Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 2/9/23 08:16, Dominique Devienne wrote: > > On Thu, Feb 9, 2023 at 5:05 PM Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > The flip side of that is that with known ports it would it easier to > have a process on the Postgres machine or in the database that checks > the ports on regular basis. And as part of that process mark any non > responding ports as inactive. That would solve the zombie problem. > > > That's one possibility. But the "reaper" process could just as well scan > the service table, > and probe those too. So again, I'm not sure what the fixed-port approach > gains me, beside > perhaps the reaper not having to connect to PostgreSQL itself. I'm OK > with connecting. As to fixed port and pulling vs services pushing, there is a security side. Not sure who controls the external services, but there is the chance that someone knowing they exist could inject their own version of a service/server. With random ports that makes that easier as you would not know what is canonical. With the pull process you have a verified(presumably) list of servers and ports they listen on. > > Thanks for the your input. Always good to have one's arguments > challenged by experts. -- Adrian Klaver adrian.klaver@aklaver.com