Home > mailing lists

Streaming Replication clusters and load balancing - Mailing list pgsql-hackers

From	James Sewell
Subject	Streaming Replication clusters and load balancing
Date	September 21, 2015 17:14:07
Msg-id	CANkGpBu1hTEijzYRTo-RG02k=wNPJi7bvT2kyR_YxR3jFoaMDg@mail.gmail.com Whole thread
Responses	Re: Streaming Replication clusters and load balancing
List	pgsql-hackers

Tree view

Hello all,

I have recently been working with PostgreSQL and HAProxy to provide seamless load balancing to a group of database servers. This on it's own isn't a hard thing: I have an implementation finished and am now thinking about the best way to bring it to a production ready state which could be used by others, and used in load-balancers other than HAProxy with minimal config changes.

My initial requirements were:

Given a group of PostgreSQL servers check each x seconds and:

Allow read/write access only to the master server (via IPA / portA)
- Disallow access if there are multiple master servers

Allow read access to all servers (via IPB / portB) as long as the following holds:
- They are attached to the current master server via streaming replication (or they are the current master server)
- They can currently contact the master server (safest option, disallow all access when master-less)
- They are in the same timeline as the master server (do I need this check?)
- The master server reports that they have less than x bytes lag

HAProxy can talk to PostgreSQL for a health check via TCP or PSQL (connection check only). Neither of these allow the logic above - therefore this logic has to be hosted outside of HAProxy. This might change in the future if HAProxy gets the ability to send SQL statements (like an F5 can).

Today the best way to provide this information to HAProxy (and many other load balancers, application frameworks, proxies, monitoring solutions) is via HTTP, where HTTP 200 is pass the check and HTTP 503 is fail the check (and don't load balance to this node). In my case I have a script which accepts HTTP requests to /read to check if this node is available for read only and /write which checks if this node is available for read/writes.

The options as I see them are:

Implement a script / small app which connects to PostgreSQL and executes these checks
- Implemented and proven today at many independent sites
- Should it run on HAProxy server or PSQL server?
- Integrated HTTP server or x.inetd script?
- Platform independence?
- What if it dies?

Implement a custom PostgreSQL BGworker which provides this information over HTTP
- No outside of PostgreSQL config needed
- No reliance on another daemon / interface being up
- libmicrohttpd or similar should help with platform independence
- Security / acceptance by community?
- Only newer versions of PostgreSQL

Spend the time working on getting SQL checks into HAProxy
- What about other platforms which only support HTTP?

I think all of the options would benefit from a PSQL extension which does the following:

Encapsulates the check logic (easier to upgrade, manipulate)
Stores historic check data for a number of hours / days / months
Stores defaults (override via HTTP could be possible for things like lag)

Does anyone else have any thoughts on this topic?

Eventually many cool features could flow out of this kind of work:

Integration with High Availability products - I have this working with EnterpriseDB EFM now.
- Locate the current master using the HA product
  - more than one master doesn't cause loss of service as long as HA state is sane
- Locate all clustered standby servers using the HA product
  - if a standby is removed from the HA cluster, it is removed from load balancing
  - if a standby is not part of the cluster, it is removed from load balancing (even if it is part of streaming replication)

HTTP replication status requests which facilitate dynamically managing HAProxy (or other) PostgreSQL server pools
- Add a node to streaming replication, it automatically shows up in the pool and starts being checked to see if it can service reads
- Great for cloud scale out

Allocation of additional load balancer groups based on some criteria (?), for example
- read/write (as above)
- read only (as above)
- data warehouse (reporting reads only)
- DR (replica with no reads or writes - until it becomes a master)

Keen to hear comments.

Cheers,

James Sewell,
Solutions Architect
______________________________________

Level 2, 50 Queen St, Melbourne VIC 3000

P (+61) 3 8370 8000 W www.lisasoft.com F (+61) 3 8370 8099

The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

pgsql-hackers by date:

From: Jeevan Chalke
Date: 21 September 2015, 16:37:46
Subject: TEXT vs VARCHAR join qual push down diffrence, bug or expected?

From: Marcin Mańk
Date: 21 September 2015, 17:15:34
Subject: Re: [patch] Proposal for \rotate in psql

Streaming Replication clusters and load balancing - Mailing list pgsql-hackers

Previous

Next