Re: Proposal: "Causal reads" mode for load balancing reads without stale data - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Date
Msg-id CANP8+jKUhV9V7tqgQkBB+NkcJdu2-yD10FfXaTXUVxZd=e+PNw@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: "Causal reads" mode for load balancing reads without stale data  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: Proposal: "Causal reads" mode for load balancing reads without stale data  (Thomas Munro <thomas.munro@enterprisedb.com>)
Re: Proposal: "Causal reads" mode for load balancing reads without stale data  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On 11 November 2015 at 09:22, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
 
1.  Reader waits with exposed LSNs, as Heikki suggests.  This is what BerkeleyDB does in "read-your-writes" mode.  It means that application developers have the responsibility for correctly identifying transactions with causal dependencies and dealing with LSNs (or whatever equivalent tokens), potentially even passing them to other processes where the transactions are causally dependent but run by multiple communicating clients (for example, communicating microservices).  This makes it difficult to retrofit load balancing to pre-existing applications and (like anything involving concurrency) difficult to reason about as applications grow in size and complexity.  It is efficient if done correctly, but it is a tax on application complexity.

Agreed. This works if you have a single transaction connected thru a pool that does statement-level load balancing, so it works in both session and transaction mode.

I was in favour of a scheme like this myself, earlier, but have more thoughts now.

We must also consider the need for serialization across sessions or transactions.

In transaction pooling mode, an application could get assigned a different session, so a token would be much harder to pass around.

2.  Reader waits for a conservatively chosen LSN.  This is roughly what MySQL derivatives do in their "causal_reads = on" and "wsrep_sync_wait = 1" modes.  Read transactions would start off by finding the current end of WAL on the primary, since that must be later than any commit that already completed, and then waiting for that to apply locally.  That means every read transaction waits for a complete replication lag period, potentially unnecessarily.  This is tax on readers with unnecessary waiting.

This tries to make it easier for users by forcing all users to experience a causality delay. Given the whole purpose of multi-node load balancing is performance, referencing the master again simply defeats any performance gain, so you couldn't ever use it for all sessions. It could be a USERSET parameter, so could be turned off in most cases that didn't need it.  But its easier to use than (1).
 
Though this should be implemented in the pooler.

3.  Writer waits, as proposed.  In this model, there is no tax on readers (they have zero overhead, aside from the added complexity of dealing with the possibility of transactions being rejected when a standby falls behind and is dropped from 'available' status; but database clients must already deal with certain types of rare rejected queries/failures such as deadlocks, serialization failures, server restarts etc).  This is a tax on writers.

This would seem to require that all readers must first check with the master as to which standbys are now considered available, so it looks like (2).

The alternative is that we simply send readers to any standby and allow the pool to work out separately whether the standby is still available, which mostly works, but it doesn't handle sporadic slow downs on particular standbys very well (if at all).

I think we need to look at whether this does actually give us anything, or whether we are missing the underlying Heisenberg reality.

More later.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: WIP: Rework access method interface
Next
From: Michael Paquier
Date:
Subject: Re: Proposing COPY .. WITH PERMISSIVE