Re: Hot standby and b-tree killed items - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Hot standby and b-tree killed items
Date
Msg-id 4958AA59.6050506@enterprisedb.com
Whole thread Raw
In response to Re: Hot standby and b-tree killed items  ("marcin mank" <marcin.mank@gmail.com>)
List pgsql-hackers
marcin mank wrote:
>> Perhaps we should listen to the people that have said they don't want
>> queries cancelled, even if the alternative is inconsistent answers.

I don't like that much. PostgreSQL has traditionally avoided that very 
hard. It's hard to tell what kind of inconsistencies you'd get, as it'd 
depend on what plan is created, when a vacuum happens to run on master etc.

> I think an alternative to that would be "if the wal backlog is too
> big, let current queries finish and let incoming queries wait till the
> backlog gets smaller".

Yeah, that makes sense too.

Many approaches have been proposed, and they all have different 
tradeoffs and therefore fit different use cases. I'm not sure which ones 
are/will be included in the patch. We don't need all in 8.4, one or two 
simplest ones will do just fine, and we can extend later.

Let me summarize. Whenever a WAL record conflicts with a 
query-in-progress, we can:

1. kill the query, or
2. wait for the query to finish
3. let the query proceed, producing invalid results.

There's some combinations of those as well. You're proposal is a 
variation of 2, to avoid the problem of WAL application falling behind 
indefinitely. There's also the max_standby_delay option in the patch, to 
wait a while, and then kill the query.

There's some additional optimizations that can be made to make those 
options less painful. Instead of killing all queries that might be 
affected by a vacuum record, only kill them when they actually hit a 
block that was vacuumed (Simon's idea of latestRemovedLSN field in page 
header).

Another line of attack is to avoid getting into the situation in the 
first place, by affecting behavior on the master. If the standby has an 
online connection to the master (per the synch rep patch), it can tell 
master what the slave's OldestXmin is, and master can take that into 
account and not remove tuples still needed by the slave. That's not good 
from high availability point of view, you don't want a hung query in the 
slave to cause a long-running-transaction situation in the master, but 
for other use cases it would be fine. Or we can just add a constant # of 
transactions to OldestXmin in master, to get some breathing room in the 
server.

The bottom line is that we have enough options to make everyone happy. 
Some understanding of the issue is required to tune it properly, 
however, so documentation is important.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [PATCHES] Infrastructure changes for recovery (v8)
Next
From: Heikki Linnakangas
Date:
Subject: Re: Synchronous replication, network protocol