Re: Hot Standby query cancellation and Streaming Replication integration - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Hot Standby query cancellation and Streaming Replication integration
Date
Msg-id 4B885FC7.5050209@2ndquadrant.com
Whole thread Raw
In response to Re: Hot Standby query cancellation and Streaming Replication integration  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Hot Standby query cancellation and Streaming Replication integration  (Bruce Momjian <bruce@momjian.us>)
Re: Hot Standby query cancellation and Streaming Replication integration  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
Bruce Momjian wrote:
> Doesn't the system already adjust the delay based on the length of slave
> transactions, e.g. max_standby_delay.  It seems there is no need for a
> user switch --- just max_standby_delay really high.
>   

The first issue is that you're basically saying "I don't care about high 
availability anymore" when you increase max_standby_delay to a high 
value.  Want to offload an 8 hour long batch report every day to the 
standby?  You can do it with max_standby_delay=8 hours.  But the day 
your master crashes 7 hours into that, you're in for a long wait before 
your standby is available while it replays all the queued up segments.  
Your 'hot standby' has actually turned into the old form of 'cold 
standby' just when you need it to be responsive.

This is also the reason why the whole "pause recovery" idea is a 
fruitless path to wander down.  The whole point of this feature is that 
people have a secondary server available for high-availability, *first 
and foremost*, but they'd like it to do something more interesting that 
leave it idle all the time.  The idea that you can hold off on applying 
standby updates for long enough to run seriously long reports is 
completely at odds with the idea of high-availability.

The second major problem is that the day the report actually takes 8.1 
hours instead, because somebody else ran another report that slowed you 
down a little, you're screwed if that's something you depend on being 
available--it just got canceled only *after* wasting 8 hours of 
reporting resource time.

max_standby_delay is IMHO only useful for allowing non-real-time web-app 
style uses of HS (think "Facebook status updates"), where you say "I'm 
OK giving people slightly out of date info sometimes if it lets me split 
the query load over two systems".  Set max_standby_delay to a few 
seconds or maybe a minute, enough time to service a typical user query, 
make your app tolerate the occasional failed query and only send big 
ones to the master, and you've just scaled up all the small ones.  
Distributed queries with "eventual consistency" on all nodes is where 
many of the web app designs are going, and this feature is a reasonable 
match for that use case.

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us



pgsql-hackers by date:

Previous
From: Gokulakannan Somasundaram
Date:
Subject: Re: A thought on Index Organized Tables
Next
From: Tom Lane
Date:
Subject: Re: Avoiding bad prepared-statement plans.