Re: Hot Standby query cancellation and Streaming Replication integration - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Hot Standby query cancellation and Streaming Replication integration
Date
Msg-id 4B8886EE.8010407@2ndquadrant.com
Whole thread Raw
In response to Re: Hot Standby query cancellation and Streaming Replication integration  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Hot Standby query cancellation and Streaming Replication integration
List pgsql-hackers
Bruce Momjian wrote:
> Well, I think the choice is either you delay vacuum on the master for 8
> hours or pile up 8 hours of WAL files on the slave, and delay
> application, and make recovery much slower.  It is not clear to me which
> option a user would prefer because the bloat on the master might be
> permanent.
>   

But if you're running the 8 hour report on the master right now, aren't 
you already exposed to a similar pile of bloat issues while it's going?  
If I have the choice between "sometimes queries will get canceled" vs. 
"sometimes the master will experience the same long-running transaction 
bloat issues as in earlier versions even if the query runs on the 
standby", I feel like leaning toward the latter at least leads to a 
problem people are used to. 

This falls into the principle of least astonishment category to me.  
Testing the final design for how transactions get canceled here led me 
to some really unexpected situations, and the downside for a mistake is 
"your query is lost".  Had I instead discovered that sometimes 
long-running transactions on the standby can ripple back to cause a 
maintenance slowdown on the master, that's not great.  But it would not 
have been so surprising, and it won't result in lost query results. 

I think people will expect that their queries cancel because of things 
like DDL changes.  And the existing knobs allow inserting some slack for 
things like locks taking a little bit of time to acquire sometimes.  
What I don't think people will see coming is that a routine update on an 
unrelated table is going to kill a query they might have been waiting 
hours for the result of, just because that update crossed an autovacuum 
threshold for the other table and introduced a dead row cleanup.

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Hot Standby query cancellation and Streaming Replication integration
Next
From: Michael Glaesemann
Date:
Subject: Re: Correcting Error message