Re: Hot standby, dropping a tablespace - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Hot standby, dropping a tablespace
Date
Msg-id 1232900135.2327.1459.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: Hot standby, dropping a tablespace  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Hot standby, dropping a tablespace  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Sun, 2009-01-25 at 11:28 +0200, Heikki Linnakangas wrote:

> >> You then call 
> >> ResolveRecoveryConflictWithVirtualXIDs to kill such transactions, and 
> >> try removing the directory again. But 
> >> ResolveRecoveryConflictWithVirtualXIDs doesn't wait for the target 
> >> transaction to die anymore (or at least it shouldn't, as we discussed 
> >> earlier), so that doesn't work AFAICS.
> > 
> > The FATAL errors inflicted should be fairly quick to take effect, so
> > waiting should not be a problem. We can make waiting for FATAL errors
> > the standard response.
> 
> The call in tablespace replay uses ERROR, not FATAL. You don't need to 
> kill the whole session, just the current transaction.

Yeh, yeh, it already does that, FATAL was just my typo. My bad.

> It seems more and more to me that the FATAL and ERROR cases are really 
> quite different. Looking at the callers, there's four different needs 
> for GetConflictingVirtualXIDs+ResolveRecoveryConflictWithVirtualXIDs:

I think the deferred case should be handled with a different function,
since it needs further work on it and that is best done as a second
function.

> 1. Kill all connections to a given database. Used when replaying DROP 
> DATABASE.
> 
> 2. Kill all connections by given user. Hmm, not used for anything, 
> actually. Should remove the roleId argument from GetConflictingVirtualXIDs.

No, because we still need to add code to kill-connected-users if we drop
role.

> 3. Kill all transactions using given tablespace as temp tablespace, FROP 
> TABLESPACE.
> 
> 4. Mark all transactions that still see a given XID as running for 
> termination if they try to access a buffer with conflicting LSN (VACUUM, 
> btree-deletes).
> 
> All callers call GetConflictingVirtualXIDs first, and then 
> ResolveRecoveryConflictWithVirtualXIDs. That's a bit cumbersome; none of 
> the callers do anything else with the virtualxid array they get from 
> GetConflictingVirtualXIDs than pass it on to 
> ResolveRecoveryConflictWithVirtualXIDs. I'm thinking of an interface 
> consisting of three functions, replacing the current 
> GetConflictingVirtualXIDs and ResolveRecoveryConflictWithVirtualXIDs 
> functions:

I'm not sure I see any benefit in doing that.

The two big changes we did earlier were worthwhile but I'm concerned
about how much refactoring you want to do. It introduces new bugs each
time and currently they take time to isolate and fix - much of that work
is currently not very visible, but its a big effort. If I was happy we
had a perfect working solution and we were not under time pressure then
I'd say go for it as much as you like. I'd prefer it if we could get
everything correct before we put all the code in the right cupboards. I
know tidy-up-as-you-go is a good policy but I'd encourage you to do a
first pass looking for potential problems before we did that. Or maybe
we're there already, not sure.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Guillaume Smet
Date:
Subject: Re: [COMMITTERS] pgsql: Automatic view update rules Bernd Helmle
Next
From: Simon Riggs
Date:
Subject: Re: Hot standby, conflict resolution