Re: Changeset Extraction v7.1 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Changeset Extraction v7.1
Date
Msg-id 20140124174916.GV7182@awork2.anarazel.de
Whole thread Raw
In response to Re: Changeset Extraction v7.1  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Changeset Extraction v7.1
List pgsql-hackers
On 2014-01-24 12:10:50 -0500, Robert Haas wrote:
> > Unfortunately not. Inside the walsender there's currently no
> > LWLockReleaseAll() for ERRORs since commands aren't run inside a
> > transaction command...
> >
> > But maybe I should have fixed this by adding the release to
> > WalSndErrorCleanup() instead? That'd still leave the problematic case
> > that currently we try to delete a replication slot inside a CATCH when
> > we fail while initializing the rest of logical replication... But I
> > guess adding it would be a good idea independent of that.
> 
> +1.  I think that if we can't rely on error handling to clean up the
> same things everywhere, it's gonna be a mess.  People won't be able to
> keep track of which error cleanup is engaged in which code paths, and
> screw-ups will result when old code paths are called from new call
> sites.

Ok, I'll additionally add it there.

> > We could also do a StartTransactionCommand() but I'd rather not, that
> > currently prevents code in that vicinity from doing anything it
> > shouldn't via various Assert()s in existing code.
> 
> Like what?  I mean, I'm OK with having a partial error-handling
> environment if that's all we need, but I think it's a bad plan to the
> extent that the code here needs to be aware of error-handling
> differences versus expectations elsewhere in our code.

Catalog lookups, building a snapshot, xid assignment, whatnot. All that
shouldn't happen in the locations creating/dropping a slot.
I think we should at some point separate parts of the error handling out
of xact.c. Currently its repeated slightly differently over logs of
places (check e.g. the callsites for LWLockReleaseAll), that's not
robust. But that's a project for another day.

Note that the actual decoding *does* happen inside a TransactionCommand,
as it'd be failure prone to copy all the cleanup logic. And we need to
have most of the normal cleanup code.

> I don't really see why it's simpler that way.  It's clearer what the
> point of the lock is if you only hold it for the operations that need
> to be protected by that lock.

> > In all other cases where we modify *_xmin we're only increasing it which
> > doesn't need a lock (HS feedback never has taken one, and
> > GetSnapshotData() modifies ->xmin while holding a shared lock), the only
> > potential danger is a slight delay in increasing the overall value.

> Right.  We might want to comment such places.

> > Don't we already have cases of that? I seem to remember so. If you
> > prefer having them there, I am certainly fine with doing that. This way
> > they aren't allocated if slots are disabled but it's just two
> > TransactionIds.
> 
> Let's go for it, unless we think of a reason not to.

ok on those counts.

> >> It's pretty evident that what's currently patch #4 (only peg the xmin
> >> horizon for catalog tables during logical decoding) needs to become
> >> patch #1, because it doesn't make any sense to apply this before we do
> >> that.
> >
> > Well, the slot code and the the slot support for streaming rep are
> > independent from and don't use it. So they easily can come before it.
> 
> But this code is riddled with places where you track a catalog xmin
> and a data xmin separately.  The only point of doing it that way is to
> support a division that hasn't been made yet.

If you think it will make stuff more manageable I can try separating all
lines dealing with catalog_xmin into another patch as long as data_xmin
doesn't have to be renamed.
That said, I don't really think it's a big problem that the division
hasn't been made, essentially the meaning is different, even if we don't
take advantage of it yet. data_xmin is there for streaming replication
where we need to prevent all removal, catalog_xmin is there for
changeset extraction.

> I have zero confidence that it's OK to treat fsync() as an operation
> that won't fail.  Linux documents EIO as a plausible error return, for
> example.  (And really, how could it not?)

But quite fundamentally having a the most basic persistency building
block fail is something you can't really handle safely. Note that
issue_xlog_fsync() has always (and I wager, will always) treated that as
a PANIC.
I don't recall many complaints about that for WAL. All of the ones I
found in a quick search were like "oh, the fs invalidated my fd because
of corruption". And few.

> >> Calling a slot "old" or "new" looks liable to cause problems.  Maybe
> >> change those names to contain a character not allowed in a slot name,
> >> if we're going to keep doing it that way.
> > I wondered about making them plain files as well but given the need for
> > a directory independent from this I don't really see the advantage,
> > we'll need to handle them anyway during cleanup.
> 
> Yeah, sure, but if it makes for fewer in-between states, it might be worth it.

I don't think it'd make anything simpler with the new version of the
code. Agreed?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Why do we let autovacuum give up?
Next
From: Tom Lane
Date:
Subject: Re: Change authentication error message (patch)