Re: logical changeset generation v6.2 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: logical changeset generation v6.2
Date
Msg-id CA+TgmoZtu0UcygHCg=+cz9T3g4TzmmYzN1LJr3TeKWWOMXxfMQ@mail.gmail.com
Whole thread Raw
In response to Re: logical changeset generation v6.2  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: logical changeset generation v6.2  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Fri, Oct 25, 2013 at 7:57 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> However, I'm leery about the idea of using a relation fork for this.
>> I'm not sure whether that's what you had it mind, but it gives me the
>> willies.  First, it adds distributed overhead to the system, as
>> previously discussed; and second, I think the accounting may be kind
>> of tricky, especially in the face of multiple rewrites.  I'd be more
>> inclined to find a separate place to store the mappings.  Note that,
>> AFAICS, there's no real need for the mapping file to be
>> block-structured, and I believe they'll be written first (with no
>> readers) and subsequently only read (with no further writes) and
>> eventually deleted.
>
> I was thinking of storing it along other data used during logical
> decoding and let decoding's cleanup clean up that data as well. All the
> information for that should be there.

That seems OK.

> There's one snag I currently can see, namely that we actually need to
> prevent that a formerly dropped relfilenode is getting reused. Not
> entirely sure what the best way for that is.

I'm not sure in detail, but it seems to me that this all part of the
same picture.  If you're tracking changed relfilenodes, you'd better
track dropped ones as well.  Completely aside from this issue, what
keeps a relation from being dropped before we've decoded all of the
changes made to its data before the point at which it was dropped?  (I
hope the answer isn't "nothing".)

>> One possible objection to this is that it would preclude decoding on a
>> standby, which seems like a likely enough thing to want to do.  So
>> maybe it's best to WAL-log the changes to the mapping file so that the
>> standby can reconstruct it if needed.
>
> The mapping file probably can be one big wal record, so it should be
> easy enough to do.

It might be better to batch it, because if you rewrite a big relation,
and the record is really big, everyone else will be frozen out of
inserting WAL for as long as that colossal record is being written and
synced.  If it's inserted in reasonably-sized chunks, the rest of the
system won't be starved as badly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Detection of nested function calls
Next
From: Robert Haas
Date:
Subject: Re: logical changeset generation v6.2