Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 82004.1775466224@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> So I've been trying to understand the "Introduce an option to make
> logical replication database specific." patch and I have to confess I
> just cannot.
> 
> As far as I can read, the point is that if we reach
> SnapBuildProcessRunningXacts() when db_specific is true (which means
> standby_decode is called in an output plugin that has set
> need_shared_catalogs to false), _and_ we've not reached consistent state
> yet, then we'll call LogStandbySnapshot with our DB oid to emit a new
> xl_running_xacts message.

Right.

> So the WAL-decoding process emits WAL.  I don't know if in normal
> conditions logical decoding processes emit WAL.  If this is exceptional,
> I think we should add a comment.

Emitting WAL in logical decoding is not exceptional: SnapBuildWaitSnapshot()
already calls LogStandbySnapshot(), in order to get to the next phase.

> Now, this additional WAL message will be processed by all other
> processes decoding WAL.  Perhaps it will ignored by most of them.

Right. That's one thing that I realized late yesterday, after having posted
the latest version of the patch. In SnapBuildProcessRunningXacts(), we need

    if (OidIsValid(running->dbid))
        return;

rather than

    if (db_specific)
        return;

because other backends can also generate their database-specific records.

> But
> most importantly, it will also reach back to ourselves, at which point
> we can hopefully use it to see that we might have reached consistent
> state within our database.  Then we know our snapshot is ready to be
> used.
> 
> Is this correct?

Yes.

> I think the reason it's safe to skip a lot of the processing caused by
> this additional process, is that xl_running_xacts messages are also
> emitted in other places in a non-database specific manner.  So all the
> other placecs that are emitting that message continue to exist and
> cause logical-decoders operate in the same way as before.

Yes. I'm still thinking though if, after having used the database-specific
record to reach CONSITENT, we sould enforce one cluster-wide record, so that
the cleanup (in "our backend") takes place sooner rather than later. Not sure
about that.

> I think we should sprinkle lots of comments in several places about
> this.  For example, I propose that standby_redo() should have something
> like
> 
>   * If 'dbid' is valid, only gather transactions running in that database.
> + * Such records should not be the only ones emitted, because this has
> + * potentially dangerous side-effects which makes some places ignore them:
> + *
> + * 1. SnapBuildProcessRunningXacts will skip computing the xmin and restart
> + * point from its input record if the record's xmin is older that the
> + * snapbuilder's current xmin; this should normally be fine because that
> + * information will be updated from other xl_running_xacts records.
> + * 2. standby_redo will likewise skip processing such a record
>   *
> (are there other things that should be mentioned?)

I added something like that, but - due to the reference to
SnapBuildProcessRunningXacts() - less verbose about snapbuild.c.

> Also, LogStandbySnapshot() should have a comment explaining that passing
> a valid dboid is a weird corner case which is to be used with care, and
> that functions X Y and Z are going to ignore snapshots carrying a valid
> dbid.

One more check added in standby_decode() (and mentioned in that function in
the comment).

> Why do we call SnapBuildFindSnapshot() to do this, instead of doing it
> directly in SnapBuildProcessRunningXacts?  Seems like it would be more
> straightforward.

Yes, fixed.


One more problem I found when testing w/o background worker
(contrib/test_decoding) was that accessSharedCatalogsInDecoding was not set
back to true. Both AllocateSnapshotBuilder() FreeSnapshotBuilder() do that
now.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Introduce XID age based replication slot invalidation
Next
From: Tomas Vondra
Date:
Subject: Re: EXPLAIN: showing ReadStream / prefetch stats