Re: Changeset Extraction v7.7 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Changeset Extraction v7.7
Date
Msg-id 20140224231605.GO6718@awork2.anarazel.de
Whole thread Raw
In response to Re: Changeset Extraction v7.7  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Changeset Extraction v7.7  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2014-02-24 17:06:53 -0500, Robert Haas wrote:
> I still think pg_create_logical_replication_slot should be in slotfuncs.c.

Ok, I don't feel too strongly, so I can change it. I wanted to keep
logical/ stuff out of slotfuncs.c, but there's not really a strong
reason for that.

> I don't think the completely-unsecured directory operations in
> test_decoding_regsupport.c are acceptable.  Tom fought tooth and nail
> to make sure that similar capabilities in adminpack carried meaningful
> security restrictions.

I actually thought they'd be too ugly to live and we'd remove them
pre-commit.

There's no security problem though afaics, since they aren't actually
created, and you need to be superuser to create C functions.

>         /*
> +        * Check whether there are, possibly unconnected, logical
> slots that refer
> +        * to the to-be-dropped database. The database lock we are holding
> +        * prevents the creation of new slots using the database.
> +        */
> +       if (ReplicationSlotsCountDBSlots(db_id, &nslots, &nslots_active))
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_OBJECT_IN_USE),
> +                                errmsg("database \"%s\" is used in a
> logical decoding slot",
> +                                               dbname),
> +                                errdetail("There are %d slot(s), %d
> of them active",
> +                                                  nslots, nslots_active)));
> 
> What are you going to do when we get around to supporting this on a
> standby?  Whatever the answer is, maybe add a TODO comment.

I think it should actually mostly work out, anybody actively connected
to a slot will be kicked of (normal HS mechanisms)... But the slot would
currently live on which obviously isn't nice.

Will add TODO.

> +       /*
> +        * GetRunningTransactionData() acquired ProcArrayLock, we must release
> +        * it. We can do that before inserting the WAL record because
> +        * ProcArrayApplyRecoveryInfo can recheck the commit status using the
> +        * clog. If we're doing logical replication we can't do that though, so
> +        * hold the lock for a moment longer.
> +        */
> +       if (wal_level < WAL_LEVEL_LOGICAL)
> +               LWLockRelease(ProcArrayLock);
> +
>         recptr = LogCurrentRunningXacts(running);
> 
> +       /* Release lock if we kept it longer ... */
> +       if (wal_level >= WAL_LEVEL_LOGICAL)
> +               LWLockRelease(ProcArrayLock);
> +

> This seems unfortunate.  The comment should clearly explain why it's necessary.

There's another (existing) comment ontop of the function giving a bit
more context, but I'll expand.

I'd actually prefer to remove that special case alltogether, I don't
have much trust in those codepaths for HS... But that's not an argument
I want to fight out right nwo.

> -       heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalXmin);
> +       if (IsSystemRelation(scan->rs_rd)
> +               || RelationIsAccessibleInLogicalDecoding(scan->rs_rd))
> +               heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalXmin);
> +       else
> +               heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalDataXmin);
> 
> Instead of changing the callers of heap_page_prune_opt() in this way,
> I think it might be better to change heap_page_prune_opt() to take
> only the first two of its current three parameters; everybody's just
> passing RecentGlobalXmin right now anyway.

Sounds like a plan.

> -               if (nrels > 0 || nmsgs > 0 || RelcacheInitFileInval ||
> forceSyncCommit)
> +               if (nrels > 0 || nmsgs > 0 || RelcacheInitFileInval ||
> forceSyncCommit ||
> +                       XLogLogicalInfoActive())
> 
> Mmph.  Is this really necessary?  If so, why?  The comments could elucidate.

We could get rid of it by (optionally) adding information about the
database oid to compact commit, but that'd increase the size of the
record.

> +               bool fail_softly = slot->data.persistency == RS_EPHEMERAL;
> 
> This should be contingent on whether we're being called in the error
> pathway, not the slot type.  I think you should pass a bool.

Why? I had it that way at first, but for persistent slots this won't be
called in error pathways as we won't drop there.

> There are a bunch of places where you're testing IsSystemRelation() ||
> RelationIsAccessibleInLogicalDecoding().  Maybe we need a macro
> encapsulating that test, with a name chose to explain the point of it.

Sounds like a idea.

>  It seems to be indicating, roughly, whether the relation should
> participate in RecentGlobalXmin or RecentGlobalDataXmin.  But is there
> any point at all of separating those when !XLogLogicalInfoActive()?
> The test expands to:
> 
> IsSystemRelation() || (XLogLogicalInfoActive() &&
> RelationNeedsWAL(relation) && (IsCatalogRelation(relation) ||
> RelationIsUsedAsCatalogTable(relation)))
> 
> So basically this is all tables created in pg_catalog during initdb
> plus all TOAST tables in the system.  If wal_level=logical, then we
> also include tables marked with the reloption user_catalog_table=true,
> unless they're unlogged.  This all seems a bit complex.  Why not this:
> 
> IsSystemRelation() || || RelationIsUsedAsCatalogTable(relation)

Because that'd possibly retain too much when !XLogLogicalInfoActive(),
there's no need to look for RelationIsUsedAsCatalogTable() for those. We
could decide not to care?

> And why not this?
> 
> IsCatalogRelation() || || RelationIsUsedAsCatalogTable(relation)
> 
> i.e. is it really necessary to include all TOAST tables, or does it
> suffice to include TOAST tables of system catalogs?

The latter would suffice.

> I bet you're
> going to tell me that we don't know which TOAST tables pertain to
> user-catalog tables, and thus must include them all.  Ugh.

Not sure offhand, but if that's an issue, it needs to be fixed when
setting the option. I dimly remember thinking about it, and convincing
myself it's not an issue.

> + * GetOldestSafeDecodingTransactionId -- lowest xid not affected by vacuum
> 
> It seems to me that this is the lowest XID known not to have been
> pruned, whether by vacuum or otherwise.

Hm, yes, mentioning pruning make sense.

> +       /* ----
> +        * This is a bit tricky: We need to determine a safe xmin
> horizon to start
> +        * decoding from, to avoid starting from a running xacts
> record referring
> +        * to xids whose rows have been vacuumed or pruned
> +        * already. GetOldestSafeDecodingTransactionId() returns such
> a value, but
> +        * without further interlock it's return value might
> immediately be out of
> +        * date.
> +        *
> +        * So we have to acquire the ProcArrayLock to prevent computation of new
> +        * xmin horizons by other backends, get the safe decoding xid,
> and inform
> +        * the slot machinery about the new limit. Once that's done the
> +        * ProcArrayLock can be be released as the slot machinery now is
> +        * protecting against vacuum.
> +        * ----
> +        */
> 
> I can't claim to be very excited about this.

Because of the already_locked parameters, or any wider concerns?

> I'm assuming you've
> spent a lot of time thinking about ways to avoid this and utterly
> failed to come up with any reasonable alternative, but let me take a
> shot.  Suppose we take ProcArrayLock in exclusive mode and compute the
> oldest running XID, install it as our xmin, and then release
> ProcArrayLock.  At that point, nobody else can compute an oldest-xmin
> value that precedes that value, so we can take our time installing
> that value as the slot's xmin, without needing to hold a lock
> meanwhile.

I actually had it that way for a while, but what if the backend already
has a xmin set? Then we need to reason about whether the xmin is newer,
restore it afterwards and such. That doesn't seem nice.

Since the time holding the lock isn't long (we're just iterating over
the slots), I am not too worried?

Thanks for the review! Will address ASAP.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?
Next
From: Haribabu Kommi
Date:
Subject: Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)