Re: logical changeset generation v6.8 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: logical changeset generation v6.8
Date
Msg-id CA+TgmoYtm4-dv6uK6gisJs9GwLuW3HvyDsUwcvioR9Ywo9Cbgw@mail.gmail.com
Whole thread Raw
In response to Re: logical changeset generation v6.8  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: logical changeset generation v6.8
List pgsql-hackers
On Wed, Dec 4, 2013 at 10:55 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> [ updated logical decoding patches ]

Regarding patch #4, introduce wal decoding via catalog timetravel,
which seems to be the bulk of what's not committed at this point...

- I think this needs SGML documentation, same kind of thing we have
for background workers, except probably significantly more.  A design
document with ASCII art in a different patch does not constitute
usable documentation.  I think it would fit best under section VII,
internals, perhaps entitled "Writing a Logical Decoding Plugin".
Looking at it, I rather wonder if the "Background Worker Processes"
ought to be there as well, rather than under section V, server
programming.

+                       /* setup the redirected t_self for the benefit
of logical decoding */
...
+                       /* reset to original, non-redirected, tid */

Clear as mud.

+ * rrow to disk but only do so in batches when we've collected several of them

Typo.

+ * position before those records back. Independently from WAL logging,

"before those records back"?

+ * position before those records back. Independently from WAL logging,
+ * everytime a rewrite is finished all generated mapping files are directly

I would delete "Independently from WAL logging" from this sentence.
And "everytime" is two words.

+ * file. That leaves the tail end that has not yet been fsync()ed to disk open
...
+ * fsynced.

Pick a spelling and stick with it.  My suggestion is "flushed to
disk", not actually using fsync per se at all.

+ * TransactionDidCommit() check. But we want to support physical replication
+ * for availability and to support logical decoding on the standbys.

What is physical replication for availability?

+ * necessary. If we detect that we don't need to log anything we'll prevent
+ * any further action by logical_*rewrite*

Sentences should end with a period, and the reason for the asterisks
is not clear.

+       logical_xmin =
+               ((volatile LogicalDecodingCtlData*) LogicalDecodingCtl)->xmin;

Ugly.

+                                        errmsg("failed to write to
logical remapping file: %m")));

Message style.

+                       ereport(ERROR,
+                                       (errcode_for_file_access(),
+                                        errmsg("incomplete write to
logical remapping file, wrote %d of %u",
+                                                       written, len)));

Message style.  I suggest treating a short write as ENOSPC; there is
precedent elsewhere.

I don't think there's much point in including "remapping" in all of
the error messages.  It adds burden for translators and users won't
know what a remapping file is anyway.

+               /*
+                * We intentionally violate the usual WAL coding
practices here and
+                * write to the file *first*. This way an eventual
checkpoint will
+                * sync any part of the file that's not guaranteed to
be recovered by
+                * the XLogInsert(). We deal with the potential
corruption in the tail
+                * of the file by truncating it to the last safe point
during WAL
+                * replay and by checking whether the xid performing
the mapping has
+                * committed.
+                */

Don't have two different comments explaining this.  Either replace
this one with a reference to the other one, or get rid of the other
one and just keep this one.  I vote for the latter.

I don't see a clear explanation anywhere of what the
rs_logical_mappings hash is actually doing.  This is badly needed.
This code basically presupposes that you know what it's try to
accomplish, and even though I sort of do, it leaves a lot to be
desired in terms of clarity.

+       /* nothing to do if we're not working on a catalog table */
+       if (!state->rs_logical_rewrite)
+               return;

Comment doesn't accurately describe code.

+       /* use *GetUpdateXid to correctly deal with multixacts */
+       xmax = HeapTupleHeaderGetUpdateXid(new_tuple->t_data);

I don't feel enlightened by that comment.

+       if (!TransactionIdIsNormal(xmax))
+       {
+               /*
+                * no xmax is set, can't have any permanent ones, so
this check is
+                * sufficient
+                */
+       }
+       else if (HEAP_XMAX_IS_LOCKED_ONLY(new_tuple->t_data->t_infomask))
+       {
+               /* only locked, we don't care */
+       }
+       else if (!TransactionIdPrecedes(xmax, cutoff))
+       {
+               /* tuple has been deleted recently, log */
+               do_log_xmax = true;
+       }

Obfuscated.  Rewrite without empty blocks.

+       /*
+        * Write out buffer everytime we've too many in-memory entries.
+        */
+       if (state->rs_num_rewrite_mappings >= 1000 /* arbitrary number */)
+               logical_heap_rewrite_flush_mappings(state);

What happens if the number of items per hash table entry is small but
the number of entries is large?

+               /* XXX: should we warn about such files? */

Nah.

+                                                errmsg("Could not
fsync logical remapping file \"%s\": %m",

Capitalization.

+ *             Decodes WAL records fed from xlogreader.h read into an
reorderbuffer
+ *             while simultaneously letting snapbuild.c build an appropriate
+ *             snapshots to decode those.

This comment doesn't seem to have very good grammar, and it's just a
wee bit less explanation than is warranted.

+ * Take every XLogReadRecord()ed record and perform the actions required to

I'm generally not that fond of using function names as verbs.

+                * Rmgrs irrelevant for changeset extraction, they
describe stuff not
+                * represented in logical decoding. Add new rmgrs in
rmgrlist.h's
+                * order.

The following resource managers are irrelevant for changeset
extraction, because they describe...

+               case RM_NEXT_ID:
+               default:
+                       elog(ERROR, "unexpected RM_NEXT_ID rmgr_id");

Message doesn't match code.

+                       /* XXX: we could replay the transaction and
prepare it as well. */

Should we do that?

+                                * Abort all transactions that we keep
track of that are older

Come on.  You're not aborting anything; you're throwing away state
because you know it did abort.  The function naming here could maybe
use some work, too.  ReorderBufferDiscardXID()?

+                                * for doing so since, in contrast to
shutdown or end of
+                                * recover checkpoints, we have
sufficient knowledge to deal

recovery, not recover

+                        * XXX: There doesn't seem to be a usecase for decoding

Why XXX?

+               case XLOG_HEAP_INPLACE:
+                       /*
+                        * Cannot be important for our purposes, not
part of transactions.
+                        */
+                       if (!TransactionIdIsValid(xid))
+                               break;
+
+                       SnapBuildProcessChange(builder, xid, buf->origptr);
+                       /* heap_inplace is only done in catalog
modifying txns */
+
ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+                       break;

It is not clear to me why we care about some instances of this and not others.

+ * logical.c
+ *
+ *        Logical decoding shared memory management
...
+ * logical decoding on-disk data structures.

So, apparently it's more than just shared memory management.

+       /*
+        * don't overwrite if we already have a newer xmin. This can
+        * happen if we restart decoding in a slot.
+        */
+       if (TransactionIdPrecedesOrEquals(xmin, MyLogicalDecodingSlot->xmin))
+       {
+       }
+       /*
+        * If the client has already confirmed up to this lsn, we directly
+        * can mark this as accepted. This can happen if we restart
+        * decoding in a slot.
+        */
+       else if (lsn <= MyLogicalDecodingSlot->confirmed_flush)

Try to avoid empty blocks.  And we don't normally put comments between
the closing brace of the if and the else clause.

+               elog(DEBUG1, "got new xmin %u at %X/%X", xmin,
+                        (uint32) (lsn >> 32), (uint32) lsn);
+       }
+       SpinLockRelease(&MyLogicalDecodingSlot->mutex);

Don't elog() while holding a spinlock.

+XLogRecPtr ComputeLogicalRestartLSN(void)

Formatting.

+       if (wal_level < WAL_LEVEL_LOGICAL)
+               ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                errmsg("logical decoding requires
wal_level=logical")));
+
+       if (MyDatabaseId == InvalidOid)
+               ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                errmsg("logical decoding requires to
be connected to a database")));
+
+       if (max_logical_slots == 0)
+               ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                (errmsg("logical decoding requires
needs max_logical_slots > 0"))));
+

Message style, times three.

+                                errmsg("There already is a logical
slot named \"%s\"", name)));

And again.

All right, I'm giving up for now.  These patches need a LOT of work on
comments and documentation to be committable, even if the underlying
architecture is basically sound.  There's a lot of stuff that has no
comment at all, and a lot of the comments that do exist are basically
recapitulating what the code says (or in some cases, not what the code
says) rather than explaining what the purpose of all of this stuff is
at a conceptual level.  The comment at the header of reorderbuffer.c,
for example, is well-written and cogent, but there's a lot of places
where similar detail is needed but lacking.  I realize that it isn't
project policy for every function to have a header comment but at the
very least I think it'd be worth asking, for each one, why it doesn't
need one, and/or what information could be provided in such a comment
to most effectively inform the reader.

+        * We free separately allocated data by entirely scrapping oure personal

Spelling.


+        * clog. If we're doing logical replication we can't do that though, so
+        * hold the lock for a moment longer.

...because why?

I'm still unhappy that we're introducing logical decoding slots but no
analogue for physical replication.  If we had the latter, would it
share meaningful amounts of structure with this?

+                * noncompatible way, but those are prevented both on catalog
+                * tables and on user tables declared as additional catalog
+                * tables.

Really?

My eyes are starting to glaze over, so really stopping here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Completing PL support for Event Triggers
Next
From: Robert Haas
Date:
Subject: Re: [COMMITTERS] pgsql: Add a new reloption, user_catalog_table.