Re: logical changeset generation v6.6 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: logical changeset generation v6.6
Date
Msg-id CA+TgmobdsR6W4mYMZniu92BzXyLHWJ_xsM5DpZk_1s8f05o9LQ@mail.gmail.com
Whole thread Raw
In response to Re: logical changeset generation v6.6  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: logical changeset generation v6.6  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Mon, Nov 11, 2013 at 12:00 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> [ updated patch-set ]

I'm pretty happy with what's now patch #1, f/k/a known as patch #3,
and probably somewhere else in the set before that.  At any rate, I
refer to 0001-wal_decoding-Add-wal_level-logical-and-log-data-requ.patch.gz.

I think the documentation still needs a bit of work.  It's not
particularly clean to just change all the places that refer to the
need to set wal_level to archive (or hot_standby) level to instead
refer to archive (or hot_standby, logical).  If we're going to do it
that way, I think we definitely need a connecting word between
hot_standby and logical, specifically "or".  But I'm wondering if
would be better to instead change those places to say archive (or any
higher setting).

You've actually changed the meaning of this section (and not in a good way):
        be set at server start. <varname>wal_level</> must be set
-        to <literal>archive</> or <literal>hot_standby</> to allow
-        connections from standby servers.
+        to <literal>archive</>, <literal>hot_standby</> or <literal>logical</>
+        to allow connections from standby servers.

I think that the previous text meant that you needed archive - or, if
you want to allow connections, hot_standby.  The new text loses that
nuance.

I'm tempted to think that we're better off calling this "logical
decoding" rather than "logical replication".  At least, we should
standardize on one or the other.  If we go with "decoding", then fix
these:

+                * For logical replication, we need the tuple even if
we're doing a
+/* Do we need to WAL-log information required only for Hot Standby
and logical replication? */
+/* Do we need to WAL-log information required only for logical replication? */
(and we should go back and correct the instance already added to the
ALTER TABLE documentation)

Is there any special reason why RelationIsLogicallyLogged(), which is
basically a three-pronged test, does one of those tests in a macro and
defers the other two to a function?  Why not just put it all in the
macro?

I did some performance testing on the previous iteration of this
patch, just my usual set of 30-minute pgbench runs.  I tried it with
wal_level=hot_standby and wal_level=logical.  32-clients, scale factor
300, shared_buffers = 8GB, maintenance_work_mem = 4GB,
synchronous_commit = off, checkpoint_segments = 300,
checkpoint_timeout = 15min, checkpoint_completion_target = 0.9.  The
results came out like this:

hot_standby tps = 15070.229005 (including connections establishing)
hot_standby tps = 14769.905054 (including connections establishing)
hot_standby tps = 15119.350014 (including connections establishing)
logical tps = 14713.523689 (including connections establishing)
logical tps = 14799.242855 (including connections establishing)
logical tps = 14557.538038 (including connections establishing)

The runs were interleaved, but I've shown them here grouped by the
wal_level in use.  If you compare the median values, there's about a
1% regression there with wal_level=logical, but that might not even be
significant - and if it is, well, that's why this feature has an off
switch.

-        * than its parent.  Musn't recurse here, or we might get a
stack overflow
+        * than its parent.  May not recurse here, or we might get a
stack overflow

You don't need this change; it doesn't change the meaning.

+        * with fewer than PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine
+        * if didLogXid isn't set for a transaction even though it appears
+        * in a wal record, we'll just superfluously log something.

It'd be good to rewrite this comment to explain under what
circumstances that can happen, or why it can't happen but that it
would be OK if it did.

I think we'd better separate the changes to catalog.c from the rest of
this.  Those are changing semantics in a significant way that needs to
be separately called out.  In particular, a user-created table in
pg_catalog will be able to have indexes defined on it, will be able to
be truncated, will be allowed to have triggers, etc.  I think that's
OK, but it shouldn't be a by-blow of the rest of this patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Kohei KaiGai
Date:
Subject: Re: What's needed for cache-only table scan?
Next
From: J Smith
Date:
Subject: Re: Errors on missing pg_subtrans/ files with 9.3