On Mon, Oct 14, 2013 at 5:07 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> So, see the attatched benchmark skript. I've always done using a disk
> bound and a memory bound (using eatmydata, preventing fsyncs) run.
>
> * unpatched run, wal_level = hot_standby, eatmydata
> * unpatched run, wal_level = hot_standby
>
> * patched run, wal_level = hot_standby, eatmydata
> * patched run, wal_level = hot_standby
>
> * patched run, wal_level = logical, eatmydata
> * patched run, wal_level = logical
>
> Based on those results, there's no difference above noise for
> wal_level=hot_standby, with or without the patch. With wal_level=logical
> there's a measurable increase in wal traffic (~12-17%), but no
> performance decrease above noise.
>
> From my POV that's ok, those are really crazy catalog workloads.
Any increase in WAL traffic will translate into a performance hit once
the I/O channel becomes saturated, but I agree those numbers don't
sound terrible for that faily-brutal test case. Actually, I was more
concerned about the hit on non-catalog workloads. pgbench isn't a
good test because the key column is so narrow; but suppose we have a
table like (a text, b integer, c text) where (a, c) is the primary key
and those strings are typically pretty long - say just short enough
that we can still index the column. It'd be worth testing both
workloads where the primary key doesn't change (so the only overhead
is figuring out that we need not log it) and those where it does
(where we're double-logging most of the tuple). I assume the latter
has to produce a significant hit to WAL volume, and I don't think
there's much we can do about that; but the former had better be nearly
free.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company