Re: Chronic performance issue with Replication Failover and FSM. - Mailing list pgsql-hackers

From Daniel Farina
Subject Re: Chronic performance issue with Replication Failover and FSM.
Date
Msg-id CAAZKuFZ7rDAjMZayCbhnqUhsX-SxRYaypfAq56MHvJrbRQAhcg@mail.gmail.com
Whole thread Raw
In response to Chronic performance issue with Replication Failover and FSM.  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Tue, Mar 13, 2012 at 4:53 PM, Josh Berkus <josh@agliodbs.com> wrote:
> All,
>
> I've discovered a built-in performance issue with replication failover
> at one site, which I couldn't find searching the archives.  I don't
> really see what we can do to fix it, so I'm posting it here in case
> others might have clever ideas.
>
> 1. The Free Space Map is not replicated between servers.
>
> 2. Thus, when we fail over to a replica, it starts with a blank FSM.
>
> 3. I believe replica also starts with zero counters for autovacuum.
>
> 4. On a high-UPDATE workload, this means that the replica assumes tables
> have no free space until it starts to build a new FSM or autovacuum
> kicks in on some of the tables, much later on.
>
> 5. If your hosting is such that you fail over a lot (such as on AWS),
> then this causes cumulative table bloat which can only be cured by a
> VACUUM FULL.
>
> I can't see any way around this which wouldn't also bog down
> replication.  Clever ideas, anyone?

Would it bog it down by "much"?

(1 byte per 8kb) * 2TB = 250MB.  Even if you doubled or tripled it for
pointer-overhead reasons it's pretty menial, whereas VACUUM traffic is
already pretty intense.  Still, it's clearly...work.

--
fdr


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade and statistics
Next
From: Fujii Masao
Date:
Subject: Re: wal_buffers, redux