BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica - Mailing list pgsql-bugs

From federico@brandwatch.com
Subject BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica
Date
Msg-id E1Ui5wj-0007tI-Cl@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica  (Simon Riggs <simon@2ndQuadrant.com>)
Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      8192
Logged by:          Federico Campoli
Email address:      federico@brandwatch.com
PostgreSQL version: 9.2.4
Operating system:   Debian 6.0
Description:        =


/*

Description:

It seems on very large tables the concurrent update with vacuum (or
autovacuum), =

when the slave is in hot standby mode, generates long loops in read on a
single wal segment during the recovery process.

This have two nasty effects.
A massive read IO peak and the replay lag increasing as the recovery process
hangs for long periods on a pointless loop.

PostgreSQL version: PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled
by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit

Steps to reproduce the error:
setup an hot standby server.
the error occurs with streaming replication enabled and disabled

*/
SET client_min_messages=3D'debug2';
SET trace_sort=3D'on';


--create a new empty table
DROP TABLE IF EXISTS t_vacuum;
CREATE table t_vacuum
    (
       i_id_row integer,
       ts_time  timestamp with time zone default now()
    )
;
--disable the autovacuum, we need to run it manually
ALTER TABLE t_vacuum set (autovacuum_enabled=3D'off');

--insert into the table enough data to have multiple file nodes
INSERT INTO t_vacuum
    (
        i_id_row
    )
    SELECT * FROM generate_series(1,40000000)
;


/*
start iotop on the hot standby monitoring the postgres:startup process and
eventually the wal receiver
for example iotop -p 31293
wait for all the wal files generated during the update to be applied =

this does not apply on live servers of course but is useful to trap only the
IO activity generated by vacuum

If the streaming replication is on, this select will tell when the slave has
finished.

In any case is useful to check the replication lag during the vacuum.
SELECT =

    pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),
replay_location))
FROM =

    pg_stat_replication
;

*/

--in a new session and start an huge table update
UPDATE t_vacuum set ts_time=3Dnow() WHERE i_id_row<20000000;

--then vacuum the table =

VACUUM VERBOSE t_vacuum;

--at some point the startup process will stuck recovering one single wal
file and
--the DISK READ column will show a huge IO for a while. =

--if you monitor the wal receiver also you will notice stream will continue
without problems
--sooner or later the startup process will resume the recovery but meanwhile
the DISK READ is not zero
--the slave will accumulate replication lag =

pgsql-bugs by date:

Previous
From: vishnu.singh@sunarctechnologies.com
Date:
Subject: BUG #8190: Issue with slony-I replication on postgres master and slave database
Next
From: warren_brodie@hotmail.com
Date:
Subject: BUG #8195: Postgre SQL Database displayed as log in even after running a hide command on terminal