pgsql: Use streaming read for VACUUM cleanup of GIN - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Use streaming read for VACUUM cleanup of GIN
Date
Msg-id E1w0W6b-003YRp-25@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Use streaming read for VACUUM cleanup of GIN

This commit replace the synchronous ReadBufferExtended() loop done in
ginvacuumcleanup() with the streaming read equivalent, to improve I/O
efficiency during GIN index vacuum cleanup operations.

With dm_delay to emulate some latency and debug_io_direct=data to force
synchronous writes and force the read path to be exercised, the author
has noticed a 5x improvement in runtime, with a substantial reduction in
IO stats numbers.  I have reproduced similar numbers while running
similar tests, with improvements becoming better with more tuples and
more pages manipulated.

Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/6c228755add8f0714677440d53a160f9ed332902

Modified Files
--------------
src/backend/access/gin/ginvacuum.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)


pgsql-committers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: pgsql: CREATE SUBSCRIPTION ... SERVER.
Next
From: Michael Paquier
Date:
Subject: pgsql: bloom: Optimize VACUUM and bulk-deletion with streaming read