Rapid disk usage spikes when updating large tables with GIN indexes - Mailing list pgsql-general

From Jonathan Marks
Subject Rapid disk usage spikes when updating large tables with GIN indexes
Date
Msg-id F16EF890-B574-487B-973F-4D8C13366DAB@gmail.com
Whole thread Raw
Responses Re: Rapid disk usage spikes when updating large tables with GIN indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hello!

We have a mid-sized database on RDS running 10.1 (32 cores, 240 GB RAM, 5TB total disk space, 20k PIOPS) with several
large(100GB+, tens of millions of rows) tables that use GIN indexes for full-text search. We at times need to index
verylarge (hundreds of pages) documents and as a result our tables have a mix of small (tens of tokens) to very large
(hundredsof thousands of tokens near to the tsvector 1MB limit). All our GIN indexes have fastupdate turned off — we
foundthat turning fastupdate on led to significant blocking and that we get better average performance with it turned
off.We’ve put a lot of effort into tuning our database over the last several years to the point where we have
acceptableread and write performance for these tables.  

One recurring, and predictable, issue that we have experienced regularly for multiple years is that inserting or
updatingrows in any table with GIN indexes results in extremely large drops in free disk space — i.e. inserting 10k
rowswith a total size of 10GB can result in the temporary loss of several hundred gigabytes of free disk space over 2-3
hours(and it could be more — we try to keep a 10-15% buffer of free disk space so that often represents almost all
availabledisk space). Once we stop the operation, free disk space rapidly recovers, which makes us believe that this
occursdue to logs, or some kind of temporary table. Our work_mem and maintenance_work_mem settings are pretty large
(12GBand 62GB, respectively). The database’s size on disk scarcely budges during this process. 

Unfortunately, we’re on RDS, so we’re unable to ssh directly into the instance to see what files are so large, and none
ofthe logs we can see (nor the wal logs) are large enough to explain this process. Any suggestions about where to look
tosee the cause of this problem (or about any settings we can tune or changes we could make to stop it) would be
greatlyappreciated. 

Thank you!

pgsql-general by date:

Previous
From: Christoph Moench-Tegeder
Date:
Subject: Re: Recommended way to copy database files on Windows OS (to performfile system level backup)
Next
From: Andreas Kretschmer
Date:
Subject: Re: Recommended way to copy database files on Windows OS (to perform file system level backup)