Home > mailing lists

[PERFORM] Slow vacuum of GIST indexes, because of random reads on PostgreSQL 9.6 - Mailing list pgsql-performance

From	Marcin Barczynski
Subject	[PERFORM] Slow vacuum of GIST indexes, because of random reads on PostgreSQL 9.6
Date	September 6, 2017 14:57:25
Msg-id	CAOhG4wem20oFCyZhGW-WyqJcno45z2trPnvsDrDUcPbOytFycg@mail.gmail.com Whole thread Raw
List	pgsql-performance

Tree view

I am using a GIST index on timestamp range, because it supports 'contains' operator ('@>'). Unfortunately, in large scale (billions of rows, index size: almost 800 GB) vacuuming the index takes an order of magnitude longer than btrees (days/weeks instead of hours).

According to the code, during vacuum gist index is traversed in a logical order which translates into random disk acceses (function gistbulkdelete in gistvacuum.c). Btree indexes are vacuummed in physical order (function btvacuumscan in nbtree.c).

As a workaround, I'm planning to replace all uses of 'contains' with the following function:

CREATE OR REPLACE FUNCTION tstzrange_contains(

range tstzrange,

ts timestamptz)

RETURNS bool AS

SELECT (ts >= lower(range) AND (lower_inc(range) OR ts > lower(range)))

AND (ts <= upper(range) AND (upper_inc(range) OR ts < upper(range)))

$$ LANGUAGE SQL IMMUTABLE;

and create btree indexes on lower and upper bound:

CREATE INDEX my_table_time_range_lower_idx ON my_table (lower(time_range));

CREATE INDEX my_table_time_range_upper_idx ON my_table (upper(time_range));

Is it the best approach?

Best regards,

Marcin Barczynski

pgsql-performance by date:

From: Soni M
Date: 06 September 2017, 11:12:26
Subject: [PERFORM] OS cache management

From: Gerardo Herzig
Date: 06 September 2017, 17:13:34
Subject: Re: [PERFORM] OS cache management

[PERFORM] Slow vacuum of GIST indexes, because of random reads on PostgreSQL 9.6 - Mailing list pgsql-performance

Previous

Next