Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id CAPpHfdteaDcBscVxzSViEV4CXLU5==9Ocz81CoFSsNWmv-pZpw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: [HACKERS] GUC for cleanup indexes threshold.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Re: [HACKERS] GUC for cleanup indexes threshold.  (Darafei "Komяpa" Praliaskouski <me@komzpa.net>)
List pgsql-hackers
Hi!

I'd like to propose a revised patch based on various ideas upthread.

This patch works as following.

1) B-tree meta page is extended with 2 additional parameters:
 * btm_oldest_btpo_xact – oldest btpo_xact among of deleted pages,
 * btm_last_cleanup_num_heap_tuples – number of heap tuples during last cleanup scan.

2) These parameters are reset during btbulkdelete() and set during btvacuumcleanup().

3) Index scans during second and subsequent btvacuumcleanup() happen only if
   btm_oldest_btpo_xact is older than RecentGlobalXmin
  OR num_heap_tuples >= btm_last_cleanup_num_heap_tuples(1 + vacuum_cleanup_index_scale_factor).

In other words btvacuumcleanup() scans the index only if there are recyclable pages,
or index statistics is stalled (inserted more than vacuum_cleanup_index_scale_factor
since last index statistics collection).

4) vacuum_cleanup_index_scale_factor can be set either by GUC or reloption.
Default value is 0.1.  So, by default cleanup scan is triggered after increasing of
table size by 10%.

5) Since new fields are added to the metapage, BTREE_VERSION is bumped.
In order to support pg_upgrade, read of previous metapage version is supported.
On metapage rewrite, it's upgraded to the new version.

So, since we don't skip scan of recyclable pages, there is no risk of xid wraparound.
Risk of stalled statistics is also small, because vacuum_cleanup_index_scale_factor
default value is quite low.  User can increase vacuum_cleanup_index_scale_factor
on his own risk and have less load of B-tree cleanup scan bought by more gap in
index statistics.

Some simple benchmark shows the effect.

Before patch.

# insert into t select i from generate_series(1,100000000) i;
# create index t_i_idx on t(i);
# vacuum t;
VACUUM
Time: 15639,822 ms (00:15,640)
# insert into t select i from generate_series(1,1000) i;
INSERT 0 1000
Time: 6,195 ms
# vacuum t;
VACUUM
Time: 1012,794 ms (00:01,013)
# insert into t select i from generate_series(1,1000) i;
INSERT 0 1000
Time: 5,276 ms
# vacuum t;
VACUUM
Time: 1013,254 ms (00:01,013)

After patch.

# insert into t select i from generate_series(1,100000000) i;
# create index t_i_idx on t(i);
# vacuum t;
VACUUM
Time: 15689,450 ms (00:15,689)
# insert into t select i from generate_series(1,1000) i;
INSERT 0 1000
Time: 5,585 ms
# vacuum t;
VACUUM
Time: 50,777 ms
# insert into t select i from generate_series(1,1000) i;
INSERT 0 1000
Time: 5,641 ms
# vacuum t;
VACUUM
Time: 46,997 ms

Thus, vacuum time for append-only table drops from 1000 ms to 50 ms (in about 20X).

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Next
From: Michael Paquier
Date:
Subject: Re: using worker_spi as pattern