[WIP] [B-Tree] Retail IndexTuple deletion - Mailing list pgsql-hackers

From Andrey V. Lepikhov
Subject [WIP] [B-Tree] Retail IndexTuple deletion
Date
Msg-id 425db134-8bba-005c-b59d-56e50de3b41e@postgrespro.ru
Whole thread Raw
Responses Re: [WIP] [B-Tree] Retail IndexTuple deletion
Re: [WIP] [B-Tree] Retail IndexTuple deletion
List pgsql-hackers
Hi,
I have written a code for quick indextuple deletion from an relation by 
heap tuple TID. The code relate to "Retail IndexTuple deletion" 
enhancement of btree index on postgresql wiki [1].
Briefly, it includes three steps:
1. Key generation for index tuple searching.
2. Index relation search for tuple with the heap tuple TID.
3. Deletion of the tuple from the index relation.

Now, index relation cleaning performs by vacuum which scan all index 
relation for dead entries sequentially, tuple-by-tuple. This simplistic 
and safe method can be significantly surpasses in the cases, than a 
number of dead entries is not large by retail deletions which uses index 
scans for search dead entries. Also, it can be used by distributed 
systems for reduce cost of a global index vacuum.

Patch '0001-retail-indextuple-deletion' introduce new function 
amtargetdelete() in access method interface. Patch 
'0002-quick-vacuum-strategy' implements this function for an alternative 
strategy of lazy index vacuum, called 'Quick Vacuum'.

The code demands hold DEAD tuple storage until scan key will be created. 
In this version I add 'target_index_deletion_factor' option. If it more 
than 0, heap_page_prune() uses ItemIdMarkDead() instead of 
ItemIdSetDead() function for set DEAD flag and hold the tuple storage.
Next step is developing background worker which will collect pairs (tid, 
scankey) of DEAD tuples from heap_page_prune() function.

Here are test description and some execution time measurements results 
showing the benefit of this patches:

Test:
-----
create table test_range(id serial primary key, value integer);
insert into test_range (value) select random()*1e7/10^N from 
generate_series(1, 1e7);
DELETE FROM test_range WHERE value=1;
VACUUM test_range;

Results:
--------

| n | t1, s  | t2, s  | speedup |
|---|---------------------------|
| 0 | 0.00003| 0.4476 | 1748.4  |
| 1 | 0.00006| 0.5367 | 855.99  |
| 2 | 0.0004 | 0.9804 | 233.99  |
| 3 | 0.0048 | 1.6493 | 34.576  |
| 4 | 0.5600 | 2.4854 | 4.4382  |
| 5 | 3.3300 | 3.9555 | 1.2012  |
| 6 | 17.700 | 5.6000 | 0.3164  |
|---|---------------------------|
in the table, t1 - measured execution time of lazy_vacuum_index() 
function by Quick-Vacuum strategy; t2 - measured execution time of 
lazy_vacuum_index() function by Lazy-Vacuum strategy;

Note, guaranteed allowable time of index scans (used for quick deletion) 
will be achieved by storing equal-key index tuples in physical TID order 
[2] with patch [3].

[1] 
https://wiki.postgresql.org/wiki/Key_normalization#Retail_IndexTuple_deletion
[2] 

https://wiki.postgresql.org/wiki/Key_normalization#Making_all_items_in_the_index_unique_by_treating_heap_TID_as_an_implicit_last_attribute
[3] 
https://www.postgresql.org/message-id/CAH2-WzkVb0Kom%3DR%2B88fDFb%3DJSxZMFvbHVC6Mn9LJ2n%3DX%3DkS-Uw%40mail.gmail.com

--
Andrey Lepikhov
Postgres Professional:
https://postgrespro.com
The Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [bug fix] Cascaded standby cannot start after a clean shutdown
Next
From: Michael Paquier
Date:
Subject: Re: Fix some error handling for read() and errno