Optimize hash index bulk-deletion with streaming read
This commit refactors hashbulkdelete() to use streaming reads, improving
the efficiency of the operation by prefetching upcoming buckets while
processing a current bucket. There are some specific changes required
to make sure that the cleanup work happens in accordance to the data
pushed to the stream read callback. When the cached metadata page is
refreshed to be able to process the next set of buckets, the stream is
reset and the data fed to the stream read callback has to be updated.
The reset needs to happen in two code paths, when _hash_getcachedmetap()
is called.
The author has seen better performance numbers than myself on this one
(with tweaks similar to 6c228755add8). The numbers are good enough for
both of us that this change is worth doing, in terms of IO and runtime.
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/bfa3c4f106b1fb858ead1c8f05332f09d34f664a
Modified Files
--------------
src/backend/access/hash/hash.c | 80 ++++++++++++++++++++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
2 files changed, 78 insertions(+), 3 deletions(-)