This patch makes the following changes to GiST:
- refactor the code to keep a pin on the buffer currently being examined
by a GiST index scan. This avoids the need to invoke ReadBuffer() for
each tuple produced by the scan; this should result in a performance
improvement.
- add support for index tuple killing (once the above refactoring was
done, this was easy). This should ensure that GiST performance degrades
more gracefully in the presence of expired heap tuples.
- remove gistscancache(): per discussion on -hackers, this is useless
- fold gistfirst() into gistnext(): there was a bunch of code duplicated
here for no good reason
- rename some structure fields to be more sensible
- add some comments to gistget.c
- rename IndexUpdateStats() to IndexCloseAndUpdateStats(), per
suggestion from Tom
- various other cleanups and improvements
Performance results:
I ran contrib/rtree_gist/bench/bench.pl to test the effect of these
changes. "$NUM" in create_test.pl was set to 400,000, and all tests were
run with "-b 40". The machine is a 2-processor Xeon 2.8 ghz, Linux
2.6.9, with a single SCSI 10k disk and 1GB of RAM. I ran the test a few
times to warm up the cache first. Results without patch:
total: 18.55 sec; number: 40; for one: 0.464 sec; found 1 docs
total: 18.44 sec; number: 40; for one: 0.461 sec; found 1 docs
total: 18.45 sec; number: 40; for one: 0.461 sec; found 1 docs
total: 18.43 sec; number: 40; for one: 0.461 sec; found 1 docs
Results with patch:
total: 16.97 sec; number: 40; for one: 0.424 sec; found 1 docs
total: 16.91 sec; number: 40; for one: 0.423 sec; found 1 docs
total: 16.96 sec; number: 40; for one: 0.424 sec; found 1 docs
total: 16.94 sec; number: 40; for one: 0.424 sec; found 1 docs
I've attached a gzip'd patch against current sources; it includes the
previous changes made to GiST for memory management and related cleanups
(sorry, I would include an incremental diff, but interdiff(1) seems to
be misbehaving...)
-Neil