Thread: btbulkdelete
On -performance we have been discussing a configuration where a bulk delete run takes almost a day (and this is not due to crappy hardware or apparent misconfiguration). Unless I misinterpreted the numbers, btbulkdelete() processes 85 index pages per second, while lazy vacuum is able to clean up 620 heap pages per second. Is there a special reason for scanning the leaf pages in *logical* order, i.e. by following the opaque->btpo_next links? Now that FSM covers free btree index pages this access pattern might be highly nonsequential. I'd expect the following scheme to be faster: for blknum = 1 to nblocks { read block blknum; if (block is a leaf) { process it; }} As there is no free lunch this has the downside that it pollutes the cache with unneeded inner nodes and free pages. OTOH there are far less inner pages than leaf pages (even a balanced binary tree has more leaves than inner nodes), and if free pages become a problem it's time to re-index. Did I miss something else? ServusManfred
On Sun, 2004-04-25 at 22:34, Manfred Koizar wrote: > On -performance we have been discussing a configuration where a bulk > delete run takes almost a day (and this is not due to crappy hardware or > apparent misconfiguration). Unless I misinterpreted the numbers, > btbulkdelete() processes 85 index pages per second, while lazy vacuum is > able to clean up 620 heap pages per second. > > Is there a special reason for scanning the leaf pages in *logical* > order, i.e. by following the opaque->btpo_next links? Now that FSM > covers free btree index pages this access pattern might be highly > nonsequential. I had considered implementing a mode where the index doesn't keep trying to reuse space that was freed by earlier deletes. For many situations where you are processing bulk inserts and bulk deletes, reusing space via the FSM ends up weaving the logical sequence into a very unsorted physical sequence. i.e. my thinking was about a way to keep logical looking more like physical, in certain situations. Best Regards, Simon Riggs
On Mon, Apr 26, 2004 at 02:29:58PM +0100, Simon Riggs wrote: > On Sun, 2004-04-25 at 22:34, Manfred Koizar wrote: > > Is there a special reason for scanning the leaf pages in *logical* > > order, i.e. by following the opaque->btpo_next links? Now that FSM > > covers free btree index pages this access pattern might be highly > > nonsequential. > > I had considered implementing a mode where the index doesn't keep trying > to reuse space that was freed by earlier deletes. For many situations > where you are processing bulk inserts and bulk deletes, reusing space > via the FSM ends up weaving the logical sequence into a very unsorted > physical sequence. > > i.e. my thinking was about a way to keep logical looking more like > physical, in certain situations. See this: @inproceedings{DBLP:conf/sigmod/ZouS96, author = {Chendong Zou and Betty Salzberg}, editor = {H. V. Jagadish and InderpalSingh Mumick}, title = {On-line Reorganization of Sparsely-populated B+trees}, booktitle = {Proceedings of the1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996}, publisher= {ACM Press}, year = {1996}, pages = {115-124}, bibsource = {DBLP, \url{http://dblp.uni-trier.de}} } Maybe it can be useful. When I tried to implement it, there was no free-pages code, so first I had to do that (Tom Lane beat me to it though). Then I had to choose a different project. Maybe now it can be done. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) One man's impedance mismatch is another man's layer of abstraction. (Lincoln Yeoh)
On Mon, 26 Apr 2004 14:29:58 +0100, Simon Riggs <simon@2ndquadrant.com> wrote: >> Now that FSM >> covers free btree index pages this access pattern might be highly >> nonsequential. > >I had considered implementing a mode where the index doesn't keep trying >to reuse space that was freed by earlier deletes. Or maybe an FSM function a la "Give me a free page near this one"? ServusManfred
On Mon, 2004-04-26 at 17:24, Manfred Koizar wrote: > On Mon, 26 Apr 2004 14:29:58 +0100, Simon Riggs <simon@2ndquadrant.com> > wrote: > >> Now that FSM > >> covers free btree index pages this access pattern might be highly > >> nonsequential. > > > >I had considered implementing a mode where the index doesn't keep trying > >to reuse space that was freed by earlier deletes. > > Or maybe an FSM function a la "Give me a free page near this one"? > I think you're statement of the requirement is better, but I suspect more complex to implement. Overall, my feeling about the index code is: - its based upon the earlier Lehman-Yao coding and we know better than that now...various literature - the b-tree code is written with the assumption that the inserts/deletes are more or less randomly distributed and balanced, as is the case with TPC-B - I would prefer a mode where the case of large table inserts - the HISTORY table in TPC-B, or many of the tables in TPC-H was optimised for - so inserts on the leading edge of the index go faster, bulk deletes go faster, but we take the chance that space is not reclaimed effectively by random deletes. Best Regards, Simon Riggs
Manfred Koizar <mkoi-pg@aon.at> writes: > Is there a special reason for scanning the leaf pages in *logical* > order, i.e. by following the opaque->btpo_next links? Yes. Read the README file concerning interlocking between indexscans and deletions. regards, tom lane
On Wed, 28 Apr 2004 00:08:48 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Is there a special reason for scanning the leaf pages in *logical* >> order, i.e. by following the opaque->btpo_next links? > >Yes. [..] interlocking between indexscans and deletions. Thanks for refreshing my memory. This has been discussed two years ago, and I even participated in that discussion :-( ServusManfred