Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2) - Mailing list pgsql-hackers

From Andrew Gierth
Subject Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)
Date
Msg-id 87a8ghom9k.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)  (Greg Stark <stark@mit.edu>)
Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)  (Andrew Borodin <borodin@octonica.com>)
List pgsql-hackers
>>>>> "Jeff" == Jeff Janes <jeff.janes@gmail.com> writes:
Jeff> But shouldn't that still leave us with a 75% full index, ratherJeff> than slightly over 50% full?

Average is usually about 67%-70%. (For capacity estimation I always
assume 66% for a non-sequentially-filled btree.)
Jeff> The leaf pages start at 50%, grow to 100%, then split back toJeff> 50%, then grow back to 100%.  So the average
shouldbe about 75%.
 

No, because as the pages split, they fill more slowly (because there are
now more pages). So on average in a large randomly filled index, pages
spend more time nearer 50% full than 100% full. This is easy to
demonstrate by creating a table with an indexed float8 column and adding
batches of random() values to it, checking with pgstatindex at intervals -
the average leaf density will rarely exceed 70%.

However, worst case conditions can give lower leaf densities; obviously
the worst case is if the data is loaded in an order and quantity that
just happens to leave every leaf page recently split.

-- 
Andrew (irc:RhodiumToad)



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Surprising behaviour of \set AUTOCOMMIT ON
Next
From: Robert Haas
Date:
Subject: Re: Parallel tuplesort, partitioning, merging, and the future