Thread: Performance of PostgreSQL B+-tree algorithm
I spent some time last week staring at the code for the PostgreSQL B+-tree implementation. What I hoped to find, and was not immediately able to determine, was the Knuth order for the PostgreSQL B+-tree implementation. It is entirely possible that I simply got lost in the wrong C file. My goal is to make an informed assertion about the performance of a PostgreSQL B+-tree index as the quantity of records in our database grows more or less unbounded. To use a common reference, wikipedia states the following: Bayer & McCreight (1972), Comer (1979), and others define the order of B-tree as the minimum number of keys in a non-root node. Folk & Zoellick (1992) points out that terminology is ambiguous because the maximum number of keys is not clear. An order 3 B-tree might hold a maximum of 6 keys or a maximum of 7 keys. (Knuth 1998, p. 483) avoids the problem by defining the order to be maximum number of children (which is one more than the maximum number of keys). http://en.wikipedia.org/wiki/B-tree I would be happy to refer to an academic publication if it contains a clear analysis of the PostgreSQL B+-tree implementation. Thanks much, --Kyle
Kyle Lanclos <lanclos@ucolick.org> writes: > I spent some time last week staring at the code for the PostgreSQL > B+-tree implementation. What I hoped to find, and was not immediately > able to determine, was the Knuth order for the PostgreSQL B+-tree > implementation. It is entirely possible that I simply got lost in the > wrong C file. > My goal is to make an informed assertion about the performance of > a PostgreSQL B+-tree index as the quantity of records in our database > grows more or less unbounded. > To use a common reference, wikipedia states the following: > Bayer & McCreight (1972), Comer (1979), and > others define the order of B-tree as the > minimum number of keys in a non-root node. > Folk & Zoellick (1992) points out that terminology > is ambiguous because the maximum number of keys > is not clear. An order 3 B-tree might hold a > maximum of 6 keys or a maximum of 7 keys. > (Knuth 1998, p. 483) avoids the problem by defining > the order to be maximum number of children (which > is one more than the maximum number of keys). Well, that would depend on the data type being indexed, which you did not specify; and if it's a variable-length type then it's really hard to give a concrete answer. For integer or int8 keys the answer is typically about 400, though, depending on whether you're talking about a 32- or 64-bit platform. Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple header, plus maxalign'ed size of the index key, divided into page size (less a couple dozen bytes for page header). You could increase the result by building with a page size of more than the default 8K, though I've seen no recent experiments suggesting that doing so is likely to be a win. regards, tom lane
Tom Lane wrote: > Well, that would depend on the data type being indexed, which you did > not specify; and if it's a variable-length type then it's really hard to > give a concrete answer. Thanks for the quick reply; I did not appreciate that the Knuth order would vary according to the data being indexed. In my specific case, I have an index on (text, double). There are individual indexes on (text) and (double) that are of some interest, but the main interest is the two-column index. The text column in question typically does not contain values longer than ten characters. > Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple > header, plus maxalign'ed size of the index key, divided into page size > (less a couple dozen bytes for page header). So, it is the size of the index key that varies depending on the column type? > You could increase the result by building with a page size of more than > the default 8K, though I've seen no recent experiments suggesting that > doing so is likely to be a win. I'm thinking it would have to be a very large increase in page size for it to have an impact. I'm guessing you would also pay a fixed cost (log (Knuth order)) to traverse a leaf node once you get there. One can probably produce graphs that show how many records one needs in a database table before the page size increase starts to make sense. --Kyle
Kyle Lanclos <lanclos@ucolick.org> writes: > Tom Lane wrote: >> Well, that would depend on the data type being indexed, which you did >> not specify; and if it's a variable-length type then it's really hard to >> give a concrete answer. > In my specific case, I have an index on (text, double). There are individual > indexes on (text) and (double) that are of some interest, but the main > interest is the two-column index. > The text column in question typically does not contain values longer than > ten characters. >> Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple >> header, plus maxalign'ed size of the index key, divided into page size >> (less a couple dozen bytes for page header). > So, it is the size of the index key that varies depending on the column > type? Yeah. You could probably safely assume that the text column occupies at most 16 bytes (and because of the alignment requirement for the double, it's unlikely to be much less either). So that gives 4+8+16+8 = 36 bytes per index entry for this case, so you could expect to fit at least 220 or so entries per index page. BTW, I'm unsure that that's a representative number in practice. The traditional wisdom for btree indexes on changing data is that the fill factor averages only around 2/3rds, which would mean you'd really find maybe 150 or so entries on a typical index page. Not sure if the model you're planning to use accounts for index slack space separately or not. regards, tom lane