Thread: Red-black tree for GIN
Hi there, attached is a patch, which contains implementation of a red-black tree, a self-balanced implementation of binary tree. The main goal of this patch is to improve creation time of GIN index in the corner cases. While creation, GIN collects data in memory in binary tree until it reach some limit and then it flush tree to disk. Some data could produces unbalanced binary tree, for example, sorted data, so the tree degenerates to the list with O(N^2) processing time (see thread http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php) ), which cause very slow index creation. Tom has fixed that by limiting depth of tree (committed to 8.3 and 8.4), but we found it's not enough and propose to use red-black tree, which is very good for skewed data and has almost the same performance for unsorted data, see http://www.sai.msu.su/~megera/wiki/2009-07-27 and http://www.sai.msu.su/~megera/wiki/2009-04-03 for more information. Implementation of red-black tree has several currently unused methods, but they will be used in next patches. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
2009/11/23 Teodor Sigaev <teodor@sigaev.ru>: > Hi there, > > attached is a patch, which contains implementation of a red-black > tree, a self-balanced implementation of binary tree. The main goal of > this patch is to improve creation time of GIN index in the corner cases. > While creation, GIN collects data in memory in binary tree until it > reach some limit and then it flush tree to disk. Some data could > produces unbalanced binary tree, for example, sorted data, so the tree > degenerates to the list with O(N^2) processing time (see thread > http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php) > ), which cause very slow index creation. Tom has fixed that by limiting > depth of tree (committed to 8.3 and 8.4), but we found it's not enough > and propose to use red-black tree, which is very good for skewed data and > has almost the same performance for unsorted data, see > http://www.sai.msu.su/~megera/wiki/2009-07-27 and > http://www.sai.msu.su/~megera/wiki/2009-04-03 for more information. > > Implementation of red-black tree has several currently unused methods, > but they will be used in next patches. I did a quick read-through of this, and one question that immediately occurred to me is that rbtree.c says that it is adopted from http://algolist.manual.ru/ds/rbtree.php. But I'm not sure what license that code is under, so I'm not sure whether it's OK for us to use it. My other question is as related to performance. Can you provide a test case that shows the performance improvement with this patch? ...Robert
Robert Haas escribió: > I did a quick read-through of this, and one question that immediately > occurred to me is that rbtree.c says that it is adopted from > http://algolist.manual.ru/ds/rbtree.php. But I'm not sure what > license that code is under, so I'm not sure whether it's OK for us to > use it. This code comes from Thomas Niemann's "Sorting and Searching Algorithms: A Cookbook", http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_man.htm specifically http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_rbt.htm The code is in the public domain; that web page says"Source code, when part of a software project, may be usedfreely withoutreference to the author." -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Jan 4, 2010 at 8:12 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Robert Haas escribió: >> I did a quick read-through of this, and one question that immediately >> occurred to me is that rbtree.c says that it is adopted from >> http://algolist.manual.ru/ds/rbtree.php. But I'm not sure what >> license that code is under, so I'm not sure whether it's OK for us to >> use it. > > This code comes from Thomas Niemann's "Sorting and Searching Algorithms: > A Cookbook", > http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_man.htm > > specifically > http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_rbt.htm > > The code is in the public domain; that web page says > "Source code, when part of a software project, may be used > freely without reference to the author." That is excellent. Perhaps we should document that information in the code comments where the URL is currently mentioned. ...Robert
On Thu, Dec 31, 2009 at 4:19 PM, Robert Haas <robertmhaas@gmail.com> wrote: > My other question is as related to performance. Can you provide a > test case that shows the performance improvement with this patch? So, we still don't have a test case for this patch. During the November CommitFest, Greg Smith griped a bit about the lack of a reproducible performance benchmark for the XLogInsert patch: http://archives.postgresql.org/pgsql-hackers/2009-12/msg00816.php ...and I would say the same logic applies to this patch, maybe even moreso. Tom has already applied a partial workaround for this problem, and I'm feeling like it won't be trivial to figure out what to measure to see the remaining issue and measure how much this new implementation helps. The coding pattern that this patch uses also merits some discussion. Basically, rbtree.c is a generic implementation of red-black trees - from a textbook - which ginbulk.c then uses for GIN. One possible advantage of this implementation is that it might make it possible for us to use the rbtree.c logic in other places, if we have other data structures that need similar treatment. But I'm not sure if that's the way we want to go. The other alternative is to drop the generalized implementation and incorporate the logic directly into ginbulk.c. I really don't know which is better, but I'd like to hear some other opinions... ...Robert
On Mon, Jan 11, 2010 at 2:42 AM, Robert Haas <robertmhaas@gmail.com> wrote: > The coding pattern that this patch uses also merits some discussion. > Basically, rbtree.c is a generic implementation of red-black trees - > from a textbook - which ginbulk.c then uses for GIN. One possible > advantage of this implementation is that it might make it possible for > us to use the rbtree.c logic in other places, if we have other data > structures that need similar treatment. But I'm not sure if that's > the way we want to go. The other alternative is to drop the > generalized implementation and incorporate the logic directly into > ginbulk.c. I really don't know which is better, but I'd like to hear > some other opinions... > So, code reuse is not the only advantage of abstraction. It's also just plain easier to understand and test code written with clear abstract interfaces. The way you describe it someone with no knowledge could look at rbtree.c and see if it's done correctly and maybe improve it. And someone reading ginbulk only has to understand the operations red-black trees support and no understand how they're implemented to follow the code. Is there any advantage of integrating the code with ginbulk.c? Does it let us get away with finer grained locks or do any tricks doing gin-specific changes while we're in the middle of rebalancing or anything like that? -- greg
Robert, we have benchmark for rbtree http://www.sai.msu.su/~megera/wiki/2009-07-27 rbtree, actually, fix corner cases with ordered input, with little overhead. As you may see from knngist patch, rbtree used in gist code, so, please, leave rbtree code as is. Oleg On Sun, 10 Jan 2010, Robert Haas wrote: > On Thu, Dec 31, 2009 at 4:19 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > My other question is as related to performance. =A0Can you provide a > > test case that shows the performance improvement with this patch? > > So, we still don't have a test case for this patch. During the > November CommitFest, Greg Smith griped a bit about the lack of a > reproducible performance benchmark for the XLogInsert patch: > > http://archives.postgresql.org/pgsql-hackers/2009-12/msg00816.php > > ...and I would say the same logic applies to this patch, maybe even > moreso. Tom has already applied a partial workaround for this > problem, and I'm feeling like it won't be trivial to figure out what > to measure to see the remaining issue and measure how much this new > implementation helps. > > The coding pattern that this patch uses also merits some discussion. > Basically, rbtree.c is a generic implementation of red-black trees - > from a textbook - which ginbulk.c then uses for GIN. One possible > advantage of this implementation is that it might make it possible for > us to use the rbtree.c logic in other places, if we have other data > structures that need similar treatment. But I'm not sure if that's > the way we want to go. The other alternative is to drop the > generalized implementation and incorporate the logic directly into > ginbulk.c. I really don't know which is better, but I'd like to hear > some other opinions... > > ...Robert > > --=20 > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
> ...and I would say the same logic applies to this patch, maybe even > moreso. Tom has already applied a partial workaround for this > problem, and I'm feeling like it won't be trivial to figure out what That partial workaround is not work for some corner cases: http://www.sai.msu.su/~megera/wiki/2009-07-27 (8.4 already has that hask!) > The coding pattern that this patch uses also merits some discussion. > Basically, rbtree.c is a generic implementation of red-black trees - > from a textbook - which ginbulk.c then uses for GIN. One possible > advantage of this implementation is that it might make it possible for > us to use the rbtree.c logic in other places, if we have other data > structures that need similar treatment. But I'm not sure if that's > the way we want to go. The other alternative is to drop the > generalized implementation and incorporate the logic directly into > ginbulk.c. I really don't know which is better, but I'd like to hear > some other opinions... knngist uses that implementation of rb-tree. One more candidate is a ts_stat which now uses unbalanced binary tree. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
2010/1/11 Teodor Sigaev <teodor@sigaev.ru>: > knngist uses that implementation of rb-tree. One more candidate is a ts_stat > which now uses unbalanced binary tree. Ah, OK. That's great if we can reuse that code in 2 or 3 places. ...Robert
On Mon, Jan 11, 2010 at 1:18 PM, Robert Haas <robertmhaas@gmail.com> wrote: > 2010/1/11 Teodor Sigaev <teodor@sigaev.ru>: >> knngist uses that implementation of rb-tree. One more candidate is a ts_stat >> which now uses unbalanced binary tree. > > Ah, OK. That's great if we can reuse that code in 2 or 3 places. Some preliminary thoughts on this patch: 1. I think rb_free_recursive is missing a pfree(). 2. We already have a definition of NIL in the PG source base. I think this one needs to be named something else. RBNIL, maybe. 3. This code could really use some more comments, and maybe some of the variable names could be better chosen, too. It's far from obvious what is going on here. I studied rbtrees in college and I still remember more or less how they work, but, boy, this is hard to follow.The names of the iterator states are truly horrible,at least IMO. 4. It would be nice if you could do a better job conforming to project indentation style. ...Robert
> 1. I think rb_free_recursive is missing a pfree(). Fixed, thank you > > 2. We already have a definition of NIL in the PG source base. I think > this one needs to be named something else. RBNIL, maybe. fixed > > 3. This code could really use some more comments, and maybe some of > the variable names could be better chosen, too. It's far from obvious > what is going on here. I studied rbtrees in college and I still > remember more or less how they work, but, boy, this is hard to follow. > The names of the iterator states are truly horrible, at least IMO. I changed names and slightly improve comments. > > 4. It would be nice if you could do a better job conforming to project > indentation style. Did pgindent. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
2010/1/24 Teodor Sigaev <teodor@sigaev.ru>: >> 3. This code could really use some more comments, and maybe some of >> the variable names could be better chosen, too. It's far from obvious >> what is going on here. I studied rbtrees in college and I still >> remember more or less how they work, but, boy, this is hard to follow. >> The names of the iterator states are truly horrible, at least IMO. > > I changed names and slightly improve comments. Would it be OK if I went through here and hacked on the comments and sent this back to you? ...Robert
> Would it be OK if I went through here and hacked on the comments and > sent this back to you? Feel free to edit the patch. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
2010/1/24 Teodor Sigaev <teodor@sigaev.ru>: >> Would it be OK if I went through here and hacked on the comments and >> sent this back to you? > > Feel free to edit the patch. Here's an edited version, which I've now reviewed more fully. Some more substantive review comments: 1. I'm pretty satisfied that the rbtree code is generally sane, although I wonder if we should think about putting it in access/common rather than utils/misc. I'm not sure that I have a sufficiently clearly defined notion of what each subdirectory is for to draw a definitive conclusion on this point; hopefully someone else will weigh in. 2. I'm a little concerned about the performance implications of this patch. Looking at http://www.sai.msu.su/~megera/wiki/2009-04-03, it's clear that the patch is a huge win in some cases. But I'm also surprised by how much performance is lost in some of the other cases that you tested. I suspect, on balance, that it's probably still a good idea to put this in, but I wonder if you've profiled this at all to see where the extra time is going and/or explored possible ways of squashing that overhead down a little more. 3. In ginInsertEntry(), the "else" branch of the if statement really looks like magic when you first read it. I wonder if it would make sense to pull the contents of EAAllocate() directly into this function, since there's only one call site anyhow. There are a lot of whitespace changes in the version I'm attaching compared to your last one - your pg_indent run didn't use a typedef list that included the new typedefs, so some of the spacing came out kind of wacky. I also fleshed out the comments a bit, deleted one comment that was no longer true, and changed some of the function names in rbtree.c to make them more consistent, but the changes are all pretty trivial. I still have not done any performance testing of my own on this code, and it probably needs that. If anyone else has time to step up and help with that, it would be much appreciated. It would be useful to have some plain old benchmarks as well as some profiling data as mentioned above. ...Robert