Re: new module contrib/tree for 7.2 ? - Mailing list pgsql-hackers
From | Don Baccus |
---|---|
Subject | Re: new module contrib/tree for 7.2 ? |
Date | |
Msg-id | 3C518DE6.8080905@pacifier.com Whole thread Raw |
In response to | new module contrib/tree for 7.2 ? (Oleg Bartunov <oleg@sai.msu.su>) |
List | pgsql-hackers |
Hannu Krosing wrote: > Oleg Bartunov wrote: > >>Hi, >> >>Is't late to submit new contrib module for 7.2 ? >>It's almost ready and tested, we have to write README.tree. >> >>New module creates new data types 'tree', 'treequery' for tree-like data and >>provides indexed access methods using Btree or GiST. >>Node represents as a path to the root, so record like '3.4' means >>4-th child of 3-rd child of the root node. >>Due-to bit-representation there is a limitation to the number of >>children - 64 and defined in compile time (valid values - 8,16,32,64). >> > > How hard it would be to automatically expand it so that 65th node will > make it flow over to next bitfield and the whole level will then > accommodate 64*64 = 4096 child nodes ? Hmmm...the OpenACS 4 tree implementation uses BIT VARYING for the key. The first 128 immediate children of a node have a one-byte key (leading 0 and 7 bits for the key), the next 2 billionish immediate children have four-byte keys (leading 1 and 31 bits for the key). After that you get no more immediate children but two billion plus seems sufficient! Many trees have no nodes with more than 128 immediate children, of course, so these always have keys with one byte per level. Being able to handle essentially "infinitely flat" trees without intervention is awfully nice, though. Using "between" and a simple set of functions you can get btree-indexed access to all subtrees. Getting the set of parents for a node is a bit more tricky - we use a recursive SQL function to build the set of parent keys. This is almost clean using CREATE OR REPLACE in PG 7.2 - you first define the function with a dummy body, then do the C OR R with the real body. The C OR R allows you to recursively reference the function since it was previously defined with a dummy body ... If you want to grab the subtree and spit out the rows in order (as in grabbing rows to display a folder/file style hierarchy) a simple "order by" on the key works, and there's a "treelevel()" function that can be used to drive indention. This has been well-tested on a handful of OpenACS 4/PG production websites and gives fast, indexed, tree operations. I had looked at doing something like Oleg's describing but fell upon BIT VARYING and the fact that byte-aligned/byte-multiple lengthed bit strings are handled efficiently internally (having source to your RDBMS is so nice at times). The advantage of this is that no non-standard type, operators for indexing, etc need to be added to PG. There are just some PL/pgSQL functions and triggers on the tree table (to build the keys on insert/update). When I find some time I thought I'd bundle this up and make it available separately but at the moment you have to grab the OpenACS 4 tarball. Hannu or anyone else, if you're curious shoot me some private e-mail and I'll tell you where to grab the pieces. -- Don Baccus Portland, OR http://donb.photo.net, http://birdnotes.net, http://openacs.org
pgsql-hackers by date: