Thread: Trees in SQL
I hope this isn't an overly broad topic that ends up diverging into graph theory, but I have a tree structure of identical items (analogous to a filesystem directory tree) that I need to store in Postgres. The "obvious" design is to give the table that will represent these objects a field identifying its "parent" that is a relation to the same table. However, this seems to make many common SQL queries rather difficult. What sort of strategies are best for storing tree structures in a relational database, and how would one structure SQL queries to find, say, "all of the children anywhere under this node", or to represent the condition "if this node is a child at any depth under this other node"? Are there good strategies for preventing cycles? I'd appreciate any insights anyone can give. Thanks. Greg Brauer greg@wildbrain.com
I guess this page answers many of my questions... http://www.intelligententerprise.com/001020/celko1_1.shtml Anybody have comments on working with nested sets? Greg
If you're a book buyer, the one called "Joe Celko's SQL for Smarties: Advanced SQL Programming" has a collection of tree/graph tricks. About 50 pages are devoted to these three sections: Adjacency List Model of Trees Nested Set Model of Trees in SQL Graphs in SQL Book distributors were trying to sell a preview edition of a book by David Rozenshtein, et al called "Tree & Graph Processing in SQL" -- but something fell through and the book apparently didn't come out. Of course, it's more fun to not read a book and to try and implement a few toy tree/graph things yourself. douglas Gregory Brauer wrote: > I hope this isn't an overly broad topic that ends up diverging into graph > theory, but I have a tree structure of identical items (analogous to a > filesystem directory tree) that I need to store in Postgres. The > "obvious" design is to give the table that will represent these objects > a field identifying its "parent" that is a relation to the same table. > However, this seems to make many common SQL queries rather difficult. > > What sort of strategies are best for storing tree structures in a > relational database, and how would one structure SQL queries to find, > say, "all of the children anywhere under this node", or to represent > the condition "if this node is a child at any depth under this other > node"? Are there good strategies for preventing cycles? > > I'd appreciate any insights anyone can give. > > Thanks. > > Greg Brauer > greg@wildbrain.com > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html
On Fri, May 24, 2002 at 11:34:41AM -0700, Gregory Brauer wrote: > > I guess this page answers many of my questions... > > http://www.intelligententerprise.com/001020/celko1_1.shtml > > Anybody have comments on working with nested sets? Note that Joe Celko concludes this article by saying "Although this procedure works, you might want to use a language that has arrays in it, instead of trying to stick to pure SQL." The operations required to maintain this type of structure are expensive. You might like to take a look at: http://cddb.sai.msu.su/~megera/postgres/gist/ I haven't had a chance to try this out yet, but it looks promising. As an added bonus, if you're Russian, the documentation is in Russian. Unfortunately, if you're not Russian, the documentation is in Russian. Otherwise, for the time being, I'd recomend procedural code, rather than pure SQL. -- Ron Peterson -o) 87 Taylor Street /\\ Granby, MA 01033 _\_v https://www.yellowbank.com/ ----
Check mailing list archive and try http://www.sai.msu.su/~megera/postgres/gist/tree/ You'd need tree.tar.gz and test data to play with (dmoz catalog) Unfortunately there is no doc in english :-( Our approach is extremely fast for searching. Oleg On Fri, 24 May 2002, Gregory Brauer wrote: > > I hope this isn't an overly broad topic that ends up diverging into graph > theory, but I have a tree structure of identical items (analogous to a > filesystem directory tree) that I need to store in Postgres. The > "obvious" design is to give the table that will represent these objects > a field identifying its "parent" that is a relation to the same table. > However, this seems to make many common SQL queries rather difficult. > > What sort of strategies are best for storing tree structures in a > relational database, and how would one structure SQL queries to find, > say, "all of the children anywhere under this node", or to represent > the condition "if this node is a child at any depth under this other > node"? Are there good strategies for preventing cycles? > > I'd appreciate any insights anyone can give. > > Thanks. > > Greg Brauer > greg@wildbrain.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Try here maybe: http://cddb.sai.msu.su/~megera/postgres/gist/ Look at their new 'tree' module... Chris ----- Original Message ----- From: "Gregory Brauer" <greg@wildbrain.com> To: <pgsql-sql@postgresql.org> Sent: Friday, May 24, 2002 11:08 AM Subject: [SQL] Trees in SQL > > I hope this isn't an overly broad topic that ends up diverging into graph > theory, but I have a tree structure of identical items (analogous to a > filesystem directory tree) that I need to store in Postgres. The > "obvious" design is to give the table that will represent these objects > a field identifying its "parent" that is a relation to the same table. > However, this seems to make many common SQL queries rather difficult. > > What sort of strategies are best for storing tree structures in a > relational database, and how would one structure SQL queries to find, > say, "all of the children anywhere under this node", or to represent > the condition "if this node is a child at any depth under this other > node"? Are there good strategies for preventing cycles? > > I'd appreciate any insights anyone can give. > > Thanks. > > Greg Brauer > greg@wildbrain.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
On Fri, 24 May 2002, Gregory Brauer wrote: I have exactly the same situation. Some work (argued to have better performance than Celko's implementation) has been done by Miguel Sofer. (see http://www.utdt.edu/~mig/trees.tar.gz) One fast solution would be using pgsql arrays and the contrib/array package. However Oleg's and Theodor's contrib/tree solution looks the most attractive. I will ask my wife today to do a translation of the doc, (hoping russian and yugoslavian are alike :) > > I hope this isn't an overly broad topic that ends up diverging into graph > theory, but I have a tree structure of identical items (analogous to a > filesystem directory tree) that I need to store in Postgres. The > "obvious" design is to give the table that will represent these objects > a field identifying its "parent" that is a relation to the same table. > However, this seems to make many common SQL queries rather difficult. > > What sort of strategies are best for storing tree structures in a > relational database, and how would one structure SQL queries to find, > say, "all of the children anywhere under this node", or to represent > the condition "if this node is a child at any depth under this other > node"? Are there good strategies for preventing cycles? > > I'd appreciate any insights anyone can give. > > Thanks. > > Greg Brauer > greg@wildbrain.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > -- Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt tel: +30-10-8981112 fax: +30-10-8981877 email: achill@matrix.gatewaynet.com mantzios@softlab.ece.ntua.gr
Hey Please post the translation back here !! ;-) lot many others like would require it despo. regds mallah. On Monday 27 May 2002 05:30 pm, Achilleus Mantzios wrote: > On Fri, 24 May 2002, Gregory Brauer wrote: > > I have exactly the same situation. > > Some work (argued to have better performance than Celko's implementation) > has been done by Miguel Sofer. (see http://www.utdt.edu/~mig/trees.tar.gz) > > One fast solution would be using pgsql arrays and the contrib/array > package. > > However Oleg's and Theodor's contrib/tree solution looks the most > attractive. > > I will ask my wife today to do a translation of the doc, > (hoping russian and yugoslavian are alike :) > > > I hope this isn't an overly broad topic that ends up diverging into graph > > theory, but I have a tree structure of identical items (analogous to a > > filesystem directory tree) that I need to store in Postgres. The > > "obvious" design is to give the table that will represent these objects > > a field identifying its "parent" that is a relation to the same table. > > However, this seems to make many common SQL queries rather difficult. > > > > What sort of strategies are best for storing tree structures in a > > relational database, and how would one structure SQL queries to find, > > say, "all of the children anywhere under this node", or to represent > > the condition "if this node is a child at any depth under this other > > node"? Are there good strategies for preventing cycles? > > > > I'd appreciate any insights anyone can give. > > > > Thanks. > > > > Greg Brauer > > greg@wildbrain.com > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html -- Rajesh Kumar Mallah, Project Manager (Development) Infocom Network Limited, New Delhi phone: +91(11)6152172 (221) (L) ,9811255597 (M) Visit http://www.trade-india.com , India's Leading B2B eMarketplace.
> Some work (argued to have better performance than Celko's implementation) > has been done by Miguel Sofer. (see http://www.utdt.edu/~mig/trees.tar.gz) > > One fast solution would be using pgsql arrays and the contrib/array > package. > > However Oleg's and Theodor's contrib/tree solution looks the most > attractive. > > I will ask my wife today to do a translation of the doc, > (hoping russian and yugoslavian are alike :) Well if she can - please post the docs back to Oleg & Teodor and the list here! Thanks, Chris
Greg, > What sort of strategies are best for storing tree structures in a > relational database, and how would one structure SQL queries to find, > say, "all of the children anywhere under this node", or to represent > the condition "if this node is a child at any depth under this other > node"? Are there good strategies for preventing cycles? Joe Celko covers this extensively in his book "SQL for Smarties." You should definitely buy it and read the two chapters on tree structures before going any further. I will be writing an article for Techdocs within the month on implementing Celko's "Linear Nested Sets Model" of trees in PostgreSQL. If you can't wait, buy the book, e-mail me, and I'll send you my notes. -- -Josh Berkus
I have been playing with the tree package, and so far it seems fast. But i have some questions: How can we find "maximum" direct child of a node? e.g. The table has as: dynatree=# SELECT node from achtree ;node -------121.11.1.11.2 (5 rows) dynatree=# Now we find the children of '1' of level 2: dynatree=# SELECT node from achtree where node <* '1.*.0';node ------1.11.2 (2 rows) dynatree=# Now if we want to insert another level 2 child of '1', how can we get the current maximum value of level 2 children of node '1'?? (in this case 2), so that we can insert '1.3' to the table??? Also, i want to ask if somebody has extended postgresql's jdbc implemantation to include support for bitrees,entrees. One maybe slow solution for both of the above would be to be able to cast e.g. entree to text, so that a) taking the lexicographically maximum bitree value would give the maximum current child, and b) java intergration would be trivial through java.lang.String. Thanx
On Mon, 27 May 2002, Christopher Kings-Lynne wrote: > > Some work (argued to have better performance than Celko's implementation) > > has been done by Miguel Sofer. (see http://www.utdt.edu/~mig/trees.tar.gz) > > > > One fast solution would be using pgsql arrays and the contrib/array > > package. > > > > However Oleg's and Theodor's contrib/tree solution looks the most > > attractive. > > > > I will ask my wife today to do a translation of the doc, > > (hoping russian and yugoslavian are alike :) > > Well if she can - please post the docs back to Oleg & Teodor and the list > here! The problem is that phrases like "the mask of the tree" dont make much sense to her. We've bought a greek-russian dictionary, and in 1-2 weeks i think i will have it. > > Thanks, > > Chris > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
On Wed, 29 May 2002 achill@matrix.gatewaynet.com wrote: > > I have been playing with the tree package, and so far > it seems fast. > > But i have some questions: > > How can we find "maximum" direct child of a node? > > e.g. > The table has as: > > dynatree=# SELECT node from achtree ; > node > ------- > 1 > 2 > 1.1 > 1.1.1 > 1.2 > (5 rows) > > dynatree=# > > Now we find the children of '1' of level 2: > > dynatree=# SELECT node from achtree where node <* '1.*.0'; > node > ------ > 1.1 > 1.2 > (2 rows) > > dynatree=# > > Now if we want to insert another level 2 child of '1', > how can we get the current maximum value of level 2 children of node '1'?? > (in this case 2), so that we can insert '1.3' to the table??? Just use functions entree_next(entree), bitree_next(bitree) -return next free node. > > Also, i want to ask if somebody has extended postgresql's jdbc > implemantation to include support for bitrees,entrees. > > One maybe slow solution for both of the above would be to > be able to cast e.g. entree to text, so that > > a) taking the lexicographically maximum bitree value would give the > maximum current child, and > b) java intergration would be trivial through java.lang.String. > > Thanx > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
On Wed, 29 May 2002, Oleg Bartunov wrote: > > How can we find "maximum" direct child of a node? > > > > dynatree=# > > > > Now we find the children of '1' of level 2: > > > > dynatree=# SELECT node from achtree where node <* '1.*.0'; > > node > > ------ > > 1.1 > > 1.2 > > (2 rows) > > > > dynatree=# > > > > Now if we want to insert another level 2 child of '1', > > how can we get the current maximum value of level 2 children of node '1'?? > > (in this case 2), so that we can insert '1.3' to the table??? > > > Just use functions entree_next(entree), bitree_next(bitree) - > return next free node. Ok, entree_next(1.2) is fine, but how can we know this particular 1.2 ^ ?? Think of a sequence for example that increments itself. Now, if we have a unique index on a entree column, how can we guarantie the next one will be the previous max child + 1?? > > > > > > > > Also, i want to ask if somebody has extended postgresql's jdbc > > implemantation to include support for bitrees,entrees. > > > > One maybe slow solution for both of the above would be to > > be able to cast e.g. entree to text, so that > > > > a) taking the lexicographically maximum bitree value would give the > > maximum current child, and > > b) java intergration would be trivial through java.lang.String. > > > > Thanx > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > -- Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt tel: +30-10-8981112 fax: +30-10-8981877 email: achill@matrix.gatewaynet.com mantzios@softlab.ece.ntua.gr
Also Oleg, i would like to ask you for the casting problem. It seems that entree doesn't cast but only to entree!! (The same applies to bitrees). How can someone interface entree's from outside sql,pg/plsql?? (C functions returning text for example, java etc). Your help is really valuable. -- Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt tel: +30-10-8981112 fax: +30-10-8981877 email: achill@matrix.gatewaynet.com mantzios@softlab.ece.ntua.gr
On Wed, 29 May 2002, Achilleus Mantzios wrote: > On Wed, 29 May 2002, Oleg Bartunov wrote: > > > > How can we find "maximum" direct child of a node? > > > > > > dynatree=# > > > > > > Now we find the children of '1' of level 2: > > > > > > dynatree=# SELECT node from achtree where node <* '1.*.0'; > > > node > > > ------ > > > 1.1 > > > 1.2 > > > (2 rows) > > > > > > dynatree=# > > > > > > Now if we want to insert another level 2 child of '1', > > > how can we get the current maximum value of level 2 children of node '1'?? > > > (in this case 2), so that we can insert '1.3' to the table??? > > > > > > Just use functions entree_next(entree), bitree_next(bitree) - > > return next free node. > > Ok, entree_next(1.2) is fine, but how can we know this particular > 1.2 > ^ > ?? tree=# select * from tree where node <* '1.*.0';node ------1.11.2 (2 rows) tree=# select entree_next(node) from tree where node <* '1.*.0' order by node desc limit 1;entree_next -------------1.3 (1 row) > > Think of a sequence for example that increments itself. > > Now, if we have a unique index on a entree column, how can we guarantie > the next one will be the previous max child + 1?? > > > > > > > > > > > > > > > Also, i want to ask if somebody has extended postgresql's jdbc > > > implemantation to include support for bitrees,entrees. > > > > > > One maybe slow solution for both of the above would be to > > > be able to cast e.g. entree to text, so that > > > > > > a) taking the lexicographically maximum bitree value would give the > > > maximum current child, and > > > b) java intergration would be trivial through java.lang.String. > > > > > > Thanx > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 3: if posting/reading through Usenet, please send an appropriate > > > subscribe-nomail command to majordomo@postgresql.org so that your > > > message can get through to the mailing list cleanly > > > > > > > Regards, > > Oleg > > _____________________________________________________________ > > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > > Sternberg Astronomical Institute, Moscow University (Russia) > > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > > phone: +007(095)939-16-83, +007(095)939-23-83 > > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Hi i found a rather strange behaviour in intarray. When i dont use indices or use the intbig version it works as expected. When i use gist__intbig_ops index, i returns something like "contained". To illustrate: (it is on the test__int table provided in the package). treetest=# DROP INDEX text_idx; DROP treetest=# select * from test__int where a= '{1}';a --- (0 rows) treetest=# CREATE INDEX text_idx on test__int using gist ( a gist__intbig_ops ); CREATE treetest=# select * from test__int where a= '{1}';a --- (0 rows) treetest=# DROP INDEX text_idx; DROP treetest=# CREATE INDEX text_idx on test__int using gist ( a gist__int_ops ); CREATE treetest=# select * from test__int where a= '{1}' limit 5; a -----------------{11,56,1}{34,1,39,16}{41,1,87,40,60}{41,64,10,1}{60,88,95,1} (5 rows) treetest=# When the gist__int_ops index is created the result is different -- Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt tel: +30-10-8981112 fax: +30-10-8981877 email: achill@matrix.gatewaynet.com mantzios@softlab.ece.ntua.gr
There is probably a bug in gist__int_ops ( 'contained' ops used instead of '=') We'll submit fix later Oleg On Fri, 31 May 2002, Achilleus Mantzios wrote: > > Hi i found a rather strange behaviour in intarray. > > When i dont use indices or use the intbig version it works as expected. > When i use gist__intbig_ops index, i returns something like "contained". > To illustrate: (it is on the test__int table provided in the package). > > treetest=# DROP INDEX text_idx; > DROP > treetest=# select * from test__int where a= '{1}'; > a > --- > (0 rows) > > treetest=# CREATE INDEX text_idx on test__int using gist ( a > gist__intbig_ops ); > CREATE > treetest=# select * from test__int where a= '{1}'; > a > --- > (0 rows) > > treetest=# DROP INDEX text_idx; > DROP > treetest=# CREATE INDEX text_idx on test__int using gist ( a gist__int_ops > ); > CREATE > treetest=# select * from test__int where a= '{1}' limit 5; > a > ----------------- > {11,56,1} > {34,1,39,16} > {41,1,87,40,60} > {41,64,10,1} > {60,88,95,1} > (5 rows) > > treetest=# > > When the gist__int_ops index is created the result is different > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Warning: Unable to connect to PostgreSQL server: connectDBStart() -- connect() failed: Connection refused Is the postmaster running (with -i) at 'db.postgresql.org' and accepting connections on TCP/IP port 5432? in /usr/local/www/www/idocs/opendb.php on line 3 Unable to access database -- Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt tel: +30-10-8981112 fax: +30-10-8981877 email: achill@matrix.gatewaynet.com mantzios@softlab.ece.ntua.gr
already fixed ... thanks .. On Fri, 31 May 2002, Achilleus Mantzios wrote: > > Warning: Unable to connect to PostgreSQL server: connectDBStart() -- > connect() failed: Connection refused Is the postmaster running (with -i) > at 'db.postgresql.org' and accepting connections on TCP/IP port 5432? in > /usr/local/www/www/idocs/opendb.php on line 3 > Unable to access database > > > > -- > Achilleus Mantzios > S/W Engineer > IT dept > Dynacom Tankers Mngmt > tel: +30-10-8981112 > fax: +30-10-8981877 > email: achill@matrix.gatewaynet.com > mantzios@softlab.ece.ntua.gr > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >