Thread: Hierarchal data
I realize this is a classic problem, but I'm a NOVICE after all. I want to represent hierarchal topics (just like dmoz.org). I've seen two ways to represent the data. Both are described at http://www.sitepoint.com/article/1105/1 And in another article by Joe Celko about using Modified Preorder Trees. I'm leaning toward using the simpler "adjacency list model" where each node (topic) in the tree just lists its parent. create table topic ( topic_id serial PRIMARY KEY, name varchar(64), parent_id int -- possible to use "REFERENCES topic" but allow NULL? ) The problem becomes then how to find the path from a given node to the root node. I'm working with perl and currently what I'm doing is a recursive call to the database. That's going to be slow if I have to look up many of those. My question is this: is there a way to get Postgresql to do this recursive query for me? My other question is how to get from a topics path to a topic node id. That is, can someone suggest a way to find the topic id if you have a path like: /top/Computers/Software/Operating_Systems/Open_Source/ Thanks, -- Bill Moseley moseley@hank.org
You have a fundamental problem if you want to go from high-level to low-level if you only store the parent_id (from low-level to high-level). [in a booming voice]: Feed your head. Good luck. Ignoring databases for a moment, if you want to go both ways, you need a doubly-linked-list. Back in the day, of course, we would try and minimize storage and use a tricky XOR scheme. In any case, throw an example on paper and see why your scheme will not work. You need a better reference than SITEPOINT for what you want to do... What they say does not apply to you. Cheers, douglas Bill Moseley wrote: >I realize this is a classic problem, but I'm a NOVICE after all. > >I want to represent hierarchal topics (just like dmoz.org). I've seen >two ways to represent the data. Both are described at > > http://www.sitepoint.com/article/1105/1 > >And in another article by Joe Celko about using Modified Preorder Trees. > >I'm leaning toward using the simpler "adjacency list model" where each >node (topic) in the tree just lists its parent. > > create table topic ( > topic_id serial PRIMARY KEY, > name varchar(64), > parent_id int -- possible to use "REFERENCES topic" but allow NULL? > ) > > >The problem becomes then how to find the path from a given node to the >root node. I'm working with perl and currently what I'm doing is a >recursive call to the database. That's going to be slow if I have to >look up many of those. >
On Fri, Jan 23, 2004 at 12:32:49AM -0600, Douglas Trainor wrote: > You have a fundamental problem if you want to go from high-level > to low-level if you only store the parent_id (from low-level to > high-level). > [in a booming voice]: Feed your head. Good luck. Well, I think you can do it, but it's really ugly. Keeping with the theme, say you have a path like: /California/History/Sixties Then something like: SELECT t0.topic_id FROM topic t0, topic t1, topic t2 WHERE t0.name LIKE 'Sixties' AND t0.parent = t1.topic_id AND t1.name LIKE 'History' AND t1.parent = t2.topic_id AND t2.name LIKE 'California' and t2.parent = 0; -- top level > In any case, throw an example on paper and see why your scheme > will not work. You need a better reference than SITEPOINT for > what you want to do... What they say does not apply to you. I'm not sure I follow. You are talking about this link, right? > > http://www.sitepoint.com/article/1105/1 It's PHP, but in the section "The Path to a Node" they show the recursive method of finding the path from a node to the root. That's 1/2 of what I need, although I'm wondering if I can get Postgresql to do the recursion for me. So, I'm not clear why you say it will not work. In Oracle someone suggested: select stuff from node start with id = $node_id connect by prior parentId = id; I had a better reference -- an article by Joe Celko linked on the bottom of that sitepoint article. But that article now requires registration. Both of those articles are recommending the preorder tree method, but I'm trying to figure out if I can use the method above, but some way that's more efficient. The tree won't change very often so I'm thinking of just doing the node to path conversions once and cache those mappings. -- Bill Moseley moseley@hank.org
I didn't receive much feedback from this post. Would psql-general be a better list to post this question? Or is there a better place to ask a general database design question? Thanks, On Thu, Jan 22, 2004 at 05:28:09PM -0800, Bill Moseley wrote: > I realize this is a classic problem, but I'm a NOVICE after all. > > I want to represent hierarchal topics (just like dmoz.org). I've seen > two ways to represent the data. Both are described at > > http://www.sitepoint.com/article/1105/1 > > And in another article by Joe Celko about using Modified Preorder Trees. > > I'm leaning toward using the simpler "adjacency list model" where each > node (topic) in the tree just lists its parent. > > create table topic ( > topic_id serial PRIMARY KEY, > name varchar(64), > parent_id int -- possible to use "REFERENCES topic" but allow NULL? > ) > > > The problem becomes then how to find the path from a given node to the > root node. I'm working with perl and currently what I'm doing is a > recursive call to the database. That's going to be slow if I have to > look up many of those. > > My question is this: is there a way to get Postgresql to do this recursive > query for me? > > > My other question is how to get from a topics path to a topic node id. That is, > can someone suggest a way to find the topic id if you have a path like: > > /top/Computers/Software/Operating_Systems/Open_Source/ > > > > > > > Thanks, > > > > > -- > Bill Moseley > moseley@hank.org > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- Bill Moseley moseley@hank.org
Bill Moseley wrote: > I didn't receive much feedback from this post. Would psql-general be a > better list to post this question? Or is there a better place to ask a > general database design question? > Try searching the mail archives for pgsql-general and maybe pgsql-sql first. This topic has been discussed in great depth more than once, and there are many expressed opinions on the "best" way to tackle it. One solution you could look at is connectby() in contrib/tablefunc. Another is contrib/ltree. And as I said, many others discussed in the archives. HTH, Joe
On Sun, Jan 25, 2004 at 08:54:06AM -0800, Joe Conway wrote: > Bill Moseley wrote: > >I didn't receive much feedback from this post. Would psql-general be a > >better list to post this question? Or is there a better place to ask a > >general database design question? > > > > Try searching the mail archives for pgsql-general and maybe pgsql-sql > first. This topic has been discussed in great depth more than once, and > there are many expressed opinions on the "best" way to tackle it. That's what I assumed, and I had tried searching the archives at: http://archives.postgresql.org/search.php without much luck. I tried things like "directory", "hierarchy", "dmoz", "recursive" so I think I'm not being creative enough in my searches. Do you by chance remember any terms from those threads that would be good for searching? > One solution you could look at is connectby() in contrib/tablefunc. > Another is contrib/ltree. And as I said, many others discussed in the > archives. Thanks, I'm looking at those now. And http://www.sai.msu.su/~megera/postgres/gist/ltree/ looks helpful. I can see it will take some time to grok. Thanks very much for your help, -- Bill Moseley moseley@hank.org
Hi Bill You can do exactly what you want using plpgsql (which is what I assume you mean by 'using posgtgres' and recursion. I use the adjacent key id to track all my projects and tasks. Here is my example of drilling from node to root: ------------- CREATE OR REPLACE FUNCTION public.fx_root_job(int4) RETURNS int4 AS ' declare x int4; returnvalue int4; begin raise notice \'looking for root of %\',$1; select into x id_parent_ from job where id_ = $1; /* The top of the tree is when the parent field is 0 or a job is its own parent*/ if x = $1 or x = 0 or x isnull then returnvalue = $1; else returnvalue = fx_root_job( x ); end if; return returnvalue; end;' LANGUAGE 'plpgsql' VOLATILE; -------- I was going to use Joe Celko's id, but I'd already started - but _next_ time. I was so inspired by the article, I bought the book it came from. So pissed off I didn't think of it my self. Hope this is similar enough to what your after to be helpful Glenn On Fri, 2004-01-23 at 12:28, Bill Moseley wrote: > I realize this is a classic problem, but I'm a NOVICE after all. > > I want to represent hierarchal topics (just like dmoz.org). I've seen > two ways to represent the data. Both are described at > > http://www.sitepoint.com/article/1105/1 > > And in another article by Joe Celko about using Modified Preorder Trees. > > I'm leaning toward using the simpler "adjacency list model" where each > node (topic) in the tree just lists its parent. > > create table topic ( > topic_id serial PRIMARY KEY, > name varchar(64), > parent_id int -- possible to use "REFERENCES topic" but allow NULL? > ) > > > The problem becomes then how to find the path from a given node to the > root node. I'm working with perl and currently what I'm doing is a > recursive call to the database. That's going to be slow if I have to > look up many of those. > > My question is this: is there a way to get Postgresql to do this recursive > query for me? > > > My other question is how to get from a topics path to a topic node id. That is, > can someone suggest a way to find the topic id if you have a path like: > > /top/Computers/Software/Operating_Systems/Open_Source/ > > > > > > > Thanks,