Thread: Processing Tables containing tree-like data

Processing Tables containing tree-like data

From

psql-novice@netzach.co.il

Date:

29 May 2007, 14:02:13

Hi,

I have a table which looks like this:

id    info    parentid
0    <God>    0
1    Adam    0
2    Cain    1
3    Abel    1
4    Seth    1
5    Enosh    4
....



I am looking for a fast and efficient way of finding ALL the descendents
of any particular node, to unlimited depth.

Is there a standard database trick for doing this efficiently ? Writing
a recursive function would be extremely inefficient for repeated
queries.

Thanks,


Netzach

Re: Processing Tables containing tree-like data

From

Richard Broersma Jr

Date:

29 May 2007, 14:39:53

--- psql-novice@netzach.co.il wrote:

>
> Hi,
>
> I have a table which looks like this:
>
> id    info    parentid
> 0    <God>    0
> 1    Adam    0
> 2    Cain    1
> 3    Abel    1
> 4    Seth    1
> 5    Enosh    4
> ....
>

This model is known as the Adjacency list tree model. there are two other with different strengths
and weaknesses. They are Path Enumeration tree model, and the nested set tree model.

Path Enumeration and nested set are better suited for this kind of query than the Adjacently list
model.  The simplest and best preforming queries are probably developed using the nested set
model, but this poor choice if you are frequently adding new people to your table.

The Path enumeration model doesn't have an insertion problem, however, this model looks very
de-normalized.  I would imagine that it queries can be slow since it relies on LIKE predicates to
find all of the descendants of a particular node.

On a side note, I believe that the PostgreSQL developers are working on adding a standard SQL
operation that will handle this for your.  Perhaps we could expect to see it in the next couple of
years.

path-enumeration
http://www.onlamp.com/pub/a/onlamp/2004/08/05/hierarchical_sql.html

nested-set
http://www.dbmsmag.com/9604d06.html

Regards,
Richard Broersma Jr.

Re: Processing Tables containing tree-like data

From

Michael Glaesemann

Date:

29 May 2007, 14:49:00

On May 29, 2007, at 11:58 , psql-novice@netzach.co.il wrote:

> I have a table which looks like this:
>
> id    info    parentid
> 0    <God>    0
> 1    Adam    0
> 2    Cain    1
> 3    Abel    1
> 4    Seth    1
> 5    Enosh    4
> ....
>
>
>
> I am looking for a fast and efficient way of finding ALL the
> descendents
> of any particular node, to unlimited depth.
>
> Is there a standard database trick for doing this efficiently ?
> Writing
> a recursive function would be extremely inefficient for repeated
> queries.

What you've got there is often called an adjacency list. Check out
the connect_by function in the tablefunc contrib module (contrib/
tablefunc). You might also want to google for SQL nested sets to find
an alternative method of doing handling trees.

Michael Glaesemann
grzm seespotcode net

Re: Processing Tables containing tree-like data

From

"Burak Seydioglu"

Date:

31 May 2007, 19:47:40

What is the frequency that this data is updated?

If the data is static (or if you can get away with running a cron job
every now and then), you can write a recursive pl/pgslq function to
get level information for each node on the tree and assign a specific
"incremental" node_id for each record. Due to the nature of the
recursive function, a node_id is assigned to the children of a
specific node instead of its siblings. You should end up with data as
illustrated below.

id      info    parent_id level node_id
1       Name1   Null    1       1
2       Name2   1       2       2
3       Name3   2       3       3
4      Name4    3       4       4
5      Name5    4       5       5
6      Name5    1       2       6
7      Name6    6       3       7
8      Name7    1       2       8

Then you can simply retrieve the children of node (N) on level (L)
with a single statement.

SELECT * FROM table WHERE node_id > N AND node_id < (SELECT node_id
FROM table WHERE level = L AND node_id > N ORDER BY node_id LIMIT 1
OFFSET 0);

Refrain from using MIN() as performance suffers unbelievably.

Hope this helps,

Burak

On 5/29/07, psql-novice@netzach.co.il <psql-novice@netzach.co.il> wrote:
>
> Hi,
>
> I have a table which looks like this:
>
> id      info    parentid
> 0       <God>   0
> 1       Adam    0
> 2       Cain    1
> 3       Abel    1
> 4       Seth    1
> 5       Enosh   4
> ....
>
>
>
> I am looking for a fast and efficient way of finding ALL the descendents
> of any particular node, to unlimited depth.
>
> Is there a standard database trick for doing this efficiently ? Writing
> a recursive function would be extremely inefficient for repeated
> queries.
>
> Thanks,
>
>
> Netzach
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>

Re: Processing Tables containing tree-like data

From

"Burak Seydioglu"

Date:

31 May 2007, 20:02:29

Actually, you can even use this approach for frequently updated data.
Here is the pseudo-code for a pl/pgsql function that adds a new node
to the tree.

* SELECT MAX(node_id) FROM table WHERE parent_id = node_parent
* UPDATE table SET node_id=node_id+1 WHERE node_id > max_node_id
* INSERT new node with node_id = max_node_id+1 and parent_id = node_parent

Inserting new nodes into the chain would be more complex, but not impossible.

Burak







On 5/31/07, Burak Seydioglu <buraks78@gmail.com> wrote:
> What is the frequency that this data is updated?
>
> If the data is static (or if you can get away with running a cron job
> every now and then), you can write a recursive pl/pgslq function to
> get level information for each node on the tree and assign a specific
> "incremental" node_id for each record. Due to the nature of the
> recursive function, a node_id is assigned to the children of a
> specific node instead of its siblings. You should end up with data as
> illustrated below.
>
> id      info    parent_id level node_id
> 1       Name1   Null    1       1
> 2       Name2   1       2       2
> 3       Name3   2       3       3
> 4      Name4    3       4       4
> 5      Name5    4       5       5
> 6      Name5    1       2       6
> 7      Name6    6       3       7
> 8      Name7    1       2       8
>
> Then you can simply retrieve the children of node (N) on level (L)
> with a single statement.
>
> SELECT * FROM table WHERE node_id > N AND node_id < (SELECT node_id
> FROM table WHERE level = L AND node_id > N ORDER BY node_id LIMIT 1
> OFFSET 0);
>
> Refrain from using MIN() as performance suffers unbelievably.
>
> Hope this helps,
>
> Burak
>
>
> On 5/29/07, psql-novice@netzach.co.il <psql-novice@netzach.co.il> wrote:
> >
> > Hi,
> >
> > I have a table which looks like this:
> >
> > id      info    parentid
> > 0       <God>   0
> > 1       Adam    0
> > 2       Cain    1
> > 3       Abel    1
> > 4       Seth    1
> > 5       Enosh   4
> > ....
> >
> >
> >
> > I am looking for a fast and efficient way of finding ALL the descendents
> > of any particular node, to unlimited depth.
> >
> > Is there a standard database trick for doing this efficiently ? Writing
> > a recursive function would be extremely inefficient for repeated
> > queries.
> >
> > Thanks,
> >
> >
> > Netzach
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >        choose an index scan if your joining column's datatypes do not
> >        match
> >
>