Hello,
I have a huge table (100 million rows) of relations between nodes by id in a Postgresql 11 server. Like this:
CREATE TABLE relations (
pid INTEGER NOT NULL,
cid INTEGER NOT NULL,
)
This table has parent-child relations references between nodes by id. Like:
pid -> cid
n1 -> n2
n1 -> n3
n1 -> n4
n2 -> n21
n2 -> n22
n2 -> n23
n22 -> n221
n22 -> n222
I would like to get a list of all the nodes being children (direct or indirect) of any other node.
Example. The children of:
1) n3: [] (n3 has not children)
2) n22: [n221, n222] (n22 has 2 children: n221 and n222)
3) n1: [n2, n21, n22, n23, n221, n222] (n1 has 6 children including indirect children).
this pseudo SQL:
SELECT *
FROM relations
WHERE has_parent(myId)
It can be solved with a recursive function or stored procedure. But that requires several passes. Is it possible to solve it in one pass? Perhaps using some low-level function or join or some index expression or auxiliary columns?
It is OK to create an index or similar using recursive expressions. However, the SELECT expressions should be solved in one pass because of speed.
Pablo
I could not find away to handle the branches but this is more complete.
with recursive descendants (last, children) as
(select c.c, array[null::int] from kids c where not exists (select 1 from kids p where c.c = p.p)
union all
select k.p, array[k.c] || l.children from kids k, descendants l where k.c = l.last)
select last, children from descendants where children[1] is not null order by last
last | children
------+-----------------
1 | {2,22,222,NULL}
1 | {4,NULL}
1 | {2,21,NULL}
1 | {2,23,NULL}
1 | {2,22,221,NULL}
1 | {3,NULL}
2 | {22,221,NULL}
2 | {22,222,NULL}
2 | {21,NULL}
2 | {23,NULL}
22 | {221,NULL}
22 | {222,NULL}
(12 rows)
I’ll throw in the towel now