Re: connectby questions - Mailing list pgsql-sql

From Dan Langille
Subject Re: connectby questions
Date
Msg-id 20021122075318.J3909-100000@m20.unixathome.org
Whole thread Raw
In response to connectby questions  (Dan Langille <dan@langille.org>)
List pgsql-sql
On Fri, 22 Nov 2002, Dan Langille wrote:

> Here is the actual data for the above nodes:
>
> freshports=# select * from element where id in (104503, 104504, 104505);
>    id   |     name     | parent_id | directory_file_flag | status
> --------+--------------+-----------+---------------------+--------
>  104503 | multimedia   |     77344 | D                   | A
>  104504 | Makefile     |    104503 | F                   | A
>  104505 | chapter.sgml |    104503 | F                   | A
> (3 rows)
>
> What I would like to include in the output is all of the above fields.

DOH!  Use a join stupid!

freshports=# select t.*, tt.name
freshports-# from connectby('simple_element', 'id', 'parent_id', '104503',
0, '/')
freshports-#         as t(id int, parent_id int, level int, branch text),
element tt
freshports-# where t.id = tt.id;  id   | parent_id | level |    branch     |     name
--------+-----------+-------+---------------+--------------104503 |           |     0 | 104503        |
multimedia104504|    104503 |     1 | 104503/104504 | Makefile104505 |    104503 |     1 | 104503/104505 |
chapter.sgml
(3 rows)


Ok, that works.  But I have two issues:

1 - speed:  That join takes 7 seconds:

freshports=# explain analyze
freshports-# select t.*, tt.name
freshports-# from connectby('simple_element', 'id', 'parent_id', '104503',
0, '/
')
freshports-#         as t(id int, parent_id int, level int, branch text),
elemen
t tt
freshports-# where t.id = tt.id;                                                                 QUERY
PLAN

--------------------------------------------------------------------------------
--------------------------------------------------------------Merge Join  (cost=62.33..3050.43 rows=1000 width=60)
(actual
time=7420.23..7421
.03 rows=3 loops=1)  Merge Cond: ("outer".id = "inner".id)  ->  Index Scan using element_pkey on element tt
(cost=0.00..2708.97
rows=104
649 width=16) (actual time=1.69..5933.32 rows=104505 loops=1)  ->  Sort  (cost=62.33..64.83 rows=1000 width=44)
(actual
time=10.84..10.87 ro
ws=3 loops=1)        Sort Key: t.id        ->  Function Scan on connectby t  (cost=0.00..12.50 rows=1000
width=44)(actual time=10.12..10.17 rows=3 loops=1)Total runtime: 7421.78 msec
(7 rows)

freshports=#

2 - What I really want in the output is the branch defined by the name
fields, not by the id fields (e.g. instead of 104503/104504, show
multimedia/Makefile.



For what its worth, I did populate my test database with the full pathname
field, maintained by triggers.  However, the initial population of that
data took 160 minutes... Luckily, the tiggers are there as constraints (of
a sort) rather than actually used to cascade changes.  In practice, nodes
do not get renamed in my application.

Cheers



pgsql-sql by date:

Previous
From: Thrasher
Date:
Subject: Re: Date trunc in UTC
Next
From: "praveen vejandla"
Date:
Subject: calculating interval