Thread: lost on self joins

lost on self joins

From
"Matthew Nuzum"
Date:
Sometimes recursion makes my head spin...

Imagine that I have a database that holds the structure of my
filesystem.  There is a table called files that contains every piece of
info you would ever want to know about a file, including a unique ID
called fileid.| files
========
x| fileid| filename| ...

Then, you have a table called folders which looks like:| folders
==========
x| folderid| parentid (relates to folders.folderid)| foldername

Finaly, a table to allow a many to many join called files_folders| files_folders
================
x| ffid| folderid (fk to folders.folderid)| fileid (fk to files.fileid)

Now, I'd like to create a view that shows everything in files, as well
as the complete path to the file.  However because I don't know how many
levels deep the file is nested, I'm not sure how to get that complete
path.  Here is conceptually what should come out:
| files_view
==============
x| fileid| filename| ...| full_path

Something that won't work is: 
SELECT files.*, folders.foldername, folders2.foldername 
FROM files, folders, folders folders2, files_folders ff
WHERE files.fileid = ff.fileid  AND ff.folderid = folders.folderid  AND folders.parentid;

The problem is that files that are not in a folder won't show up, and if
a folder is more than two levels deep it will only show the two highest
levels.

Can anyone suggest a way for me to get the information I need?  I'm very
content to use a simple pl/pgsql function, however I don't know how I'd
use recursion there.

Thanks,

--
Matthew Nuzum
www.bearfruit.org
cobalt@bearfruit.org





Re: lost on self joins

From
Tomasz Myrta
Date:
Matthew Nuzum wrote:

>Sometimes recursion makes my head spin...
>
>Imagine that I have a database that holds the structure of my
>filesystem.  There is a table called files that contains every piece of
>info you would ever want to know about a file, including a unique ID
>called fileid.
> | files
>========
>x| fileid
> | filename
> | ...
>
>Then, you have a table called folders which looks like:
> | folders
>==========
>x| folderid
> | parentid (relates to folders.folderid)
> | foldername
>
>Finaly, a table to allow a many to many join called files_folders
> | files_folders
>================
>x| ffid
> | folderid (fk to folders.folderid)
> | fileid (fk to files.fileid)

Strange. Do you need this table? Can one file exist in several directories?
If not, you can just add "folderid" field into table files.

>Now, I'd like to create a view that shows everything in files, as well
>as the complete path to the file.  However because I don't know how many
>levels deep the file is nested, I'm not sure how to get that complete
>path.  Here is conceptually what should come out:
>
> | files_view
>==============
>x| fileid
> | filename
> | ...
> | full_path
>
>Something that won't work is: 
>SELECT files.*, folders.foldername, folders2.foldername 
>FROM files, folders, folders folders2, files_folders ff
>WHERE files.fileid = ff.fileid 
>  AND ff.folderid = folders.folderid 
>  AND folders.parentid;
>
>The problem is that files that are not in a folder won't show up, and if
>a folder is more than two levels deep it will only show the two highest
>levels.
>
>Can anyone suggest a way for me to get the information I need?  I'm very
>content to use a simple pl/pgsql function, however I don't know how I'd
>use recursion there.

What would you say about this:

create or replace function parent_dir(varchar,integer) returns varchar as '
DECLARE curr_name    ALIAS for $1; curr_id    ALIAS for $2; par_name    varchar; par_id    integer;
begin select into par_name,par_id foldername,parentid from folders where folderid=curr_id; if not found or par_name is
nullthen   --finish   return curr_name; else   --find upper folder   return parent_dir(par_name || ''/'' ||
curr_name,par_id);end if;
 
end;
' LANGUAGE 'plpgsql';

Using:
select parent_dir('',folderid) as fullpath...;
or 
select parent_dir(filename,folderid) as fullfilename...;

Your query would look like this:
SELECT files.*,
parent_dir('',folderid) as fullfoldername
FROM files f join files_folders ff using (fileid);

Regards,
Tomasz Myrta



Re: lost on self joins

From
"Matthew Nuzum"
Date:
> -----Original Message-----
> From: Tomasz Myrta [mailto:jasiek@klaster.net]
> Sent: Wednesday, January 15, 2003 3:00 PM
> To: Matthew Nuzum
> Cc: pgsql-sql@postgresql.org
> Subject: Re: [SQL] lost on self joins
> >Finaly, a table to allow a many to many join called files_folders
> > | files_folders
> >================
> >x| ffid
> > | folderid (fk to folders.folderid)
> > | fileid (fk to files.fileid)
> 
> Strange. Do you need this table? Can one file exist in several
> directories?
> If not, you can just add "folderid" field into table files.

Good point.  No, it can't exist in multiple folders, so I guess it's
overkill to do a many to many here.  Thanks for the moment of clarity.

> What would you say about this:
> 
> create or replace function parent_dir(varchar,integer) returns varchar
as
> '
> DECLARE
>   curr_name    ALIAS for $1;
>   curr_id    ALIAS for $2;
>   par_name    varchar;
>   par_id    integer;
> begin
>   select into par_name,par_id foldername,parentid from folders where
> folderid=curr_id;
>   if not found or par_name is null then
>     --finish
>     return curr_name;
>   else
>     --find upper folder
>     return parent_dir(par_name || ''/'' || curr_name,par_id);
>   end if;
> end;
> ' LANGUAGE 'plpgsql';
> 
> Using:
> select parent_dir('',folderid) as fullpath...;
> or
> select parent_dir(filename,folderid) as fullfilename...;
> 
> Your query would look like this:
> SELECT files.*,
> parent_dir('',folderid) as fullfoldername
> FROM files f join files_folders ff using (fileid);

Well, I guess I would say Thanks!  You make it look so easy.

As I was planning this in my mind, I didn't have the equivalent of your
first parameter to parent_dir, but now I see it's necessary to get the
full path when you recurse up the folder hierarchy.

Thanks a lot for your response.

Matthew Nuzum
www.bearfruit.org
cobalt@bearfruit.org



Re: lost on self joins

From
Bruno Wolff III
Date:
On Wed, Jan 15, 2003 at 13:29:47 -0500, Matthew Nuzum <cobalt@bearfruit.org> wrote:
> 
> Can anyone suggest a way for me to get the information I need?  I'm very
> content to use a simple pl/pgsql function, however I don't know how I'd
> use recursion there.

Look at the tablefunc contrib package. I believe it provides the ability
to do what you are trying to do. You might need to have 7.3 to have it
available.


Re: lost on self joins

From
"Ross J. Reedstrom"
Date:
On Wed, Jan 15, 2003 at 03:19:38PM -0500, Matthew Nuzum wrote:
> > -----Original Message-----
> > From: Tomasz Myrta [mailto:jasiek@klaster.net]
> > Sent: Wednesday, January 15, 2003 3:00 PM
> > To: Matthew Nuzum
> > Cc: pgsql-sql@postgresql.org
> > Subject: Re: [SQL] lost on self joins
> > >Finaly, a table to allow a many to many join called files_folders
> > > | files_folders
> > >================
> > >x| ffid
> > > | folderid (fk to folders.folderid)
> > > | fileid (fk to files.fileid)
> > 
> > Strange. Do you need this table? Can one file exist in several
> > directories?
> > If not, you can just add "folderid" field into table files.
> 
> Good point.  No, it can't exist in multiple folders, so I guess it's
> overkill to do a many to many here.  Thanks for the moment of clarity.

Unless you're attempting to accurately map Unix filesystem sematics, where
the exact same file _can_ be in more than one place in the filesystem
(hard links).  It's all about the inode. One of the wierder bits of unix
that you don't often see used in common occurances.

Ross


Re: lost on self joins

From
Tomasz Myrta
Date:
Ross J. Reedstrom wrote:

>>>>Finaly, a table to allow a many to many join called files_folders
>>>>| files_folders
>>>>================
>>>>x| ffid
>>>>| folderid (fk to folders.folderid)
>>>>| fileid (fk to files.fileid)
>>>
>>>Strange. Do you need this table? Can one file exist in several
>>>directories?
>>>If not, you can just add "folderid" field into table files.
>>
>>Good point.  No, it can't exist in multiple folders, so I guess it's
>>overkill to do a many to many here.  Thanks for the moment of clarity.
>
>
>Unless you're attempting to accurately map Unix filesystem sematics, where
>the exact same file _can_ be in more than one place in the filesystem
>(hard links).  It's all about the inode. One of the wierder bits of unix
>that you don't often see used in common occurances.
>
>Ross

If we are talking about Unix filesystems - this solution doesn't 
let you change filename when using hard (symbolic) links.
Anyway I wish I could use symbolic link on windows machine 
the same like on linux one...

I think more important for Matthew would be protection against circular 
join which causes query to hang up.

Tomasz