Thread: lost on self joins
Sometimes recursion makes my head spin... Imagine that I have a database that holds the structure of my filesystem. There is a table called files that contains every piece of info you would ever want to know about a file, including a unique ID called fileid.| files ======== x| fileid| filename| ... Then, you have a table called folders which looks like:| folders ========== x| folderid| parentid (relates to folders.folderid)| foldername Finaly, a table to allow a many to many join called files_folders| files_folders ================ x| ffid| folderid (fk to folders.folderid)| fileid (fk to files.fileid) Now, I'd like to create a view that shows everything in files, as well as the complete path to the file. However because I don't know how many levels deep the file is nested, I'm not sure how to get that complete path. Here is conceptually what should come out: | files_view ============== x| fileid| filename| ...| full_path Something that won't work is: SELECT files.*, folders.foldername, folders2.foldername FROM files, folders, folders folders2, files_folders ff WHERE files.fileid = ff.fileid AND ff.folderid = folders.folderid AND folders.parentid; The problem is that files that are not in a folder won't show up, and if a folder is more than two levels deep it will only show the two highest levels. Can anyone suggest a way for me to get the information I need? I'm very content to use a simple pl/pgsql function, however I don't know how I'd use recursion there. Thanks, -- Matthew Nuzum www.bearfruit.org cobalt@bearfruit.org
Matthew Nuzum wrote: >Sometimes recursion makes my head spin... > >Imagine that I have a database that holds the structure of my >filesystem. There is a table called files that contains every piece of >info you would ever want to know about a file, including a unique ID >called fileid. > | files >======== >x| fileid > | filename > | ... > >Then, you have a table called folders which looks like: > | folders >========== >x| folderid > | parentid (relates to folders.folderid) > | foldername > >Finaly, a table to allow a many to many join called files_folders > | files_folders >================ >x| ffid > | folderid (fk to folders.folderid) > | fileid (fk to files.fileid) Strange. Do you need this table? Can one file exist in several directories? If not, you can just add "folderid" field into table files. >Now, I'd like to create a view that shows everything in files, as well >as the complete path to the file. However because I don't know how many >levels deep the file is nested, I'm not sure how to get that complete >path. Here is conceptually what should come out: > > | files_view >============== >x| fileid > | filename > | ... > | full_path > >Something that won't work is: >SELECT files.*, folders.foldername, folders2.foldername >FROM files, folders, folders folders2, files_folders ff >WHERE files.fileid = ff.fileid > AND ff.folderid = folders.folderid > AND folders.parentid; > >The problem is that files that are not in a folder won't show up, and if >a folder is more than two levels deep it will only show the two highest >levels. > >Can anyone suggest a way for me to get the information I need? I'm very >content to use a simple pl/pgsql function, however I don't know how I'd >use recursion there. What would you say about this: create or replace function parent_dir(varchar,integer) returns varchar as ' DECLARE curr_name ALIAS for $1; curr_id ALIAS for $2; par_name varchar; par_id integer; begin select into par_name,par_id foldername,parentid from folders where folderid=curr_id; if not found or par_name is nullthen --finish return curr_name; else --find upper folder return parent_dir(par_name || ''/'' || curr_name,par_id);end if; end; ' LANGUAGE 'plpgsql'; Using: select parent_dir('',folderid) as fullpath...; or select parent_dir(filename,folderid) as fullfilename...; Your query would look like this: SELECT files.*, parent_dir('',folderid) as fullfoldername FROM files f join files_folders ff using (fileid); Regards, Tomasz Myrta
> -----Original Message----- > From: Tomasz Myrta [mailto:jasiek@klaster.net] > Sent: Wednesday, January 15, 2003 3:00 PM > To: Matthew Nuzum > Cc: pgsql-sql@postgresql.org > Subject: Re: [SQL] lost on self joins > >Finaly, a table to allow a many to many join called files_folders > > | files_folders > >================ > >x| ffid > > | folderid (fk to folders.folderid) > > | fileid (fk to files.fileid) > > Strange. Do you need this table? Can one file exist in several > directories? > If not, you can just add "folderid" field into table files. Good point. No, it can't exist in multiple folders, so I guess it's overkill to do a many to many here. Thanks for the moment of clarity. > What would you say about this: > > create or replace function parent_dir(varchar,integer) returns varchar as > ' > DECLARE > curr_name ALIAS for $1; > curr_id ALIAS for $2; > par_name varchar; > par_id integer; > begin > select into par_name,par_id foldername,parentid from folders where > folderid=curr_id; > if not found or par_name is null then > --finish > return curr_name; > else > --find upper folder > return parent_dir(par_name || ''/'' || curr_name,par_id); > end if; > end; > ' LANGUAGE 'plpgsql'; > > Using: > select parent_dir('',folderid) as fullpath...; > or > select parent_dir(filename,folderid) as fullfilename...; > > Your query would look like this: > SELECT files.*, > parent_dir('',folderid) as fullfoldername > FROM files f join files_folders ff using (fileid); Well, I guess I would say Thanks! You make it look so easy. As I was planning this in my mind, I didn't have the equivalent of your first parameter to parent_dir, but now I see it's necessary to get the full path when you recurse up the folder hierarchy. Thanks a lot for your response. Matthew Nuzum www.bearfruit.org cobalt@bearfruit.org
On Wed, Jan 15, 2003 at 13:29:47 -0500, Matthew Nuzum <cobalt@bearfruit.org> wrote: > > Can anyone suggest a way for me to get the information I need? I'm very > content to use a simple pl/pgsql function, however I don't know how I'd > use recursion there. Look at the tablefunc contrib package. I believe it provides the ability to do what you are trying to do. You might need to have 7.3 to have it available.
On Wed, Jan 15, 2003 at 03:19:38PM -0500, Matthew Nuzum wrote: > > -----Original Message----- > > From: Tomasz Myrta [mailto:jasiek@klaster.net] > > Sent: Wednesday, January 15, 2003 3:00 PM > > To: Matthew Nuzum > > Cc: pgsql-sql@postgresql.org > > Subject: Re: [SQL] lost on self joins > > >Finaly, a table to allow a many to many join called files_folders > > > | files_folders > > >================ > > >x| ffid > > > | folderid (fk to folders.folderid) > > > | fileid (fk to files.fileid) > > > > Strange. Do you need this table? Can one file exist in several > > directories? > > If not, you can just add "folderid" field into table files. > > Good point. No, it can't exist in multiple folders, so I guess it's > overkill to do a many to many here. Thanks for the moment of clarity. Unless you're attempting to accurately map Unix filesystem sematics, where the exact same file _can_ be in more than one place in the filesystem (hard links). It's all about the inode. One of the wierder bits of unix that you don't often see used in common occurances. Ross
Ross J. Reedstrom wrote: >>>>Finaly, a table to allow a many to many join called files_folders >>>>| files_folders >>>>================ >>>>x| ffid >>>>| folderid (fk to folders.folderid) >>>>| fileid (fk to files.fileid) >>> >>>Strange. Do you need this table? Can one file exist in several >>>directories? >>>If not, you can just add "folderid" field into table files. >> >>Good point. No, it can't exist in multiple folders, so I guess it's >>overkill to do a many to many here. Thanks for the moment of clarity. > > >Unless you're attempting to accurately map Unix filesystem sematics, where >the exact same file _can_ be in more than one place in the filesystem >(hard links). It's all about the inode. One of the wierder bits of unix >that you don't often see used in common occurances. > >Ross If we are talking about Unix filesystems - this solution doesn't let you change filename when using hard (symbolic) links. Anyway I wish I could use symbolic link on windows machine the same like on linux one... I think more important for Matthew would be protection against circular join which causes query to hang up. Tomasz