Thread: Tree structure
Hi I'm trying to determine the best way to represent a simple tree structure (like a file/dir tree or a uri path). I guess that's done a zillion times before; I just don't seem to be able to find the right solution. I have one special request, that I'd like to find all 'shorter' paths, i.e. given 'a/b/c/d' it'll find a a/b a/b/c - but not b a/c b/a There are a number of options to test. 1. As strings There's no dedicated function (@>) WHERE clause should read something like 'a/b/c/d' LIKE column || '%', which is both ugly and (I guess) non indexable Perhaps regex indexes would work, but not efficient and not optimal 2. As array of strings My favorite, would be elegant. A GIN index on the whole array would make for fast performance Alas @> treats the arrays as a set, not an array WHERE col @> 'a/b/c/d' would find all of the above rows, including a, a/c, b/a, etc. 3. ltree contrib The only option that actually works and uses index @> works as I want it to. But the single segments can only be alphanumeric and underscore ltree only supports GIST there's a length limit. The last option is the one I'm using right now, but I hope that somebody can point out where I'm wrong with regards to the other options, or tell me that there is a better solution somewhere I didn't look.
> 1. As strings > There's no dedicated function (@>) > WHERE clause should read something like 'a/b/c/d' LIKE column || '%', > which is both ugly and (I guess) non indexable > Perhaps regex indexes would work, but not efficient and not optimal > > 2. As array of strings > My favorite, would be elegant. A GIN index on the whole array would make > for fast performance > Alas @> treats the arrays as a set, not an array > WHERE col @> 'a/b/c/d' would find all of the above rows, including a, a/c, > b/a, etc. > > 3. ltree contrib > The only option that actually works and uses index > @> works as I want it to. > But the single segments can only be alphanumeric and underscore > ltree only supports GIST > there's a length limit. 4. Using a recursive common table expression (CTE). http://www.postgresql.org/docs/9.2/static/queries-with.html -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
Kaare Rasmussen <kaare@jasonic.dk> writes: > Hi > > I'm trying to determine the best way to represent a simple tree > structure (like a file/dir tree or a uri path). I guess that's done a > zillion times before; I just don't seem to be able to find the right > solution. I have one special request, that I'd like to find all > shorter' paths, i.e. given 'a/b/c/d' it'll find > > a > a/b > a/b/c > - but not > b > a/c > b/a If I understand you correctly, you want a prefix match, and sure there's a PostgreSQL extension for that: CREATE EXTENSION prefix; CREATE TABLE t1 ( id serial NOT NULL, p prefix_range NOT NULL, PRIMARY KEY (id) ); CREATE INDEX pp ON t1 USING gist(p); INSERT INTO t1 (p) VALUES ('a'), ('b'), ('a/c'), ('a/b'), ('b/a'), ('a/b/c'); EXPLAIN ANALYZE SELECT id, p FROM t1 WHERE p @> 'a/b/c/d' ;
Hi Alban > 4. Using a recursive common table expression (CTE). > http://www.postgresql.org/docs/9.2/static/queries-with.html Yes, you're right. In fact that's what I'm testing a way to replace, as I'm not confident in the performance in all situations. My fault entirely; I should have told so from the start.
BE carefull you have a number of limitation with recursive cte (I'm thinking of update and so.)
You can work around with plpgsql but it might be painfull.
You forgot a solution :
if you need powerfull graph features,
use postgres as a database and a SPARQL speaking frontend.
It may be a bit of overkill ;-)
2013/9/23 Kaare Rasmussen <kaare@jasonic.dk>
Hi AlbanYes, you're right. In fact that's what I'm testing a way to replace, as I'm not confident in the performance in all situations. My fault entirely; I should have told so from the start.4. Using a recursive common table expression (CTE). http://www.postgresql.org/docs/9.2/static/queries-with.html
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, Sep 22, 2013 at 9:48 PM, Kaare Rasmussen <kaare@jasonic.dk> wrote:
-- Hi AlbanYes, you're right. In fact that's what I'm testing a way to replace, as I'm not confident in the performance in all situations. My fault entirely; I should have told so from the start.4. Using a recursive common table expression (CTE). http://www.postgresql.org/docs/9.2/static/queries-with.html
It might be helpful for you to discuss what sorts of concerns you have and how they fit into the specifics of your data. Trees are an area where different uses may have different recommended solutions. I gave my thoughts on performance on trees above. There are a few really bad areas I can think of. For example, if you had a ten-layer deep scan where each scan pulled around 10% or so of the table, you might be looking at 10 sequential scans and a fair bit of CPU time. If the result set was very large, you might see things written to disk. There are a number of gotchas.
This being said, *usually* I find that recursive CTE's are one of the better solutions out there for trees and I think they will perform better in more situations than many of the other solutions.
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.
On Fri, Sep 20, 2013 at 6:29 AM, Kaare Rasmussen <kaare@jasonic.dk> wrote: > Hi > > I'm trying to determine the best way to represent a simple tree structure > (like a file/dir tree or a uri path). I guess that's done a zillion times > before; I just don't seem to be able to find the right solution. I have one > special request, that I'd like to find all 'shorter' paths, i.e. given > 'a/b/c/d' it'll find > > a > a/b > a/b/c > - but not > b > a/c > b/a > > There are a number of options to test. > > 1. As strings > There's no dedicated function (@>) > WHERE clause should read something like 'a/b/c/d' LIKE column || '%', > which is both ugly and (I guess) non indexable > Perhaps regex indexes would work, but not efficient and not optimal > > 2. As array of strings > My favorite, would be elegant. A GIN index on the whole array would make > for fast performance > Alas @> treats the arrays as a set, not an array > WHERE col @> 'a/b/c/d' would find all of the above rows, including a, a/c, > b/a, etc. > > 3. ltree contrib > The only option that actually works and uses index > @> works as I want it to. > But the single segments can only be alphanumeric and underscore > ltree only supports GIST > there's a length limit. > > The last option is the one I'm using right now, but I hope that somebody can > point out where I'm wrong with regards to the other options, or tell me that > there is a better solution somewhere I didn't look. All materialized path approaches will face index limit so be warned. For my money, I'd be using ltree if you need complex indexed searching or some TEXT+btree for simple searching (LIKE '/foo/bar%') on the 'full path' side. maview type approaches are faster to search but slower to update than a recursive type structure where you use parent/child relationships + recursive CTE to query. merlin
Sorry, got tangled up in this thing called 'real life'. > If I understand you correctly, you want a prefix match, and sure there's > a PostgreSQL extension for that: OK, that seems to do the job, thanks a lot. The only small quibble is that it's an extension. I'm quite surprised there seem to be no way in core to treat an array as an array. Using @> treats it as a set, AFAICT.
On Thu, Oct 10, 2013 at 1:00 AM, Kaare Rasmussen <kaare@jasonic.dk> wrote: > Sorry, got tangled up in this thing called 'real life'. > > >> If I understand you correctly, you want a prefix match, and sure there's >> a PostgreSQL extension for that: > > > OK, that seems to do the job, thanks a lot. The only small quibble is that > it's an extension. > > I'm quite surprised there seem to be no way in core to treat an array as an > array. Using @> treats it as a set, AFAICT. can you elaborate on that? merlin
Hi Merlin > On Thu, Oct 10, 2013 at 1:00 AM, Kaare Rasmussen <kaare@jasonic.dk> wrote: >> I'm quite surprised there seem to be no way in core to treat an array as an >> array. Using @> treats it as a set, AFAICT. > can you elaborate on that? > > merlin To me, an array is a vector (or a vector of vectors). So I'm looking for an operator where ARRAY[1,4,3] doesn't contain ARRAY[3,1] and ARRAY[2,7] isn't contained by ARRAY[1,7,4,2,6] (but ARRAY[1,7,4] is) IOW order matters to me, but not to the array operators mentioned in http://www.postgresql.org/docs/9.3/static/functions-array.html. Note that index support is important.
Hey sorry if my answer is stupid,
but there is an extension for array, even if it is limited to int (but int could be indexes of row)
It provides essential function, although lacking some (I re-implemented union of array with disjoint result).
I think this extension uses indexes
I think this extension uses indexes
Cheers,
Rémi-C
2013/10/10 Kaare Rasmussen <kaare@jasonic.dk>
Hi MerlinOn Thu, Oct 10, 2013 at 1:00 AM, Kaare Rasmussen <kaare@jasonic.dk> wrote:I'm quite surprised there seem to be no way in core to treat an array as ancan you elaborate on that?
array. Using @> treats it as a set, AFAICT.
merlin
To me, an array is a vector (or a vector of vectors). So I'm looking for an operator where
ARRAY[1,4,3] doesn't contain ARRAY[3,1] and
ARRAY[2,7] isn't contained by ARRAY[1,7,4,2,6] (but ARRAY[1,7,4] is)
IOW order matters to me, but not to the array operators mentioned in http://www.postgresql.org/docs/9.3/static/functions-array.html. Note that index support is important.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hi Rémi > Hey sorry if my answer is stupid, > but there is an extension for array, even if it is limited to int (but > int could be indexes of row) > It's named http://www.postgresql.org/docs/9.3/static/intarray.html > It provides essential function, although lacking some (I > re-implemented union of array with disjoint result). > I think this extension uses indexes > Thanks for your answer. If there were a similar strarray, it would be top nice :-)