Re: ltree + gist index performance degrades significantly over a night - Mailing list pgsql-general

From CG
Subject Re: ltree + gist index performance degrades significantly over a night
Date
Msg-id 20060227182720.48012.qmail@web32502.mail.mud.yahoo.com
Whole thread Raw
In response to Re: ltree + gist index performance degrades significantly over a night  (Bernhard Weisshuhn <bkw@weisshuhn.de>)
Responses Re: ltree + gist index performance degrades significantly over a night  (Bernhard Weisshuhn <bkw@weisshuhn.de>)
List pgsql-general
That would do the job, wouldn't it? :)

I don't think it's a naive question at all. Its quite a good question, and the
solution you suggest is a good option to have, and would probably work better
than the single-vector ltree index for simple substring matching. In my case,
the ltree+gist index table actually contains more pages than the table of data
itself. I'd need to see if the space required for the varchar+btree tables are
comparible, better, or worse than the ltree+gist tables with regards to size.
Now that I think about it, building substrings out of ltree nodes would be
incredible overkill comapred to the effetiveness of the varchar+btree.

The extra advantages of ltree are the ability to match and extract nodes in a
path based not only on contents but also proximity, and aggregate on those
characteristics.

In my case this might be good for serial numbers where each digit or grouping
of digits have special values which would be used to aggregate on.

The ltree method was suggested to me a while back when I was frustrated with
the performance of "like '%something%'" ...



--- Bernhard Weisshuhn <bkw@weisshuhn.de> wrote:

> On Mon, Feb 27, 2006 at 09:14:40AM -0800, CG <cgg007@yahoo.com> wrote:
>
> > I could probably get even better performance out of the table, at the cost
> of a
> > significant increase in table and index size, by chopping up the columns
> into
> > smaller chunks.
> >
> > "Hello World" would yield
> >
> > 'h.e.l.l.o.w.o.r.l.d'
> > 'e.l.l.o.w.o.r.l.d'
> > 'l.l.o.w.o.r.l.d'
> > 'l.o.w.o.r.l.d'
> > 'o.w.o.r.l.d'
> > 'w.o.r.l.d'
> > 'o.r.l.d'
> > 'r.l.d'
> >
> > and using a wildcard search "search_vector ~ 'o.r.l.*'" would jump right to
> the
> > vectors which start with "o.r.l" ...
>
> But with this approch you'd be fine with a normal varchar_ops btree index
> for textfields and searching using "like 'world%'", wouldn't you?
> Or is the ltree approch more efficient?
>
> I'm not trying to be smart-assed, it's a naive question, since I'm
> looking for an efficient substring search solution in postgresql myself.
>
> regards,
> bkw
>
>


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

pgsql-general by date:

Previous
From: Emi Lu
Date:
Subject: Re: A question about Vacuum analyze
Next
From: Bruno Wolff III
Date:
Subject: Re: Wish: remove ancient constructs from Postgres