Re: Trees: integer[] outperformed by ltree - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Trees: integer[] outperformed by ltree
Date
Msg-id CAHyXU0yoV4LWxvYwsY4rd6X3y2Sdk5sJdZa39HcOx97FY06Kuw@mail.gmail.com
Whole thread Raw
In response to Re: Trees: integer[] outperformed by ltree  (Jan Walter <john@commontongue.com>)
Responses Re: Trees: integer[] outperformed by ltree
List pgsql-performance
On Tue, Nov 5, 2013 at 3:52 PM, Jan Walter <john@commontongue.com> wrote:
>
> On 5.11.2013 20:51, Merlin Moncure wrote:
>
> On Tue, Nov 5, 2013 at 11:30 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
> On Tue, Nov 5, 2013 at 6:25 AM, Jan Walter <john@commontongue.com> wrote:
>
> Hi,
>
> I am in a need of a very robust (esp. fast in read, non-blocking in update)
> tree structure storage (95% trees are of depth <4, current max. is 12). We
> have 10k-100k trees now, millions in the future.
> I made many tests, benchmarks of usual operations, and after all,
> materialized path ('1.5.3' path notation) seems most promising.
>
> materialized path approaches tend to be ideal if the tree remains
> relatively static and is not too deep.  The downside with matpath is
> that if a you have to move a node around in the tree, then all the
> child elements paths' have to be expensively updated.  I bring this up
> as it relates to your 'non-blocking in update' requirement: in matpath
> an update to parent can update an unbounded number of records.
>
>
> Thanks for your remark.
> Materialized path is still better in updates than nested sets we are using currently.
> Although adjacency lists with recursive CTEs were initially my favorite substitution (smallest lock space for node
relocation),whey are completely failing in e.g. order by path (depth) task (150s vs. 31ms via integer[]), and twice
slowerin simple descendant-based tasks. 
> I am yet considering it (if I moved e.g. ordering to application server level), and still need to rewrite couple of
moresophisticated scenarios to CTEs to be absolutely sure if it fails; anyway MP seems more promising. 
> I also tend to have the tree structure completely independent on other data belonging to nodes.
>
> Or did you have any other model in your mind?
>
>
> hm, why do you need gist/gin for the int[] formulation?  what are your
> lookup requirements?  with int[] you can typically do contains with
> simple btree.
>
>
> I do not think so, i.e. neither my tests nor http://www.postgresql.org/docs/current/static/indexes-types.html showed
itworks for <@ or @>. Using it for <= operator puts btree index to query plan in some (not all) scenarios. Still it
needsto be accompanied with <@, and the performance goes in more complex scenarios down. 
>
> 'Start with' I was mentioning, would be ideal.

This is trivial especially if you don't have to deal with null edge
case.  If you wan to search btree'd array column for all elements
starting with [1,2,3],

SELECT * FROM foo
  WHERE
    id >= array[1,2,3]
    AND id < array[1,2,4];

merlin


pgsql-performance by date:

Previous
From: Jan Walter
Date:
Subject: Re: Trees: integer[] outperformed by ltree
Next
From: Michael Paquier
Date:
Subject: Re: postgresql recommendation memory