Re: Making a tree with "millions and millions" of dynamic - Mailing list pgsql-general

From Oleg Bartunov
Subject Re: Making a tree with "millions and millions" of dynamic
Date
Msg-id Pine.GSO.4.58.0312051203090.27913@ra.sai.msu.su
Whole thread Raw
In response to Making a tree with "millions and millions" of dynamic nodes  (Christian Fowler <google@NOSPAM.gravesweeper.com>)
Responses Re: Making a tree with "millions and millions" of dynamic  (Lynn.Tilby@asu.edu)
List pgsql-general
Christian,

you may try our contrib/ltree module
see http://www.sai.msu.su/~megera/postgres/gist/ltree
With tuned postgresql and reasonable hardware you might get what you're
looking for.

    Oleg

On Tue, 2 Dec 2003, Christian Fowler wrote:

>
>
> I have a VERY LARGE pile of geographic data that I am importing into a database (db of choice is postgres,
> though may hop to oracle if necessary). The data is strictly hierarchical - each node has one, and only one
> parent. The depth should not exceed 6 or 7 levels. The initial import will have about 6 million leaves, and
> 3 million branches. I would expect the leaves to grow significantly, in number easily tripling. However, the
> branches will likely stay very constant in number, but I expect there locations to shift around somewhat
> (affecting up to thousand of children).
>
> For selection, it is crucial for me to get:
>
> 1. path generation speed
> 2. immediate sibling speed
> 3. children count speed
>
>
> I have implemented a first run using Joe Celko's Nested Sets (w/ a mod to store tree level for speed). The
> update propigation issue is the achilles heel of this approach for me. I have read Vadim Tropashko Nested
> Interval concept ( http://www.dbazine.com/tropashko4.html )  , and my brain is painfully stretched enough to
> get the general idea. I have a feeling i will hit the arithmetic issues others of reported.
>
> So it seems Materialized Path is my only option, however I am concerned about LIKE performance for the right
> hand side of the tree, where the path is 8digits x 6 levels = 48 chars. Should I be concerned? I need
> split-second real-time performance, and can't imagine it will be as fast the Nested Set arithmatic approach.
> I can flatten out the import to insure the upper tree has the smallest numbers, however, it will save at
> most 8 chars on the path.
>
> I've been googling through USENET archives watching the big debates of years gone by. I've grazed much
> knowledge, but now am curious if anyone has any thoughts/advice/war stories about a data set this large.
>
> (and if any fellow postgres fans have some reassuring words about real-time performance of a 20 million row
> tree, i'd love to hear ;-)
>
>
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

pgsql-general by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Groups vs. Roles
Next
From: Thierry Missimilly
Date:
Subject: Re: What is WAL used for?