Re: redundant fields in table for "performance optimizations" - Mailing list pgsql-general

From Ondrej Ivanič
Subject Re: redundant fields in table for "performance optimizations"
Date
Msg-id CAM6mieLvwRBmL6RdE9LggOMhq94aQDQzwY87Boj5yAB0t-_2Mg@mail.gmail.com
Whole thread Raw
In response to redundant fields in table for "performance optimizations"  (Menelaos PerdikeasSemantix <mperdikeas.semantix@gmail.com>)
List pgsql-general
Hi,

On 22 August 2012 07:07, Menelaos PerdikeasSemantix
<mperdikeas.semantix@gmail.com> wrote:
> Let's say you have a father-child (or master-detail if you wish) hierarchy
> of tables of not just 2 levels, but, say, 5 levels.
> E.g. tables A, B, C, D and E organized in successive 1-to-N relationships:
>
> A ----1-to-N----->  B
> B ----1-to-N-----> C
> C ----1-to-N-----> D
> D ----1-to-N-----> E
>
> with appropriate foreign keys:
>
> * from E to D
> * from D to C
> * from C to B
> * from B to A
>
> This is normalized so far. Now assume that it is the case than in some
> queries on table E you also need to report a field that only exists on table
> A. This will mean a JOIN between five tables: E, D, C, B and A. Some
> questions follow:
>
> [1] assuming tables having a number of rows in the order of 100,000, after
> how many levels of depth would you feel justified to depart from the
> normalized schema and introduce some redundancy to speed up the queries?
>
> [3] do you feel this is a legitimate concern in a modern PostgreSQL database
> running on high end (200,000 USD) hardware and serving no more than 1000
> concurrent users with table sizes at the lowest (more detailed) level of the
> hierarchy in the order of a few tens of millions of rows at the most and
> dropping by a factor of 20 for each level up ?

I would ask different question(s): how "static" that tree structure is
and what kind of queries do you want to run:
- father-child: easy to understand; add new node; change leaf node;
hard to run some count(*) queries; and get hierarchy (CTEs are help
full)
- nested sets: pailful to move nodes around (even add new node); easy
to get tree subsets; ...

Anyway, I've found this summary:
http://stackoverflow.com/questions/4048151/what-are-the-options-for-storing-hierarchical-data-in-a-relational-database
when I was googling for Joe Celko's Trees and Hierarchies book.

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)


pgsql-general by date:

Previous
From: Steve Crawford
Date:
Subject: Re: Problems with timestamp with time zone and old dates?
Next
From: Michael Clark
Date:
Subject: Re: Problems with timestamp with time zone and old dates?