Re: Offending My Tender Sensibilities -OR- OLTP on a Star Schema - Mailing list pgsql-general

From Rob Sargent
Subject Re: Offending My Tender Sensibilities -OR- OLTP on a Star Schema
Date
Msg-id 531640D1.4070205@gmail.com
Whole thread Raw
In response to Re: Offending My Tender Sensibilities -OR- OLTP on a Star Schema  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Offending My Tender Sensibilities -OR- OLTP on a Star Schema  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-general
On 03/04/2014 01:40 PM, Merlin Moncure wrote:
On Tue, Mar 4, 2014 at 2:15 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
Roy Anderson <roy.anderson@gmail.com> wrote:

We have an OLTP database and no data warehouse. We are currently
planning out a build for a data warehouse however (possibly using
Hadoop). "X" is recommending that we convert our current,
normalized OLTP database into a flattened Star Schema.
I'm not going to repeat good advice you've already gotten in other
answers, but I will point out that complex reporting off of
normalized data is often much faster if you have been able to use
natural keys, even if you need to go to multi-column primary keys
to do so.  One of the biggest down-sides of synthetic primary keys
(where, for example, you might have a single-column PK column
called "id" in every table) is that forces one particular route to
"navigate" the tables.  With natural keys a complex query often
finds intriguing plans to give the results you ask for using plans
you might never have thought of, and which can be orders of
magnitude faster than the plans which would be possible if the
joins are all done using synthetic keys.
If we ever happen to meet, you just bought yourself a steak dinner
with this email.  Natural key database design has to my great
displeasure become something of a lost art.  Data modeling and
performance expectations have really suffered as a consequence of that
knowledge gap.  Now, natural keys have issues also -- update
performance on the key in particular -- so you have to be nimble and
adjust the model as appropriate to the task at hand.

merlin


Do you make a distinction between a key and an index?  I'm not picking up on design-by-natural-key and what that entails. Especially the notion that the natural key of a given item might be mutable. What stops it from colliding with the next item? (I have not had the pleasure of working in a domain where natural keys are obvious if they existed at all. "What's in a name", after all. )

pgsql-general by date:

Previous
From: David Johnston
Date:
Subject: Re: SQL question on chunking aggregates
Next
From: Venkata Balaji Nagothi
Date:
Subject: Re: log_statement per table