Re: Compare rows - Mailing list pgsql-performance

From Christopher Browne
Subject Re: Compare rows
Date
Msg-id m3fzi35fkd.fsf@wolfe.cbbrowne.com
Whole thread Raw
In response to Compare rows  (Greg Spiegelberg <gspiegelberg@cranel.com>)
Responses Re: Compare rows  (Josh Berkus <josh@agliodbs.com>)
Re: Compare rows  (Greg Spiegelberg <gspiegelberg@cranel.com>)
List pgsql-performance
In an attempt to throw the authorities off his trail, josh@agliodbs.com (Josh Berkus) transmitted:
> child table
> id    value_type    value
> 3    uptime        0.3
> 3    speed        11.2
> 3    memory        37
> 3    tty            6
> 7    uptime        1.1
> 7    memory        15
> 9    uptime        0.1
> 9    memory        94
> 9    tty            2
>
> As you can see, the NULLs are not stored, making this system much more
> efficient on storage space.

Wow, that takes me back to a paper I have been looking for for
_years_.

Some time in the late '80s, probably '88 or '89, there was a paper
presented in Communications of the ACM that proposed using this sort
of "hypernormalized" schema as a way of having _really_ narrow schemas
that would be exceedingly expressive.  They illustrated an example of
an address table that could hold full addresses with a schema with
only about half a dozen columns, the idea being that you'd have
several rows linked together.

The methodology was _heavy_ on metadata, though not so much so that
there were no columns left over for "real" data.

The entertaining claim was that they felt they could model the
complexities of the operations of any sort of company using not more
than 50 tables.  It seemed somewhat interesting, at the time; it truly
resonated as Really Interesting when I saw SAP R/3, with its bloat of
1500-odd tables.

(I seem to remember the authors being Boston-based, and they indicated
that they had implemented this "on VMS," which would more than likely
imply RDB; somehow I doubt that'll be the set of detail that makes
someone remember it...)

The need to do a lot of joins would likely hurt performance somewhat,
as well as the way that it greatly increases the number of rows.
Although you could always split it into several tables, one for each
"value_type", and UNION them into a view...
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','acm.org').
http://cbbrowne.com/info/unix.html
You shouldn't anthropomorphize computers; they don't like it.

pgsql-performance by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: [HACKERS] Sun performance - Major discovery!
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Sun performance - Major discovery!