Re: How to modify ENUM datatypes? - Mailing list pgsql-general

From Jeff Davis
Subject Re: How to modify ENUM datatypes?
Date
Msg-id 1208902296.14025.39.camel@dogma.ljc.laika.com
Whole thread Raw
In response to Re: How to modify ENUM datatypes?  ("D. Dante Lorenso" <dante@lorenso.com>)
Responses Re: How to modify ENUM datatypes?  ("Merlin Moncure" <mmoncure@gmail.com>)
List pgsql-general
On Tue, 2008-04-22 at 16:34 -0500, D. Dante Lorenso wrote:
> I see this might be a
> problem with storage since you will need to store the TEXT value for
> every row in the 'mystuff' table instead of just storing the reference
> to the lookup table as an INTEGER.  Over millions of rows, perhaps this
> would become a concern?

It does use additional storage to store the full text value, rather than
a fixed-size integer. However, the difference is not much when the
average string length is short.

If you store an integer reference instead, joins are not necessarily
expensive. If the number of distinct values is small (which is the
normal use case for ENUM), I would expect the joins to be quite cheap.
Beware of running into bad plans however, or making the optimizer work
too hard (if you have a lot of other joins, too).

I don't think the performance concerns are major, but worth considering
if you have millions of rows.

> What is the general consensus by the community about this approach?  Is
> this de-normalization frowned upon, or is there a performance advantage
> here that warrants the usage?

This is not de-normalization, at all. Normalization is a formal process,
and if this were de-normalization, you could find a specific rule that
is violated by this approach.

Look here:
http://en.wikipedia.org/wiki/Third_normal_form

If you go to higher normal forms, you will not find any violations
there, either. There is nothing about normalization that requires the
use of surrogate keys.

The approach suggested by Scott Marlowe is normalized as well as being
quite natural and simple. I think often this is overlooked as being "too
simple", but it's a quite good design in many cases.

Regards,
    Jeff Davis


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Rapidly decaying performance repopulating a large table
Next
From: "David Wilson"
Date:
Subject: Re: Rapidly decaying performance repopulating a large table