Home > mailing lists

Re: Enums patch v2 - Mailing list pgsql-patches

From	Tom Dunstan
Subject	Re: Enums patch v2
Date	December 20, 2006 00:22:50
Msg-id	45889076.5020507@tomd.cc Whole thread Raw
In response to	Re: Enums patch v2 (Heikki Linnakangas <heikki@enterprisedb.com>)
List	pgsql-patches

Tree view

Heikki Linnakangas wrote:
> I'm sorry I missed the original discussions, but I have to ask: Why do
> we want enums in core? The only potential advantage I can see over using
> a look-up table and FK references is performance.

Well, there are a few things. Sometimes its tidiness, sometimes
integrity... I've seen more than one system with hundreds of these
things, and they've either gone down the table-per-enum solution, with
LOTS of extra tables whose values never change, or the EAV solution,
with one or two globally referenced tables to which everything in your
system has as a FK, and an integrity check in a trigger if you're very
lucky. Yuck on both accounts. Enums hit a sweet spot in the middle and
provide data integrity and performance for non-changing values.

> 1. What's the point of having comparison operators for enums? For most
> use cases, there's no natural ordering of enum values.

Well, there are a number of cases where ordering IS important, and
indeed, enums provide a way to do it easily where many of the
alternative solutions do not. It's one of the key benefits.

> 2. The comparison routine compares oids, right? If the oids wrap around
> when the enum values are created, the ordering isn't what the user expects.

As has been pointed out by others quicker on the draw than me, I do sort
the OIDs at enum creation time, for exactly this reason.

> 3. 4 bytes per value is wasteful if you're storing simple status codes
> etc. Especially if you're doing this for performance, let's do no harm
> by wasting space. One byte seems enough for the typical use cases. I'd
> even argue that having a high upper limit on the number of enum values
> encourages misuse of the feature.

I'd really love to have these fit into a 1 or 2 byte value on disk, but
AFAIK there's simply no way to do it currently in postgresql. If we ever
move to a solution where on-disk representation is noticeably different
from in-memory representation, then it might be doable. If that does
happen, we might benefit from other improvements such as being able to
order columns in a tuple on disk so as to minimize alignment padding,
not having to store a composite type's oid, etc. Until that happens,
though, if it ever does, this is probably the tightest on-disk
representation we're likely to get, unless we're happy to impose some
pretty severe restrictions, like 8 bits per enum, and only 256 enums in
total (for a 2 byte total). I was already shot down trying to make
similar restrictions when I first brought it up. :) The OID solution
seems to offend the least. :)

We did discuss this somewhat earlier, and I'm happy to take alternative
suggestions, but AFAIK this is about as good as it's going to get right now.

Cheers

Tom

pgsql-patches by date:

From: Bruce Momjian
Date: 19 December 2006, 16:00:51
Subject: Re: Load distributed checkpoint patch

From: Tom Dunstan
Date: 20 December 2006, 00:39:50
Subject: Re: [HACKERS] Enums patch v2

Re: Enums patch v2 - Mailing list pgsql-patches

Previous

Next