Re: Varchar vs foreign key vs enumerator - table and index size - Mailing list pgsql-performance

From Łukasz Walkowski
Subject Re: Varchar vs foreign key vs enumerator - table and index size
Date
Msg-id A698EB68-A22F-427A-9A7D-0C251B03A7EC@homplex.pl
Whole thread Raw
In response to Re: Varchar vs foreign key vs enumerator - table and index size  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Varchar vs foreign key vs enumerator - table and index size  (Craig James <cjames@emolecules.com>)
List pgsql-performance
Tom,

> If you're starting to be concerned about space, it's definitely time to
> get away from this choice.  Depending on what locale you're using,
> comparing varchar values can be quite an expensive operation, too.

I don't like wasting space and processing power even if more work is required to achieve this. We use pl_PL.UTF-8 as
ourlocale. 

> I think the main "pro" of this approach is that it doesn't use any
> nonstandard SQL features, so you preserve your options to move to some
> other database in the future.  The main "con" is that you'd be buying into
> fairly significant rewriting of your application code, since just about
> every query involving these columns would have to become a join.

Well, I don't really think I will move from Postgresql anytime soon. It's just the best database for me. Rewriting code
isone of the things I'm doing right now but before I touch database, I want to be sure that the choices I made are
good.

> FWIW, I'd be inclined to just use integer not smallint.  The space savings
> from smallint is frequently illusory because of alignment considerations
> --- for instance, an index on a single smallint column will *not* be any
> smaller than one on a single int column.  And smallint has some minor
> usage annoyances because it's a second-class citizen in the type promotion
> hierarchy --- you may find yourself needing explicit casts to smallint
> here and there.

Ok, thats important information. Thank you.

>
> Space-wise this is going to be equivalent to the integer-foreign-key
> solution.  It's much nicer from a notational standpoint, though, because
> you don't need joins --- it's likely that you'd need few if any
> application code changes to go this route.  (But I'd advise doing some
> testing to verify that before you take it as a given.)
>
> You're right though that enums are not a good option if you expect
> frequent changes in the pool of allowed values.  I guess the question
> is how often does that happen, in your application?  Adding a new value
> from time to time isn't much of a problem unless you want to get picky
> about how it sorts relative to existing values.  But you can't ever delete
> an individual enum value, and we don't support renaming them either.
> (Though if you're desperate, I believe a manual UPDATE on the pg_enum
> catalog would work for that.)
>
> Another thing to think about is whether you have auxiliary data about each
> value that might usefully be stored as additional columns in the small
> tables.  The enum approach doesn't directly handle that, though I suppose
> you could still create small separate tables that use an enum column as
> primary key.
>
>             regards, tom lane

So, I'll go for enumerators for device type, eventtype and eventsource as those columns are quite stable. For browser
andoperating system I'll do external tables. 

Thank you - any additional tips are welcome.

Reagards,
Lukasz Walkowski

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Varchar vs foreign key vs enumerator - table and index size
Next
From: Craig James
Date:
Subject: Re: Varchar vs foreign key vs enumerator - table and index size