Re: Varchar vs foreign key vs enumerator - table and index size - Mailing list pgsql-performance

From Vitalii Tymchyshyn
Subject Re: Varchar vs foreign key vs enumerator - table and index size
Date
Msg-id CABWW-d16BtaUYwQTmtzyf-1R4QVgxwnOt_Di5rk5_x-Y7+KGkw@mail.gmail.com
Whole thread Raw
In response to Re: Varchar vs foreign key vs enumerator - table and index size  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-performance
Well, in older version of Hibernate it was a little tricky to handle Postgresql Enums. Dunno if it's out of the box now.
Also adding new value is an explicit operation (much like with lookup table). I've had quite a complex code with second connection opening to support lookup table filling without flooding original transaction with additional locks that could lead to deadlocks.
BTW: Does adding new value to enum adds some locks? Can a check if value exists and adding new value be done in atomic fashion without grabbing some global lock?
P.S. As  I see, it can be a topic for good article for, say, dzone. The problem can be quite tricky in MVCC database and choice must be done wisely.

Best regards, Vitalii Tymchyshyn


2013/9/2 Andrew Dunstan <andrew@dunslane.net>

On 09/02/2013 05:53 AM, Łukasz Walkowski wrote:
On 1 wrz 2013, at 05:10, Vitalii Tymchyshyn <tivv00@gmail.com> wrote:

Well, there are some more options:
a) Store int keys and do mapping in the application (e.g. with java enums). This can save you a join, that is especially useful if you are going to do paged output with limit/offset scenario. Optimizer sometimes produce suboptimal plans for join in offset/limit queries.
b) Store small varchar values as keys (up to "char" type if you really want to save space) and do user display mapping in application. It's different from (a) since it's harder to mess with the mapping and values are still more or less readable with simple select. But it can be less efficient than (a).
c) Do mixed approach with mapping table, loaded on start into application memory. This would be an optimization in case you get into optimizer troubles.

Best regards, Vitalii Tymchyshyn
I'd like to leave database in readable form because before I add some new queries and rest endpoints to the application, I test them as ad-hoc queries using command line. So variant a) isn't good for me. Variant b) is worth trying and c) is easy to code, but I still prefer having all this data in database independent of application logic.



I think the possible use of Postgres enums has been too easily written off in this thread. Looking at the original problem description they look like quite a good fit, despite the OP's skepticism. What exactly is wanted that can't be done with database enums? You can add new values to the type very simply.  You can change the values of existing labels in the type slightly less simply, but still without any great difficulty. Things that are hard to do include removing labels in the set and changing the sort order, because those things would require processing tables where the type is used, unlike the simple things. But neither of these is required for typical use cases. For most uses of this kind they are very efficient both in storage and processing.

cheers

andrew



--
Best regards,
 Vitalii Tymchyshyn

pgsql-performance by date:

Previous
From: Roberto Grandi
Date:
Subject: COPY TO and VACUUM
Next
From: Kevin Grittner
Date:
Subject: Re: planner parameters