A space-efficient, user-friendly way to store categorical data - Mailing list pgsql-hackers

From Andrew Kane
Subject A space-efficient, user-friendly way to store categorical data
Date
Msg-id CACDdp+b0=o_jsoLnmq=5eL3mmpcxxYH1AZoqg-yz9tSP1+rVyA@mail.gmail.com
Whole thread Raw
Responses Re: A space-efficient, user-friendly way to store categorical data  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: A space-efficient, user-friendly way to store categorical data  (Mark Dilger <hornschnorter@gmail.com>)
List pgsql-hackers
Hi,

I'm hoping to get feedback on an idea for a new data type to allow for efficient storage of text values while keeping reads and writes user-friendly. Suppose you want to store categorical data like current city for users. There will be a long list of cities, and many users will have the same city. Some options are:

- Use a text column
- Use an enum column - saves space, but labels must be set ahead of time
- Create another table for cities (normalize) - saves space, but complicates reads and writes

A better option could be a new "dynamic enum" type, which would have similar storage requirements as an enum, but instead of labels being declared ahead of time, they would be added as data is inserted.

It'd be great to hear what others think of this (or if I'm missing something). Another direction could be to deduplicate values for TOAST-able data types.

Thanks,
Andrew

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: ldapi support
Next
From: Petr Jelinek
Date:
Subject: Re: ALTER TABLE ADD COLUMN fast default