Thread: extensible enum types

extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 12:51:09

A recent discussion with a developer convinced me that enums would be 
much more useful if new values could be added to the label list easily 
without changing the stored values. Given the current representation of 
enums as a globally unique oid, I think this is just about impossible if 
the new label is to go somewhere other than on the end of the list 
(assuming that's where the next oid would get allocated).

So I have been thinking about a new type family which for the sake of 
argument I will call varenum (please hold the bikeshedding till later). 
Staying with 4 bytes as the stored value, the idea is that we partition 
it into two 16 bit pieces. The high-order piece is the varenum type 
identifier. The low-order piece would uniquely identify the label within 
the set for that varenum. Initial values would be allocated like this: 
calculate the space between values p as 2**16 / (16 + number_of_labels). 
Then set the first value at  8 * p, then next at 9* p and so on. This is 
designed to allow more space to add labels at the beginning and end of 
the list, where this is more likely. Adding a label would be a matter of 
finding the labels adjacent to the position where we want to add the new 
label, and placing it half way between them, possibly with special rules 
for the list ends. If we want to add the label between two labels having 
values n and n+1 the addition would fail.

All this would mean a) we can't have more than 65536 varenum types in a 
system, and no varenum type could have more than 65536 values (and 
possibly less, depending on how they are added). In practice I think 
these are quite reasonable restrictions, and 99.99% of users would never 
come close to bumping up against them.

Given all this, we could then allow things like:
   ALTER TYPE varenumtype ADD 'newlabel' [ BEFORE | AFTER 'existinglabel' ]


I haven't looked at how we'd set this up in the catalog, but I assume 
that by analogy with pg_enum it should be fairly straightforward.

We could actually allow something like the above for existing enum types 
without the optional clause, which would just add the label wherever the 
next oid happened to fall, which would commonly be at the end of the 
list. That would handle the common case where the application doesn't 
care about the label ordering. That should be a much simpler change and 
is probably worth doing first.

There was some discussion of this here: 
<http://archives.postgresql.org/message-id/20080425182718.GD5888@alvh.no-ip.org> 
But I'm fairly reluctant to do things which will require table rewrites. 
I'm also less excited about the idea of removing values - that is 
something I don't ever recall being asked about, but I have often been 
asked about adding labels. For people prepared to rewrite tables, there 
is currently a workaround: create a new enum type, alter the table to 
use the new enum type, drop the old enum type, rename the new enum type 
to the old enum type.

Thoughts?

cheers

andrew

Re: extensible enum types

From

Robert Haas

Date:

18 June 2010, 13:07:56

On Fri, Jun 18, 2010 at 11:50 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Then set the
> first value at  8 * p, then next at 9* p and so on. This is designed to
> allow more space to add labels at the beginning and end of the list, where
> this is more likely. Adding a label would be a matter of finding the labels
> adjacent to the position where we want to add the new label, and placing it
> half way between them, possibly with special rules for the list ends. If we
> want to add the label between two labels having values n and n+1 the
> addition would fail.

I like the idea of being able to modify enums on the fly, but I'm
skeptical of an implementation that won't always work.  Maybe it's
still better than what we have now, but it seems grotty.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 13:34:45


Robert Haas wrote:
> On Fri, Jun 18, 2010 at 11:50 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>   
>> Then set the
>> first value at  8 * p, then next at 9* p and so on. This is designed to
>> allow more space to add labels at the beginning and end of the list, where
>> this is more likely. Adding a label would be a matter of finding the labels
>> adjacent to the position where we want to add the new label, and placing it
>> half way between them, possibly with special rules for the list ends. If we
>> want to add the label between two labels having values n and n+1 the
>> addition would fail.
>>     
>
> I like the idea of being able to modify enums on the fly, but I'm
> skeptical of an implementation that won't always work.  Maybe it's
> still better than what we have now, but it seems grotty.
>
>   

I'd be perfectly happy to hear a reasonable alternative. Assuming we use 
some integer representation, given two labels represented by n and n+1, 
we can't add a label between them without rewriting the tables that use 
the type, whether it's my representation scheme or some other. Maybe we 
could have a FORCE option which would rewrite if necessary.

cheers

andrew

Re: extensible enum types

From

"David E. Wheeler"

Date:

18 June 2010, 13:35:55

On Jun 18, 2010, at 9:07 AM, Robert Haas wrote:

>> Then set the
>> first value at  8 * p, then next at 9* p and so on. This is designed to
>> allow more space to add labels at the beginning and end of the list, where
>> this is more likely. Adding a label would be a matter of finding the labels
>> adjacent to the position where we want to add the new label, and placing it
>> half way between them, possibly with special rules for the list ends. If we
>> want to add the label between two labels having values n and n+1 the
>> addition would fail.
>
> I like the idea of being able to modify enums on the fly, but I'm
> skeptical of an implementation that won't always work.  Maybe it's
> still better than what we have now, but it seems grotty.

Yes, other than that I fully endorse the idea. What's the likelihood of a failure? And would the position of the new
label(represented by its internal number) be predictive? IOW, would updating the same varenumtype in two databases (or
ontwo servers) yield the same underlying positional value? 

Best,

David

Re: extensible enum types

From

"David E. Wheeler"

Date:

18 June 2010, 13:36:32

On Jun 18, 2010, at 9:34 AM, Andrew Dunstan wrote:

> I'd be perfectly happy to hear a reasonable alternative. Assuming we use some integer representation, given two
labelsrepresented by n and n+1, we can't add a label between them without rewriting the tables that use the type,
whetherit's my representation scheme or some other. Maybe we could have a FORCE option which would rewrite if
necessary.

People would likely always use it.

Why not use a decimal number?

Best,

David

Re: extensible enum types

From

"Joshua D. Drake"

Date:

18 June 2010, 13:43:43

On Fri, 2010-06-18 at 12:34 -0400, Andrew Dunstan wrote:
>
> Robert Haas wrote:
> > On Fri, Jun 18, 2010 at 11:50 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> >
> >> Then set the
> >> first value at  8 * p, then next at 9* p and so on. This is designed to
> >> allow more space to add labels at the beginning and end of the list, where
> >> this is more likely. Adding a label would be a matter of finding the labels
> >> adjacent to the position where we want to add the new label, and placing it
> >> half way between them, possibly with special rules for the list ends. If we
> >> want to add the label between two labels having values n and n+1 the
> >> addition would fail.
> >>
> >
> > I like the idea of being able to modify enums on the fly, but I'm
> > skeptical of an implementation that won't always work.  Maybe it's
> > still better than what we have now, but it seems grotty.
> >
> >
>
> I'd be perfectly happy to hear a reasonable alternative. Assuming we use
> some integer representation, given two labels represented by n and n+1,
> we can't add a label between them without rewriting the tables that use
> the type, whether it's my representation scheme or some other. Maybe we
> could have a FORCE option which would rewrite if necessary.

I apologize as this thread has already moved past the initial proposal.
I had mail problems so I didn't see any part of the thread in my client
until now.

My standard response to enums is, don't use them. Specifically because
you can't modify them. When I talk to developers and they see we have
enums they get excited and then I tell them you can't modify them and
they get very downtrodden. I tell them just to use a look up table.

Anyway, Andrew if you can find a reasonable way to make them modifiable,
I am all for it :)

On that note and yes this would be weird but have we considered some
kind of wrapper around just a lookup table? I mean what if there was a
system table that listed :

nspname, relation, column, value

the "type" enum would know to look in that table for valid values

Then we just add various syntactical sugar to manage the values in the
table via that specific enum:

ALTER TABLE foo ALTER COLUMN bar VALUES (bar,baz,bing,foo)

ALTER TABLE would know its an enum and would perform proper operations
on the system table.

Sincerely,

Joshua D. Drake

>
> cheers
>
> andrew
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering

Re: extensible enum types

From

Tom Lane

Date:

18 June 2010, 13:46:12

Andrew Dunstan <andrew@dunslane.net> writes:
> Robert Haas wrote:
>> I like the idea of being able to modify enums on the fly, but I'm
>> skeptical of an implementation that won't always work.  Maybe it's
>> still better than what we have now, but it seems grotty.

> I'd be perfectly happy to hear a reasonable alternative.

Insert a sort order column into pg_enum, and rearrange the values in
that whenever the user wants to add a new value in a particular place.
You give up cheap comparisons in exchange for flexibility.  I think lots
of people would accept that tradeoff, especially if they could make it
per-datatype.

One point here is that you'd have to restrict the rearrangements so that
the relative sort order of existing values never changes, else you break
(for example) indexes on columns of that type.
        regards, tom lane

Re: extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 13:56:23

David E. Wheeler wrote:
> What's the likelihood of a failure? 

Constructing a failure case would be simple. In practice, I suspect it 
would be very low.

> And would the position of the new label (represented by its internal number) be predictive? IOW, would updating the
samevarenumtype in two databases (or on two servers) yield the same underlying positional value?

>
>
>   

The algorithm I outlined is deterministic, so the same sequence of 
operations on the type would yield the same set of values on the 
low-order 16 bits. But that doesn't mean they would have the same high 
order 16 bits. That would depend on the history of the system. But more 
importantly, why do you care? the stored value is an implementation 
artefact that should be totally invisible to users. There would be no 
guarantee of the same underlying values on dump/reload either, just as 
there is not now for enums.

cheers

andrew

Re: extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 13:59:20


David E. Wheeler wrote:
> On Jun 18, 2010, at 9:34 AM, Andrew Dunstan wrote:
>
>   
>> I'd be perfectly happy to hear a reasonable alternative. Assuming we use some integer representation, given two
labelsrepresented by n and n+1, we can't add a label between them without rewriting the tables that use the type,
whetherit's my representation scheme or some other. Maybe we could have a FORCE option which would rewrite if
necessary.
>>     
>
> People would likely always use it.
>
> Why not use a decimal number?
>
>
>   

You are just bumping up the storage cost. Part of the attraction of 
enums is their efficiency.

cheers

andrew

Re: extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 14:17:47

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Robert Haas wrote:
>>     
>>> I like the idea of being able to modify enums on the fly, but I'm
>>> skeptical of an implementation that won't always work.  Maybe it's
>>> still better than what we have now, but it seems grotty.
>>>       
>
>   
>> I'd be perfectly happy to hear a reasonable alternative.
>>     
>
> Insert a sort order column into pg_enum, and rearrange the values in
> that whenever the user wants to add a new value in a particular place.
> You give up cheap comparisons in exchange for flexibility.  I think lots
> of people would accept that tradeoff, especially if they could make it
> per-datatype.
>
> One point here is that you'd have to restrict the rearrangements so that
> the relative sort order of existing values never changes, else you break
> (for example) indexes on columns of that type.
>
>             
>   

Hmm. Yes, that could work. The assumption in my proposal was that 
existing values would not be reordered anyway.

But I'm not happy about giving up cheap comparison. And how would it be 
per data-type? That part isn't clear to me. Would we mark a given enum 
type as having its oids in order? It would also be sensible to quantify 
how much more expensive comparisons would become. If the sort order data 
were kept in the syscache the extra cost might get  very small.

What I actually like most about this suggestion is that we would be able 
to apply it cleanly to existing enum types without inventing anything 
much new.

cheers

cheers

Re: extensible enum types

From

Robert Haas

Date:

18 June 2010, 14:18:22

On Fri, Jun 18, 2010 at 12:59 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> You are just bumping up the storage cost. Part of the attraction of enums is
> their efficiency.

What's efficient about them?  Aren't we using 4 bytes to store a value
that will nearly always fit in 2, if not 1?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 14:59:19


Robert Haas wrote:
> On Fri, Jun 18, 2010 at 12:59 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>   
>> You are just bumping up the storage cost. Part of the attraction of enums is
>> their efficiency.
>>     
>
> What's efficient about them?  Aren't we using 4 bytes to store a value
> that will nearly always fit in 2, if not 1?
>
>   
This was debated when we implemented enums. As between 1,2 and 4 there 
is often not much to choose, as alignment padding makes it pretty much 
the same. But any of them are more efficient than storing a numeric 
value or the label itself.

Anyway, it might well be moot.

cheers

andrew

Re: extensible enum types

From

Robert Haas

Date:

18 June 2010, 15:06:18

On Fri, Jun 18, 2010 at 1:59 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> This was debated when we implemented enums. As between 1,2 and 4 there is
> often not much to choose, as alignment padding makes it pretty much the
> same.  But any of them are more efficient than storing a numeric value or the
> label itself.

I was assuming the alternative was an integer, rather than a
numeric...  but yeah, a numeric or the label itself would definitely
be larger.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: extensible enum types

From

Greg Stark

Date:

18 June 2010, 15:28:35

On Fri, Jun 18, 2010 at 6:17 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> Tom Lane wrote:
>>
>> Insert a sort order column into pg_enum, and rearrange the values in
>> that whenever the user wants to add a new value in a particular place.

+1 I was going to say exactly the same thing.

>> You give up cheap comparisons in exchange for flexibility.  I think lots
>> of people would accept that tradeoff, especially if they could make it
>> per-datatype.
> Hmm. Yes, that could work. The assumption in my proposal was that existing
> values would not be reordered anyway.
>
> But I'm not happy about giving up cheap comparison. And how would it be per
> data-type? That part isn't clear to me. Would we mark a given enum type as
> having its oids in order? It would also be sensible to quantify how much
> more expensive comparisons would become. If the sort order data were kept in
> the syscache the extra cost might get  very small.

I think you would need a syscache or something like it. My first
instinct was to load the whole enum value->sort order mapping into a
hash table the first time you're asked to compare two values in a
given type. Then your comparison operator amounts to "look
up/initialize hash table for this enum type, look up both sort orders
in hash table, return comparison". You might need something like a
syscache for the hash tables so that you don't keep the hash tables
around forever.

Using a syscache for the individual sort values would be slower to
initially load if you're sorting a list since you would be doing a lot
of retail lookups of individual values. But then perhaps it's a cheap
way to protect against very large enums which using a hash table per
enum type would be fragile against.

--
greg

Re: extensible enum types

From

Tom Lane

Date:

18 June 2010, 16:19:36

Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> Insert a sort order column into pg_enum, and rearrange the values in
>> that whenever the user wants to add a new value in a particular place.
>> You give up cheap comparisons in exchange for flexibility.  I think lots
>> of people would accept that tradeoff, especially if they could make it
>> per-datatype.

> But I'm not happy about giving up cheap comparison.

I don't think it would be all that bad.  We could teach typcache.c to
cache the ordering data for any type that's in active use.  It'd
certainly be a lot more expensive than OID comparison, but perhaps not
worse than NUMERIC comparisons.

> And how would it be per data-type?

Well, there'd be two kinds of enums, just as you were saying before.
I'm not sure how we'd expose that to users exactly, or whether there
could be provisions for switching a type's behavior after creation.
        regards, tom lane

Re: extensible enum types

From

Joseph Adams

Date:

18 June 2010, 19:18:05

On Fri, Jun 18, 2010 at 1:59 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> Robert Haas wrote:
>>
>> On Fri, Jun 18, 2010 at 12:59 PM, Andrew Dunstan <andrew@dunslane.net>
>> wrote:
>>
>>>
>>> You are just bumping up the storage cost. Part of the attraction of enums
>>> is
>>> their efficiency.
>>>
>>
>> What's efficient about them?  Aren't we using 4 bytes to store a value
>> that will nearly always fit in 2, if not 1?
>>
>>
>
> This was debated when we implemented enums. As between 1,2 and 4 there is
> often not much to choose, as alignment padding makes it pretty much the
> same. But any of them are more efficient than storing a numeric value or the
> label itself.
>
> Anyway, it might well be moot.
>
> cheers
>
> andrew

Something I don't understand in all this is: why can't the type of an
enum be determined statically rather than stored in every single
value?  For instance, if we have:

CREATE TYPE number AS ENUM ('one', 'two', 'three');
CREATE TYPE color AS ENUM ('red', 'green', 'blue');

PostgreSQL won't allow a comparison between two different enum types, e.g.:

> SELECT 'one'::number = 'red'::color;
ERROR:  operator does not exist: number = color

However, when we say:

SELECT 'one'::number = 'one'::number

Couldn't enum_eq just use get_fn_expr_argtype to determine the type of
enum input rather than rely on it being stored in the value (either
implicitly via OID or explicitly as a word half)?

Also, I can't seem to find the original debates from when enums were
implemented.  Does anyone have a link to that thread in the archives?
Thanks.

Joey Adams

Re: extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 19:33:23


Joseph Adams wrote:
> Also, I can't seem to find the original debates from when enums were
> implemented.  Does anyone have a link to that thread in the archives?
> Thanks.
>   

Start here 
<http://archives.postgresql.org/pgsql-hackers/2006-08/msg00979.php>


cheers

andrew

Re: extensible enum types

From

Alvaro Herrera

Date:

18 June 2010, 19:36:23

Excerpts from Joseph Adams's message of vie jun 18 18:17:50 -0400 2010:

> Couldn't enum_eq just use get_fn_expr_argtype to determine the type of
> enum input rather than rely on it being stored in the value (either
> implicitly via OID or explicitly as a word half)?
> 
> Also, I can't seem to find the original debates from when enums were
> implemented.  Does anyone have a link to that thread in the archives?

Probably around here
http://archives.postgresql.org/pgsql-patches/2006-12/msg00099.php

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: extensible enum types

From

Tom Lane

Date:

18 June 2010, 19:54:01

Joseph Adams <joeyadams3.14159@gmail.com> writes:
> Couldn't enum_eq just use get_fn_expr_argtype to determine the type of
> enum input rather than rely on it being stored in the value

No.  Support functions have to work in many contexts where there is no
"side channel" such as get_fn_expr_argtype.  What's more, it's very
difficult to provide a side channel without creating security holes.
We used to think it was OK for output functions to rely on a type OID
passed separately from the actual value, but that's insecure:
http://archives.postgresql.org/pgsql-hackers/2005-04/msg00998.php
        regards, tom lane

Re: extensible enum types

From

"Joshua D. Drake"

Date:

18 June 2010, 20:43:23

On Fri, 2010-06-18 at 12:34 -0400, Andrew Dunstan wrote:
> 
> Robert Haas wrote:
> > On Fri, Jun 18, 2010 at 11:50 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> >   
> >> Then set the
> >> first value at  8 * p, then next at 9* p and so on. This is designed to
> >> allow more space to add labels at the beginning and end of the list, where
> >> this is more likely. Adding a label would be a matter of finding the labels
> >> adjacent to the position where we want to add the new label, and placing it
> >> half way between them, possibly with special rules for the list ends. If we
> >> want to add the label between two labels having values n and n+1 the
> >> addition would fail.
> >>     
> >
> > I like the idea of being able to modify enums on the fly, but I'm
> > skeptical of an implementation that won't always work.  Maybe it's
> > still better than what we have now, but it seems grotty.
> >
> >   
> 
> I'd be perfectly happy to hear a reasonable alternative. Assuming we use 
> some integer representation, given two labels represented by n and n+1, 
> we can't add a label between them without rewriting the tables that use 
> the type, whether it's my representation scheme or some other. Maybe we 
> could have a FORCE option which would rewrite if necessary.

I apologize as this thread has already moved past the initial proposal.
I had mail problems so I didn't see any part of the thread in my client
until now.

My standard response to enums is, don't use them. Specifically because
you can't modify them. When I talk to developers and they see we have
enums they get excited and then I tell them you can't modify them and
they get very downtrodden. I tell them just to use a look up table.

Anyway, Andrew if you can find a reasonable way to make them modifiable,
I am all for it :)

On that note and yes this would be weird but have we considered some
kind of wrapper around just a lookup table? I mean what if there was a
system table that listed :

nspname, relation, column, value

the "type" enum would know to look in that table for valid values

Then we just add various syntactical sugar to manage the values in the
table via that specific enum:

ALTER TABLE foo ALTER COLUMN bar VALUES (bar,baz,bing,foo)

ALTER TABLE would know its an enum and would perform proper operations
on the system table.

Sincerely,

Joshua D. Drake

> 
> cheers
> 
> andrew
> 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering

Re: extensible enum types

From

Andrew Dunstan

Date:

18 June 2010, 20:59:56

Tom Lane wrote:
>> And how would it be per data-type?
>>     
>
> Well, there'd be two kinds of enums, just as you were saying before.
> I'm not sure how we'd expose that to users exactly, or whether there
> could be provisions for switching a type's behavior after creation.
>
>             
>   

I'd be a whole lot happier if we didn't have to do that. I've been 
trying to wrack my brain for some clever way to avoid it (so far without 
success).

cheers

andrew

Re: extensible enum types

From

Andrew Dunstan

Date:

19 June 2010, 15:59:09


Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Tom Lane wrote:
>>     
>>> Insert a sort order column into pg_enum, and rearrange the values in
>>> that whenever the user wants to add a new value in a particular place.
>>> You give up cheap comparisons in exchange for flexibility.  I think lots
>>> of people would accept that tradeoff, especially if they could make it
>>> per-datatype.
>>>       
>
>   
>> But I'm not happy about giving up cheap comparison.
>>     
>
> I don't think it would be all that bad.  We could teach typcache.c to
> cache the ordering data for any type that's in active use.  It'd
> certainly be a lot more expensive than OID comparison, but perhaps not
> worse than NUMERIC comparisons.
>
>   
>   

Another thought: could we add a column to pg_type with a flag that's 
true if the oids are in sort order? Then the comparison routines could 
just look that up in the type cache and if it's true (as it often will 
be) just return the oid comparison.

cheers

andrew

Re: extensible enum types

From

Tom Lane

Date:

19 June 2010, 16:06:40

Andrew Dunstan <andrew@dunslane.net> writes:
> Another thought: could we add a column to pg_type with a flag that's 
> true if the oids are in sort order? Then the comparison routines could 
> just look that up in the type cache and if it's true (as it often will 
> be) just return the oid comparison.

Well, having to do a cache lookup already makes it a couple orders of
magnitude more expensive than an OID comparison.  However, it's hard to
say how much that matters in terms of total application performance.
We really could do with a bit of performance testing here ...
        regards, tom lane

Re: extensible enum types

From

Gurjeet Singh

Date:

19 June 2010, 17:14:57

On Fri, Jun 18, 2010 at 12:59 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

David E. Wheeler wrote:
On Jun 18, 2010, at 9:34 AM, Andrew Dunstan wrote:

I'd be perfectly happy to hear a reasonable alternative. Assuming we use some integer representation, given two labels represented by n and n+1, we can't add a label between them without rewriting the tables that use the type, whether it's my representation scheme or some other. Maybe we could have a FORCE option which would rewrite if necessary.

People would likely always use it.

Why not use a decimal number?

You are just bumping up the storage cost. Part of the attraction of enums is their efficiency.

Probably it'd be the same as the decimal suggestion above, but we can use floating-point data type.

It will allow injection of a new label at any stage.

CREATE leads to

Label1 -> 1.0
Label2 -> 2.0
Label3 -> 3.0

ALTER ... ADD Label4 AFTER Label2; leads to
Label1 -> 1.0
Label2 -> 2.0
Label4 -> 2.5
Label3 -> 3.0

ALTER ... ADD Label5 AFTER Label2; leads to
Label1 -> 1.0
Label2 -> 2.0
Label5 -> 2.25
Label4 -> 2.5
Label3 -> 3.0

Since floating-point implementation is architecture dependent, the ALTER command should check that the injected value does not equate to any value around it (eg. comparisons of (2.5 == 2) and (2.25 == 2.5) should not yield 0); and if it does, then throw an error and ask the user to use the rewrite-the-table version of the command.

And since it is still 32 bit, and comparisons done by machine, performance should be acceptably close to current integer comparisons, and much faster that the cache lookups etc. being proposed.

This is very similar to Andrew's original suggestion of splitting 32 bits into 16+16, but managed by the machine hence no complicated comparison algos needed on our part. Also, since this is all transparent to the SQL interface, our dump-reload cycle or Slony replication, etc. should not be affected either.

Regards,

--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Re: extensible enum types

From

Andrew Dunstan

Date:

19 June 2010, 17:55:19

Gurjeet Singh wrote:
>
>
> This is very similar to Andrew's original suggestion of splitting 32 
> bits into 16+16, but managed by the machine hence no complicated 
> comparison algos needed on our part. Also, since this is all 
> transparent to the SQL interface, our dump-reload cycle or Slony 
> replication, etc. should not be affected either.
>
>

It would break the on-disk representation, though. That's not something 
we want to do any more if it can possibly be avoided. We want to keep 
pg_upgrade working.

cheers

andrew

Re: extensible enum types

From

Merlin Moncure

Date:

19 June 2010, 18:11:52

On Sat, Jun 19, 2010 at 4:55 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Gurjeet Singh wrote:
>>
>>
>> This is very similar to Andrew's original suggestion of splitting 32 bits
>> into 16+16, but managed by the machine hence no complicated comparison algos
>> needed on our part. Also, since this is all transparent to the SQL
>> interface, our dump-reload cycle or Slony replication, etc. should not be
>> affected either.
>>
>>
>
> It would break the on-disk representation, though. That's not something we
> want to do any more if it can possibly be avoided. We want to keep
> pg_upgrade working.

I was partial to your original idea -- i thought it was quite clever
actually.  enums are a performance side of a tradeoff already so I
think any improvement for them should be looked at through that lens.

16 bits is IMO enough to pick a reasonable bucket size that gives you
enough play to handle the vast majority of cases that are appropriate
for enums.  your workaround in the rare case you actually hit the
limitations (most of these would fall under the 'oops, i used the
wrong tool' category) seems perfectly ok imo.

merlin

Re: extensible enum types

From

Simon Riggs

Date:

19 June 2010, 21:19:35

On Fri, 2010-06-18 at 11:50 -0400, Andrew Dunstan wrote:

> Thoughts?

enum types exist as an optimisation-by-avoidance of referential
integrity.

We're a relational database, so IMHO we should spend time performance
tuning RI. 

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services

Re: extensible enum types

From

Andrew Dunstan

Date:

19 June 2010, 23:16:54

Simon Riggs wrote:
> On Fri, 2010-06-18 at 11:50 -0400, Andrew Dunstan wrote:
>
>   
>> Thoughts?
>>     
>
> enum types exist as an optimisation-by-avoidance of referential
> integrity.
>
> We're a relational database, so IMHO we should spend time performance
> tuning RI. 
>
>   

I don't accept your initial assertion at all. But in any case, these are 
not mutually exclusive. Your work tuning RI will not obstruct mine in 
making enums more useful, nor vice versa.

cheers

andrew

Re: extensible enum types

From

Peter Geoghegan

Date:

19 June 2010, 23:42:58

>> Thoughts?
>
> enum types exist as an optimisation-by-avoidance of referential
> integrity.
>
> We're a relational database, so IMHO we should spend time performance
> tuning RI.

I take the view that they exist as a way of representing enumerations
of application/domain values - if it's hard coded in the application,
it's hard coded in the database by using an enum. This is why it isn't
that big a problem that they cannot be amended - they ought to be very
static, immutable values in the first place. I still think it would be
handy to be able to append new values though, and not have to ALTER
COLUMN USING to a new enum type. Besides, using enums in place of
lookup tables doesn't really make much sense as an optimisation.

It's very cool to be able to write queries like SELECT * FROM payments
WHERE payment_type = 'cash', rather than having to recall time and
again what the PK of cash is within your lookup table.

-- 
Regards,
Peter Geoghegan

Re: extensible enum types

From

"Joshua D. Drake"

Date:

20 June 2010, 00:51:14

On Sun, 2010-06-20 at 03:42 +0100, Peter Geoghegan wrote:
> >> Thoughts?

> It's very cool to be able to write queries like SELECT * FROM payments
> WHERE payment_type = 'cash', rather than having to recall time and
> again what the PK of cash is within your lookup table.

Ahem. That is what a natural key is for :)

Joshua D. Drake


>
> --
> Regards,
> Peter Geoghegan
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering

Re: extensible enum types

From

"Joshua D. Drake"

Date:

20 June 2010, 07:50:50

On Sun, 2010-06-20 at 03:42 +0100, Peter Geoghegan wrote:
> >> Thoughts?

> It's very cool to be able to write queries like SELECT * FROM payments
> WHERE payment_type = 'cash', rather than having to recall time and
> again what the PK of cash is within your lookup table.

Ahem. That is what a natural key is for :) 

Joshua D. Drake


> 
> -- 
> Regards,
> Peter Geoghegan
> 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering

Re: extensible enum types

From

Peter Geoghegan

Date:

20 June 2010, 12:02:48

> Ahem. That is what a natural key is for :)

Well, they have their own drawbacks that don't make them particularly
appealing to use with lookup tables to ape enums. How many lookup
tables have you seen in the wild with a natural key?

People sometimes represent things like US states as enums. This is
probably a mistake, because you cannot control or predict if there'll
be a new US state, unlikely though that me be. You *can* control, for
example, what types of payment your application can deal with, and
you'll probably have to hardcode differences in dealing with each
inside your application, which makes enums a good choice. In my
earlier example, in addition to 'cash', there is a value for
payment_type of 'credit_card' . There is a separate column in the
payments table that references a credit_cards table, because credit
cards are considered transitory. A check constraint enforces that
credit_cards_id is null or not null as appropriate.

I don't like the idea of having values in a table that aren't so much
data as an integral part of your application/database. I think it's
wrong-headed. That's why I am not in favour of your enums as a lookup
table wrapper suggestion.

-- 
Regards,
Peter Geoghegan

Re: extensible enum types

From

"Kevin Grittner"

Date:

20 June 2010, 12:24:48

Peter Geoghegan  wrote:
> How many lookup tables have you seen in the wild with a natural
> key?
Me?  Personally?  A few hundred.

> People sometimes represent things like US states as enums. This is
> probably a mistake, because you cannot control or predict if
> there'll be a new US state, unlikely though that me be.
More importantly, you're likely to need to associate properties with
the state.  Sales tax info, maybe a sales manager, etc.  A state
table can be a handy place to store things like that.
> I don't like the idea of having values in a table that aren't so
> much data as an integral part of your application/database.
Yep, exactly why natural keys should be used when possible.
-Kevin

Re: extensible enum types

From

Peter Geoghegan

Date:

20 June 2010, 13:32:15

>> People sometimes represent things like US states as enums. This is
>> probably a mistake, because you cannot control or predict if
>> there'll be a new US state, unlikely though that me be.
>
> More importantly, you're likely to need to associate properties with
> the state.  Sales tax info, maybe a sales manager, etc.  A state
> table can be a handy place to store things like that.

That's probably true, but if there was any question of needing to
associate such values with US states, it ought to be perfectly obvious
to everyone that enums are totally inappropriate. If that wasn't the
case, then their use is only highly questionable, at least IMHO. What
you're describing isn't really a lookup table as I understand the
term. It's just a table. Lookup tables typically have things in them
like the various possible states of another table's tuples. In my
experience, lookup tables generally have two columns, an integer PK
and a description/state.

>> I don't like the idea of having values in a table that aren't so
>> much data as an integral part of your application/database.
>
> Yep, exactly why natural keys should be used when possible.

The "not having to remember lookup value PK" point I made was very
much ancillary to my main point. Ideally, if you restore a schema-only
dump of your database, you shouldn't be missing anything that is
schema. Things like the possible states of a table's tuples are often
schema, not data, and should be treated as such.

--
Regards,
Peter Geoghegan

Re: extensible enum types

From

Andrew Dunstan

Date:

21 June 2010, 11:43:56

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Another thought: could we add a column to pg_type with a flag that's 
>> true if the oids are in sort order? Then the comparison routines could 
>> just look that up in the type cache and if it's true (as it often will 
>> be) just return the oid comparison.
>>     
>
> Well, having to do a cache lookup already makes it a couple orders of
> magnitude more expensive than an OID comparison.  However, it's hard to
> say how much that matters in terms of total application performance.
> We really could do with a bit of performance testing here ...
>
>             
>   

I have done some. The performance hit is fairly horrible. Adding cache 
lookups for the enum rows to the comarison routines made a REINDEX on a 
1m row table where the index is on an enum column (the enum has 500 
randomly ordered labels) jump from around 10s to around 70s. I think 
that probably rules out doing anything like this for the existing enum  
types. I think the most we can reasonably do there is to allow adding a 
label to the end of the enum list. I'm fairly resistant to doing 
something which will have a major performance impact, as I know there 
are users who are relying on enums for performce reasons. I'm also 
fairly resistant to doing things which will require table rewriting.

So the question then is: do we want to allow lots of flexibility for 
positioning new labels with significant degradation in comparison 
performace for a new enum variant, or have a new variant with some 
restrictions which probably won't impact most users but would have 
equivalent performance to the current enum family, or do nothing?

cheers

andrew

Re: extensible enum types

From

Tom Lane

Date:

21 June 2010, 12:06:22

Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> Well, having to do a cache lookup already makes it a couple orders of
>> magnitude more expensive than an OID comparison.  However, it's hard to
>> say how much that matters in terms of total application performance.
>> We really could do with a bit of performance testing here ...

> I have done some. The performance hit is fairly horrible. Adding cache 
> lookups for the enum rows to the comarison routines made a REINDEX on a 
> 1m row table where the index is on an enum column (the enum has 500 
> randomly ordered labels) jump from around 10s to around 70s.

Hmmm... that's bad, but I bet it's still less than the cost of comparing
NUMERICs.  Also, did you make any attempt to avoid repetitive cache
lookups by storing a pointer in fn_extra (cf array comparisons)?
        regards, tom lane

Re: extensible enum types

From

Andrew Dunstan

Date:

21 June 2010, 13:28:27


Tom Lane wrote:
>> Adding cache 
>> lookups for the enum rows to the comarison routines made a REINDEX on a 
>> 1m row table where the index is on an enum column (the enum has 500 
>> randomly ordered labels) jump from around 10s to around 70s.
>>     
>
> Hmmm... that's bad, but I bet it's still less than the cost of comparing
> NUMERICs.  Also, did you make any attempt to avoid repetitive cache
> lookups by storing a pointer in fn_extra (cf array comparisons)?
>
>             
>   

No. Will work on that. Thanks.

cheers

andrew

Re: extensible enum types

From

"Kevin Grittner"

Date:

21 June 2010, 14:04:56

Peter Geoghegan <peter.geoghegan86@gmail.com> wrote:
> In my experience, lookup tables generally have two columns, an
> integer PK and a description/state.
Eek.  If that's what you consider a lookup table, I wouldn't
advocate their use for anything.  Ever.  Period.
-Kevin

Re: extensible enum types

From

Simon Riggs

Date:

21 June 2010, 14:16:29

On Mon, 2010-06-21 at 12:04 -0500, Kevin Grittner wrote:
> Peter Geoghegan <peter.geoghegan86@gmail.com> wrote:
>  
> > In my experience, lookup tables generally have two columns, an
> > integer PK and a description/state.
>  
> Eek.  If that's what you consider a lookup table, I wouldn't
> advocate their use for anything.  Ever.  Period.

Do you mean you don't use relational modelling, or do you mean you would
never implement your physical database that way because of the
performance impact of RI on PostgreSQL? Or?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services

Re: extensible enum types

From

Andrew Dunstan

Date:

21 June 2010, 23:39:01


Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>
>> Tom Lane wrote:
>>
>>> Well, having to do a cache lookup already makes it a couple orders of
>>> magnitude more expensive than an OID comparison.  However, it's hard to
>>> say how much that matters in terms of total application performance.
>>> We really could do with a bit of performance testing here ...
>>>
>
>
>> I have done some. The performance hit is fairly horrible. Adding cache
>> lookups for the enum rows to the comarison routines made a REINDEX on a
>> 1m row table where the index is on an enum column (the enum has 500
>> regards, tom lane
>>
>> randomly ordered labels) jump from around 10s to around 70s.
>>
>
> Hmmm... that's bad, but I bet it's still less than the cost of comparing
> NUMERICs.  Also, did you make any attempt to avoid repetitive cache
> lookups by storing a pointer in fn_extra (cf array comparisons)?
>
>
>

OK, making a bit of progress. Attached is a sort of proof of concept
patch that does that. It stores a bsearchable list of {enum, sort_order}
pairs in fn_extra, along with a flag that indicates if the oids are  in
fact ordered. This flag, which would be maintained in and populated from
pg_type, would allow avoidance of any significant performance penalty in
such cases by relying on straight Oid comparison. We'd probably need to
keep a count of labels in pg_type too so we could size the cache
appropriately. This approach just about buys the best of both worlds.
The execution time for the test mentioned above is down from around 70s
to around 20s. I think for a worst case that's not too bad, especially
when it is completely avoided unless we have perturbed the sort order.

If anyone wants to play along, my test set is available at
<http://developer.postgresql.org/~adunstan/enumtest.dmp> It's about 8.5Mb.

cheers

andrew
Index: enum.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/enum.c,v
retrieving revision 1.11
diff -c -r1.11 enum.c
*** enum.c    26 Feb 2010 02:01:08 -0000    1.11
--- enum.c    22 Jun 2010 02:16:48 -0000
***************
*** 14,19 ****
--- 14,20 ----
  #include "postgres.h"

  #include "catalog/pg_enum.h"
+ #include "catalog/pg_type.h"
  #include "fmgr.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
***************
*** 22,27 ****
--- 23,52 ----
  #include "libpq/pqformat.h"
  #include "miscadmin.h"

+ typedef struct
+ {
+     Oid      enum_oid;
+     uint32   sort_order;
+ } enum_sort;
+
+ typedef struct
+ {
+     bool      oids_are_sorted;
+     int       sort_list_length;
+     enum_sort sort_order_list[1024];
+ } enum_sort_cache;
+
+
+ static int
+ enum_sort_cmp(void * es1, void * es2)
+ {
+     enum_sort *p1, *p2;
+     p1 = (enum_sort *)es1;
+     p2 = (enum_sort *)es2;
+     return p1->enum_oid - p2->enum_oid;
+ }
+
+

  static ArrayType *enum_range_internal(Oid enumtypoid, Oid lower, Oid upper);
  static int    enum_elem_cmp(const void *left, const void *right);
***************
*** 155,167 ****

  /* Comparison functions and related */

  Datum
  enum_lt(PG_FUNCTION_ARGS)
  {
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(a < b);
  }

  Datum
--- 180,283 ----

  /* Comparison functions and related */

+ static inline int
+ enum_ccmp(Oid arg1, Oid arg2, FunctionCallInfo fcinfo)
+ {
+
+     enum_sort_cache * mycache;
+     enum_sort *es1, *es2;
+     int sort1, sort2;
+     bool added = false;
+     HeapTuple    tup;
+     Form_pg_enum en;
+     Oid typeoid;
+     Form_pg_type typ;
+
+     if (arg1 == arg2)
+         return 0;
+
+     mycache = (enum_sort_cache *) fcinfo->flinfo->fn_extra;
+     if (mycache == NULL )
+     {
+         fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+                                                       sizeof(enum_sort_cache));
+         mycache = (enum_sort_cache *) fcinfo->flinfo->fn_extra;
+         mycache->sort_list_length = 1;
+         tup = SearchSysCache1(ENUMOID, ObjectIdGetDatum(arg1));
+         en = (Form_pg_enum) GETSTRUCT(tup);
+         mycache->sort_order_list[0].enum_oid = arg1;
+         mycache->sort_order_list[0].sort_order = arg1;
+         typeoid = en->enumtypid;
+         ReleaseSysCache(tup);
+         tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typeoid));
+         typ = (Form_pg_type)  GETSTRUCT(tup);
+         if (typ->typtype != 'e')
+             elog(ERROR,"wrong type for oid %u",typeoid);
+         /* XXX TODO fill in oids_are_sorted property from type tuple here */
+         mycache->oids_are_sorted = false;
+         ReleaseSysCache(tup);
+     }
+
+     if (mycache->oids_are_sorted)
+         return arg1 - arg2;
+
+     es1 = bsearch(&arg1,mycache->sort_order_list,mycache->sort_list_length,
+                   sizeof(enum_sort),enum_sort_cmp);
+     es2 = bsearch(&arg2,mycache->sort_order_list,mycache->sort_list_length,
+                   sizeof(enum_sort),enum_sort_cmp);
+
+     if (es1 == NULL)
+     {
+
+         tup = SearchSysCache1(ENUMOID, ObjectIdGetDatum(arg1));
+         en = (Form_pg_enum) GETSTRUCT(tup);
+         mycache->sort_order_list[mycache->sort_list_length].enum_oid = arg1;
+         sort1 = arg1; /* XXX should be en->enumsort */
+         mycache->sort_order_list[mycache->sort_list_length].sort_order =
+             sort1;
+         ReleaseSysCache(tup);
+         mycache->sort_list_length ++;
+         added = true;
+     }
+     else
+     {
+         /* already in cache */
+         sort1 = es1->sort_order;
+     }
+
+     if (es2 == NULL)
+     {
+
+         tup = SearchSysCache1(ENUMOID, ObjectIdGetDatum(arg2));
+         en = (Form_pg_enum) GETSTRUCT(tup);
+         sort2 = arg2; /* XXX should be en->enumsort */
+         mycache->sort_order_list[mycache->sort_list_length].enum_oid = arg2;
+         mycache->sort_order_list[mycache->sort_list_length].sort_order =
+             sort2;
+         ReleaseSysCache(tup);
+         mycache->sort_list_length ++;
+         added = true;
+     }
+     else
+     {
+         /* already in cache */
+         sort2 = es2->sort_order;
+     }
+
+     if (added)
+         qsort(mycache->sort_order_list,mycache->sort_list_length,
+               sizeof(enum_sort),enum_sort_cmp);
+
+     return sort1 - sort2;
+ }
+
  Datum
  enum_lt(PG_FUNCTION_ARGS)
  {
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(enum_ccmp(a,b,fcinfo) < 0);
  }

  Datum
***************
*** 170,176 ****
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(a <= b);
  }

  Datum
--- 286,292 ----
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(enum_ccmp(a,b,fcinfo) <= 0);
  }

  Datum
***************
*** 197,203 ****
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(a >= b);
  }

  Datum
--- 313,319 ----
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(enum_ccmp(a,b,fcinfo) >= 0);
  }

  Datum
***************
*** 206,212 ****
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(a > b);
  }

  Datum
--- 322,328 ----
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     PG_RETURN_BOOL(enum_ccmp(a,b,fcinfo) > 0);
  }

  Datum
***************
*** 233,242 ****
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     if (a > b)
!         PG_RETURN_INT32(1);
!     else if (a == b)
          PG_RETURN_INT32(0);
      else
          PG_RETURN_INT32(-1);
  }
--- 349,358 ----
      Oid            a = PG_GETARG_OID(0);
      Oid            b = PG_GETARG_OID(1);

!     if (a == b)
          PG_RETURN_INT32(0);
+     else if (enum_ccmp(a,b,fcinfo) > 0)
+         PG_RETURN_INT32(1);
      else
          PG_RETURN_INT32(-1);
  }

Re: extensible enum types

From

Bruce Momjian

Date:

23 June 2010, 19:00:21

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Robert Haas wrote:
> >> I like the idea of being able to modify enums on the fly, but I'm
> >> skeptical of an implementation that won't always work.  Maybe it's
> >> still better than what we have now, but it seems grotty.
> 
> > I'd be perfectly happy to hear a reasonable alternative.
> 
> Insert a sort order column into pg_enum, and rearrange the values in
> that whenever the user wants to add a new value in a particular place.
> You give up cheap comparisons in exchange for flexibility.  I think lots
> of people would accept that tradeoff, especially if they could make it
> per-datatype.
> 
> One point here is that you'd have to restrict the rearrangements so that
> the relative sort order of existing values never changes, else you break
> (for example) indexes on columns of that type.

Sorry to be commenting late, but don't most people want to add to the
end or beginning of the enum list, rather than in the middle, and can't
we support that already?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + None of us is going to be here forever. +

Re: extensible enum types

From

Tom Lane

Date:

23 June 2010, 19:03:55

Bruce Momjian <bruce@momjian.us> writes:
> Sorry to be commenting late, but don't most people want to add to the
> end or beginning of the enum list, rather than in the middle, and can't
> we support that already?

We could allow adding a value, but we couldn't guarantee where it would
appear in the type's sort ordering.  Depending on the current OID
counter it would usually show up either at the end or the beginning.
I think the general feeling is that this is too implementation-dependent
to be acceptable.
        regards, tom lane

Re: extensible enum types

From

Bruce Momjian

Date:

23 June 2010, 19:10:05

Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Sorry to be commenting late, but don't most people want to add to the
> > end or beginning of the enum list, rather than in the middle, and can't
> > we support that already?
> 
> We could allow adding a value, but we couldn't guarantee where it would
> appear in the type's sort ordering.  Depending on the current OID
> counter it would usually show up either at the end or the beginning.
> I think the general feeling is that this is too implementation-dependent
> to be acceptable.

Well, we don't need the enum value to map into the entire oid range. 
Can't we just add one to the top-most value and see if there is a
conflict?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + None of us is going to be here forever. +

Re: extensible enum types

From

Tom Lane

Date:

23 June 2010, 19:12:44

Bruce Momjian <bruce@momjian.us> writes:
> Well, we don't need the enum value to map into the entire oid range. 
> Can't we just add one to the top-most value and see if there is a
> conflict?

If you don't use the OID counter to generate the new value, you're going
to have problems with race conditions.  There's also that small chance
that there is no free value before 2^32.

The bottom line here is not wanting a feature that "usually" works but
fails once in awhile on the basis of conditions the user can't control.
        regards, tom lane

Re: extensible enum types

From

Andrew Dunstan

Date:

23 June 2010, 20:30:42

Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>   
>> Well, we don't need the enum value to map into the entire oid range. 
>> Can't we just add one to the top-most value and see if there is a
>> conflict?
>>     
>
> If you don't use the OID counter to generate the new value, you're going
> to have problems with race conditions.  There's also that small chance
> that there is no free value before 2^32.
>
> The bottom line here is not wanting a feature that "usually" works but
> fails once in awhile on the basis of conditions the user can't control.
>
>   

Yeah, what I'm now hoping to be able to do, following good suggestions 
from Tom, is to provide a way to get virtually no degradation in bulk 
comparison performance in the common case where any additions have been 
made at the end of the list with no oid wraparound, and acceptable 
performance otherwise, but provide for an ability to add new values at 
arbitrary places in the ordering, with no limit.

If we can do that why would we want less?

cheers

andrew