Thread: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

Andrew Dunstan wrote:
>
> Bruce Momjian wrote:
>> Log Message:
>> -----------
>> Update:
>>
>> < * Allow adding enumerated values to an existing enumerated data
>>   
>>> * Allow adding/removing enumerated values to an existing enumerated data
>
> Where did this come from? Adding values anywhere except on the end of  
> the enumeration list will be fraught with danger, as will removing them.  
> In essence, either operation would entail rewriting every table that  
> used the type. Anything else carries a major risk of corruption. That  
> seems like a pretty bad idea.

We already support rewriting tables ... (albeit only one at a time, I
admit.  Doing it for more than one can cause deadlocks).

Still, if the user wants to pay the cost, why should we prohibit it?

Perhaps we should add a pg_depend entries on tables using the type (or
we have them already), and disallow modifying it unless there are no
users of it.  So the user can create a new enum with the options he
wants, then rewrite his tables one by one, then drop the original.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Apr 26, 2008 at 4:27 AM, Alvaro Herrera  wrote:
> Andrew Dunstan wrote:
>  >
>  > Bruce Momjian wrote:
>  >> Log Message:
>  >> -----------
>  >> Update:
>  >>
>  >>   >>
>  >>> * Allow adding/removing enumerated values to an existing enumerated data
>  >
>  > Where did this come from? Adding values anywhere except on the end of
>  > the enumeration list will be fraught with danger, as will removing them.
>  > In essence, either operation would entail rewriting every table that
>  > used the type. Anything else carries a major risk of corruption. That
>  > seems like a pretty bad idea.
>
>  We already support rewriting tables ... (albeit only one at a time, I
>  admit.  Doing it for more than one can cause deadlocks).
>
>  Still, if the user wants to pay the cost, why should we prohibit it?
>

I agree with Alvaro's sentiment here, but it  does seem likely that
adding an value to the end of an enum list is much lower-hanging fruit
than add/remove.

Has anyone had a close look at how hard it would be allow just the
"add to the end" capability?

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIEiWO5YBsbHkuyV0RAo5pAKDMQ7aAbJJyIe74c+PacXVXg5chXACdEnv3
sFiNsSf193/C9HpW5UVhYWs=
=jOzI
-----END PGP SIGNATURE-----


"Brendan Jurd" <direvus@gmail.com> writes:
> Has anyone had a close look at how hard it would be allow just the
> "add to the end" capability?

The problem is you can't guarantee anything about the ordering of the
new value relative to the old ones.  The OID it's assigned might be
after them, or before them if the OID counter has wrapped around,
or (with much smaller probability) between two existing ones.

This is something we consciously gave up when we selected the current
ENUM implementation.
        regards, tom lane


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Apr 26, 2008 at 6:02 AM, Tom Lane  wrote:
> "Brendan Jurd"  writes:
>  > Has anyone had a close look at how hard it would be allow just the
>  > "add to the end" capability?
>
>  The problem is you can't guarantee anything about the ordering of the
>  new value relative to the old ones.  The OID it's assigned might be
>  after them, or before them if the OID counter has wrapped around,
>  or (with much smaller probability) between two existing ones.
>

I see.  So to add a value on to the end with guaranteed ordering, you
would have to rebuild the enum from scratch anyway.

Then there's no technical difference at all in implementing "add to
the end" and "add/remove".

Thanks for the answer.

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIEjrO5YBsbHkuyV0RAogmAKC3u8wogvrUNSfFUx/PbKyS6U2/DgCgjDSu
z+VNazq7LiRdiU4oUvKL0jc=
=obXq
-----END PGP SIGNATURE-----


On Sat, Apr 26, 2008 at 12:10 AM, Brendan Jurd <direvus@gmail.com> wrote:
>  Has anyone had a close look at how hard it would be allow just the
>  "add to the end" capability?

If the OIDs haven't wrapped around since the enum was created, it's
trivial. If they have, well someone with more OID-fu than me will have
to explain what we'd have to do to guarantee getting a new OID higher
than a certain value. And if your highest enum oid happens to be
4^32-1, you're in trouble :).

I wonder if it's worth revisiting the decision to save enums on disk
as oids. The very first idea that I had was to have an enum value as
the combination of both an enum id and the ordinal value. We would
presumably make both say 16bits so we could still be be passed by
value. This would restrict us to 2^16 enum types per database and 2^16
values per type, but if anyone is getting within an order of magnitude
of either limit I'd be very interested in seeing what they're doing.

The nice thing about the above approach is that we could space out the
ordinal values so as to allow people to insert a fair number of extra
values before forcing a rewrite of the table. The only thing we really
couldn't handle that way would be reordering - we'd need an extra
layer of indirection for that, which would have some performance
penalties. None of the standard operators for enums require a syscache
lookup currently, only I/O does,

Cheers

Tom


Brendan Jurd escribió:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On Sat, Apr 26, 2008 at 6:02 AM, Tom Lane  wrote:
> > "Brendan Jurd"  writes:
> >  > Has anyone had a close look at how hard it would be allow just the
> >  > "add to the end" capability?
> >
> >  The problem is you can't guarantee anything about the ordering of the
> >  new value relative to the old ones.  The OID it's assigned might be
> >  after them, or before them if the OID counter has wrapped around,
> >  or (with much smaller probability) between two existing ones.
> 
> I see.  So to add a value on to the end with guaranteed ordering, you
> would have to rebuild the enum from scratch anyway.

The other alternative is to make the system generate the new OIDs in
such a way that the ordering is preserved.  This, of course, has a lot
of problems of its own.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


On Fri, Apr 25, 2008 at 11:57 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
>  We already support rewriting tables ... (albeit only one at a time, I
>  admit.  Doing it for more than one can cause deadlocks).
>
>  Still, if the user wants to pay the cost, why should we prohibit it?

One scenario I'm not happy about is this: the friendly db admin has
happily added an extra value to the end before and the operation has
been a snap - no rewriting required. But this time either a) oid
wraparound has occurred, b) she's inserted one or c) she's reordered
them. Bam - we start rewriting the entire database. That's not the
kind of surprise I like giving people, and the current situation of
either don't allow updates at all, or the alternative to surprises of
always rewrite everything seem pretty deficient. And I don't want to
only allow updates if they won't cause a rewrite, it's
nondeterministic.

I've already suggested some alternatives in the reply to Brendan that
would solve some of this, but I suppose another gross-seeming way to
stop surprise rewrites would be to never do one unless given a FORCE
REWRITE clause on the ALTER statement or something like that, and fail
if a rewrite is required not specified.

>  So the user can create a new enum with the options he
>  wants, then rewrite his tables one by one, then drop the original.

They can pretty much do this now, they just need to define an implicit
cast I think.

Cheers

Tom


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Apr 26, 2008 at 6:19 AM, Tom Dunstan  wrote:
>  I wonder if it's worth revisiting the decision to save enums on disk
>  as oids. The very first idea that I had was to have an enum value as
>  the combination of both an enum id and the ordinal value.

That's very intuitive.

>  The nice thing about the above approach is that we could space out the
>  ordinal values so as to allow people to insert a fair number of extra
>  values before forcing a rewrite of the table. The only thing we really
>  couldn't handle that way would be reordering - we'd need an extra
>  layer of indirection for that, which would have some performance
>  penalties. None of the standard operators for enums require a syscache
>  lookup currently, only I/O does,
>

Wouldn't you still be able to do a reorder with a table rewrite?

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIEkBC5YBsbHkuyV0RAggsAJ0btEoQTTwiakLPRg/sixcXFCroLwCgleri
Q/JebP+AWErEl+w2+QHL16o=
=YUGG
-----END PGP SIGNATURE-----


"Tom Dunstan" <pgsql@tomd.cc> writes:
> I wonder if it's worth revisiting the decision to save enums on disk
> as oids. The very first idea that I had was to have an enum value as
> the combination of both an enum id and the ordinal value. We would
> presumably make both say 16bits so we could still be be passed by
> value. This would restrict us to 2^16 enum types per database and 2^16
> values per type, but if anyone is getting within an order of magnitude
> of either limit I'd be very interested in seeing what they're doing.

I seem to remember that we discussed that and rejected it, but I don't
remember the reasoning...
        regards, tom lane


"Tom Dunstan" <pgsql@tomd.cc> writes:
> One scenario I'm not happy about is this: the friendly db admin has
> happily added an extra value to the end before and the operation has
> been a snap - no rewriting required. But this time either a) oid
> wraparound has occurred, b) she's inserted one or c) she's reordered
> them. Bam - we start rewriting the entire database. That's not the
> kind of surprise I like giving people, and the current situation of
> either don't allow updates at all, or the alternative to surprises of
> always rewrite everything seem pretty deficient. And I don't want to
> only allow updates if they won't cause a rewrite, it's
> nondeterministic.

If we take OIDs out of the picture it wouldn't be nondeterministic.

I think with something like your 16bit/16bit design, and say ten free
codes between each original assignment, it'd be okay to not support the
rewriting stuff at all.  The frequency with which people would hit the
restriction would be so low it wouldn't be worth supporting the code for
it, especially since we couldn't do it any more efficiently than a
manual ALTER COLUMN TYPE replacement would.
        regards, tom lane


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Apr 26, 2008 at 6:33 AM, Tom Dunstan  wrote:
>  One scenario I'm not happy about is this: the friendly db admin has
>  happily added an extra value to the end before and the operation has
>  been a snap - no rewriting required. But this time either a) oid
>  wraparound has occurred, b) she's inserted one or c) she's reordered
>  them. Bam - we start rewriting the entire database.

As long as the documentation is candid about this, I don't think it's
a show-stopper.  e.g.:
   N.B.  Rearranging an ENUM will usually be a simple operation, but
in $CERTAIN_CASES may require a rewrite of tables using the ENUM,
which is time consuming and locks the table against writing ...

You'd probably also want a "NOTICE: Change to ENUM will require
rewriting of tables." to be emitted when this happens.

>
>  I've already suggested some alternatives in the reply to Brendan that
>  would solve some of this, but I suppose another gross-seeming way to
>  stop surprise rewrites would be to never do one unless given a FORCE
>  REWRITE clause on the ALTER statement or something like that, and fail
>  if a rewrite is required not specified.
>

That would be okay too, but I think I'd prefer proceeding with the
rewrite after emitting a NOTICE.  If the db admin decides not to go
ahead, or wait to do it after hours, she can always hit ^C, right?

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIEkO+5YBsbHkuyV0RAttIAJ9TNhNDN8SAsfyAR5MY9lppPyeWSQCfYOSs
kG25F0V44QqTZ4HMAWXL5JI=
=tG5q
-----END PGP SIGNATURE-----


"Brendan Jurd" <direvus@gmail.com> writes:
> On Sat, Apr 26, 2008 at 6:33 AM, Tom Dunstan  wrote:
>> I've already suggested some alternatives in the reply to Brendan that
>> would solve some of this, but I suppose another gross-seeming way to
>> stop surprise rewrites would be to never do one unless given a FORCE
>> REWRITE clause on the ALTER statement or something like that, and fail
>> if a rewrite is required not specified.

> That would be okay too, but I think I'd prefer proceeding with the
> rewrite after emitting a NOTICE.  If the db admin decides not to go
> ahead, or wait to do it after hours, she can always hit ^C, right?

The more I think about it, the less I think that we want to support such
a feature at all.  Consider that it'd require taking a fairly strong
lock (surely at least locking out other writers) on every table using
the enum, in who-knows-what order.  The odds of completing without
deadlock seem to be right about nil.
        regards, tom lane


On Sat, Apr 26, 2008 at 2:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  > The very first idea that I had was to have an enum value as
>  > the combination of both an enum id and the ordinal value.
>
>  I seem to remember that we discussed that and rejected it, but I don't
>  remember the reasoning...

I don't think there was any terribly strong objection. IIRC I
originally proposed trying to fit everything into 2 bytes, you
objected to that as "unnecessary bit-shaving" and proposed 8 bytes, I
didn't want to give up pass-by-value, plus my initial pg_enum design
was rather denormalized - the current solution was a compromise that
fixed that and kept everyone happy. :) But we didn't really consider
updates too carefully. Maybe it was just a bit too cute a solution.

So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":

1 - We space the values out as evenly as we can across the 65000ish
range and allow people to delete, insert and append, but not reorder.
If they do the above gratuitously we might have to do a rewrite, but
they'll have to get fairly busy to do it. Rewrite would be required
for reorderings.

2- We totally give up the idea of storing a value on disk that is
directly comparable (other than equality), and simply number from zero
up, using that number to index into an array (or use as syscache key
or whatever) containing the real ordering information. We can then
reorder or do any other operations to our heart's content.

I'm actually favouring option 2 - I think it can be done in such a way
as to not be much of an overhead compared to the status quo, and you
know that if we don't implement proper reordering now, someone will
ask for it, and we'll be having this discussion at a similar time
after 8.4 goes out.

I'm happy to work on a patch for this if it meets general approval.

Cheers

Tom


Oops, sorry for the crossed emails, slight delay in my main being received.

On Sat, Apr 26, 2008 at 2:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  I think with something like your 16bit/16bit design, and say ten free
>  codes between each original assignment, it'd be okay to not support the
>  rewriting stuff at all.

Yeah. I'm more concerned about someone wanting reordering, hence the
second option in my other mail.

Cheers

Tom


"Tom Dunstan" <pgsql@tomd.cc> writes:
> 1 - We space the values out as evenly as we can across the 65000ish
> range and allow people to delete, insert and append, but not reorder.
> If they do the above gratuitously we might have to do a rewrite, but
> they'll have to get fairly busy to do it. Rewrite would be required
> for reorderings.

> 2- We totally give up the idea of storing a value on disk that is
> directly comparable (other than equality), and simply number from zero
> up, using that number to index into an array (or use as syscache key
> or whatever) containing the real ordering information. We can then
> reorder or do any other operations to our heart's content.

> I'm actually favouring option 2 -

I'm not ... it strikes me that it will add implementation complexity and
runtime overhead for a feature that two days ago we didn't think we
needed at all, and IMHO one we still shouldn't be thinking to expend a
lot of work on.

I like #1 with no rewrite support.  That strikes me as covering 99%
of the requirements with 10% of the work.

Further, as already noted, if you do have to rewrite then a series of
manual ALTER COLUMN TYPE operations would probably be a *better* answer
than a monolithic implementation, because of the locking problems
involved in doing it in one transaction.  (Oh, and don't forget the disk
space problem: double the disk space for every table involved,
simultaneously.)
        regards, tom lane

PS: no, I do *not* want to hear any proposals for ALTER TYPE
CONCURRENTLY ;-)



Tom Dunstan wrote:
> So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":
>
> 1 - We space the values out as evenly as we can across the 65000ish
> range and allow people to delete, insert and append, but not reorder.
> If they do the above gratuitously we might have to do a rewrite, but
> they'll have to get fairly busy to do it. Rewrite would be required
> for reorderings.
>   

Or else we just error out in such cases. As Tom Lane suggests, rewriting 
has some nasty deadlock possibilities.

You always have the option of creating a new enum type and moving each 
affected column to that type.

> 2- We totally give up the idea of storing a value on disk that is
> directly comparable (other than equality), and simply number from zero
> up, using that number to index into an array (or use as syscache key
> or whatever) containing the real ordering information. We can then
> reorder or do any other operations to our heart's content.
>
> I'm actually favouring option 2 - I think it can be done in such a way
> as to not be much of an overhead compared to the status quo, and you
> know that if we don't implement proper reordering now, someone will
> ask for it, and we'll be having this discussion at a similar time
> after 8.4 goes out.
>
>
>   

Being able simply to order by the oid value is fast. That's one of the 
current benefits. So I think we'd need some benchmarking to show that 
this wouldn't slow things down.

cheers


andrew


On Sat, Apr 26, 2008 at 2:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  I'm not ... it strikes me that it will add implementation complexity and
>  runtime overhead for a feature that two days ago we didn't think we
>  needed at all, and IMHO one we still shouldn't be thinking to expend a
>  lot of work on.

Well, I *did* think it was necessary, I just hadn't spent the effort
in coming up with a solution. And on the effort side, I'm not going to
be hacking the optimizer any day soon. :)

>  I like #1 with no rewrite support.  That strikes me as covering 99%
>  of the requirements with 10% of the work.

>  Further, as already noted, if you do have to rewrite then a series of
>  manual ALTER COLUMN TYPE operations would probably be a *better* answer
>  than a monolithic implementation, because of the locking problems
>  involved in doing it in one transaction.

I don't understand this if it's calling option 2 the monolithic
implementation. I was intending that the values be permanent tokens if
you like, so that ZERO rewriting would be required for any types of
modification. So I don't see where locking comes in. I don't want
rewriting either.

Hmm, maybe I haven't explained it properly. Here's an example:

CREATE TYPE thing AS ENUM('vegetable', 'animal');

Hypothetical new pg_enum:enum_id | value | order | label
---------+-------+-------+-----------      0 |     0 |     0 | vegetable      0 |     1 |     1 | animal


ALTER TYPE thing AS ENUM('animal', 'mineral', 'vegetable');

pg_enum:enum_id | value | order | label
---------+-------+-------+-----------      0 |     0 |     2 | vegetable      0 |     1 |     0 | animal      0 |     2
|    1 | mineral
 

So we store the 'value' column on disk, and it never changes. The
downside is that we now have to look up the order when we call a
non-equality operator, but I reckon we can pretty efficiently cache
that, so the lookup is just a couple of array index operations. The
upside is that we can reorder, and we'll never run out of values
"in-between" existing ones.

Anyway, sorry if all of the above *was* clear and I just misunderstood
the comment. If there's consensus to go with option 1 I'll pursue that
path. It's much less of a change to go from option 1 to option 2 than
the current code to either of them anyway, so doing some benchmarking
of both options shouldn't be hard if I want to.

Cheers

Tom


"Tom Dunstan" <pgsql@tomd.cc> writes:
> On Sat, Apr 26, 2008 at 2:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Further, as already noted, if you do have to rewrite then a series of
>> manual ALTER COLUMN TYPE operations would probably be a *better* answer
>> than a monolithic implementation, because of the locking problems
>> involved in doing it in one transaction.

> I don't understand this if it's calling option 2 the monolithic
> implementation.

No, I was imagining an option-1 implementation trying to support
rewriting of all affected tables in a single "monolithic" command.
        regards, tom lane


Andrew Dunstan escribió:

> Tom Dunstan wrote:
>> So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":
>>
>> 1 - We space the values out as evenly as we can across the 65000ish
>> range and allow people to delete, insert and append, but not reorder.
>> If they do the above gratuitously we might have to do a rewrite, but
>> they'll have to get fairly busy to do it. Rewrite would be required
>> for reorderings.
>
> Or else we just error out in such cases. As Tom Lane suggests, rewriting  
> has some nasty deadlock possibilities.
>
> You always have the option of creating a new enum type and moving each  
> affected column to that type.

Another alternative would be internally creating a different temporary
enum, rewriting the tables one by one each on its own transaction, and
finish by dropping the original enum and renaming the temporary one.
This solves the deadlock problem.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.



Alvaro Herrera wrote:
> Andrew Dunstan escribió:
>
>   
>> Tom Dunstan wrote:
>>     
>>> So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":
>>>
>>> 1 - We space the values out as evenly as we can across the 65000ish
>>> range and allow people to delete, insert and append, but not reorder.
>>> If they do the above gratuitously we might have to do a rewrite, but
>>> they'll have to get fairly busy to do it. Rewrite would be required
>>> for reorderings.
>>>       
>> Or else we just error out in such cases. As Tom Lane suggests, rewriting  
>> has some nasty deadlock possibilities.
>>
>> You always have the option of creating a new enum type and moving each  
>> affected column to that type.
>>     
>
> Another alternative would be internally creating a different temporary
> enum, rewriting the tables one by one each on its own transaction, and
> finish by dropping the original enum and renaming the temporary one.
> This solves the deadlock problem.
>
>   

What happens when someone tries to join two of the tables, one that has 
been converted and one that hasn't? You might not have deadlock, but you 
won't have type integrity either, ISTM.

cheers

andrew


Andrew Dunstan <andrew@dunslane.net> writes:
> Alvaro Herrera wrote:
>> Another alternative would be internally creating a different temporary
>> enum, rewriting the tables one by one each on its own transaction, and
>> finish by dropping the original enum and renaming the temporary one.
>> This solves the deadlock problem.

> What happens when someone tries to join two of the tables, one that has 
> been converted and one that hasn't? You might not have deadlock, but you 
> won't have type integrity either, ISTM.

Not to mention the mess you'll be left with if the process fails after
converting some of the tables.
        regards, tom lane


Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

From
"Zeugswetter Andreas OSB SD"
Date:
> I don't understand this if it's calling option 2 the monolithic
> implementation. I was intending that the values be permanent tokens if
> you like, so that ZERO rewriting would be required for any types of
> modification. So I don't see where locking comes in. I don't want
> rewriting either.

I think you are not considering existing btree indexes here
(for the reordering case) ?

So +1 on a solution that has naturally sorting keys (e.g. your 1).

Andreas


On Mon, Apr 28, 2008 at 2:24 PM, Zeugswetter Andreas OSB SD
<Andreas.Zeugswetter@s-itsolutions.at> wrote:
>  I think you are not considering existing btree indexes here
>  (for the reordering case) ?

You're quite right, I had not considered existing indexes. There's no
easy way to deal with that other than rebuilding them. :(

I *still* think someone with a big table would prefer to drop/create
their indexes rather than go through a nasty ALTER COLUMN which would
have the seemingly much worse outcome of rebuilding their whole table
AND any indexes. But whatever - I'll implement option 1 and submit it,
as a marked improvement over the status quo. If I can make option 2
work fast enough I'll consider submitting it as a feature improvement
thereafter, but given the general consensus for option 1 I'm not
pushing option 2 much any more.

Cheers

Tom