Thread: XX000: enum value 117721 not found in cache for enum enumcrash

XX000: enum value 117721 not found in cache for enum enumcrash

From
Andres Freund
Date:
Hi,

Currently its possible to cause transactions to fail with ALTER ENUM ADD 
AFTER/BEFORE:

psql 1:

CREATE TYPE enumcrash AS ENUM('a', 'b');
CREATE FUNCTION randenum() RETURNS enumcrash LANGUAGE sql AS $$SELECT * FROM 
unnest(enum_range('a'::enumcrash)) ORDER BY random() LIMIT 1$$;
CREATE TABLE enumcrash_table(id serial primary key, val enumcrash);
CREATE INDEX enumcrash_table__val__id ON enumcrash_table (val, id);
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 
10000);INSERT 0 10000

psql 2:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 
10000);

psql 1:
ALTER TYPE enumcrash ADD VALUE 'a_1' AFTER 'a';
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 
10000);

psql 2:
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 
10000);
ERROR:  XX000: enum value 117745 not found in cache for enum enumcrash
LOCATION:  compare_values_of_enum, typcache.c:957

This is not surprising. psql 2's backend finds rows in the index with enum 
values that are not visible in its mvcc snapshot. That mvcc snapshot is needed 
because a_1 is an enum value with an uneven numbered oid because its inserted 
after the initial creation.

Do we consider that something that needs to be fixed or just something to 
document? I can't think of a non-intrusive fix right now.

Greetings,

Andres
-- 
Andres Freund        http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: XX000: enum value 117721 not found in cache for enum enumcrash

From
Robert Haas
Date:
On Sat, Jun 30, 2012 at 5:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> Currently its possible to cause transactions to fail with ALTER ENUM ADD
> AFTER/BEFORE:
>
> psql 1:
>
> CREATE TYPE enumcrash AS ENUM('a', 'b');
> CREATE FUNCTION randenum() RETURNS enumcrash LANGUAGE sql AS $$SELECT * FROM
> unnest(enum_range('a'::enumcrash)) ORDER BY random() LIMIT 1$$;
> CREATE TABLE enumcrash_table(id serial primary key, val enumcrash);
> CREATE INDEX enumcrash_table__val__id ON enumcrash_table (val, id);
> INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
> 10000);INSERT 0 10000
>
> psql 2:
> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
> 10000);
>
> psql 1:
> ALTER TYPE enumcrash ADD VALUE 'a_1' AFTER 'a';
> INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
> 10000);
>
> psql 2:
> INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
> 10000);
> ERROR:  XX000: enum value 117745 not found in cache for enum enumcrash
> LOCATION:  compare_values_of_enum, typcache.c:957
>
> This is not surprising. psql 2's backend finds rows in the index with enum
> values that are not visible in its mvcc snapshot. That mvcc snapshot is needed
> because a_1 is an enum value with an uneven numbered oid because its inserted
> after the initial creation.

I think the problem is that load_enum_cache_data() uses
GetTransactionSnapshot() rather than GetLatestSnapshot().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: XX000: enum value 117721 not found in cache for enum enumcrash

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Jun 30, 2012 at 5:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> This is not surprising. psql 2's backend finds rows in the index with enum
>> values that are not visible in its mvcc snapshot.

> I think the problem is that load_enum_cache_data() uses
> GetTransactionSnapshot() rather than GetLatestSnapshot().

That would only make the race condition window smaller (ie, hard
to reproduce manually like this, but not gone).
        regards, tom lane


Re: XX000: enum value 117721 not found in cache for enum enumcrash

From
Tom Lane
Date:
I wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think the problem is that load_enum_cache_data() uses
>> GetTransactionSnapshot() rather than GetLatestSnapshot().

> That would only make the race condition window smaller (ie, hard
> to reproduce manually like this, but not gone).

No, wait, we made ALTER TYPE ADD VALUE PreventTransactionChain so that
uncommitted enum OIDs could never get into tables or indexes.  So I
think you're right, forcing a new snapshot to be used would fix this.

However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
in GetLatestSnapshot.  Are we sure that enum comparisons could never
happen without a snapshot already being set?  What's the point of
throwing an error there anyway, as opposed to letting it redirect to
GetTransactionSnapshot?
        regards, tom lane


Re: XX000: enum value 117721 not found in cache for enum enumcrash

From
Robert Haas
Date:
On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
> in GetLatestSnapshot.  Are we sure that enum comparisons could never
> happen without a snapshot already being set?  What's the point of
> throwing an error there anyway, as opposed to letting it redirect to
> GetTransactionSnapshot?

I don't know whether it should set the transaction snapshot or just r


Re: XX000: enum value 117721 not found in cache for enum enumcrash

From
Robert Haas
Date:
On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
>> in GetLatestSnapshot.  Are we sure that enum comparisons could never
>> happen without a snapshot already being set?  What's the point of
>> throwing an error there anyway, as opposed to letting it redirect to
>> GetTransactionSnapshot?
>
> I don't know whether it should set the transaction snapshot or just r

Argh, sorry.

...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I agree
withchanging it. Erroring out always seemed kind of pointless to me... 

...Robert

Re: XX000: enum value 117721 not found in cache for enum enumcrash

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
>>> in GetLatestSnapshot.

>> I don't know whether it should set the transaction snapshot or just r
> Argh, sorry.
> ...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I
agreewith changing it. Erroring out always seemed kind of pointless to me...
 

I think it was coded like that because the sole original use was in
ri_triggers.c, in which it would have been a red flag if no transaction
snapshot already existed.  However, the restriction clearly doesn't fit
with GetLatestSnapshot's API spec, so it seems to me to be sensible
to change it (as opposed to, say, inventing a new snapshot function
with a subtly different API spec).

As for creating an MVCC snapshot without causing a transaction snapshot
to be established, no thanks --- that would create a path of control
that exists nowhere today and has gotten no testing at all.  I suspect
that it might actively break some assumptions in snapshot management.
        regards, tom lane


Re: XX000: enum value 117721 not found in cache for enum enumcrash

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of lun jul 02 00:24:06 -0400 2012:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
> >>> in GetLatestSnapshot.
>
> >> I don't know whether it should set the transaction snapshot or just r
> > Argh, sorry.
> > ...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I
agreewith changing it. Erroring out always seemed kind of pointless to me... 
>
> I think it was coded like that because the sole original use was in
> ri_triggers.c, in which it would have been a red flag if no transaction
> snapshot already existed.

Yeah, that's correct.  I guess I could have made the check an Assert()
instead of elog().

> However, the restriction clearly doesn't fit
> with GetLatestSnapshot's API spec, so it seems to me to be sensible
> to change it (as opposed to, say, inventing a new snapshot function
> with a subtly different API spec).

Sounds sensible.

> As for creating an MVCC snapshot without causing a transaction snapshot
> to be established, no thanks --- that would create a path of control
> that exists nowhere today and has gotten no testing at all.  I suspect
> that it might actively break some assumptions in snapshot management.

I think it should work fine as long as the snapshot is registered, but
as you say it is untested.  Maybe it'd have interactions with the way
snapshots connect with resource owners.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support