Thread: XX000: enum value 117721 not found in cache for enum enumcrash
Hi, Currently its possible to cause transactions to fail with ALTER ENUM ADD AFTER/BEFORE: psql 1: CREATE TYPE enumcrash AS ENUM('a', 'b'); CREATE FUNCTION randenum() RETURNS enumcrash LANGUAGE sql AS $$SELECT * FROM unnest(enum_range('a'::enumcrash)) ORDER BY random() LIMIT 1$$; CREATE TABLE enumcrash_table(id serial primary key, val enumcrash); CREATE INDEX enumcrash_table__val__id ON enumcrash_table (val, id); INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 10000);INSERT 0 10000 psql 2: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 10000); psql 1: ALTER TYPE enumcrash ADD VALUE 'a_1' AFTER 'a'; INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 10000); psql 2: INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, 10000); ERROR: XX000: enum value 117745 not found in cache for enum enumcrash LOCATION: compare_values_of_enum, typcache.c:957 This is not surprising. psql 2's backend finds rows in the index with enum values that are not visible in its mvcc snapshot. That mvcc snapshot is needed because a_1 is an enum value with an uneven numbered oid because its inserted after the initial creation. Do we consider that something that needs to be fixed or just something to document? I can't think of a non-intrusive fix right now. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Jun 30, 2012 at 5:51 AM, Andres Freund <andres@2ndquadrant.com> wrote: > Currently its possible to cause transactions to fail with ALTER ENUM ADD > AFTER/BEFORE: > > psql 1: > > CREATE TYPE enumcrash AS ENUM('a', 'b'); > CREATE FUNCTION randenum() RETURNS enumcrash LANGUAGE sql AS $$SELECT * FROM > unnest(enum_range('a'::enumcrash)) ORDER BY random() LIMIT 1$$; > CREATE TABLE enumcrash_table(id serial primary key, val enumcrash); > CREATE INDEX enumcrash_table__val__id ON enumcrash_table (val, id); > INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, > 10000);INSERT 0 10000 > > psql 2: > BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; > INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, > 10000); > > psql 1: > ALTER TYPE enumcrash ADD VALUE 'a_1' AFTER 'a'; > INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, > 10000); > > psql 2: > INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1, > 10000); > ERROR: XX000: enum value 117745 not found in cache for enum enumcrash > LOCATION: compare_values_of_enum, typcache.c:957 > > This is not surprising. psql 2's backend finds rows in the index with enum > values that are not visible in its mvcc snapshot. That mvcc snapshot is needed > because a_1 is an enum value with an uneven numbered oid because its inserted > after the initial creation. I think the problem is that load_enum_cache_data() uses GetTransactionSnapshot() rather than GetLatestSnapshot(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Jun 30, 2012 at 5:51 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> This is not surprising. psql 2's backend finds rows in the index with enum >> values that are not visible in its mvcc snapshot. > I think the problem is that load_enum_cache_data() uses > GetTransactionSnapshot() rather than GetLatestSnapshot(). That would only make the race condition window smaller (ie, hard to reproduce manually like this, but not gone). regards, tom lane
I wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I think the problem is that load_enum_cache_data() uses >> GetTransactionSnapshot() rather than GetLatestSnapshot(). > That would only make the race condition window smaller (ie, hard > to reproduce manually like this, but not gone). No, wait, we made ALTER TYPE ADD VALUE PreventTransactionChain so that uncommitted enum OIDs could never get into tables or indexes. So I think you're right, forcing a new snapshot to be used would fix this. However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction in GetLatestSnapshot. Are we sure that enum comparisons could never happen without a snapshot already being set? What's the point of throwing an error there anyway, as opposed to letting it redirect to GetTransactionSnapshot? regards, tom lane
On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction > in GetLatestSnapshot. Are we sure that enum comparisons could never > happen without a snapshot already being set? What's the point of > throwing an error there anyway, as opposed to letting it redirect to > GetTransactionSnapshot? I don't know whether it should set the transaction snapshot or just r
On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction >> in GetLatestSnapshot. Are we sure that enum comparisons could never >> happen without a snapshot already being set? What's the point of >> throwing an error there anyway, as opposed to letting it redirect to >> GetTransactionSnapshot? > > I don't know whether it should set the transaction snapshot or just r Argh, sorry. ...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I agree withchanging it. Erroring out always seemed kind of pointless to me... ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction >>> in GetLatestSnapshot. >> I don't know whether it should set the transaction snapshot or just r > Argh, sorry. > ...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I agreewith changing it. Erroring out always seemed kind of pointless to me... I think it was coded like that because the sole original use was in ri_triggers.c, in which it would have been a red flag if no transaction snapshot already existed. However, the restriction clearly doesn't fit with GetLatestSnapshot's API spec, so it seems to me to be sensible to change it (as opposed to, say, inventing a new snapshot function with a subtly different API spec). As for creating an MVCC snapshot without causing a transaction snapshot to be established, no thanks --- that would create a path of control that exists nowhere today and has gotten no testing at all. I suspect that it might actively break some assumptions in snapshot management. regards, tom lane
Excerpts from Tom Lane's message of lun jul 02 00:24:06 -0400 2012: > Robert Haas <robertmhaas@gmail.com> writes: > > On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote: > >> On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>> However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction > >>> in GetLatestSnapshot. > > >> I don't know whether it should set the transaction snapshot or just r > > Argh, sorry. > > ...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I agreewith changing it. Erroring out always seemed kind of pointless to me... > > I think it was coded like that because the sole original use was in > ri_triggers.c, in which it would have been a red flag if no transaction > snapshot already existed. Yeah, that's correct. I guess I could have made the check an Assert() instead of elog(). > However, the restriction clearly doesn't fit > with GetLatestSnapshot's API spec, so it seems to me to be sensible > to change it (as opposed to, say, inventing a new snapshot function > with a subtly different API spec). Sounds sensible. > As for creating an MVCC snapshot without causing a transaction snapshot > to be established, no thanks --- that would create a path of control > that exists nowhere today and has gotten no testing at all. I suspect > that it might actively break some assumptions in snapshot management. I think it should work fine as long as the snapshot is registered, but as you say it is untested. Maybe it'd have interactions with the way snapshots connect with resource owners. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support