Thread: [GENERAL] Invalidation of cached plans for stored procedures ?
Hi On our production database, we had a small hiccup when playing a SQL migration. We write them to be as smooth as possible, trying not to interfere with running services, but this time, we got a batch of failures. I diagnosed the situation, and found out that we were hit by a known "feature" of plpgsql. Here is, basically, how to produce the problem : 0) setup You need a table, alterning type-incompatible fields (that's easier to reproduce the issue), and a trigger using these fields. For instance : CREATE TABLE demo_function (id serial, number integer, status text, number2 integer); CREATE OR REPLACE FUNCTION demo_function_serialize_trigger() RETURNS trigger LANGUAGE plpgsql AS $function$ BEGIN INSERT INTO demo_function_target SELECT json_build_object( 'number', NEW.number, 'status', NEW.status, 'number2', NEW.number2); RETURN NEW; END; $function$; CREATE TRIGGER demo_function_trg AFTER INSERT OR UPDATE ON demo_function FOR EACH ROW EXECUTE PROCEDURE demo_function_serialize_trigger(); 1) background session Imagine your web-worker, with its persistant SQL connection, doing this kind of query : INSERT INTO demo_function(number, status, number2) VALUES (1, 'todo', 2); This will cache the plan for the stored procedure in that session. 2) alter... In another session, let's do this : CREATE TYPE demo_status AS ENUM ('todo', 'doing', 'done'); ALTER TABLE demo_function ADD COLUMN status_enum demo_status; UPDATE demo_function SET status_enum = status::demo_status; ALTER TABLE demo_function DROP COLUMN status; ALTER TABLE demo_function RENAME COLUMN status_enum TO status; (It should of course be a bit more complicated, with triggers and so on to maintain the new column, split update to prevent locking too many rows, but let's focus on the issue here) 3) back to the background... INSERT INTO demo_function(number, status, number2) VALUES (2, 'todo', 3); ==> This will crash with the following error : type of parameter 15 (demo_status) does not match that when preparing the plan (text) And that's a simple one, we could have something far uglier. I found a workaround using event triggers to rewrite every function when an alter occurs on such a table, but this seems… odd to me. I don't think we are doing anything very complicated here, so I'm surprised that nothing has been done yet to fix that issue. Did I miss something obvious, or should I report that as a bug and start digging PostgreSQL code ? Thanks
Attachment
On 06/21/2017 04:28 AM, Pierre Ducroquet wrote: > Hi > > On our production database, we had a small hiccup when playing a SQL > migration. > We write them to be as smooth as possible, trying not to interfere with > running services, but this time, we got a batch of failures. > I diagnosed the situation, and found out that we were hit by a known "feature" > of plpgsql. > Here is, basically, how to produce the problem : > > 0) setup > > You need a table, alterning type-incompatible fields (that's easier to > reproduce the issue), and a trigger using these fields. > For instance : > > CREATE TABLE demo_function (id serial, number integer, status text, number2 > integer); > CREATE OR REPLACE FUNCTION demo_function_serialize_trigger() > RETURNS trigger > LANGUAGE plpgsql > AS $function$ > BEGIN > INSERT INTO demo_function_target > SELECT json_build_object( > 'number', NEW.number, > 'status', NEW.status, > 'number2', NEW.number2); > RETURN NEW; > END; > $function$; > CREATE TRIGGER demo_function_trg AFTER INSERT OR UPDATE ON demo_function FOR > EACH ROW EXECUTE PROCEDURE demo_function_serialize_trigger(); > > > > 1) background session > > Imagine your web-worker, with its persistant SQL connection, doing this kind > of query : > INSERT INTO demo_function(number, status, number2) VALUES (1, 'todo', 2); > > This will cache the plan for the stored procedure in that session. > > > 2) alter... > > In another session, let's do this : > > CREATE TYPE demo_status AS ENUM ('todo', 'doing', 'done'); > ALTER TABLE demo_function ADD COLUMN status_enum demo_status; > UPDATE demo_function SET status_enum = status::demo_status; > ALTER TABLE demo_function DROP COLUMN status; > ALTER TABLE demo_function RENAME COLUMN status_enum TO status; > > (It should of course be a bit more complicated, with triggers and so on to > maintain the new column, split update to prevent locking too many rows, but > let's focus on the issue here) > > > 3) back to the background... > > INSERT INTO demo_function(number, status, number2) VALUES (2, 'todo', 3); > > ==> This will crash with the following error : > type of parameter 15 (demo_status) does not match that when preparing the plan > (text) > > > And that's a simple one, we could have something far uglier. > > > I found a workaround using event triggers to rewrite every function when an > alter occurs on such a table, but this seems… odd to me. I don't think we are > doing anything very complicated here, so I'm surprised that nothing has been > done yet to fix that issue. https://www.postgresql.org/docs/9.6/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN > Did I miss something obvious, or should I report that as a bug and start > digging PostgreSQL code ? https://www.postgresql.org/docs/9.6/static/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING > > > Thanks > -- Adrian Klaver adrian.klaver@aklaver.com
On 06/21/2017 06:06 AM, Pierre Ducroquet wrote: Please reply to list also. Ccing list. > On Wednesday, June 21, 2017 5:49:29 AM CEST you wrote: >> On 06/21/2017 04:28 AM, Pierre Ducroquet wrote: >>> Hi >>> >>> >>> I found a workaround using event triggers to rewrite every function when >>> an >>> alter occurs on such a table, but this seems… odd to me. I don't think we >>> are doing anything very complicated here, so I'm surprised that nothing >>> has been done yet to fix that issue. >> >> https://www.postgresql.org/docs/9.6/static/plpgsql-statements.html#PLPGSQL-S >> TATEMENTS-EXECUTING-DYN >>> Did I miss something obvious, or should I report that as a bug and start >>> digging PostgreSQL code ? >> >> https://www.postgresql.org/docs/9.6/static/plpgsql-implementation.html#PLPGS >> QL-PLAN-CACHING >>> Thanks > > Hi Adrian > > I know this caching behavior is well-known and documented, and a disturbed way > of writing triggers will fix it (prefixing every statement with EXECUTE would > make for even uglier procedures), but still it seems counter-intuitive and > could deserve being called a bug, or a serious restriction. > I was very surprised to be hit by this so easily. Also, even after a backend > was hit by a failure, it kept running the wrong plan until I forced a new > function version. Hence my questioning, is there more than just «plpgsql cache > plans» in that issue… > If there is nothing else, then I'll start digging the code… I don't know, which is why I am returning this to list. > > Pierre > -- Adrian Klaver adrian.klaver@aklaver.com