Hi
On our production database, we had a small hiccup when playing a SQL
migration.
We write them to be as smooth as possible, trying not to interfere with
running services, but this time, we got a batch of failures.
I diagnosed the situation, and found out that we were hit by a known "feature"
of plpgsql.
Here is, basically, how to produce the problem :
0) setup
You need a table, alterning type-incompatible fields (that's easier to
reproduce the issue), and a trigger using these fields.
For instance :
CREATE TABLE demo_function (id serial, number integer, status text, number2
integer);
CREATE OR REPLACE FUNCTION demo_function_serialize_trigger()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
INSERT INTO demo_function_target
SELECT json_build_object(
'number', NEW.number,
'status', NEW.status,
'number2', NEW.number2);
RETURN NEW;
END;
$function$;
CREATE TRIGGER demo_function_trg AFTER INSERT OR UPDATE ON demo_function FOR
EACH ROW EXECUTE PROCEDURE demo_function_serialize_trigger();
1) background session
Imagine your web-worker, with its persistant SQL connection, doing this kind
of query :
INSERT INTO demo_function(number, status, number2) VALUES (1, 'todo', 2);
This will cache the plan for the stored procedure in that session.
2) alter...
In another session, let's do this :
CREATE TYPE demo_status AS ENUM ('todo', 'doing', 'done');
ALTER TABLE demo_function ADD COLUMN status_enum demo_status;
UPDATE demo_function SET status_enum = status::demo_status;
ALTER TABLE demo_function DROP COLUMN status;
ALTER TABLE demo_function RENAME COLUMN status_enum TO status;
(It should of course be a bit more complicated, with triggers and so on to
maintain the new column, split update to prevent locking too many rows, but
let's focus on the issue here)
3) back to the background...
INSERT INTO demo_function(number, status, number2) VALUES (2, 'todo', 3);
==> This will crash with the following error :
type of parameter 15 (demo_status) does not match that when preparing the plan
(text)
And that's a simple one, we could have something far uglier.
I found a workaround using event triggers to rewrite every function when an
alter occurs on such a table, but this seems… odd to me. I don't think we are
doing anything very complicated here, so I'm surprised that nothing has been
done yet to fix that issue.
Did I miss something obvious, or should I report that as a bug and start
digging PostgreSQL code ?
Thanks