As part of our monitoring system we have a process that on every incomming
event inserts a row into a dummy table. This INSERT is intercepted by BEFORE
INSERT trigger that checks if the event matches some existing incident, if
yes, then it updates the incident data, else it creates a new incident.
There was a major malfunction in our network, and the system generated about
300 events per second. We found that during this eriod our monitoring
contained obsolete data, because the trigger bottlenecked whole system. We
would like to know if there is simple way to EXPLAIN the trigger function, so
we could optimize it.
We are using PostgreSQL 9.4.3. All processes are single-threaded and there
were no problems caused by concurrency anomalies (we checked by temporarily
using SERIALIZABLE - there were no failing transactions and the obsoleteness
was still there).
Simplified version looks like this:
CREATE TABLE dummytable (id int, data text, ...);
CREATE TABLE incidents (id int, data text, lastupdate timestamp, ...);
The monitoring process does:
INSERT INTO dummytable VALUES (id, data);
The trigger does
BEGIN
SELECT id INTO vid FROM incidents WHERE field1 = NEW.field1 AND (field2 =
NEW.field2 OR field3 = NEW.field3);
IF vid IS NULL THEN
UPDATE incidents SET lastupdate = NOW();
ELSE
INSERT INTO incidents VALUES (id, data)
END IF;
RETURN NULL;
END;
The trigger took about 12 ms to do its work, as we found out by doing
EXPLAIN ANALYZE INSERT INTO dummytable;
so it was clearly a problem.
I looked the various Internet resources and found an information that stored
procedures are similar to prepared statements by computing the execution plan
once at creation time. We thought the first SELECT could use some suboptimal
plan and changed it to
EXECUTE 'SELECT id FROM incidents WHERE field1 = NEW.field1 AND (field2 =
NEW.field2 OR field3 = NEW.field3);' INTO vid USING ...;
We did it by trial and error, because we couldn't find any way to EXPLAIN the
trigger function. Is there any way to do it? For now we were lucky to hit the
right solution quickly, but in some future case we could waste many hours by
looking basically everywhere.
Notes:
* The SQL is simplified, because I am not very interested in finding solution
to this exact case. I want to "learn to fish" not have a "fish given to me".
* We could skip the trigger and just do it in application, but we fear about
the cost of round trips
* We are currently thinking about doing the INSERTs in parallel, and we are
aware of potential problems with concurrent transaction