Well, so far with commit_delay=0 no problems. I will report back of couse if something happens, but I believe that the problem may indeed be solved/masked with that setting.
Rough description of our setup, or how to reproduce:
* Timeseries data in table , say, "measurements", size: 3-4TB, about 1000 inserts/second
* table measurements also has a trigger on insert to also insert on measurements_a (for daily export purposes)
Just the above would cause a stuck query after a few days.
Now for exporting we run the following CTE query (measurements_b is an empty table, measurements_a has about 5GB)
* WITH d_rows AS (DELETE FROM measurement_events_a RETURNING * ) INSERT INTO measurement_events_b SELECT * FROM d_rows;
The above caused the problem to appear every time, after a 10-20 minutes.
Regards,
-Spiros