On 2026-03-18 03:01, Merlin Moncure wrote:
I've been maintaining an airflow style orchestrator in pl/pgsql, and it's revealed a performance issue I just can't solve. There is a table, task, which may normally contain billions of rows, but only a tiny portion is interesting for specific reasons—a common pattern in task-type systems.
...
I'm wondering if there are other tricks that might apply here, for example, multi column index statistics...curious if anyone has thoughts on that.
Any suggestions?
merlin
Hello. I think planner doesn't have information about distribution of async.task_execution_state(task) unless it's part of any full index. I would try to give that with extended statistics (postgresql 14+):
create statistics (mcv) task_task_execution_state_stat on ((async.task_execution_state(task))) from async.task;
analyze async.task;
If that won't help - please show distribution from pg_stats_ext view for extended statistic above.
--
Alexey Ermakov