Home > mailing lists

Re: postgres chooses objectively wrong index - Mailing list pgsql-performance

From	Merlin Moncure
Subject	Re: postgres chooses objectively wrong index
Date	March 18 21:38:07
Msg-id	CAHyXU0zhdKYDvt_iON5kVSa4AXa=p_vLP1JMs3u0FFA29rP-aQ@mail.gmail.com Whole thread Raw
In response to	Re: postgres chooses objectively wrong index (Alexey Ermakov <alexius.work@gmail.com>)
Responses	Re: postgres chooses objectively wrong index Re: postgres chooses objectively wrong index
List	pgsql-performance

Tree view

On Tue, Mar 17, 2026 at 11:27 PM Alexey Ermakov <alexius.work@gmail.com> wrote:

On 2026-03-18 04:52, Merlin Moncure wrote:
Hello. I think planner doesn't have information about distribution of async.task_execution_state(task) unless it's part of any full index. I would try to give that with extended statistics (postgresql 14+):
create statistics (mcv) task_task_execution_state_stat on ((async.task_execution_state(task))) from async.task;
analyze async.task;
If that won't help - please show distribution from pg_stats_ext view for extended statistic above.
This unfortunately fails, probably because the table type includes system columns (despite not using them).

orchestrator_service_user@orchestrator=> create statistics task_stats (mcv) on (async.task_execution_state(task)) from async.task;
ERROR: statistics creation on system columns is not supported

This would require some refactoring to fix.
Interesting... In that case functional index should help (as it also makes statistic for the planner):
create index concurrently on task_task_execution_state_idx async.task using btree ((async.task_execution_state(task)));
analyze async.task;

This can't help performance, as the index...

CREATE INDEX ON async.task(concurrency_pool, priority, entered)
WHERE async.task_execution_state(task) = 'READY';

...is very precisely configured to provide exactly what's needed; I need tasks for that exact pool in that exact order if and only if ready. The partial predicate is designed to keep the index nice and small since only a small portion of tasks would be eligible at any specific time.

@Tom Lane I'm pretty sure you were following me, but my abstraction earlier was a bit off;

Simplified, the query is:
SELECT * FROM foo WHERE a=? AND b=K ORDER BY c, d LIMIT N;
CREATE INDEX ON foo(a,b,c) WHERE b=K;

Should have been:

SELECT * FROM foo WHERE a=? AND d=K ORDER BY b, c LIMIT N;

CREATE INDEX ON foo(a,b,c) WHERE d=K;

Point being, the index match in on (=, order, order). If a contains any less than 100% of the total records, and N is small relative to table size, this ought to be the ideal index for just about any case, the exact match on partial qual is just gravy.

I think the planner is not giving enough bonus for an exact match versus an inexact match on partial index mathcing, (A=A should be better than A IN(A,B,C)), and it's unclear why the planner things bitmap heap + sort is outperforming a raw read off the index base on marginal estimated row counts. Lowering random_page_cost definitely biases the plan I like, but it skews both estimates.

@Alexey Ermakov

If you're interested in more context, see:

pgasync

pgflow

graph example

merlin

pgsql-performance by date:

From: Alexey Ermakov
Date: 18 March, 08:27:26
Subject: Re: postgres chooses objectively wrong index

From: Alexey Ermakov
Date: 18 March, 23:12:13
Subject: Re: postgres chooses objectively wrong index

Re: postgres chooses objectively wrong index - Mailing list pgsql-performance

Previous

Next