Re: Partitioned tables and SELECT ... ORDER BY ... LIMIT - Mailing list pgsql-performance

From Дмитрий Шалашов
Subject Re: Partitioned tables and SELECT ... ORDER BY ... LIMIT
Date
Msg-id CAKPeCUFT8m3jtfAnnfTeZ4vCi2-CS5=d22CAf8iRS0qXso6oSw@mail.gmail.com
Whole thread Raw
In response to Re: Partitioned tables and SELECT ... ORDER BY ... LIMIT  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
Hi Jeff,

Thanks for clarifications!

In my case yes, it's just few blocks, but different ones every time I change user_id value in my WHERE clause. When I change user_id - buffers are no longer "shared hit" in EXPLAIN. This is a bit more worrying.

But if there is no easy fix - well, OK.


Best regards,
Dmitriy Shalashov

2014-10-16 21:04 GMT+04:00 Jeff Janes <jeff.janes@gmail.com>:
On Thu, Oct 16, 2014 at 5:35 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
Hi,

lets imagine that we have some table, partitioned by timestamp field, and we query it with SELECT with ordering by that field (DESC for example), with some modest limit.
Lets further say that required amount of rows is found in the first table that query encounters (say, latest one).
I am just wondering, why nevertheless PostgreSQL does read couple of buffers from each of the older tables?

The planner only does partition pruning statically, not dynamically.The LIMIT has to be implemented dynamically--it cannot prove absolutely that the "first" partition will have enough rows, so it cannot eliminate the others.

The "Merge Append" does a priority queue merge, and so needs to read the "first" row (according to the ORDER BY) from each partition in order to seed the priority queue.  I guess what it could be made to do in the case where there are suitable check constraints on a partition, is seed the priority queue with a dummy value constructed from the constraint.  If the merge never gets far enough to draw upon that dummy value, then that whole plan node never needs to get started up.

In your case that would save very little, as reading a few blocks for each partition is not much of a burden.  Especially as it the same few blocks every time, so they should be well cached.  There may be other case where this would be more helpful.  But it isn't clear to me how the planner could build such a feature into its cost estimates, and the whole thing would be a rather complex and esoteric optimization to make for uncertain gain.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Felipe Santos
Date:
Subject: Re: Partitioned tables and SELECT ... ORDER BY ... LIMIT
Next
From: Emi Lu
Date:
Subject: CopyManager(In/out) vs. delete/insert directly