Re: Reducing planning time on tables with many indexes - Mailing list pgsql-hackers
From | David Geier |
---|---|
Subject | Re: Reducing planning time on tables with many indexes |
Date | |
Msg-id | CAPsAnrm-E9r4VrUuLKACor8r7DEiL8vd9V-PL2+ND5_0OyjR2A@mail.gmail.com Whole thread Raw |
In response to | Re: Reducing planning time on tables with many indexes (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Hi Tom,
On Wed, Jul 27, 2022 at 6:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Geier <geidav.pg@gmail.com> writes:
> We tracked down the root cause of this slowdown to lock contention in
> 'get_relation_info()'. The index lock of every single index of every single
> table used in that query is acquired. We attempted a fix by pre-filtering
> out all indexes that anyways cannot be used with a certain query, without
> taking the index locks (credits to Luc Vlaming for idea and
> implementation). The patch does so by caching the columns present in every
> index, inside 'struct Relation', similarly to 'rd_indexlist'.
I wonder how much thought you gave to the costs imposed by that extra
cache space. We have a lot of users who moan about relcache bloat
already.
The current representation could be compacted (e.g. by storing the column indexes as 16-bit integers, instead of using a Bitmapset). That should make the additional space needed neglectable compared to the current size of RelationData. On top of that we could maybe reorder the members of RelationData to save padding bytes. Currently, there is lots of interleaving of members of different sizes. Even when taking cache locality into consideration it looks like a fair amount could be saved, probably accounting for the additional space needed to store the index column data.
But more to the point, I do not buy the assumption that
an index's set of columns is a good filter for which indexes are of
interest. A trivial counterexample from the regression database is
regression=# explain select count(*) from tenk1;
QUERY PLAN
--------------------------------------------------------------------------------
------------
Aggregate (cost=219.28..219.29 rows=1 width=8)
-> Index Only Scan using tenk1_hundred on tenk1 (cost=0.29..194.28 rows=100
00 width=0)
(2 rows)
Only for queries without index conditions, the current version of the patch simply discards all indexes but the one with the least columns. This is case (3) in s64_IsUnnecessaryIndex(). This is an over-simplification with the goal of picking the index which yields least I/O. For single column indexes that works, but it can fall short for multi-column indexes (e.g. [INT, TEXT] index vs [INT, INT]t index have both 2 columns but the latter would be better suited when there's no other index and we want to read the first integer column). What we should do here instead is to discard indexes based on storage size.
It looks to me like the patch also makes unwarranted assumptions about
being able to discard all but the smallest index having a given set
of columns. This would, for example, possibly lead to dropping the
index that has the most useful sort order, or that has the operator
class needed to support a specific WHERE clause.t
Why would that be? If we keep all indexes that contain columns that are used by the query, we also keep the indexes which provide a certain sort order or operator class.
In short, I'm not sure I buy this concept at all. I think it might
be more useful to attack the locking overhead more directly. I kind
of wonder why we need per-index locks at all during planning ---
I think that we already interpret AccessShareLock on the parent table
as being sufficient to block schema changes on existing indexes.
As I said in the reply to your other mail, there's huge savings also in the serial case where lock contention is not an issue. It seems like pre-filtering saves work down the road. I'll do some perf runs to track down where exactly the savings come from. One source I can think of is only having to consider a subset of all indexes during path creation.
Unfortunately, as things stand today, the planner needs more than the
right to look at the indexes' schemas, because it makes physical accesses
to btree indexes to find out their tree height (and I think there are some
comparable behaviors in other AMs). I've never particularly cared for
that implementation, and would be glad to rip out that behavior if we can
find another way. Maybe we could persuade VACUUM or ANALYZE to store that
info in the index's pg_index row, or some such, and then the planner
could use it with no lock?
That's another interesting approach, but I would love to save the planner cycles for the sequential case.
--
David Geier
(ServiceNow)
(ServiceNow)
pgsql-hackers by date: