Re: Partial index creation always scans the entire table - Mailing list pgsql-performance

From Tom Lane
Subject Re: Partial index creation always scans the entire table
Date
Msg-id 22504.1581901740@sss.pgh.pa.us
Whole thread Raw
In response to RE: Partial index creation always scans the entire table  ("Mike Sofen" <msofen@runbox.com>)
List pgsql-performance
"Mike Sofen" <msofen@runbox.com> writes:
>> From: Tom Lane <tgl@sss.pgh.pa.us>   Sent: Sunday, February 16, 2020 7:30 AM
>>> I've seen people try to do this before.  I wonder if the manual page about
>>> partial indexes should explicitly say "don't do that".

> Yes please (seriously).  The utter beauty of Postgres is the flexibility and
> power that its evolutionary path has allowed/created.  The tragic danger is
> that the beauty is fairly easy to misapply/misuse.  Caveats in the
> documentation would be very beneficial to both seasoned practitioners and
> newcomers - it is quite challenging to keep up with everything Postgres and
> the documentation is where most of us turn for guidance.

OK, so how about something like this added to section 11.8
(no pretty markup as yet):

Example 11.4.  Do Not use Partial Indexes as a Substitute for Partitioning

You might be tempted to create a large set of non-overlapping partial
indexes, for example

    CREATE INDEX mytable_cat_1 ON mytable (data) WHERE category = 1;
    CREATE INDEX mytable_cat_2 ON mytable (data) WHERE category = 2;
    CREATE INDEX mytable_cat_3 ON mytable (data) WHERE category = 3;
    ...

This is a bad idea!  Almost certainly, you'll be better off with a single
non-partial index, declared like

    CREATE INDEX mytable_cat_data ON mytable (category, data);

(Put the category column first, for the reasons described in section 11.3
Multicolumn Indexes.)  While a search in this larger index might have to
descend through a couple more tree levels than a search in a smaller
index, that's almost certainly going to be cheaper than the planner effort
needed to select the appropriate one of the partial indexes.  The core of
the problem is that the system does not understand the relationship among
the partial indexes, and will laboriously test each one to see if it's
applicable to the current query.

If your table is large enough that a single index really is a bad idea,
you should look into using partitioning instead (section whatever-it-is).
With that mechanism, the system does understand that the tables and
indexes are non-overlapping, so much better performance is possible.

            regards, tom lane



pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Partial index creation always scans the entire table
Next
From: Laurenz Albe
Date:
Subject: Re: SubtransControlLock and performance problems