Re: Partial index creation always scans the entire table - Mailing list pgsql-performance

From Tom Lane
Subject Re: Partial index creation always scans the entire table
Date
Msg-id 23936.1581870943@sss.pgh.pa.us
Whole thread Raw
In response to Re: Partial index creation always scans the entire table  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-performance
Justin Pryzby <pryzby@telsasoft.com> writes:
> I was reminded of reading this, but I think it's a pretty different case.
> https://heap.io/blog/engineering/running-10-million-postgresql-indexes-in-production

Yeah, the critical paragraph in that is

    This isn’t as scary as it sounds for a two main reasons. First, we
    shard all of our data by customer. Each table in our database holds
    only one customer’s data, so each table has a only a few thousand
    indexes at most. Second, these events are relatively rare. The most
    common defined events make up only a few percent of a customer’s raw
    events, and most are much more rare. This means that we perform
    relatively little I/O maintaining this schema, because most incoming
    events match no event definitions and therefore don’t need to be
    written to any of the indexes. Similarly, the indexes don’t take up
    much space on disk.

A set of partial indexes that cover a small part of the total data
can be sensible.  If you're trying to cover most/all of the data,
you're doing it wrong --- basically, you're reinventing partitioning
using the wrong tools.

            regards, tom lane



pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Partial index creation always scans the entire table
Next
From: Lars Aksel Opsahl
Date:
Subject: SubtransControlLock and performance problems