Thread: Understanding BRIN index performance

Understanding BRIN index performance

From

Ivan Voras

Date:

03 October 2016, 09:01:38

Hi,

I have a table of around 20 G, more than 220 million records, and I'm running this query on it:

explain analyze SELECT MAX(id) - (SELECT id FROM expl_transactions WHERE dateAdded < (now() - INTERVAL '10 MINUTES') ORDER BY dateAdded DESC LIMIT 1) FROM expl_transactions;

"id" is SERIAL, "dateAdded" is timestamp without timezone

The "dateAdded" field also has a "default now()" applied to it some time after its creation, and a fair amount of null values in the records (which I don't think matters for this query, but maybe I'm wrong).

My first idea is to create a default BRIN index on dateAdded since the above query is not run frequently. To my surprise, the planner refused to use the index and used sequential scan instead. When I forced sequential scanning off, I got this:

https://explain.depesz.com/s/W8oo

The query was executing for 40+ seconds. It seems like the "index scan" on it returns nearly 9% of the table, 25 mil rows. Since the data in dateAdded actually is sequential and fairly selective (having now() as the default over a long period of time), this surprises me.

With a normal btree index, of course, it runs fine:

https://explain.depesz.com/s/TB5

Any ideas?

Re: Understanding BRIN index performance

From

"Madusudanan.B.N"

Date:

03 October 2016, 09:26:41

I don't think a BRIN index would help in either case.

BRIN just marks each page with a max and min boundaries which are helpful in where clauses and has nothing to do with ordering.

For the first operation i.e Max a btree index would do an index scan backward which is just an index lookup in reverse and for order by it can use the index as well since a btree index is ordered by default.

That is the reason why it switches to a sequential scan since there is no way for a BRIN index to be used in the case of a max / order by.

On Mon, Oct 3, 2016 at 2:30 PM, Ivan Voras <ivoras@gmail.com> wrote:

Hi,

I have a table of around 20 G, more than 220 million records, and I'm running this query on it:

explain analyze SELECT MAX(id) - (SELECT id FROM expl_transactions WHERE dateAdded < (now() - INTERVAL '10 MINUTES') ORDER BY dateAdded DESC LIMIT 1) FROM expl_transactions;

"id" is SERIAL, "dateAdded" is timestamp without timezone

The "dateAdded" field also has a "default now()" applied to it some time after its creation, and a fair amount of null values in the records (which I don't think matters for this query, but maybe I'm wrong).

My first idea is to create a default BRIN index on dateAdded since the above query is not run frequently. To my surprise, the planner refused to use the index and used sequential scan instead. When I forced sequential scanning off, I got this:

https://explain.depesz.com/s/W8oo

The query was executing for 40+ seconds. It seems like the "index scan" on it returns nearly 9% of the table, 25 mil rows. Since the data in dateAdded actually is sequential and fairly selective (having now() as the default over a long period of time), this surprises me.

With a normal btree index, of course, it runs fine:

https://explain.depesz.com/s/TB5

Any ideas?

Regards,
Madusudanan.B.N

Re: Understanding BRIN index performance

From

Simon Riggs

Date:

03 October 2016, 09:40:46

On 3 October 2016 at 10:00, Ivan Voras <ivoras@gmail.com> wrote:
> Hi,
>
> I have a table of around 20 G, more than 220 million records, and I'm
> running this query on it:
>
> explain analyze SELECT MAX(id) - (SELECT id FROM expl_transactions WHERE
> dateAdded < (now() - INTERVAL '10 MINUTES') ORDER BY dateAdded DESC LIMIT 1)
> FROM expl_transactions;
>
> "id" is SERIAL, "dateAdded" is timestamp without timezone
>
> The "dateAdded" field also has a "default now()" applied to it some time
> after its creation, and a fair amount of null values in the records (which I
> don't think matters for this query, but maybe I'm wrong).
>
> My first idea is to create a default BRIN index on dateAdded since the above
> query is not run frequently. To my surprise, the planner refused to use the
> index and used sequential scan instead. When I forced sequential scanning
> off, I got this:
>
> https://explain.depesz.com/s/W8oo
>
> The query was executing for 40+ seconds. It seems like the "index scan" on
> it returns nearly 9% of the table, 25 mil rows. Since the data in dateAdded
> actually is sequential and fairly selective (having now() as the default
> over a long period of time), this surprises me.
>
> With a normal btree index, of course, it runs fine:
>
> https://explain.depesz.com/s/TB5

Btree retains ordering, BRIN does not.

We've discussed optimizing the sort based upon BRIN metadata, but
that's not implemented yet.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Understanding BRIN index performance

From

Ivan Voras

Date:

03 October 2016, 09:59:34

On 3 October 2016 at 11:40, Simon Riggs <simon@2ndquadrant.com> wrote:

On 3 October 2016 at 10:00, Ivan Voras <ivoras@gmail.com> wrote:

> My first idea is to create a default BRIN index on dateAdded since the above
> query is not run frequently. To my surprise, the planner refused to use the
> index and used sequential scan instead. When I forced sequential scanning
> off, I got this:
>
> https://explain.depesz.com/s/W8oo
>
> The query was executing for 40+ seconds. It seems like the "index scan" on
> it returns nearly 9% of the table, 25 mil rows. Since the data in dateAdded
> actually is sequential and fairly selective (having now() as the default
> over a long period of time), this surprises me.
>
> With a normal btree index, of course, it runs fine:
>
> https://explain.depesz.com/s/TB5

Btree retains ordering, BRIN does not.

We've discussed optimizing the sort based upon BRIN metadata, but
that's not implemented yet.

I get that, my question was more about why the index scan returned 25 mil rows, when the pages are sequentially filled by timestamps? In my understading of BRIN, it should have returned a small number of pages which would have been filtered (and sorted) for the exact data, right?

Re: Understanding BRIN index performance

From

Simon Riggs

Date:

03 October 2016, 10:06:00

On 3 October 2016 at 10:58, Ivan Voras <ivoras@gmail.com> wrote:

> I get that, my question was more about why the index scan returned 25 mil
> rows, when the pages are sequentially filled by timestamps? In my
> understading of BRIN, it should have returned a small number of pages which
> would have been filtered (and sorted) for the exact data, right?

That could be most simply explained if the distribution of your data
is not what you think it is.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Understanding BRIN index performance

From

Ivan Voras

Date:

03 October 2016, 11:30:14

On 3 October 2016 at 12:05, Simon Riggs <simon@2ndquadrant.com> wrote:

On 3 October 2016 at 10:58, Ivan Voras <ivoras@gmail.com> wrote:

> I get that, my question was more about why the index scan returned 25 mil
> rows, when the pages are sequentially filled by timestamps? In my
> understading of BRIN, it should have returned a small number of pages which
> would have been filtered (and sorted) for the exact data, right?

That could be most simply explained if the distribution of your data
is not what you think it is.

Something doesn't add up.

I've clustered the table, then created a BRIN index, and the number of rows resulting from the index scan dropped only very slightly.

Hmmm, looking at your original reply about the metadata, and my query, did you mean something like this:

SELECT id FROM expl_transactions WHERE dateAdded < (now() - INTERVAL '10 MINUTES') ORDER BY dateAdded DESC LIMIT 1

To solve this with a BRIN index, the index records (range pairs?) themselves would need to be ordered, to be able to perform the "ORDER by ... DESC" operation with the index, and then sort it and take the single record from this operation, and there is currently no such data being recorded?