Thread: 9.5 alpha: some small comments on BRIN and btree_gin

9.5 alpha: some small comments on BRIN and btree_gin

From
Marc Mamin
Date:
Hello,

First: KUDO !!!
The release notes are extremely promising in regard to performance improvements :-)


I've made some (dirty) tests with BRIN and btree_gin.
(on a smalll Windows laptop ...)

just a few remarks:


- btree_gin deserve a better description than that:
 "However, they are useful for GIN testing and as a base for developing other GIN operator classes."   I came to
similartimes between btree and gin for indexes on "category" columns (ca 20 to 200 distinct values)  For me, gin
clearlywins here thanks to the index size difference.    You should really consider moving btree_gin to core ...    


- btree_gin on integers doesn't cope well with BETWEEN. Seems to always lead to a full index scan I think I understand
why,but maybe this is worth a comment in the doc to underline the difference to btree. 
SELECT * from tgin_1 WHERE cat_id between 1 and 2:   http://explain.depesz.com/s/fmqn SELECT * from tgin_1 WHERE cat_id
IN(1,2):   http://explain.depesz.com/s/bYg 

- BRIN cost: I've made a silly test, where all distinct values exist in all BRIN page ranges:    INSERT into tbrin_1
(cat_id,....) SELECT s%20, ... FROM generate_series(1,3000000 )s;CREATE INDEX cat_brin_1 on tbrin_1 using BRIN
(cat_id)with(pages_per_range=64);SELECT * from tbrin_1 WHERE cat_id=10;    http://explain.depesz.com/s/9YQR 
  There seems to be no "fence" against useless BRIN indexes that would allow a fallback on a table scan.  But the time
overheadremind small :)     
best regards,

Marc Mamin




Re: 9.5 alpha: some small comments on BRIN and btree_gin

From
Jeff Janes
Date:
On Mon, Jul 6, 2015 at 12:20 AM, Marc Mamin <M.Mamin@intershop.de> wrote:
Hello,

First: KUDO !!!
The release notes are extremely promising in regard to performance improvements :-)


I've made some (dirty) tests with BRIN and btree_gin.
(on a smalll Windows laptop ...)

just a few remarks:


- btree_gin deserve a better description than that:

  "However, they are useful for GIN testing and as a base for developing other GIN operator classes."

   I came to similar times between btree and gin for indexes on "category" columns (ca 20 to 200 distinct values)
   For me, gin clearly wins here thanks to the index size difference.

I reached the same conclusion for things with higher distinct values, but still several copies of each distinct value. except I don't think we should change the description until the BETWEEN issue is resolved.  That is a pretty serious limitation, I think.

Or at least, if you we invite people to use it for this purpose, we would have to warn them that it is not suitable for range queries.  It wouldn't be so bad if it merely didn't support them well, but as things are now it positively pulls the planner away from better options, because it looks falsely attractive.  

I've looked into doing myself, but I'm afraid it is beyond me.

Cheers,

Jeff

Re: 9.5 alpha: some small comments on BRIN and btree_gin

From
Josh Berkus
Date:
On 07/06/2015 12:20 AM, Marc Mamin wrote:
>    There seems to be no "fence" against useless BRIN indexes that would allow a fallback on a table scan.
>    But the time overhead remind small :)

When have we ever stopped users from creating useless indexes?  For one
thing, just because the index isn't useful *now* doesn't mean it won't
be in the future.

Now, it would be useful to have a brin_index_effectiveness() function so
that DBAs could check for themselves whether they should dump indexes.
However, I don't see needing that for 9.5.

Are there usage stats in pg_stat_user_indexes for BRIN?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: 9.5 alpha: some small comments on BRIN and btree_gin

From
Michael Paquier
Date:
On Tue, Jul 7, 2015 at 9:04 AM, Josh Berkus <josh@agliodbs.com> wrote:
> Are there usage stats in pg_stat_user_indexes for BRIN?

Yes, they are here.
-- 
Michael



Re: 9.5 alpha: some small comments on BRIN and btree_gin

From
Alvaro Herrera
Date:
Marc Mamin wrote:

> - BRIN cost: I've made a silly test, where all distinct values exist in all BRIN page ranges:
>         
>     INSERT into tbrin_1 (cat_id, ....) SELECT s%20, ... FROM generate_series(1,3000000 )s;
>     CREATE INDEX cat_brin_1 on tbrin_1 using BRIN (cat_id)with (pages_per_range=64);
>     SELECT * from tbrin_1 WHERE cat_id=10; 
>        http://explain.depesz.com/s/9YQR
> 
>    There seems to be no "fence" against useless BRIN indexes that would allow a fallback on a table scan.
>    But the time overhead remind small :)

Hmm, I guess the costing function for brin could stand some improvement.
Clearly we're not covering all bases.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: 9.5 alpha: some small comments on BRIN and btree_gin

From
Marc Mamin
Date:

> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: Dienstag, 7. Juli 2015 02:04
> 
> On 07/06/2015 12:20 AM, Marc Mamin wrote:
> >    There seems to be no "fence" against useless BRIN indexes that
> would allow a fallback on a table scan.
> >    But the time overhead remind small :)
> 
> When have we ever stopped users from creating useless indexes?  

Sure, but on the other hand, they are so small and quick to build 
that they seem to be a good alternative when other index types are too costly, 
even if theses indexes can't deal well with all data ranges passed as query condition.

Hence it would be fine if the planner could reject these indexes in the bad cases.

I don't mean this is something I'm counting on, but it could be a good idea to mention this limitation in the doc.

regards,

Marc Mamin


> For one
> thing, just because the index isn't useful *now* doesn't mean it won't
> be in the future.
> 
> Now, it would be useful to have a brin_index_effectiveness() function
> so that DBAs could check for themselves whether they should dump
> indexes.
> However, I don't see needing that for 9.5.
> 
> Are there usage stats in pg_stat_user_indexes for BRIN?
> 
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com

Re: 9.5 alpha: some small comments on BRIN and btree_gin

From
Josh Berkus
Date:
On 07/07/2015 06:28 AM, Marc Mamin wrote:
> Sure, but on the other hand, they are so small and quick to build 
> that they seem to be a good alternative when other index types are too costly, 
> even if theses indexes can't deal well with all data ranges passed as query condition.
> 
> Hence it would be fine if the planner could reject these indexes in the bad cases.

Oh, sorry!  I didn't realize that the planner was using the BRIN index
even when it was useless; your email wasn't clear.

The problem here is that the usefulness of BRIN indexes as a cost
calculation should take correlation into account, heavily.  Can we do
that?  Is correlation even part of the index costing method now?  How
accurate are our correlation estimates?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com