Re: Gist indexing performance with cidr types - Mailing list pgsql-performance

From Henrik Thostrup Jensen
Subject Re: Gist indexing performance with cidr types
Date
Msg-id alpine.DEB.2.11.1508281513050.14499@pyrite
Whole thread Raw
In response to Re: Gist indexing performance with cidr types  (Emre Hasegeli <emre@hasegeli.com>)
List pgsql-performance
Hi

On Thu, 27 Aug 2015, Emre Hasegeli wrote:

> I think the slowdown is not related with the key your searched for,
> but the organisation of the index.  We have a simple structure for
> the index keys.  Basically, common bits of the child nodes are stored
> on the parent node.  It leads to not efficient indexes, where there
> are too much values with the same prefix.  I couldn't quite understand
> why it performs so bad, though.

I can see the issue. Unfortunately IP space tends to be fragmented in some
ranges, and very sparse in other.

It is unfortunate that something to index IP prefixes doesn't handle BGP
and IRR data very well (the only largish "real" datasets with IP prefixes
I can think of).


> You might have better luck with ip4r extension [1] or creating an index
> using the range types like this:
[snip]

Using the range type index:

  Nested Loop  (cost=0.42..603902.92 rows=8396377 width=26) (actual time=0.514..662.500 rows=8047 loops=1)
    ->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual time=0.015..0.119 rows=732 loops=1)
    ->  Index Scan using routes_cidr_to_range_idx on routes  (cost=0.42..595.58 rows=22941 width=19) (actual
time=0.262..0.903rows=11 loops=732) 
          Index Cond: (inetrange(set_masklen((route)::inet, 0), set_masklen(broadcast((route)::inet), 0)) &&
inetrange(set_masklen((hmm.route)::inet,0), set_masklen(broadcast((hmm.route)::inet), 0))) 
  Planning time: 0.211 ms
  Execution time: 662.769 ms
(6 rows)

Boom. This is actually usefull.

It does take 70 seconds for the biggst network though. The index is also
rather large:

  public | routes_cidr_to_range_idx | index | htj   | routes  | 158 MB |

Table is 119MB data. The gist index was 99 MB.


     Best regards, Henrik

  Henrik Thostrup Jensen <htj at nordu.net>
  Software Developer, NORDUnet




pgsql-performance by date:

Previous
From: Emre Hasegeli
Date:
Subject: Re: Gist indexing performance with cidr types
Next
From: "挨踢人"
Date:
Subject: is there any way we can push join predicate into inner table