Thread: Gist indexing performance with cidr types
Hi I'm trying to get a query to run fast enough for interactive use. I've gotten some speed-up, but still not there. It is for a tool called IRRExplorer (http://irrexplorer.nlnog.net/) that correlates IP routes between Internet Route Registries and real-world routing information. We landed on PostgreSQL largely due to indexing of the cidr type with gist indexing. * Preliminaries: The query is about getting data from this table: irrexplorer=> \d routes Table "public.routes" Column | Type | Modifiers -----------+---------+----------- route | cidr | not null asn | bigint | not null source_id | integer | not null Indexes: "unique_entry" UNIQUE CONSTRAINT, btree (route, asn, source_id) "route_gist" gist (route inet_ops) Check constraints: "positive_asn" CHECK (asn > 0) Foreign-key constraints: "routes_source_id_fkey" FOREIGN KEY (source_id) REFERENCES sources(id) Complete DDL: https://github.com/job/irrexplorer/blob/master/data/schema.sql Data set: 125 MB on disk, 2.3 million rows. Running stock PostgreSQL 9.4.4 on Ubuntu 14.10 (hosted on VMWare on OS X) Have done VACUUM, ANALYZE, and checked memory settings, and tried to increase work_mem, but with no success. The issue seems cpu bound (100% cpu load during query). * Query When a user inputs an AS number a simple match on the asn column will return the stuff relevant. However, the interesting thing to display is conflicting/rogue routes. This means matching routes with the && operator to find all covered/covering routes. This would look something like this: SELECT rv.route, rv.asn, rv.source FROM routes_view rv LEFT OUTER JOIN routes_view r ON (rv.route && r.route) WHERE rv.route && r.route AND r.asn = %s While this is fairly fast if the initial set of routes is relatively small (<100) it runs with a second or so, but if the number of routes matching the asn is large (>1000), it takes quite a while (+30 seconds). Explain analyze link: http://explain.depesz.com/s/dHqo I am not terribly good at reading the output, but it seem most of the time is actually spend on the bitmap scan for the gist index. It there another type of indexing that would behave better here? Since there often identical routes in the initial set of routes (from the AS number matching), I tried reducing the initialset of matching routes: ORDER BY rv.route; SELECT rv.route, rv.asn, rv.source FROM (SELECT DISTINCT route FROM routes_view WHERE asn = %s) r LEFT OUTER JOIN routes_view rv ON (r.route && rv.route) This often cuts the set of initial matching routes by 25-50%, and cuts a similar amount of time from the query time. Explain analyze link: http://explain.depesz.com/s/kf13 I tried further reducing the routes to the minimal set: WITH distinct_routes AS (SELECT DISTINCT route FROM routes_view WHERE asn = %s) SELECT route FROM distinct_routes EXCEPT SELECT r1.route FROM distinct_routes r1 INNER JOIN distinct_routes r2 ON (r1.route << r2.route); But typically only yields 10-20% reduction of the inital route set, and adds query complexity (putting the above in a CTE/WITH seems to make the query significantly slower for some reason). The main issue seem to be with the gist bitmap index. Is there a better way to approach this style of query? Best regards, Henrik
> I'm trying to get a query to run fast enough for interactive use. I've gotten > some speed-up, but still not there. It is for a tool called IRRExplorer > (http://irrexplorer.nlnog.net/) that correlates IP routes between Internet > Route Registries and real-world routing information. We landed on PostgreSQL > largely due to indexing of the cidr type with gist indexing. It is nice to hear about someone making use of the feature. > * Query > > When a user inputs an AS number a simple match on the asn column will return > the stuff relevant. However, the interesting thing to display is > conflicting/rogue routes. This means matching routes with the && operator to > find all covered/covering routes. This would look something like this: > > SELECT rv.route, rv.asn, rv.source > FROM routes_view rv > LEFT OUTER JOIN routes_view r ON (rv.route && r.route) > WHERE rv.route && r.route AND r.asn = %s Why don't you just use INNER JOIN like this: SELECT rv.route, rv.asn, rv.source FROM routes_view rv JOIN routes_view r ON rv.route && r.route WHERE r.asn = %s > While this is fairly fast if the initial set of routes is relatively small > (<100) it runs with a second or so, but if the number of routes matching the > asn is large (>1000), it takes quite a while (+30 seconds).Explain analyze > link: > > http://explain.depesz.com/s/dHqo > > I am not terribly good at reading the output, but it seem most of the time is > actually spend on the bitmap scan for the gist index. It there another type of > indexing that would behave better here? An index to the "asn" column would probably help to the outer side, but more time seems to be consumed on the inner side. Plain index scan would probably be faster for it. You can test it by setting enable_bitmapscan to false. The problem about bitmap index scan is selectivity estimation. The planner estimates a lot more rows would match the condition, so it chooses bitmap index scan. Selectivity estimation functions for inet on PostgreSQL 9.4 just return some constants, so it is expected. We developed better ones for 9.5. PostgreSQL 9.5 also supports index only scans with GiST which can be even better than plain index scan. Can you try 9.5 to see if they help?
Hi, thanks for the reply. On Tue, 25 Aug 2015, Emre Hasegeli wrote: >> I'm trying to get a query to run fast enough for interactive use. I've gotten >> some speed-up, but still not there. It is for a tool called IRRExplorer >> (http://irrexplorer.nlnog.net/) that correlates IP routes between Internet >> Route Registries and real-world routing information. >> We landed on PostgreSQL largely due to indexing of the cidr type with >> gist indexing. > > It is nice to hear about someone making use of the feature. Thanks to whoever made it. It is probably a niche-feature though. >> SELECT rv.route, rv.asn, rv.source >> FROM routes_view rv >> LEFT OUTER JOIN routes_view r ON (rv.route && r.route) >> WHERE rv.route && r.route AND r.asn = %s > > Why don't you just use INNER JOIN like this: > > SELECT rv.route, rv.asn, rv.source > FROM routes_view rv > JOIN routes_view r ON rv.route && r.route > WHERE r.asn = %s I probably have a habit of thinking in outer joins. The inner join turns out to slightly slower though (but faster in planning), but it looks like it depends on a dice roll by the planner (it does bitmap heap scan on inner, and index scan on left outer). >> I am not terribly good at reading the output, but it seem most of the time is >> actually spend on the bitmap scan for the gist index. It there another type of >> indexing that would behave better here? > > An index to the "asn" column would probably help to the outer side, "select route from routes where asn = %s" takes .15-.2 seconds on my laptop, so it isn't where the time is spend here. > but more time seems to be consumed on the inner side. Plain index > scan would probably be faster for it. You can test it by setting > enable_bitmapscan to false. This actually makes it go slower for inner join (31s -> 56s). Left outer join is around the same. > The problem about bitmap index scan is selectivity estimation. The > planner estimates a lot more rows would match the condition, so it > chooses bitmap index scan. Selectivity estimation functions for inet > on PostgreSQL 9.4 just return some constants, so it is expected. We > developed better ones for 9.5. PostgreSQL 9.5 also supports index > only scans with GiST which can be even better than plain index scan. OK, that is interesting. > Can you try 9.5 to see if they help? I'll try installing it and report back. Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet
On Wed, 26 Aug 2015, Henrik Thostrup Jensen wrote: >> Can you try 9.5 to see if they help? > > I'll try installing it and report back. I upgraded to 9.5 (easier than expected) and ran vacuum analyze. The query planner now chooses index scan for outer and inner join. This seems to cut off roughly a second or so (31s -> 30s, and 17s->16s for when using distint on initial route set). Query: EXPLAIN (ANALYZE, BUFFERS) SELECT rv.route, rv.asn, rv.source FROM (SELECT DISTINCT route FROM routes_view WHERE asn = %s) r INNER JOIN routes_view rv ON (r.route && rv.route) ORDER BY rv.route; Explain analyze: http://explain.depesz.com/s/L7kZ 9.5 also seems to fix the case with using CTE/WITH was actually slower. The fastest I can currently do is this, which finds the minimal set of covering routes before joining: SET enable_bitmapscan = false; EXPLAIN ANALYZE WITH distinct_routes AS (SELECT DISTINCT route FROM routes_view WHERE asn = %s), minimal_routes AS (SELECT route FROM distinct_routes EXCEPT SELECT r1.route FROM distinct_routes r1 INNER JOIN distinct_routes r2 ON (r1.route << r2.route)) SELECT rv.route, rv.asn, rv.source FROM routes_view rv JOIN minimal_routes ON (rv.route <<= minimal_routes.route); Explain analyze: http://explain.depesz.com/s/Plx4 The query planner chooses bitmap Index Scan for this query, which adds around .5 second the query time, so it isn't that bad of a decision. Unfortunately it still takes 15 seconds for my test case (a big network, but still a factor 10 from the biggest). Are the coverage operatons just that expensive? Best regards, Henrik
> Are the coverage operatons just that expensive? They shouldn't be. A similar query like yours works in 0.5 second on my laptop: ># create table inner_side as select i, ((random() * 255.5)::int::text || '.' || (random() * 255.5)::int::text || '.' ||(random() * 255.5)::int::text || '.' || (random() * 255.5)::int::text || '/' || (random() * 16 + 9)::int::text)::inet::cidras network from generate_series(1, 2300000) as i; > SELECT 2300000 > ># create table outer_side as select i, ((random() * 255.5)::int::text || '.' || (random() * 255.5)::int::text || '.' ||(random() * 255.5)::int::text || '.' || (random() * 255.5)::int::text || '/' || (random() * 16 + 9)::int::text)::inet::cidras network from generate_series(1, 732) as i; > SELECT 732 > ># create index on inner_side using gist(network inet_ops); > CREATE INDEX > ># analyze; > ANALYZE > ># explain analyze select * from outer_side join inner_side on outer_side.network && inner_side.network; > QUERY PLAN > ---------- > Nested Loop (cost=0.41..563927.27 rows=137310 width=22) (actual time=0.115..474.103 rows=561272 loops=1) > -> Seq Scan on outer_side (cost=0.00..11.32 rows=732 width=11) (actual time=0.011..0.096 rows=732 loops=1) > -> Index Scan using inner_side_network_idx on inner_side (cost=0.41..540.38 rows=23000 width=11) (actual time=0.031..0.553rows=767 loops=732) > Index Cond: ((outer_side.network)::inet && (network)::inet) > Planning time: 0.830 ms > Execution time: 505.641 ms > (6 rows) Maybe, something we haven't expected about your dataset causes a performance regression on the index. Did you see anything relevant on the server logs on index creation time?
On Wed, 26 Aug 2015, Emre Hasegeli wrote: >> Are the coverage operatons just that expensive? > > They shouldn't be. A similar query like yours works in 0.5 second on my laptop: [snip] I get the same from your testcase. > Maybe, something we haven't expected about your dataset causes a > performance regression on the index. Did you see anything relevant on > the server logs on index creation time? I tried dropping and re-creating the index. The only log entry was for the drop statement. The distribution of the data is not uniform like the data set you produce. Though I find it hard to believe that it would affect this as much. select masklen(route), count(*) from routes group by masklen(route); masklen | count ---------+--------- 8 | 47 9 | 30 10 | 84 11 | 225 12 | 580 13 | 1163 14 | 2401 15 | 4530 16 | 32253 17 | 20350 18 | 35583 19 | 76307 20 | 111913 21 | 132807 22 | 229578 23 | 286986 24 | 1149793 Rest is rather small, though with bumps at /32 and /48 (typical IPv6 prefix length). Real-world address space is very fragmented, where as some is unused. Then there is the mixed IPv6 and IPv4 data that might factor in. I tried the approach from your benchmark, to try make a more isolated test case: irrexplorer=> SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn = 2914; SELECT 732 irrexplorer=> explain analyze select routes.route from routes join hmm on routes.route && hmm.route; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.41..511914.27 rows=2558 width=7) (actual time=8.096..17209.778 rows=8127 loops=1) -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.010..0.609 rows=732 loops=1) -> Index Only Scan using route_gist on routes (cost=0.41..470.32 rows=22900 width=7) (actual time=4.823..23.502 rows=11loops=732) Index Cond: (route && (hmm.route)::inet) Heap Fetches: 0 Planning time: 0.971 ms Execution time: 17210.627 ms (7 rows) The only difference in the query plan is that the above used an index only, where as your test case used index scan (it did this for me as well). I tried without index only scan: irrexplorer=> set enable_indexonlyscan =false; SET QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.41..571654.27 rows=2558 width=7) (actual time=6.406..15899.791 rows=8127 loops=1) -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.011..0.615 rows=732 loops=1) -> Index Scan using route_gist on routes (cost=0.41..551.93 rows=22900 width=7) (actual time=4.490..21.712 rows=11loops=732) Index Cond: ((route)::inet && (hmm.route)::inet) Planning time: 0.505 ms Execution time: 15900.669 ms (6 rows) Slight faster, but nothing significant. Something seems wonky. Best regards, Henrik
> Then there is the mixed IPv6 and IPv4 data that might factor in. It shouldn't be the problem. The index should separate them on the top level. > I tried the approach from your benchmark, to try make a more isolated test > case: Can you try to isolate it even more by something like this: select * from routes where route && 'a.b.c.d/e'; It would be easier to debug, if we can reproduce performance regression like this. It would also be helpful to check where the time is spent. Maybe "perf" on Linux would help, though I haven't used it before.
On Wed, Aug 26, 2015 at 4:29 AM, Henrik Thostrup Jensen <htj@nordu.net> wrote:
On Wed, 26 Aug 2015, Emre Hasegeli wrote:[snip]Are the coverage operatons just that expensive?
They shouldn't be. A similar query like yours works in 0.5 second on my laptop:
I get the same from your testcase.Maybe, something we haven't expected about your dataset causes a
performance regression on the index. Did you see anything relevant on
the server logs on index creation time?
I tried dropping and re-creating the index. The only log entry was for the drop statement.
The distribution of the data is not uniform like the data set you produce. Though I find it hard to believe that it would affect this as much.
select masklen(route), count(*) from routes group by masklen(route);
Any chance you can share the actual underlying data? I noticed it wasn't on github, but is that because it is proprietary, or just because you don't think it is interesting?
irrexplorer=> explain analyze select routes.route from routes join hmm on routes.route && hmm.route;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.41..511914.27 rows=2558 width=7) (actual time=8.096..17209.778 rows=8127 loops=1)
-> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.010..0.609 rows=732 loops=1)
-> Index Only Scan using route_gist on routes (cost=0.41..470.32 rows=22900 width=7) (actual time=4.823..23.502 rows=11 loops=732)
Index Cond: (route && (hmm.route)::inet)
Heap Fetches: 0
Planning time: 0.971 ms
Execution time: 17210.627 ms
(7 rows)
If you loop over the 732 rows yourself, issuing the simple query against each retrieved constant value:
explain (analyze,buffers) select routes.route from routes where route && $1
Does each one take about the same amount of time, or are there some outlier values which take much more time than the others?
Cheers,
Jeff
On Wed, 26 Aug 2015, Emre Hasegeli wrote: > Can you try to isolate it even more by something like this: I tried some different bisection approaches: -- base query (time ~19 seconds) EXPLAIN (ANALYZE, BUFFERS) SELECT rv.route, rv.asn, rv.source FROM (SELECT DISTINCT route FROM routes_view WHERE asn = 2914 AND [ stuff here ]) r JOIN routes_view rv ON (r.route && rv.route); SELECT DISTINCT route FROM routes_view WHERE asn = 2914; -> 732 rows, 0.2 seconds masklen(route) <= 20; -> 356 rows, join time 9.2 seconds masklen(route) > 20; -> 376 rows, join time 9.1 seconds family(route) = 6 -> 22 rows, join time 0.2 seconds family(route) = 4 -> 710 rows, join time 18.1 seconds route <= '154.0.0.0' -> 362 rows, join time 9.2 seconds route > '154.0.0.0' -> 370 rows, join time 9.5 seconds Nothing really interesting here though. > select * from routes where route && 'a.b.c.d/e'; > > It would be easier to debug, if we can reproduce performance > regression like this. It would also be helpful to check where the > time is spent. Maybe "perf" on Linux would help, though I haven't > used it before. Haven't used this before either (but seem like a nice tool). Output while running the query: Samples: 99K of event 'cpu-clock', Event count (approx.): 11396624870 14.09% postgres [.] inet_gist_consistent 10.77% postgres [.] 0x00000000000c05f7 10.46% postgres [.] FunctionCall5Coll 5.68% postgres [.] gistdentryinit 5.57% postgres [.] 0x00000000000c05c4 4.62% postgres [.] FunctionCall1Coll 4.52% postgres [.] MemoryContextReset 4.25% postgres [.] bitncmp 3.32% libc-2.19.so [.] __memcmp_sse4_1 2.44% postgres [.] 0x00000000000c08f9 2.37% postgres [.] 0x00000000000c0907 2.27% postgres [.] 0x00000000000c0682 2.12% postgres [.] pg_detoast_datum_packed 1.86% postgres [.] hash_search_with_hash_value 1.40% postgres [.] inet_gist_decompress 1.09% postgres [.] 0x00000000000c067e 1.03% postgres [.] 0x00000000000c047e 0.77% postgres [.] 0x00000000002f0e57 0.75% postgres [.] gistcheckpage This seemed to stay reletively consistent throughout the query. Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet
On Wed, 26 Aug 2015, Jeff Janes wrote: > Any chance you can share the actual underlying data? Sure. I added a snapshot to the repo: https://github.com/job/irrexplorer/blob/master/data/irrexplorer_dump.sql.gz?raw=true > I noticed it wasn't on github, but is that because it is proprietary, or > just because you don't think it is interesting? I hoped it wouldn't be this complicated :-). BGP and IRR data is (mostly) public, but it changes constantly, so there is little sense in putting in the repo, as it is not the authorative source (we have a script to boostrap with instead). > If you loop over the 732 rows yourself, issuing the simple query against each retrieved constant value: > > explain (analyze,buffers) select routes.route from routes where route && $1 > > Does each one take about the same amount of time, or are there some outlier values which take much more time than the others? I wrote a small script to try this out. It queries for each route 20 times to try and suppress the worst noise. I've sorted the results by time and put it here: https://gist.github.com/htj/1817883f92a9cb17a4f8 (ignore the ntp timing issue causing a negative value) Some observations: - v6 is faster than v4 which is expected. - The slowest prefixes by all seem to start bits '11'. However it is only by a factor of 1.5x which is not really significant Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet
> Nothing really interesting here though. I think the slowdown is not related with the key your searched for, but the organisation of the index. We have a simple structure for the index keys. Basically, common bits of the child nodes are stored on the parent node. It leads to not efficient indexes, where there are too much values with the same prefix. I couldn't quite understand why it performs so bad, though. You might have better luck with ip4r extension [1] or creating an index using the range types like this: > # create type inetrange as range (subtype = inet); > CREATE TYPE > > # create function cidr_to_range(cidr) returns inetrange language sql as 'select inetrange(set_masklen($1::inet, 0), set_masklen(broadcast($1),0))'; > CREATE FUNCTION > > # create index on routes using gist ((cidr_to_range(route))); > CREATE INDEX > > # explain analyze select * from routes where cidr_to_range(route) && cidr_to_range('160.75/16'); > QUERY PLAN > ---------- > Bitmap Heap Scan on routes (cost=864.50..18591.45 rows=21173 width=19) (actual time=7.249..7.258 rows=7 loops=1) > Recheck Cond: (inetrange(set_masklen((route)::inet, 0), set_masklen(broadcast((route)::inet), 0)) && '[160.75.0.0/0,160.75.255.255/0)'::inetrange) > Heap Blocks: exact=3 > -> Bitmap Index Scan on routes_cidr_to_range_idx (cost=0.00..859.21 rows=21173 width=0) (actual time=7.242..7.242 rows=7loops=1) > Index Cond: (inetrange(set_masklen((route)::inet, 0), set_masklen(broadcast((route)::inet), 0)) && '[160.75.0.0/0,160.75.255.255/0)'::inetrange) > Planning time: 1.456 ms > Execution time: 7.346 ms > (7 rows) I have examined them about the performance problem: * It splits pages by IP family [2] a lot of times, but deleting IPv6 rows from the table doesn't make it faster. * It doesn't fail and do 50-50 split [3] as I expected. * The previous posted version [4] of it works roughly twice faster, but it is still too slow. [1] https://github.com/RhodiumToad/ip4r [2] network_gist.c:705 [3] network_gist.c:754 [4] CAE2gYzzioHNxdZXyWz0waruJuw7wKpEJ-2xPTihjd6Rv8YJF_w@mail.gmail.com
Hi On Thu, 27 Aug 2015, Emre Hasegeli wrote: > I think the slowdown is not related with the key your searched for, > but the organisation of the index. We have a simple structure for > the index keys. Basically, common bits of the child nodes are stored > on the parent node. It leads to not efficient indexes, where there > are too much values with the same prefix. I couldn't quite understand > why it performs so bad, though. I can see the issue. Unfortunately IP space tends to be fragmented in some ranges, and very sparse in other. It is unfortunate that something to index IP prefixes doesn't handle BGP and IRR data very well (the only largish "real" datasets with IP prefixes I can think of). > You might have better luck with ip4r extension [1] or creating an index > using the range types like this: [snip] Using the range type index: Nested Loop (cost=0.42..603902.92 rows=8396377 width=26) (actual time=0.514..662.500 rows=8047 loops=1) -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.015..0.119 rows=732 loops=1) -> Index Scan using routes_cidr_to_range_idx on routes (cost=0.42..595.58 rows=22941 width=19) (actual time=0.262..0.903rows=11 loops=732) Index Cond: (inetrange(set_masklen((route)::inet, 0), set_masklen(broadcast((route)::inet), 0)) && inetrange(set_masklen((hmm.route)::inet,0), set_masklen(broadcast((hmm.route)::inet), 0))) Planning time: 0.211 ms Execution time: 662.769 ms (6 rows) Boom. This is actually usefull. It does take 70 seconds for the biggst network though. The index is also rather large: public | routes_cidr_to_range_idx | index | htj | routes | 158 MB | Table is 119MB data. The gist index was 99 MB. Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet