Broken selectivity with special inet operators - Mailing list pgsql-bugs

From Josh Berkus
Subject Broken selectivity with special inet operators
Date
Msg-id 4E7A4923.1020900@agliodbs.com
Whole thread Raw
Responses Re: Broken selectivity with special inet operators  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Summary: special inet operators ( << >> <<= =>> ) are
    up to 1000000X off in estimating rowcounts
Type: performance
Severity: normal
Tested on: 9.1.0
Description:

We've been noticing that row estimates for queries which use the =>> and
<<= operators for inet data were way, way off.  We finally narrowed the
problem down to a simple test:

===========
USING <<= :
===========

explain analyze
SELECT count(*)
FROM partition1 lh
WHERE lh.ip <<= '1.2.3'::cidr;


QUERY PLAN
.....
   ->  Index Scan using partition1_ip on partition1 lh
    (cost=0.00..10.21 rows=6956732 width=0)
    (actual time=0.016..0.016 rows=0 loops=1)
         Index Cond: ((ip >= '1.2.3.0/24'::inet) AND (ip <=
'1.2.3.255'::inet))
         Filter: (ip <<= '1.2.3.0/24'::inet)
.....

explain analyze
SELECT count(*)
FROM partition2 WHERE 1=1 AND ip <<= '87.178.193.0/24'::inet;

               QUERY PLAN
 Aggregate  (cost=18296.78..18296.79 rows=1 width=0) (actual
time=0.037..0.038 rows=1 loops=1)
   ->  Index Scan using partition2_ip on partition2  (cost=0.00..38.36
rows=7303365 width=0) (actual ti
me=0.022..0.031 rows=5 loops=1)
         Index Cond: ((ip >= '87.178.193.0/24'::inet) AND (ip <=
'87.178.193.255'::inet))
         Filter: (ip <<= '87.178.193.0/24'::inet)
 Total runtime: 0.107 ms

============
USING < > :
============

explain analyze
SELECT count(*)
FROM partition1 lh
WHERE lh.ip >= '1.2.3.0/24'::inet and lh.ip <= '1.2.3.255'::inet;

QUERY PLAN
....
   ->  Index Scan using partition1_ip on partition1 lh
    (cost=0.00..10.22 rows=1 width=0)
    (actual time=0.016..0.016 rows=0 loops=1)
         Index Cond: ((ip >= '1.2.3.0/24'::inet) AND (ip <=
'1.2.3.255'::inet))
....

explain analyze
SELECT count(*)
FROM partition2 WHERE 1=1 AND ip > '87.178.193.0'::inet and ip <=
'87.178.193.255'::inet;

            QUERY PLAN

 Aggregate  (cost=26.34..26.35 rows=1 width=0) (actual time=0.033..0.033
rows=1 loops=1)
   ->  Index Scan using partition2_ip on partition2  (cost=0.00..26.33
rows=5 width=0) (actual time=0.0
19..0.029 rows=5 loops=1)
         Index Cond: ((ip > '87.178.193.0'::inet) AND (ip <=
'87.178.193.255'::inet))
 Total runtime: 0.097 ms


====
Note that the mis-estimate of rows returned in each case is almost
exactly 50% of the total rows in the table.  That would suggest that
match_special_index_operator is failing, and not recognizing the <<=
operator for estimation purposes and just going with a default estimate
of 0.5.

I've tried to locate the cause of this problem, but the code involved is
rather convoluted and crusty, and I can't follow the logic.  Help?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

pgsql-bugs by date:

Previous
From: Euler Taveira de Oliveira
Date:
Subject: Re: Timezone issues with Postrres
Next
From: Tom Lane
Date:
Subject: Re: Timezone issues with Postrres