I'm trying to join two tables on an inet column, where one of the
columns may contain a subnet rather than a single host. Somehow the
operation isn't completing quite fast enough, even though neither table
is very large:
table | rows
--------------------+--------
clients | 115472
clients_commercial | 11670
First attempt, cancelled after running for half an hour:
SELECT
c.address AS address,
cc.address AS network
FROM
clients c
JOIN clients_commercial cc ON (c.address <<= cc.address)
;
Nested Loop
(cost=189.00..27359887.76 rows=607947200 width=22)
Join Filter: ("outer".address <<= "inner".address)
-> Seq Scan on clients c
(cost=0.00..2074.76 rows=102176 width=11)
-> Materialize
(cost=189.00..308.00 rows=11900 width=11)
-> Seq Scan on clients_commercial cc
(cost=0.00..189.00 rows=11900 width=11)
Second attempt, completes within 10 min:
SELECT
c.address AS address,
cc.address AS network
FROM
clients c,
clients_commercial cc
WHERE
c.commercial IS NULL
AND c.address <<= cc.address
;
Nested Loop
(cost=189.00..139084.01 rows=3040450 width=22)
Join Filter: ("outer".address <<= "inner".address)
-> Seq Scan on clients c
(cost=0.00..2074.76 rows=511 width=11)
Filter: (commercial IS NULL)
-> Materialize
(cost=189.00..308.00 rows=11900 width=11)
-> Seq Scan on clients_commercial cc
(cost=0.00..189.00 rows=11900 width=11)
Third attempt; provided some indexes, which unfortunately don't get
used, making the query twice as slow as the previous one:
SELECT
c.address AS address,
cc.address AS network
FROM
clients c,
clients_commercial cc
WHERE
c.commercial IS NULL
AND set_masklen(c.address, masklen(cc.address)) = cc.address
;
CREATE INDEX clients_commercial_masklen_idx
ON clients_commercial((masklen(address)));
CREATE INDEX clients_32_idx
ON clients((set_masklen(address, 32)));
CREATE INDEX clients_24_idx
ON clients((set_masklen(address, 24)));
CREATE INDEX clients_16_idx
ON clients((set_masklen(address, 16)));
Nested Loop
(cost=189.00..169488.51 rows=479 width=22)
Join Filter: (set_masklen("outer".address, masklen("inner".address))
= "inner".address)
-> Seq Scan on clients c
(cost=0.00..2074.76 rows=511 width=11)
Filter: (commercial IS NULL)
-> Materialize
(cost=189.00..308.00 rows=11900 width=11)
-> Seq Scan on clients_commercial cc
(cost=0.00..189.00 rows=11900 width=11)
Anything else I could try? BTREE indexes don't seem to work with the <<=
operator; is this not possible in principal, or simply something that
has not been implmented yet?