Thread: JOINing based on whether an IP address is contained within a CIDR range?

JOINing based on whether an IP address is contained within a CIDR range?

From
"Jamie Tufnell"
Date:
Hi,<br /><br />I am storing a log of HTTP requests in a database table (including IP address):<br /><br />http_log:
id(PK),path, time, ip<br /><br />I have another table that contains CIDR ranges and names for them:<br /><br
/>network_names:id(PK), cidr, name <br /><br />Some example data for both tables:<br /><br />network_names:<br />1, <a
href="http://192.168.0.0/24">192.168.0.0/24</a>,'Engineering'<br />2, <a
href="http://192.168.1.0/24">192.168.1.0/24</a>,'Media'<br /> 3, <a href="http://192.168.2.0/24">192.168.2.0/24</a>,
'Engineering'<br/>4, <a href="http://192.168.3.0/24">192.168.3.0/24</a>, 'Accounting'<br />5, <a
href="http://192.168.4.0/24">192.168.4.0/24</a>,'Engineering' <br />6, <a href="http://10.0.0.0/8">10.0.0.0/8</a>,
'Engineering'<br/><br />http_log:<br />1, '/index.html', 110000001, <a
href="http://192.168.0.47/32">192.168.0.47/32</a><br/> 2, '/index.html', 110000023, <a
href="http://200.1.2.3/32">200.1.2.3/32</a><br/> 3, '/index.html', 110000059, <a
href="http://1.2.3.4/32">1.2.3.4/32</a><br/> 4, '/index.html', 110000232, <a
href="http://192.168.2.1/32">192.168.2.1/32</a><br/> 5, '/index.html', 113919102, <a
href="http://192.168.1.39/32">192.168.1.39/32</a><br/> 6, '/index.html', 129101293, <a
href="http://10.2.2.4/32">10.2.2.4/32</a><br/> 7, '/index.html', 132828282, <a
href="http://192.168.4.2/32">192.168.4.2/32</a><br/><br />Now, in trying to produce a report on this data, I've come up
againstan interesting (to me at least!) problem..<br /><br />I basically want the same output as in http_log, but
substitutingthe IP with the network name where available, i.e:<br /><br />1, '/index.html', 110000001, Engineering<br
/>2, '/index.html', 110000023, <a href="http://200.1.2.3/32">200.1.2.3/32</a><br /> 3, '/index.html', 110000059, <a
href="http://1.2.3.4/32">1.2.3.4/32</a><br/> 4, '/index.html', 110000232, Engineering<br /> 5, '/index.html',
113919102,Media<br /> 6, '/index.html', 129101293, Engineering<br /> 7, '/index.html', 132828282, Engineering<br /><br
/>I'mwondering what the best way of doing this is (considering that http_log could have >100000 rows)  Is it
possibleto somehow JOIN using the <<= and >>= network operators?  Or would I have to iterate the
network_namestable manually with LOOP (or something) on every row of the http_log? <br /><br />If anyone can share some
advice,that would be great!<br /><br />Thanks,<br />JST<br /> 

Re: JOINing based on whether an IP address is contained within a CIDR range?

From
Erik Jones
Date:
On Oct 25, 2007, at 1:22 PM, Jamie Tufnell wrote:

> Hi,
>
> I am storing a log of HTTP requests in a database table (including
> IP address):
>
> http_log: id(PK), path, time, ip
>
> I have another table that contains CIDR ranges and names for them:
>
> network_names: id(PK), cidr, name
>
> Some example data for both tables:
>
> network_names:
> 1, 192.168.0.0/24, 'Engineering'
> 2, 192.168.1.0/24, 'Media'
> 3, 192.168.2.0/24, 'Engineering'
> 4, 192.168.3.0/24, 'Accounting'
> 5, 192.168.4.0/24, 'Engineering'
> 6, 10.0.0.0/8, 'Engineering'
>
> http_log:
> 1, '/index.html', 110000001, 192.168.0.47/32
> 2, '/index.html', 110000023, 200.1.2.3/32
> 3, '/index.html', 110000059, 1.2.3.4/32
> 4, '/index.html', 110000232, 192.168.2.1/32
> 5, '/index.html', 113919102, 192.168.1.39/32
> 6, '/index.html', 129101293, 10.2.2.4/32
> 7, '/index.html', 132828282, 192.168.4.2/32
>
> Now, in trying to produce a report on this data, I've come up
> against an interesting (to me at least!) problem..
>
> I basically want the same output as in http_log, but substituting
> the IP with the network name where available, i.e:
>
> 1, '/index.html', 110000001, Engineering
> 2, '/index.html', 110000023, 200.1.2.3/32
> 3, '/index.html', 110000059, 1.2.3.4/32
> 4, '/index.html', 110000232, Engineering
> 5, '/index.html', 113919102, Media
> 6, '/index.html', 129101293, Engineering
> 7, '/index.html', 132828282, Engineering
>
> I'm wondering what the best way of doing this is (considering that
> http_log could have >100000 rows)  Is it possible to somehow JOIN
> using the <<= and >>= network operators?  Or would I have to
> iterate the network_names table manually with LOOP (or something)
> on every row of the http_log?
>
> If anyone can share some advice, that would be great!

Check out: http://www.postgresql.org/docs/8.2/interactive/functions-
net.html

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




"Jamie Tufnell" <diesql@googlemail.com> writes:
> Is it possible to somehow JOIN using the <<= and
> = network operators?

Sure, why not?

You probably won't get a plan that's smarter than a nestloop, but I
imagine you'll have few enough rows in network_names that it won't
matter.

One potential problem is if there could be overlapping entries in
network_names --- the join would give you multiple rows for any
log entry that matches more than one network_names entry.  You
could fix that using a subselect with LIMIT 1, as long as you
weren't too picky about which name got reported.
        regards, tom lane


Re: JOINing based on whether an IP address is contained within a CIDR range?

From
Harald Fuchs
Date:
In article <b0a4f3350710251122y10648d4id322f12a81001c07@mail.gmail.com>,
"Jamie Tufnell" <diesql@googlemail.com> writes:

> Hi,
> I am storing a log of HTTP requests in a database table (including IP address):

> http_log: id(PK), path, time, ip

> I have another table that contains CIDR ranges and names for them:

> network_names: id(PK), cidr, name

> Some example data for both tables:

> network_names:
> 1, 192.168.0.0/24, 'Engineering'
> 2, 192.168.1.0/24, 'Media'
> 3, 192.168.2.0/24, 'Engineering'
> 4, 192.168.3.0/24, 'Accounting'
> 5, 192.168.4.0/24, 'Engineering'
> 6, 10.0.0.0/8, 'Engineering'

> http_log:
> 1, '/index.html', 110000001, 192.168.0.47/32
> 2, '/index.html', 110000023, 200.1.2.3/32
> 3, '/index.html', 110000059, 1.2.3.4/32
> 4, '/index.html', 110000232, 192.168.2.1/32
> 5, '/index.html', 113919102, 192.168.1.39/32
> 6, '/index.html', 129101293, 10.2.2.4/32
> 7, '/index.html', 132828282, 192.168.4.2/32

> Now, in trying to produce a report on this data, I've come up against an
> interesting (to me at least!) problem..

> I basically want the same output as in http_log, but substituting the IP with
> the network name where available, i.e:

> 1, '/index.html', 110000001, Engineering
> 2, '/index.html', 110000023, 200.1.2.3/32
> 3, '/index.html', 110000059, 1.2.3.4/32
> 4, '/index.html', 110000232, Engineering
> 5, '/index.html', 113919102, Media
> 6, '/index.html', 129101293, Engineering
> 7, '/index.html', 132828282, Engineering

> I'm wondering what the best way of doing this is (considering that http_log
> could have >100000 rows)  Is it possible to somehow JOIN using the <<= and >>=
> network operators?

There are PostgreSQL builtin functions for that, but I think they are
unable to use indexes. I use http://pgfoundry.org/projects/ip4r/
and I think it's the best thing since the invention of sliced bread ;-)



Re: JOINing based on whether an IP address is contained within a CIDR range?

From
"Jonah H. Harris"
Date:
On 10/26/07, Harald Fuchs <hf0217x@protecting.net> wrote:
> There are PostgreSQL builtin functions for that, but I think they are
> unable to use indexes. I use http://pgfoundry.org/projects/ip4r/
> and I think it's the best thing since the invention of sliced bread ;-)

Yes:

ip4r('ip') <<= ip4r(cidr('cidr'))

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation                | fax: 732.331.1301
499 Thornall Street, 2nd Floor          | jonah.harris@enterprisedb.com
Edison, NJ 08837                        | http://www.enterprisedb.com/