Thread: JOINing based on whether an IP address is contained within a CIDR range?
Hi,<br /><br />I am storing a log of HTTP requests in a database table (including IP address):<br /><br />http_log: id(PK),path, time, ip<br /><br />I have another table that contains CIDR ranges and names for them:<br /><br />network_names:id(PK), cidr, name <br /><br />Some example data for both tables:<br /><br />network_names:<br />1, <a href="http://192.168.0.0/24">192.168.0.0/24</a>,'Engineering'<br />2, <a href="http://192.168.1.0/24">192.168.1.0/24</a>,'Media'<br /> 3, <a href="http://192.168.2.0/24">192.168.2.0/24</a>, 'Engineering'<br/>4, <a href="http://192.168.3.0/24">192.168.3.0/24</a>, 'Accounting'<br />5, <a href="http://192.168.4.0/24">192.168.4.0/24</a>,'Engineering' <br />6, <a href="http://10.0.0.0/8">10.0.0.0/8</a>, 'Engineering'<br/><br />http_log:<br />1, '/index.html', 110000001, <a href="http://192.168.0.47/32">192.168.0.47/32</a><br/> 2, '/index.html', 110000023, <a href="http://200.1.2.3/32">200.1.2.3/32</a><br/> 3, '/index.html', 110000059, <a href="http://1.2.3.4/32">1.2.3.4/32</a><br/> 4, '/index.html', 110000232, <a href="http://192.168.2.1/32">192.168.2.1/32</a><br/> 5, '/index.html', 113919102, <a href="http://192.168.1.39/32">192.168.1.39/32</a><br/> 6, '/index.html', 129101293, <a href="http://10.2.2.4/32">10.2.2.4/32</a><br/> 7, '/index.html', 132828282, <a href="http://192.168.4.2/32">192.168.4.2/32</a><br/><br />Now, in trying to produce a report on this data, I've come up againstan interesting (to me at least!) problem..<br /><br />I basically want the same output as in http_log, but substitutingthe IP with the network name where available, i.e:<br /><br />1, '/index.html', 110000001, Engineering<br />2, '/index.html', 110000023, <a href="http://200.1.2.3/32">200.1.2.3/32</a><br /> 3, '/index.html', 110000059, <a href="http://1.2.3.4/32">1.2.3.4/32</a><br/> 4, '/index.html', 110000232, Engineering<br /> 5, '/index.html', 113919102,Media<br /> 6, '/index.html', 129101293, Engineering<br /> 7, '/index.html', 132828282, Engineering<br /><br />I'mwondering what the best way of doing this is (considering that http_log could have >100000 rows) Is it possibleto somehow JOIN using the <<= and >>= network operators? Or would I have to iterate the network_namestable manually with LOOP (or something) on every row of the http_log? <br /><br />If anyone can share some advice,that would be great!<br /><br />Thanks,<br />JST<br />
On Oct 25, 2007, at 1:22 PM, Jamie Tufnell wrote: > Hi, > > I am storing a log of HTTP requests in a database table (including > IP address): > > http_log: id(PK), path, time, ip > > I have another table that contains CIDR ranges and names for them: > > network_names: id(PK), cidr, name > > Some example data for both tables: > > network_names: > 1, 192.168.0.0/24, 'Engineering' > 2, 192.168.1.0/24, 'Media' > 3, 192.168.2.0/24, 'Engineering' > 4, 192.168.3.0/24, 'Accounting' > 5, 192.168.4.0/24, 'Engineering' > 6, 10.0.0.0/8, 'Engineering' > > http_log: > 1, '/index.html', 110000001, 192.168.0.47/32 > 2, '/index.html', 110000023, 200.1.2.3/32 > 3, '/index.html', 110000059, 1.2.3.4/32 > 4, '/index.html', 110000232, 192.168.2.1/32 > 5, '/index.html', 113919102, 192.168.1.39/32 > 6, '/index.html', 129101293, 10.2.2.4/32 > 7, '/index.html', 132828282, 192.168.4.2/32 > > Now, in trying to produce a report on this data, I've come up > against an interesting (to me at least!) problem.. > > I basically want the same output as in http_log, but substituting > the IP with the network name where available, i.e: > > 1, '/index.html', 110000001, Engineering > 2, '/index.html', 110000023, 200.1.2.3/32 > 3, '/index.html', 110000059, 1.2.3.4/32 > 4, '/index.html', 110000232, Engineering > 5, '/index.html', 113919102, Media > 6, '/index.html', 129101293, Engineering > 7, '/index.html', 132828282, Engineering > > I'm wondering what the best way of doing this is (considering that > http_log could have >100000 rows) Is it possible to somehow JOIN > using the <<= and >>= network operators? Or would I have to > iterate the network_names table manually with LOOP (or something) > on every row of the http_log? > > If anyone can share some advice, that would be great! Check out: http://www.postgresql.org/docs/8.2/interactive/functions- net.html Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
"Jamie Tufnell" <diesql@googlemail.com> writes: > Is it possible to somehow JOIN using the <<= and > = network operators? Sure, why not? You probably won't get a plan that's smarter than a nestloop, but I imagine you'll have few enough rows in network_names that it won't matter. One potential problem is if there could be overlapping entries in network_names --- the join would give you multiple rows for any log entry that matches more than one network_names entry. You could fix that using a subselect with LIMIT 1, as long as you weren't too picky about which name got reported. regards, tom lane
Re: JOINing based on whether an IP address is contained within a CIDR range?
From
Harald Fuchs
Date:
In article <b0a4f3350710251122y10648d4id322f12a81001c07@mail.gmail.com>, "Jamie Tufnell" <diesql@googlemail.com> writes: > Hi, > I am storing a log of HTTP requests in a database table (including IP address): > http_log: id(PK), path, time, ip > I have another table that contains CIDR ranges and names for them: > network_names: id(PK), cidr, name > Some example data for both tables: > network_names: > 1, 192.168.0.0/24, 'Engineering' > 2, 192.168.1.0/24, 'Media' > 3, 192.168.2.0/24, 'Engineering' > 4, 192.168.3.0/24, 'Accounting' > 5, 192.168.4.0/24, 'Engineering' > 6, 10.0.0.0/8, 'Engineering' > http_log: > 1, '/index.html', 110000001, 192.168.0.47/32 > 2, '/index.html', 110000023, 200.1.2.3/32 > 3, '/index.html', 110000059, 1.2.3.4/32 > 4, '/index.html', 110000232, 192.168.2.1/32 > 5, '/index.html', 113919102, 192.168.1.39/32 > 6, '/index.html', 129101293, 10.2.2.4/32 > 7, '/index.html', 132828282, 192.168.4.2/32 > Now, in trying to produce a report on this data, I've come up against an > interesting (to me at least!) problem.. > I basically want the same output as in http_log, but substituting the IP with > the network name where available, i.e: > 1, '/index.html', 110000001, Engineering > 2, '/index.html', 110000023, 200.1.2.3/32 > 3, '/index.html', 110000059, 1.2.3.4/32 > 4, '/index.html', 110000232, Engineering > 5, '/index.html', 113919102, Media > 6, '/index.html', 129101293, Engineering > 7, '/index.html', 132828282, Engineering > I'm wondering what the best way of doing this is (considering that http_log > could have >100000 rows) Is it possible to somehow JOIN using the <<= and >>= > network operators? There are PostgreSQL builtin functions for that, but I think they are unable to use indexes. I use http://pgfoundry.org/projects/ip4r/ and I think it's the best thing since the invention of sliced bread ;-)
Re: JOINing based on whether an IP address is contained within a CIDR range?
From
"Jonah H. Harris"
Date:
On 10/26/07, Harald Fuchs <hf0217x@protecting.net> wrote: > There are PostgreSQL builtin functions for that, but I think they are > unable to use indexes. I use http://pgfoundry.org/projects/ip4r/ > and I think it's the best thing since the invention of sliced bread ;-) Yes: ip4r('ip') <<= ip4r(cidr('cidr')) -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/