Re: Updating cidr column with network operator - Mailing list pgsql-sql

From Axel Rau
Subject Re: Updating cidr column with network operator
Date
Msg-id f10bbbc30411862e1691fc7a905a8e5e@Chaos1.DE
Whole thread Raw
In response to Re: Updating cidr column with network operator  (Michael Fuhr <mike@fuhr.org>)
Responses Re: Updating cidr column with network operator
List pgsql-sql
Am 26.09.2005 um 02:05 schrieb Michael Fuhr:

> On Fri, Sep 23, 2005 at 09:19:25PM +0200, Axel Rau wrote:
>> Am 23.09.2005 um 19:32 schrieb Michael Fuhr:
>>> On Fri, Sep 23, 2005 at 06:31:17PM +0200, Axel Rau wrote:
>>>> Networks change during time, being diveded or aggregated or you just
>>>> enter wrong data during insert.
>>>
>>> Have you considered using a CHECK constraint and/or a trigger to
>>> ensure that the network in the network column contains the address
>>> in the id column?  If you have and rejected the idea, what were the
>>> reasons?
>>
>> I'm sure this would be the cleanest solution but remember networks
>> change.
>
> Yes, which is why it's a good idea to automatically propogate those
> changes to tables that maintain redundant data.
I would not call it redundant but normalized, because network has some
attributes, common to all addresses in the net, 1st of all the netmask.
>   If that data isn't
> reliable then there's little point in maintaining it.
Reliability is a big issue in my application, because it's some kind of 
"data mining"
of internet structures (networks, Autonomous Systems and abuse 
addresses).
Whois data is seldom correct, so I changed recently to use the internet 
routing
table for the most important network data.
>
>> This constraind would have to update all details (addresses) of a
>> 10/8 being splitted in a 10/9 and a 10.128/9. If this can be done
>> with pg, it is above my current knowledge level. (But feel free to
>> send a suggestion).
>
> See the documentation for PL/pgSQL and triggers.  You could write
> a trigger function to automatically update the address table whenever
> the network table changes.  Or, since the foreign keys already
> cascade on update, you could have a trigger on the address table
> that checks whether the new network contains the IP address, and
> if it doesn't then it looks up the correct network.
I will try this, but be sure I will come back with questions. (-;).
>
>> The other point is performance. Inserting new addresses is a realtime
>> job while correcting network changes is a daily maintenance job.
>
> Triggers on update shouldn't affect insert performance, and since
> you already have a foreign key constraint to slow inserts down,
> adding a CHECK constraint should have negligible impact.
>
> The need for regular corrections is a sign that perhaps the design
> could be improved.  This is one reason to avoid maintaining redundant
> data if possible: you have to take additional steps to ensure that
> it remains consistent.
I agree.
>
>>> This update also might not give the results you want if more than
>>> one network matches.
>>
>> This is not possible, because the pk of network is the net cidr.
>
> Yes, it is possible, because the update's join condition isn't
> equality but rather containment.  If the network table contains
> 10.1.0.0/16 and 10.1.0.0/24, then the IP address 10.1.0.1 would
> match both.
You mean, unique does not imply none-overlapping for data-type
network? Oh, I didn't know that.

Who is responsible for this func spec? This is completly contra-
real-world-experience. Can this be re-considered for a future release?

I do understand now the background of your arguments.
First what I have to do, is to fix that network table to forbid 
overlapps.
I expect that I will find overlapps already in the table, because I have
not yet written the maintenance code to deleting/reorganizing nets (-;).

Deleting involves scanning a 300 MB flat file and looking which row
in network has no longer an entry in the flat file. I did try this in 
pg in
the 1st place but could not keep up updating 9 million rows 3 times
in 4 hours on a 2x 900MHz 2GB Powermac G4. I currently have no
idea how to solve that (Possibly worth another thread).

>  If your application prevents 10.1.0.0/16 and 10.1.0.0/24
> from both being in the network table then *that's* the reason
> multiple matches aren't possible, but it's not because of the
> primary key.
>
> -- 
> Michael Fuhr

Thank you for taking the time to explain this,
Axel

Axel Rau, Frankfurt, Germany                           +49-69-951418-0



pgsql-sql by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: Updating cidr column with network operator
Next
From: Chris Browne
Date:
Subject: Re: Possible to delete record from all tables at the same time?