Thread: Updating cidr column with network operator
Hi SQLers, I have a fk from address to network and try to update the foreign key column to point at the network, "it belongs to": CREATE TABLE network ( id cidr PRIMARY KEY , -- 'PK, ,IPv4/6 Network address' ) CREATE TABLE address ( id inet PRIMARY KEY , -- 'PK of IPv4/6 host address' network cidr NOT NULL -- 'FK to Network table' REFERENCES network ON DELETE CASCADE ON UPDATE CASCADE ) I tried (using the WHERE clause to eliminate the addresses were no corresponding net exists): UPDATE addressSET network = (SELECT N.id WHERE A.id << N.id)FROM address A, network NWHERE A.id << N.id; But this ended up with all network columns pointing at the same net (-:). Any help would be appreciated. Axel Axel Rau, Frankfurt, Germany +49-69-951418-0
Axel Rau wrote: > Hi SQLers, > > I have a fk from address to network and try to update the foreign key > column to point at the network, "it belongs to": > > CREATE TABLE network ( > id cidr PRIMARY KEY , -- 'PK, ,IPv4/6 Network address' > ) > > CREATE TABLE address ( > id inet PRIMARY KEY , -- 'PK of IPv4/6 host address' > network cidr NOT NULL -- 'FK to Network table' > REFERENCES network ON DELETE CASCADE ON UPDATE > CASCADE > ) > > I tried (using the WHERE clause to eliminate the addresses were no > corresponding net exists): But you can't insert a row in address w/o a valid network.id? That's what the fk ensures. Perhaps you could elaborate more? Are you trying to *put* on the fk and you currently have bad data? > UPDATE address > SET network = (SELECT N.id WHERE A.id << N.id) > FROM address A, network N > WHERE A.id << N.id; > This also makes no sense. For starters, << is "bitwise shift left" ... > But this ended up with all network columns pointing at the same net (-:). > Any help would be appreciated. > > Axel > Axel Rau, Frankfurt, Germany +49-69-951418-0 > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Daryl "We want great men who, when fortune frowns, will not be discouraged." -- Colonel Henry Knox, 1776
Thank you for responding, Daryl, Am 22.09.2005 um 16:45 schrieb Daryl Richter: > Axel Rau wrote: >> Hi SQLers, >> I have a fk from address to network and try to update the foreign key >> column to point at the network, "it belongs to": >> CREATE TABLE network ( >> id cidr PRIMARY KEY , -- 'PK, ,IPv4/6 Network address' >> ) >> CREATE TABLE address ( >> id inet PRIMARY KEY , -- 'PK of IPv4/6 host >> address' >> network cidr NOT NULL -- 'FK to Network table' >> REFERENCES network ON DELETE CASCADE ON UPDATE >> CASCADE >> ) >> I tried (using the WHERE clause to eliminate the addresses were no >> corresponding net exists): > > But you can't insert a row in address w/o a valid network.id? That's > what the fk ensures. > > Perhaps you could elaborate more? Are you trying to *put* on the fk > and you currently have bad data? The fk requires a corresponding row in network. But my update tries to reference the right network, that one where the ip address belongs to. > >> UPDATE address >> SET network = (SELECT N.id WHERE A.id << N.id) >> FROM address A, network N >> WHERE A.id << N.id; > > This also makes no sense. For starters, << is "bitwise shift left" ... I'm using 8.0.3 and there are some new operators related to inet and cidr data types. On page 157, I found "<<" as address/network "is contained in" network. Finding the net where an address belongs to works as:SELECT id FROM network WHERE inet '$p_ipSource' << id; Axel Axel Rau, Frankfurt, Germany +49-69-951418-0
Axel Rau wrote: > Thank you for responding, Daryl, > > Am 22.09.2005 um 16:45 schrieb Daryl Richter: > >> Axel Rau wrote: >> >>> Hi SQLers, >>> I have a fk from address to network and try to update the foreign key >>> column to point at the network, "it belongs to": >>> CREATE TABLE network ( >>> id cidr PRIMARY KEY , -- 'PK, ,IPv4/6 Network address' >>> ) >>> CREATE TABLE address ( >>> id inet PRIMARY KEY , -- 'PK of IPv4/6 host address' >>> network cidr NOT NULL -- 'FK to Network table' >>> REFERENCES network ON DELETE CASCADE ON UPDATE >>> CASCADE >>> ) >>> I tried (using the WHERE clause to eliminate the addresses were no >>> corresponding net exists): >> >> >> But you can't insert a row in address w/o a valid network.id? That's >> what the fk ensures. >> >> Perhaps you could elaborate more? Are you trying to *put* on the fk >> and you currently have bad data? > > The fk requires a corresponding row in network. But my update tries to > reference the right network, that one where the ip address belongs to. I'm still not understanding what you're trying to do, perhaps its a language issue. :) Let me try again. I built your schema and inserted some rows: insert into network( id ) values( '10.1' ); insert into address( id, network ) values( '10.1.0.1', '10.1' ); insert into address( id, network ) values( '10.1.0.2','10.1' ); insert into address( id, network ) values( '10.1.0.3', '10.1' ); I then select from network: id ----------- 10.1.0.0/16 and from address: id network -------- ----------- 10.1.0.1 10.1.0.0/16 10.1.0.2 10.1.0.0/16 10.1.0.3 10.1.0.0/16 Why do you now want to update address.network? They are already pointing to the right network, aren't they? I think if you provide some sample data we can figure this out. > >> >>> UPDATE address >>> SET network = (SELECT N.id WHERE A.id << N.id) >>> FROM address A, network N >>> WHERE A.id << N.id; >> >> >> This also makes no sense. For starters, << is "bitwise shift left" ... > > I'm using 8.0.3 and there are some new operators related to inet and > cidr data types. > On page 157, I found "<<" as address/network "is contained in" network. > > Finding the net where an address belongs to works as: > SELECT id FROM network WHERE inet '$p_ipSource' << id; > Ahh, ok. see above. > Axel > > Axel Rau, Frankfurt, Germany +49-69-951418-0 > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -- Daryl
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 22.09.2005 um 22:26 schrieb Daryl Richter: > Axel Rau wrote: >> Thank you for responding, Daryl, >> Am 22.09.2005 um 16:45 schrieb Daryl Richter: >>> Axel Rau wrote: >>> >>>> Hi SQLers, >>>> I have a fk from address to network and try to update the foreign >>>> key column to point at the network, "it belongs to": >>>> CREATE TABLE network ( >>>> id cidr PRIMARY KEY , -- 'PK, ,IPv4/6 Network >>>> address' >>>> ) >>>> CREATE TABLE address ( >>>> id inet PRIMARY KEY , -- 'PK of IPv4/6 host >>>> address' >>>> network cidr NOT NULL -- 'FK to Network table' >>>> REFERENCES network ON DELETE CASCADE ON >>>> UPDATE CASCADE >>>> ) >>>> I tried (using the WHERE clause to eliminate the addresses were no >>>> corresponding net exists): >>> >>> >>> But you can't insert a row in address w/o a valid network.id? >>> That's what the fk ensures. >>> >>> Perhaps you could elaborate more? Are you trying to *put* on the fk >>> and you currently have bad data? >> The fk requires a corresponding row in network. But my update tries >> to reference the right network, that one where the ip address belongs >> to. > > I'm still not understanding what you're trying to do, perhaps its a > language issue. :) Let me try again. > > I built your schema and inserted some rows: > > insert into network( id ) values( '10.1' ); > > insert into address( id, network ) values( '10.1.0.1', '10.1' ); > insert into address( id, network ) values( '10.1.0.2', '10.1' ); > insert into address( id, network ) values( '10.1.0.3', '10.1' ); > > I then select from network: > > id > ----------- > 10.1.0.0/16 > > and from address: > > id network > -------- ----------- > 10.1.0.1 10.1.0.0/16 > 10.1.0.2 10.1.0.0/16 > 10.1.0.3 10.1.0.0/16 > > Why do you now want to update address.network? They are already > pointing to the right network, aren't they? Networks change during time, being diveded or aggregated or you just enter wrong data during insert. With the UPDATE below, I want to correct the addresses to again point at the right net. While writing this, I learn that because of the pk in network, UPDATEs will be difficult to accomplish (you may need a temporary net to park all addresses of a network to be divided, make the change in network and use the UPDATE below to adjust A.network. I use net '0.0.0.0/32' as 'UNKNOWN' net or for parking. > > I think if you provide some sample data we can figure this out. Yes, this a goof idea. Playing with small tables let you find quickly the right query. Lets start over with a slightly bigger collection of data: insert into network( id ) values( '10.1/16' ); insert into network( id ) values( '10.2/16' ); insert into network(id ) values( '10.3/16' ); insert into address( id, network ) values( '10.1.0.1', '10.1/16' ); insert into address( id, network ) values( '10.1.0.2','10.1/16' ); insert into address( id, network ) values( '10.1.0.3', '10.1/16' ); insert into address( id,network ) values( '10.1.0.4', '10.2/16' ); insert into address( id, network ) values( '10.1.0.5', '10.2/16' ); insertinto address( id, network ) values( '10.1.0.6', '10.3/16' ); insert into address( id, network ) values( '10.200.0.6', '10.3/16' ); -- address not in network insert into address( id, network ) values( '10.200.0.7', '10.3/16' ); -- address not in network During the UPDATE I must avoid NULLs in A.network. This SELECTSELECT A.id, A.network, N.id FROM address A, network N WHERE EXISTS (SELECT A.id WHERE A.id << N.id); does it right: id | network | id - ----------+-------------+------------- 10.1.0.1 | 10.1.0.0/16 | 10.1.0.0/16 10.1.0.2 | 10.1.0.0/16 | 10.1.0.0/16 10.1.0.3| 10.1.0.0/16 | 10.1.0.0/16 10.1.0.4 | 10.2.0.0/16 | 10.1.0.0/16 10.1.0.5 | 10.2.0.0/16 | 10.1.0.0/16 10.1.0.6 |10.3.0.0/16 | 10.1.0.0/16 NOT the UPDATE (meanwhile improved from below): UPDATE address SET network = (SELECT N.id WHERE A.id << N.id) FROM address A, network N, network N1 WHERE EXISTS(SELECT A.id WHERE A.id << N1.id); I get: 'ERROR: null value in column "network" violates not-null constraint'. So my problem reduces to some (basic) SQL question: "How do I avoid the NULLs with EXISTS () or IN () in the WHERE clause of an UPDATE.?" Apparently in this situation the WHERE acts not as an inner join. The problem seems to be the N.id in the SET expression, as shows by this SELECT:SELECT A.id, A.network, (SELECT N.id WHERE A.id << N.id) FROM address A, network N, network N1 WHERE EXISTS(SELECT A.id WHERE A.id << N1.id); which gives: id | network | ?column? - ----------+-------------+------------- 10.1.0.1 | 10.1.0.0/16 | 10.1.0.0/16 10.1.0.1 | 10.1.0.0/16 | 10.1.0.1 | 10.1.0.0/16| 10.1.0.2 | 10.1.0.0/16 | 10.1.0.0/16 10.1.0.2 | 10.1.0.0/16 | 10.1.0.2 | 10.1.0.0/16 | 10.1.0.3 | 10.1.0.0/16| 10.1.0.0/16 10.1.0.3 | 10.1.0.0/16 | 10.1.0.3 | 10.1.0.0/16 | 10.1.0.4 | 10.2.0.0/16 | 10.1.0.0/16 10.1.0.4| 10.2.0.0/16 | 10.1.0.4 | 10.2.0.0/16 | 10.1.0.5 | 10.2.0.0/16 | 10.1.0.0/16 10.1.0.5 | 10.2.0.0/16 | 10.1.0.5 |10.2.0.0/16 | 10.1.0.6 | 10.3.0.0/16 | 10.1.0.0/16 10.1.0.6 | 10.3.0.0/16 | 10.1.0.6 | 10.3.0.0/16 | So I'm looking forward to receive advice for suppressing the NULLs. >>> >>>> UPDATE address >>>> SET network = (SELECT N.id WHERE A.id << N.id) >>>> FROM address A, network N >>>> WHERE A.id << N.id; >>> >>> >>> This also makes no sense. For starters, << is "bitwise shift left" >>> ... >> I'm using 8.0.3 and there are some new operators related to inet and >> cidr data types. >> On page 157, I found "<<" as address/network "is contained in" >> network. >> Finding the net where an address belongs to works as: >> SELECT id FROM network WHERE inet '$p_ipSource' << id; > > Ahh, ok. see above. > Axel Axel Rau, Frankfurt, Germany +49-69-951418-0 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iQEVAwUBQzQt28Fz9+6bacTRAQI5mggAnZoX8wc3v4G/rZAfcr7BAuknSf3ENXBH F6fzpZphXPTqBOr45wtdPxCJPaT8bgx6TfgmyRBTx6Btxmz7iAKnDou7p41cTvSZ gllxr/BmmyzahxonPRqSI/EH10B5BqWdk+4YzwPVyElMLLdvlcq6+yz1jYv17V01 tju81zYt7gj1qwtwpqfJZy27UoO3JYRLsWsDQbXmiplH4IW2qIgIpvuya9Ewd9Eg G58cyet+Usp6QMXTq/vAPH85k7n+XtZ4RC4AAz0sm56CnJkuULeNDYkQTA74MpCC k2WN2R/booWQKtfptjD7ahWnjDI6II2RePLKF6yZP1+vKpdaWirTXA== =KNmK -----END PGP SIGNATURE-----
On Fri, Sep 23, 2005 at 06:31:17PM +0200, Axel Rau wrote: > Networks change during time, being diveded or aggregated or you just > enter wrong data during insert. Have you considered using a CHECK constraint and/or a trigger to ensure that the network in the network column contains the address in the id column? If you have and rejected the idea, what were the reasons? > With the UPDATE below, I want to correct the addresses to again point > at the right net. Does the following statement do what you want? It shouldn't touch the records with no matching network -- what do you want to happen in those cases? This update also might not give the results you want if more than one network matches. UPDATE address SET network = n.id FROM network n WHERE address.id << n.id; > While writing this, I learn that because of the pk in network, > UPDATEs will be difficult to accomplish (you may need a temporary > net to park all addresses of a network to be divided, make the > change in network and use the UPDATE below to adjust A.network. I > use net '0.0.0.0/32' as 'UNKNOWN' net or for parking. I'm not sure I understand what you're saying, but if you're concerned about foreign key violations then making the foreign key constraints deferrable and deferring them during certain updates might remove the need for a temporary "parking" network. -- Michael Fuhr
Axel Rau wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > Am 22.09.2005 um 22:26 schrieb Daryl Richter: > >> Axel Rau wrote: >> >>> Thank you for responding, Daryl, >>> Am 22.09.2005 um 16:45 schrieb Daryl Richter: >>> >>>> Axel Rau wrote: >>>> >>>>> Hi SQLers, >>>>> I have a fk from address to network and try to update the foreign >>>>> key column to point at the network, "it belongs to": >>>>> CREATE TABLE network ( >>>>> id cidr PRIMARY KEY , -- 'PK, ,IPv4/6 Network address' >>>>> ) >>>>> CREATE TABLE address ( >>>>> id inet PRIMARY KEY , -- 'PK of IPv4/6 host >>>>> address' >>>>> network cidr NOT NULL -- 'FK to Network table' >>>>> REFERENCES network ON DELETE CASCADE ON >>>>> UPDATE CASCADE >>>>> ) >>>>> I tried (using the WHERE clause to eliminate the addresses were no >>>>> corresponding net exists): >>>> [snip] > > Networks change during time, being diveded or aggregated or you just > enter wrong data during insert. > With the UPDATE below, I want to correct the addresses to again point at > the right net. While writing this, > I learn that because of the pk in network, UPDATEs will be difficult to > accomplish (you may need a temporary net > to park all addresses of a network to be divided, make the change in > network and use the UPDATE below to > adjust A.network. I use net '0.0.0.0/32' as 'UNKNOWN' net or for parking. > >> >> I think if you provide some sample data we can figure this out. > > Yes, this a goof idea. Playing with small tables let you find quickly > the right query. Lets start over with a slightly bigger > collection of data: > > insert into network( id ) values( '10.1/16' ); > insert into network( id ) values( '10.2/16' ); > insert into network( id ) values( '10.3/16' ); > > insert into address( id, network ) values( '10.1.0.1', '10.1/16' ); > insert into address( id, network ) values( '10.1.0.2', '10.1/16' ); > insert into address( id, network ) values( '10.1.0.3', '10.1/16' ); > insert into address( id, network ) values( '10.1.0.4', '10.2/16' ); > insert into address( id, network ) values( '10.1.0.5', '10.2/16' ); > insert into address( id, network ) values( '10.1.0.6', '10.3/16' ); > > insert into address( id, network ) values( '10.200.0.6', '10.3/16' > ); -- address not in network > insert into address( id, network ) values( '10.200.0.7', '10.3/16' > ); -- address not in network > But those are bad inserts, right? I think that I now see what the problem is --> Why do you have a network table at all? It's redundant. If you just insert the ids into your address table, don't the new PostgreSQL operators give you all the information you need? So, for example, if I inserted the data above and then want to answer the question, "What are all my 16-bit subnets?" select distinct network( set_masklen( id, 16 ) ) from address; network ------------- 10.1.0.0/16 10.2.0.0/16 10.200.0.0/16 [rest snipped] -- Daryl
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 23.09.2005 um 19:32 schrieb Michael Fuhr: > On Fri, Sep 23, 2005 at 06:31:17PM +0200, Axel Rau wrote: >> Networks change during time, being diveded or aggregated or you just >> enter wrong data during insert. > > Have you considered using a CHECK constraint and/or a trigger to > ensure that the network in the network column contains the address > in the id column? If you have and rejected the idea, what were the > reasons? I'm sure this would be the cleanest solution but remember networks change. This constraind would have to update all details (addresses) of a 10/8 being splitted in a 10/9 and a 10.128/9. If this can be done with pg, it is above my current knowledge level. (But feel free to send a suggestion). The other point is performance. Inserting new addresses is a realtime job while correcting network changes is a daily maintenance job. > >> With the UPDATE below, I want to correct the addresses to again point >> at the right net. > > Does the following statement do what you want? Yes. Thank you. This was the 1st answer I'm looking for. Just too simple. > It shouldn't touch > the records with no matching network -- what do you want to happen > in those cases? They will be updated to reference '0.0.0.0/32' (the UNKNOWN net). Will this work: UPDATE address SET network = '0.0.0.0/32' WHERE NOT EXISTS (SELECT address.id << network.id); ? > This update also might not give the results you > want if more than one network matches. This is not possible, because the pk of network is the net cidr. > > UPDATE address SET network = n.id FROM network n WHERE address.id << > n.id; > >> While writing this, I learn that because of the pk in network, >> UPDATEs will be difficult to accomplish (you may need a temporary >> net to park all addresses of a network to be divided, make the >> change in network and use the UPDATE below to adjust A.network. I >> use net '0.0.0.0/32' as 'UNKNOWN' net or for parking. > > I'm not sure I understand what you're saying, but if you're concerned > about foreign key violations then making the foreign key constraints > deferrable and deferring them during certain updates might remove > the need for a temporary "parking" network. I see. But I have to sort in the orphaned addresses anyway if I delete a net for splitting or aggregating or even the net may be abandoned so its children will have to be moved to the UNKNOWN net. Axel Axel Rau, Frankfurt, Germany +49-69-951418-0 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iQEVAwUBQzRVS8Fz9+6bacTRAQKIlwgAn6G8mXkT+vODCW+n9/dUmOB/NYOJVfZL T7/oiYpSVWz1ApcIbcQii+RvhpEZXvgpHif8i5Nd0yeV2347PKwflttGSiWJxVPt mVUrYxjfIjAKmYhbOP25aHK/AGqgjgQRrCOosz3Kbzr5OY4kpNhF67oosGDpIVq+ DcC7nx6+QoHkFByBqL7xTlHDNBS98baVCeGDTIeaJOFEsU1u6t+29ORHloicBo6n 3QZz2qLTMVNzcX/mfS6BqV4POOMSza9zMyRApTwM5lwM+HBAOXvMJ0INiGA0hLE0 o+kVa0I0JTBD4RByxt9c66qFtFN5Y6oZFonm+pBA6nRliBIpt2/8ZA== =fhnD -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 23.09.2005 um 20:43 schrieb Daryl Richter: [snip] > I think that I now see what the problem is --> Why do you have a > network table at all? It's redundant. There are more attributes and fks, I didn't mention to simplify. The schema is events -> addresses -> autnums -> organisations. Axel Axel Rau, Frankfurt, Germany +49-69-951418-0 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iQEVAwUBQzRWjsFz9+6bacTRAQIT4Af/fYbPJtgi9sDcWTm84hbtPl9BUS6Rt0Wo 728oZO7/iEXNSCoaZCqUymK2sKzsE5XV1EZL7xeBCoDZmj+i/3LS0cEAq3Q4corU HEXBvRQ1HcEbvaRioefo5pk6iKvgBxbz9guKawcPwgHJMdvAFSlJpDB1lXOkZgTJ P/8goloEygGFFTEEsquP+4Us0sdBDNe9g0TLapNwI8A9VQ/10tqKF29hEa+tUUun DjkdWAWndAiZvz5zoTfyBTFFKNu9e4dHYNSRMtmz16d5oMYwhIePuHT4lPw03hBW 0FOq4CnkF5YUIrC5XoEta6J1Zxf7FGY1ba+xHBUZjviKCtjtmPS34A== =XBqf -----END PGP SIGNATURE-----
On Fri, Sep 23, 2005 at 09:19:25PM +0200, Axel Rau wrote: > Am 23.09.2005 um 19:32 schrieb Michael Fuhr: > > On Fri, Sep 23, 2005 at 06:31:17PM +0200, Axel Rau wrote: > > > Networks change during time, being diveded or aggregated or you just > > > enter wrong data during insert. > > > > Have you considered using a CHECK constraint and/or a trigger to > > ensure that the network in the network column contains the address > > in the id column? If you have and rejected the idea, what were the > > reasons? > > I'm sure this would be the cleanest solution but remember networks > change. Yes, which is why it's a good idea to automatically propogate those changes to tables that maintain redundant data. If that data isn't reliable then there's little point in maintaining it. > This constraind would have to update all details (addresses) of a > 10/8 being splitted in a 10/9 and a 10.128/9. If this can be done > with pg, it is above my current knowledge level. (But feel free to > send a suggestion). See the documentation for PL/pgSQL and triggers. You could write a trigger function to automatically update the address table whenever the network table changes. Or, since the foreign keys already cascade on update, you could have a trigger on the address table that checks whether the new network contains the IP address, and if it doesn't then it looks up the correct network. > The other point is performance. Inserting new addresses is a realtime > job while correcting network changes is a daily maintenance job. Triggers on update shouldn't affect insert performance, and since you already have a foreign key constraint to slow inserts down, adding a CHECK constraint should have negligible impact. The need for regular corrections is a sign that perhaps the design could be improved. This is one reason to avoid maintaining redundant data if possible: you have to take additional steps to ensure that it remains consistent. > > This update also might not give the results you want if more than > > one network matches. > > This is not possible, because the pk of network is the net cidr. Yes, it is possible, because the update's join condition isn't equality but rather containment. If the network table contains 10.1.0.0/16 and 10.1.0.0/24, then the IP address 10.1.0.1 would match both. If your application prevents 10.1.0.0/16 and 10.1.0.0/24 from both being in the network table then *that's* the reason multiple matches aren't possible, but it's not because of the primary key. -- Michael Fuhr
Am 26.09.2005 um 02:05 schrieb Michael Fuhr: > On Fri, Sep 23, 2005 at 09:19:25PM +0200, Axel Rau wrote: >> Am 23.09.2005 um 19:32 schrieb Michael Fuhr: >>> On Fri, Sep 23, 2005 at 06:31:17PM +0200, Axel Rau wrote: >>>> Networks change during time, being diveded or aggregated or you just >>>> enter wrong data during insert. >>> >>> Have you considered using a CHECK constraint and/or a trigger to >>> ensure that the network in the network column contains the address >>> in the id column? If you have and rejected the idea, what were the >>> reasons? >> >> I'm sure this would be the cleanest solution but remember networks >> change. > > Yes, which is why it's a good idea to automatically propogate those > changes to tables that maintain redundant data. I would not call it redundant but normalized, because network has some attributes, common to all addresses in the net, 1st of all the netmask. > If that data isn't > reliable then there's little point in maintaining it. Reliability is a big issue in my application, because it's some kind of "data mining" of internet structures (networks, Autonomous Systems and abuse addresses). Whois data is seldom correct, so I changed recently to use the internet routing table for the most important network data. > >> This constraind would have to update all details (addresses) of a >> 10/8 being splitted in a 10/9 and a 10.128/9. If this can be done >> with pg, it is above my current knowledge level. (But feel free to >> send a suggestion). > > See the documentation for PL/pgSQL and triggers. You could write > a trigger function to automatically update the address table whenever > the network table changes. Or, since the foreign keys already > cascade on update, you could have a trigger on the address table > that checks whether the new network contains the IP address, and > if it doesn't then it looks up the correct network. I will try this, but be sure I will come back with questions. (-;). > >> The other point is performance. Inserting new addresses is a realtime >> job while correcting network changes is a daily maintenance job. > > Triggers on update shouldn't affect insert performance, and since > you already have a foreign key constraint to slow inserts down, > adding a CHECK constraint should have negligible impact. > > The need for regular corrections is a sign that perhaps the design > could be improved. This is one reason to avoid maintaining redundant > data if possible: you have to take additional steps to ensure that > it remains consistent. I agree. > >>> This update also might not give the results you want if more than >>> one network matches. >> >> This is not possible, because the pk of network is the net cidr. > > Yes, it is possible, because the update's join condition isn't > equality but rather containment. If the network table contains > 10.1.0.0/16 and 10.1.0.0/24, then the IP address 10.1.0.1 would > match both. You mean, unique does not imply none-overlapping for data-type network? Oh, I didn't know that. Who is responsible for this func spec? This is completly contra- real-world-experience. Can this be re-considered for a future release? I do understand now the background of your arguments. First what I have to do, is to fix that network table to forbid overlapps. I expect that I will find overlapps already in the table, because I have not yet written the maintenance code to deleting/reorganizing nets (-;). Deleting involves scanning a 300 MB flat file and looking which row in network has no longer an entry in the flat file. I did try this in pg in the 1st place but could not keep up updating 9 million rows 3 times in 4 hours on a 2x 900MHz 2GB Powermac G4. I currently have no idea how to solve that (Possibly worth another thread). > If your application prevents 10.1.0.0/16 and 10.1.0.0/24 > from both being in the network table then *that's* the reason > multiple matches aren't possible, but it's not because of the > primary key. > > -- > Michael Fuhr Thank you for taking the time to explain this, Axel Axel Rau, Frankfurt, Germany +49-69-951418-0
On Mon, Sep 26, 2005 at 12:34:59PM +0200, Axel Rau wrote: > Am 26.09.2005 um 02:05 schrieb Michael Fuhr: > > On Fri, Sep 23, 2005 at 09:19:25PM +0200, Axel Rau wrote: > > > I'm sure this would be the cleanest solution but remember networks > > > change. > > > > Yes, which is why it's a good idea to automatically propogate those > > changes to tables that maintain redundant data. > > I would not call it redundant but normalized, because network has some > attributes, common to all addresses in the net, 1st of all the netmask. An attribute is redundant if it repeats a fact that can be learned without it. If one table contains IP addresses and another contains networks, then you can associate IP addresses and networks with a join of the two tables; indeed, this is how the "fix the network column" update works. Having a network column in the address table simply repeats what could be learned through the join. > > > > This update also might not give the results you want if more than > > > > one network matches. > > > > > > This is not possible, because the pk of network is the net cidr. > > > > Yes, it is possible, because the update's join condition isn't > > equality but rather containment. If the network table contains > > 10.1.0.0/16 and 10.1.0.0/24, then the IP address 10.1.0.1 would > > match both. > > You mean, unique does not imply none-overlapping for data-type > network? Oh, I didn't know that. > > Who is responsible for this func spec? This is completly contra- > real-world-experience. Can this be re-considered for a future release? This isn't "completely contra-real-world-experience" -- it's just contrary to your particular use case. The networks 10.1.0.0/16 and 10.1.0.0/24 are in fact different, and in some applications having both in a table's primary key column would be perfectly legitimate. For example, a table might store network administration information, where the administrator for 10.1.0.0/16 as a whole is Group X, and the administrator for 10.1.0.0/24 in particular is Group Y. -- Michael Fuhr
Michael Fuhr wrote:> On Mon, Sep 26, 2005 at 12:34:59PM +0200, Axel Rau wrote:>>>Am 26.09.2005 um 02:05 schrieb Michael Fuhr:>>>>>OnFri, Sep 23, 2005 at 09:19:25PM +0200, Axel Rau wrote:>>>>>>>I'm sure this would be the cleanest solution butremember networks>>>>change.>>>>>>Yes, which is why it's a good idea to automatically propogate those>>>changes to tablesthat maintain redundant data.>>>>I would not call it redundant but normalized, because network has some>>attributes,common to all addresses in the net, 1st of all the netmask.>>> An attribute is redundant if it repeats afact that can be learned> without it. If one table contains IP addresses and another contains> networks, then you can associateIP addresses and networks with a> join of the two tables; indeed, this is how the "fix the network> column" updateworks. Having a network column in the address table> simply repeats what could be learned through the join.>> I agree with Michael here. I think the fundamental problem with your schema is that it is possible to have contradictory data between the network and address table, always a bad situation. I would replace network.id with a serial type value and make the cidr a separate column, for example: CREATE TABLE network ( id int not null PRIMARY KEY, address cidr not null, attr1 varchar(10) null ); CREATE TABLE address ( id inet PRIMARY KEY, network int NOT NULL REFERENCES network ); insert into network( id, address, attr1 ) values( 1, '10.1', 'a' ); insert into network( id, address, attr1 ) values( 2, '10.2', 'b' ); go insert into address( id, network ) values( '10.1.0.1', 1 ); insert into address( id, network ) values( '10.1.0.2', 1 ); insert into address( id, network ) values( '10.1.0.3', 1 ); -- OOPS! insert into address( id, network ) values( '10.2.0.4', 1 ); go -- This will "correct" the any addresses put in the wrong network update address set network = ( select id from network where address >> address.id ) [additional network stuff snipped] -- Daryl "We want great men who, when fortune frowns, will not be discouraged." -- Colonel Henry Knox, 1776
Am 27.09.2005 um 16:02 schrieb Daryl Richter: > > An attribute is redundant if it repeats a fact that can be learned > > without it. If one table contains IP addresses and another contains > > networks, then you can associate IP addresses and networks with a > > join of the two tables; indeed, this is how the "fix the network > > column" update works. Having a network column in the address table > > simply repeats what could be learned through the join. > > > > > > I agree with Michael here. I think the fundamental problem with your > schema is that it is possible to have contradictory data between the > network and address table, always a bad situation. > > I would replace network.id with a serial type value and make the cidr > a separate column, for example: > > CREATE TABLE network ( > id int not null PRIMARY KEY, > address cidr not null, > attr1 varchar(10) null > ); > > CREATE TABLE address ( > id inet PRIMARY KEY, > network int NOT NULL > REFERENCES network > ); I agree with Michael too, but I understand him differently: What he says is: "Get rid of the redundancy", which means to me: "remove the fk from address to network completly". The attribute "network" is not realy needed because we can always join address.id << network.id This reduces the necessary logic to keep things consistent. I still can have my cascaded delete in network, have to do it with a trigger. I'm currently looking at performance issues. Introducing a synthetic pk in network does not really make things easier. Instead I introduced an insert/update trigger which prevents from overlaps in network (which is not as a matter of course for cidr columns, I have learnt): CREATE OR REPLACE FUNCTION dd.ids_network_update_insert_check() RETURNS TRIGGER AS $$ BEGIN -- check if new net overlapps with existing one PERFORM N.id FROM network N WHERE NEW.id << N.id OR NEW.id >> N.id; IF FOUND THEN RAISE EXCEPTION '?Attempt to insert overlapping network %', NEW.id; RETURN NULL; END IF; RETURN NEW; END; $$ LANGUAGE 'plpgsql'; Axel Axel Rau, Frankfurt, Germany +49-69-951418-0
Axel.Rau@Chaos1.DE wrote: > > Am 27.09.2005 um 16:02 schrieb Daryl Richter: > > > >>>An attribute is redundant if it repeats a fact that can be learned >>>without it. If one table contains IP addresses and another contains >>>networks, then you can associate IP addresses and networks with a >>>join of the two tables; indeed, this is how the "fix the network >>>column" update works. Having a network column in the address table >>>simply repeats what could be learned through the join. >>> >>> >> >>I agree with Michael here. I think the fundamental problem with your >>schema is that it is possible to have contradictory data between the >>network and address table, always a bad situation. >> >>I would replace network.id with a serial type value and make the cidr a >>separate column, for example: >> >>CREATE TABLE network ( >> id int not null PRIMARY KEY, >> address cidr not null, >> attr1 varchar(10) null >>); >> >>CREATE TABLE address ( >> id inet PRIMARY KEY, >> network int NOT NULL >> REFERENCES network >>); >> >> > > I agree with Michael too, but I understand him differently: What he says is: > > "Get rid of the redundancy", > which means to me: > "remove the fk from address to network completly". > The attribute "network" is not realy needed because we can always join > address.id << network.id > This reduces the necessary logic to keep things consistent. I still can have > > my cascaded delete in network, have to do it with a trigger. > I'm currently looking at performance issues. > > Introducing a synthetic pk in network does not really make things easier. > Instead I introduced an insert/update trigger which prevents from overlaps > in network (which is not as a matter of course for cidr columns, I have > learnt): > Ok, I guess, but isn't tit true now that you can insert a new address row which doesn't belong to any valid network?? > CREATE OR REPLACE FUNCTION dd.ids_network_update_insert_check() RETURNS > TRIGGER AS $$ > BEGIN -- check if new net overlapps with existing one > PERFORM N.id FROM network N WHERE NEW.id << N.id OR NEW.id >> N.id; > IF FOUND THEN > RAISE EXCEPTION '?Attempt to insert overlapping network %', > NEW.id; > RETURN NULL; > END IF; > RETURN NEW; > END; > $$ LANGUAGE 'plpgsql'; > > Axel > > Axel Rau, Frankfurt, Germany +49-69-951418-0 -- Daryl "We want great men who, when fortune frowns, will not be discouraged." -- Colonel Henry Knox, 1776
Am 27.09.2005 um 17:02 schrieb Daryl Richter: > > Ok, I guess, but isn't tit true now that you can insert a new address > row which doesn't belong to any valid network?? Yes, I can. But in the earlier approach, the fk pointed at a special row in network ("UNKNOWN"), which maks no big difference. It's the nature of my ids/abuse application, that there are always faked source addresses, which may not belong to any routed net. Axel Axel Rau, Frankfurt, Germany +49-69-951418-0
Axel.Rau@Chaos1.DE wrote: > > Am 27.09.2005 um 16:02 schrieb Daryl Richter: > > > >>>An attribute is redundant if it repeats a fact that can be learned >>>without it. If one table contains IP addresses and another contains >>>networks, then you can associate IP addresses and networks with a >>>join of the two tables; indeed, this is how the "fix the network >>>column" update works. Having a network column in the address table >>>simply repeats what could be learned through the join. >>> >>> >> >>I agree with Michael here. I think the fundamental problem with your >>schema is that it is possible to have contradictory data between the >>network and address table, always a bad situation. >> >>I would replace network.id with a serial type value and make the cidr a >>separate column, for example: >> >>CREATE TABLE network ( >> id int not null PRIMARY KEY, >> address cidr not null, >> attr1 varchar(10) null >>); >> >>CREATE TABLE address ( >> id inet PRIMARY KEY, >> network int NOT NULL >> REFERENCES network >>); >> >> > > I agree with Michael too, but I understand him differently: What he says is: > > "Get rid of the redundancy", > which means to me: > "remove the fk from address to network completly". > The attribute "network" is not realy needed because we can always join > address.id << network.id > This reduces the necessary logic to keep things consistent. I still can have > > my cascaded delete in network, have to do it with a trigger. > I'm currently looking at performance issues. > > Introducing a synthetic pk in network does not really make things easier. > Instead I introduced an insert/update trigger which prevents from overlaps > in network (which is not as a matter of course for cidr columns, I have > learnt): > Ok, I guess, but isn't tit true now that you can insert a new address row which doesn't belong to any valid network?? > CREATE OR REPLACE FUNCTION dd.ids_network_update_insert_check() RETURNS > TRIGGER AS $$ > BEGIN -- check if new net overlapps with existing one > PERFORM N.id FROM network N WHERE NEW.id << N.id OR NEW.id >> N.id; > IF FOUND THEN > RAISE EXCEPTION '?Attempt to insert overlapping network %', > NEW.id; > RETURN NULL; > END IF; > RETURN NEW; > END; > $$ LANGUAGE 'plpgsql'; > > Axel > > Axel Rau, Frankfurt, Germany +49-69-951418-0 -- Daryl "We want great men who, when fortune frowns, will not be discouraged." -- Colonel Henry Knox, 1776
Am 27.09.2005 um 17:02 schrieb Daryl Richter: > > Ok, I guess, but isn't tit true now that you can insert a new address > row which doesn't belong to any valid network?? Yes, I can. But in the earlier approach, the fk pointed at a special row in network ("UNKNOWN"), which maks no big difference. It's the nature of my ids/abuse application, that there are always faked source addresses, which may not belong to any routed net. Axel Axel Rau, Frankfurt, Germany +49-69-951418-0