Thread: is a 'pairwise' possible / feasible in SQL?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I have a table of the form aid cid - ---- ----- 1 123 2 456 3 667 3 879 3 123 4 878 4 456 4 123 5 999 5 667 5 879 My goal is to identify for each pair of cid values, the number of times they have the same aid Thus for example I would have pair count - ---- ----- 123 & 456 1 667 & 879 2 ... I currently do this by using a Python script to do a pairwise lookup, as select count(aid) where cid = 123 and cid = 456; but I was wondering whether I could construct a single SQL statement to do this. Any pointers would be appreciated, Thanks, - ------------------------------------------------------------------- Rajarshi Guha <rguha@indiana.edu> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 - ------------------------------------------------------------------- All great discoveries are made by mistake. -- Young -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkiXRYUACgkQZqGSLFHnnoTJJQCgtvromGcYfQVGsekGFQJU6vTo oHgAnjpfKSkZR0MqBjdE6WFGO8SBr2WH =zZJk -----END PGP SIGNATURE-----
On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote: > select count(aid) where cid = 123 and cid = 456; > > but I was wondering whether I could construct a single SQL statement to do > this. > > Any pointers would be appreciated, Typed into gmail, so may need some tweaking, but something to the effect of: select count(*) from table a inner join table b on a.aid=b.bid group by a.cid,b.cid; should do the trick, I'd think... -- - David T. Wilson david.t.wilson@gmail.com
On Mon, Aug 4, 2008 at 1:02 PM, David Wilson <david.t.wilson@gmail.com> wrote: > On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote: > >> select count(aid) where cid = 123 and cid = 456; >> >> but I was wondering whether I could construct a single SQL statement to do >> this. >> >> Any pointers would be appreciated, > > Typed into gmail, so may need some tweaking, but something to the effect of: > > select count(*) from table a inner join table b on a.aid=b.bid group > by a.cid,b.cid; > > should do the trick, I'd think... But then you need remove the dups where you got: 667 999 2 999 667 2
On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote: > pair count > - ---- ----- > 123 & 456 1 > 667 & 879 2 create temp table aic_cid ( id smallint, cid smallint ); insert into aic_cid values (1,123); insert into aic_cid values (2,456); insert into aic_cid values (3,667); insert into aic_cid values (3,879); insert into aic_cid values (3,123); insert into aic_cid values (4,878); insert into aic_cid values (4,456); insert into aic_cid values (4,123); insert into aic_cid values (5,999); insert into aic_cid values (5,667); insert into aic_cid values (5,879); select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null group by a.cid, b.cid order by a.cid; ac | bc | count -----+-----+------- 123 | 456 | 1 123 | 667 | 1 123 | 878 | 1 123 | 879 | 1 456 | 123 | 1 456 | 878 | 1 667 | 123 | 1 667 | 879 | 2 667 | 999 | 1 878 | 123 | 1 878 | 456 | 1 879 | 123 | 1 879 | 667 | 2 879 | 999 | 1 999 | 667 | 1 999 | 879 | 1 Is that what you are looking for?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote: > On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote: >> pair count >> - ---- ----- >> 123 & 456 1 >> 667 & 879 2 > <snip> > select a.cid as ac, b.cid as bc, count(*) from aic_cid a left > outer join > aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null > group by > a.cid, b.cid order by a.cid; > ac | bc | count > -----+-----+------- > 123 | 456 | 1 > 123 | 667 | 1 > 123 | 878 | 1 > 123 | 879 | 1 > 456 | 123 | 1 > 456 | 878 | 1 > 667 | 123 | 1 > 667 | 879 | 2 > 667 | 999 | 1 > 878 | 123 | 1 > 878 | 456 | 1 > 879 | 123 | 1 > 879 | 667 | 2 > 879 | 999 | 1 > 999 | 667 | 1 > 999 | 879 | 1 > > Is that what you are looking for? Thanks a lot - this is very close. Ideally, I'd want unique pairs, so the row 879 | 999 | 1 is the same as 999 | 879 | 1 Can these duplicates be avoided? - ------------------------------------------------------------------- Rajarshi Guha <rguha@indiana.edu> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 - ------------------------------------------------------------------- How I wish I were what I was when I wished I were what I am. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkiXbe8ACgkQZqGSLFHnnoRXPACeMcPqXG4QIf308ufnAHev9hlG EEoAoLzU5tmL1ipiUIp69N9mOvnsfrES =JOg1 -----END PGP SIGNATURE-----
On mán, 2008-08-04 at 17:00 -0400, Rajarshi Guha wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote: > > On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote: > >> pair count > >> - ---- ----- > >> 123 & 456 1 > >> 667 & 879 2 > > > <snip> > > > select a.cid as ac, b.cid as bc, count(*) from aic_cid a left > > outer join > > aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null > > group by > > a.cid, b.cid order by a.cid; > > ac | bc | count > > -----+-----+------- > > 123 | 456 | 1 > > 123 | 667 | 1 > > ... > > Is that what you are looking for? > > Thanks a lot - this is very close. Ideally, I'd want unique pairs, so > the row > > 879 | 999 | 1 > > is the same as > > 999 | 879 | 1 > > Can these duplicates be avoided? just add a ac<bc condition: select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join aic_cid b on a.cid <> b.cid and a.id = b.id where b.cid is not null AND a.cid < b.cid group by a.cid, b.cid order by a.cid; gnari
Rajarshi Guha wrote > > On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote: >> On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote: >>> pair count >>> - ---- ----- >>> 123 & 456 1 >>> 667 & 879 2 > > <snip> > >> select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join >> aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null >> group by >> a.cid, b.cid order by a.cid; >> ac | bc | count >> -----+-----+------- >> 123 | 456 | 1 >> 123 | 667 | 1 >> 123 | 878 | 1 >> 123 | 879 | 1 >> 456 | 123 | 1 >> 456 | 878 | 1 >> 667 | 123 | 1 >> 667 | 879 | 2 >> 667 | 999 | 1 >> 878 | 123 | 1 >> 878 | 456 | 1 >> 879 | 123 | 1 >> 879 | 667 | 2 >> 879 | 999 | 1 >> 999 | 667 | 1 >> 999 | 879 | 1 > >> Is that what you are looking for? > > Thanks a lot - this is very close. Ideally, I'd want unique pairs, so > the row > > 879 | 999 | 1 > > is the same as > > 999 | 879 | 1 > > Can these duplicates be avoided? Depends on values and other distinguishing attributes.... For the given example - assuming pairing of a given cid with itself is not to be expected: add a "and a.cid < b.cid" to the query.... Rainer > > ------------------------------------------------------------------- > Rajarshi Guha <rguha@indiana.edu> > GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 > ------------------------------------------------------------------- > How I wish I were what I was when I wished I were what I am. > >
On Mon, Aug 04, 2008 at 05:00:31PM -0400, Rajarshi Guha wrote: > On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote: > > select a.cid as ac, b.cid as bc, count(*) from aic_cid a left > >outer join > >aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null > >group by > >a.cid, b.cid order by a.cid; > > > >Is that what you are looking for? > > Thanks a lot - this is very close. Ideally, I'd want unique pairs You just need to change the "a.cid <> b.cid" equality to something non-symmetric, i.e. "a.cid < b.cid". I'm also not sure why an outer join is being used. I've rewritten it to: SELECT a.cid AS ac, b.cid AS bc, count(*) FROM aic_cid a, aic_cid b WHERE a.id = b.id AND a.cid < b.cid GROUP BY a.cid, b.cid ORDER BY a.cid, b.cid; and seem to get similar results. Sam