Thread: is a 'pairwise' possible / feasible in SQL?

is a 'pairwise' possible / feasible in SQL?

From
Rajarshi Guha
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, I have a table of the form

aid     cid
- ----   -----
1      123
2      456
3      667
3      879
3      123
4      878
4      456
4      123
5      999
5      667
5      879

My goal is to identify for each pair of cid values, the number of
times they have the same aid

Thus for example I would have

pair        count
- ----        -----
123 & 456   1
667 & 879   2
...

I currently do this by using a Python script to do a pairwise lookup, as

select count(aid) where cid = 123 and cid = 456;

but I was wondering whether I could construct a single SQL statement
to do this.

Any pointers would be appreciated,

Thanks,
- -------------------------------------------------------------------
Rajarshi Guha  <rguha@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
- -------------------------------------------------------------------
All great discoveries are made by mistake.
    -- Young


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkiXRYUACgkQZqGSLFHnnoTJJQCgtvromGcYfQVGsekGFQJU6vTo
oHgAnjpfKSkZR0MqBjdE6WFGO8SBr2WH
=zZJk
-----END PGP SIGNATURE-----

Re: is a 'pairwise' possible / feasible in SQL?

From
"David Wilson"
Date:
On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote:

> select count(aid) where cid = 123 and cid = 456;
>
> but I was wondering whether I could construct a single SQL statement to do
> this.
>
> Any pointers would be appreciated,

Typed into gmail, so may need some tweaking, but something to the effect of:

select count(*) from table a inner join table b on a.aid=b.bid group
by a.cid,b.cid;

should do the trick, I'd think...

--
- David T. Wilson
david.t.wilson@gmail.com

Re: is a 'pairwise' possible / feasible in SQL?

From
"Scott Marlowe"
Date:
On Mon, Aug 4, 2008 at 1:02 PM, David Wilson <david.t.wilson@gmail.com> wrote:
> On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote:
>
>> select count(aid) where cid = 123 and cid = 456;
>>
>> but I was wondering whether I could construct a single SQL statement to do
>> this.
>>
>> Any pointers would be appreciated,
>
> Typed into gmail, so may need some tweaking, but something to the effect of:
>
> select count(*) from table a inner join table b on a.aid=b.bid group
> by a.cid,b.cid;
>
> should do the trick, I'd think...

But then you need remove the dups where you got:

667 999 2
999 667 2

Re: is a 'pairwise' possible / feasible in SQL?

From
"Francisco Reyes"
Date:
On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
> pair        count
> - ----        -----
> 123 & 456   1
> 667 & 879   2

create temp table aic_cid
(
id smallint,
cid smallint

);

insert into aic_cid values (1,123);
insert into aic_cid values (2,456);
insert into aic_cid values (3,667);
insert into aic_cid values (3,879);
insert into aic_cid values (3,123);
insert into aic_cid values (4,878);
insert into aic_cid values (4,456);
insert into aic_cid values (4,123);
insert into aic_cid values (5,999);
insert into aic_cid values (5,667);
insert into aic_cid values (5,879);


 select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join
aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null group by
a.cid, b.cid order by a.cid;
 ac  | bc  | count
-----+-----+-------
 123 | 456 |     1
 123 | 667 |     1
 123 | 878 |     1
 123 | 879 |     1
 456 | 123 |     1
 456 | 878 |     1
 667 | 123 |     1
 667 | 879 |     2
 667 | 999 |     1
 878 | 123 |     1
 878 | 456 |     1
 879 | 123 |     1
 879 | 667 |     2
 879 | 999 |     1
 999 | 667 |     1
 999 | 879 |     1

Is that what you are looking for?


Re: is a 'pairwise' possible / feasible in SQL?

From
Rajarshi Guha
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
> On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
>> pair        count
>> - ----        -----
>> 123 & 456   1
>> 667 & 879   2
>
<snip>

>  select a.cid as ac, b.cid as bc, count(*) from aic_cid a left
> outer join
> aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
> group by
> a.cid, b.cid order by a.cid;
>  ac  | bc  | count
> -----+-----+-------
>  123 | 456 |     1
>  123 | 667 |     1
>  123 | 878 |     1
>  123 | 879 |     1
>  456 | 123 |     1
>  456 | 878 |     1
>  667 | 123 |     1
>  667 | 879 |     2
>  667 | 999 |     1
>  878 | 123 |     1
>  878 | 456 |     1
>  879 | 123 |     1
>  879 | 667 |     2
>  879 | 999 |     1
>  999 | 667 |     1
>  999 | 879 |     1
>
> Is that what you are looking for?

Thanks a lot - this is very close. Ideally, I'd want unique pairs, so
the row

879 | 999 |     1

is the same as

999 | 879 |     1

Can these duplicates be avoided?

- -------------------------------------------------------------------
Rajarshi Guha  <rguha@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
- -------------------------------------------------------------------
How I wish I were what I was when I wished I were what I am.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkiXbe8ACgkQZqGSLFHnnoRXPACeMcPqXG4QIf308ufnAHev9hlG
EEoAoLzU5tmL1ipiUIp69N9mOvnsfrES
=JOg1
-----END PGP SIGNATURE-----

Re: is a 'pairwise' possible / feasible in SQL?

From
Ragnar
Date:
On mán, 2008-08-04 at 17:00 -0400, Rajarshi Guha wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
> > On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
> >> pair        count
> >> - ----        -----
> >> 123 & 456   1
> >> 667 & 879   2
> >
> <snip>
>
> >  select a.cid as ac, b.cid as bc, count(*) from aic_cid a left
> > outer join
> > aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
> > group by
> > a.cid, b.cid order by a.cid;
> >  ac  | bc  | count
> > -----+-----+-------
> >  123 | 456 |     1
> >  123 | 667 |     1
> > ...
> > Is that what you are looking for?
>
> Thanks a lot - this is very close. Ideally, I'd want unique pairs, so
> the row
>
> 879 | 999 |     1
>
> is the same as
>
> 999 | 879 |     1
>
> Can these duplicates be avoided?

just add a ac<bc condition:
   select a.cid as ac, b.cid as bc, count(*)
   from aic_cid a left outer join aic_cid b
           on a.cid <> b.cid and a.id = b.id
   where b.cid is not null AND  a.cid < b.cid
   group by a.cid, b.cid
   order by a.cid;

gnari



Re: is a 'pairwise' possible / feasible in SQL?

From
Rainer Pruy
Date:

Rajarshi Guha wrote
>
> On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
>> On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
>>> pair        count
>>> - ----        -----
>>> 123 & 456   1
>>> 667 & 879   2
>
> <snip>
>
>>  select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join
>> aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
>> group by
>> a.cid, b.cid order by a.cid;
>>  ac  | bc  | count
>> -----+-----+-------
>>  123 | 456 |     1
>>  123 | 667 |     1
>>  123 | 878 |     1
>>  123 | 879 |     1
>>  456 | 123 |     1
>>  456 | 878 |     1
>>  667 | 123 |     1
>>  667 | 879 |     2
>>  667 | 999 |     1
>>  878 | 123 |     1
>>  878 | 456 |     1
>>  879 | 123 |     1
>>  879 | 667 |     2
>>  879 | 999 |     1
>>  999 | 667 |     1
>>  999 | 879 |     1
>
>> Is that what you are looking for?
>
> Thanks a lot - this is very close. Ideally, I'd want unique pairs, so
> the row
>
> 879 | 999 |     1
>
> is the same as
>
> 999 | 879 |     1
>
> Can these duplicates be avoided?

Depends on values and other distinguishing attributes....
For the given example - assuming pairing of a given cid with itself is not to be expected:
add a "and a.cid < b.cid" to the query....

Rainer
>
> -------------------------------------------------------------------
> Rajarshi Guha  <rguha@indiana.edu>
> GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
> -------------------------------------------------------------------
> How I wish I were what I was when I wished I were what I am.
>
>

Re: is a 'pairwise' possible / feasible in SQL?

From
Sam Mason
Date:
On Mon, Aug 04, 2008 at 05:00:31PM -0400, Rajarshi Guha wrote:
> On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
> > select a.cid as ac, b.cid as bc, count(*) from aic_cid a left
> >outer join
> >aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
> >group by
> >a.cid, b.cid order by a.cid;
> >
> >Is that what you are looking for?
>
> Thanks a lot - this is very close. Ideally, I'd want unique pairs

You just need to change the "a.cid <> b.cid" equality to something
non-symmetric, i.e. "a.cid < b.cid".  I'm also not sure why an outer
join is being used.  I've rewritten it to:

  SELECT a.cid AS ac, b.cid AS bc, count(*)
  FROM aic_cid a, aic_cid b
  WHERE a.id = b.id AND a.cid < b.cid
  GROUP BY a.cid, b.cid
  ORDER BY a.cid, b.cid;

and seem to get similar results.


  Sam