is a 'pairwise' possible / feasible in SQL?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi, I have a table of the form
aid cid
- ---- -----
1 123
2 456
3 667
3 879
3 123
4 878
4 456
4 123
5 999
5 667
5 879
My goal is to identify for each pair of cid values, the number of
times they have the same aid
Thus for example I would have
pair count
- ---- -----
123 & 456 1
667 & 879 2
...
I currently do this by using a Python script to do a pairwise lookup, as
select count(aid) where cid = 123 and cid = 456;
but I was wondering whether I could construct a single SQL statement
to do this.
Any pointers would be appreciated,
Thanks,
- -------------------------------------------------------------------
Rajarshi Guha <rguha@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84
- -------------------------------------------------------------------
All great discoveries are made by mistake.
-- Young
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
iEYEARECAAYFAkiXRYUACgkQZqGSLFHnnoTJJQCgtvromGcYfQVGsekGFQJU6vTo
oHgAnjpfKSkZR0MqBjdE6WFGO8SBr2WH
=zZJk
-----END PGP SIGNATURE-----
On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote:
select count(aid) where cid = 123 and cid = 456;
but I was wondering whether I could construct a single SQL statement to do
this.Any pointers would be appreciated,
Typed into gmail, so may need some tweaking, but something to the effect of:
select count(*) from table a inner join table b on a.aid=b.bid group
by a.cid,b.cid;
should do the trick, I'd think...
--
- David T. Wilson
david.t.wilson@gmail.com
On Mon, Aug 4, 2008 at 1:02 PM, David Wilson <david.t.wilson@gmail.com> wrote:
On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote:
select count(aid) where cid = 123 and cid = 456;
but I was wondering whether I could construct a single SQL statement to do
this.Any pointers would be appreciated,
Typed into gmail, so may need some tweaking, but something to the effect of:
select count(*) from table a inner join table b on a.aid=b.bid group
by a.cid,b.cid;should do the trick, I'd think...
But then you need remove the dups where you got:
667 999 2
999 667 2
On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
pair count
- ---- -----
123 & 456 1
667 & 879 2
create temp table aic_cid
(
id smallint,
cid smallint
);
insert into aic_cid values (1,123);
insert into aic_cid values (2,456);
insert into aic_cid values (3,667);
insert into aic_cid values (3,879);
insert into aic_cid values (3,123);
insert into aic_cid values (4,878);
insert into aic_cid values (4,456);
insert into aic_cid values (4,123);
insert into aic_cid values (5,999);
insert into aic_cid values (5,667);
insert into aic_cid values (5,879);
select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join
aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null group by
a.cid, b.cid order by a.cid;
ac | bc | count
-----+-----+-------
123 | 456 | 1
123 | 667 | 1
123 | 878 | 1
123 | 879 | 1
456 | 123 | 1
456 | 878 | 1
667 | 123 | 1
667 | 879 | 2
667 | 999 | 1
878 | 123 | 1
878 | 456 | 1
879 | 123 | 1
879 | 667 | 2
879 | 999 | 1
999 | 667 | 1
999 | 879 | 1
Is that what you are looking for?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
pair count
- ---- -----
123 & 456 1
667 & 879 2
<snip>
select a.cid as ac, b.cid as bc, count(*) from aic_cid a left
outer join
aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
group by
a.cid, b.cid order by a.cid;
ac | bc | count
-----+-----+-------
123 | 456 | 1
123 | 667 | 1
123 | 878 | 1
123 | 879 | 1
456 | 123 | 1
456 | 878 | 1
667 | 123 | 1
667 | 879 | 2
667 | 999 | 1
878 | 123 | 1
878 | 456 | 1
879 | 123 | 1
879 | 667 | 2
879 | 999 | 1
999 | 667 | 1
999 | 879 | 1Is that what you are looking for?
Thanks a lot - this is very close. Ideally, I'd want unique pairs, so
the row
879 | 999 | 1
is the same as
999 | 879 | 1
Can these duplicates be avoided?
- -------------------------------------------------------------------
Rajarshi Guha <rguha@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84
- -------------------------------------------------------------------
How I wish I were what I was when I wished I were what I am.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
iEYEARECAAYFAkiXbe8ACgkQZqGSLFHnnoRXPACeMcPqXG4QIf308ufnAHev9hlG
EEoAoLzU5tmL1ipiUIp69N9mOvnsfrES
=JOg1
-----END PGP SIGNATURE-----
Rajarshi Guha wrote
On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
pair count
- ---- -----
123 & 456 1
667 & 879 2<snip>
select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join
aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
group by
a.cid, b.cid order by a.cid;
ac | bc | count
-----+-----+-------
123 | 456 | 1
123 | 667 | 1
123 | 878 | 1
123 | 879 | 1
456 | 123 | 1
456 | 878 | 1
667 | 123 | 1
667 | 879 | 2
667 | 999 | 1
878 | 123 | 1
878 | 456 | 1
879 | 123 | 1
879 | 667 | 2
879 | 999 | 1
999 | 667 | 1
999 | 879 | 1Is that what you are looking for?
Thanks a lot - this is very close. Ideally, I'd want unique pairs, so
the row879 | 999 | 1
is the same as
999 | 879 | 1
Can these duplicates be avoided?
Depends on values and other distinguishing attributes....
For the given example - assuming pairing of a given cid with itself is not to be expected:
add a "and a.cid < b.cid" to the query....
Rainer
Show quoted text
-------------------------------------------------------------------
Rajarshi Guha <rguha@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
How I wish I were what I was when I wished I were what I am.
On m�n, 2008-08-04 at 17:00 -0400, Rajarshi Guha wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
pair count
- ---- -----
123 & 456 1
667 & 879 2<snip>
select a.cid as ac, b.cid as bc, count(*) from aic_cid a left
outer join
aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
group by
a.cid, b.cid order by a.cid;
ac | bc | count
-----+-----+-------
123 | 456 | 1
123 | 667 | 1
...
Is that what you are looking for?Thanks a lot - this is very close. Ideally, I'd want unique pairs, so
the row879 | 999 | 1
is the same as
999 | 879 | 1
Can these duplicates be avoided?
just add a ac<bc condition:
select a.cid as ac, b.cid as bc, count(*)
from aic_cid a left outer join aic_cid b
on a.cid <> b.cid and a.id = b.id
where b.cid is not null AND a.cid < b.cid
group by a.cid, b.cid
order by a.cid;
gnari
On Mon, Aug 04, 2008 at 05:00:31PM -0400, Rajarshi Guha wrote:
On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
select a.cid as ac, b.cid as bc, count(*) from aic_cid a left
outer join
aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
group by
a.cid, b.cid order by a.cid;Is that what you are looking for?
Thanks a lot - this is very close. Ideally, I'd want unique pairs
You just need to change the "a.cid <> b.cid" equality to something
non-symmetric, i.e. "a.cid < b.cid". I'm also not sure why an outer
join is being used. I've rewritten it to:
SELECT a.cid AS ac, b.cid AS bc, count(*)
FROM aic_cid a, aic_cid b
WHERE a.id = b.id AND a.cid < b.cid
GROUP BY a.cid, b.cid
ORDER BY a.cid, b.cid;
and seem to get similar results.
Sam