On Tue, 7 May 2002 11:03:50 -0400
Sean McCorkle <mccorkle@avenger.bio.bnl.gov> wrote:
> but I (and my colleagues) would much rather see this, which draws
> attention to the duplicates (or multiples) in the left column.
>
> tag gb_id pos descrip
>
> ACTATTTTTAGAGACCC NM_032685.1 307 hypothetical protein MGC13005 (MGC13005),
> AGAAAAAAAAAAAAAAA NM_019110.2 27 hypothetical protein P1 p373c6 (P1P373C6)
> NM_019110.2 27 hypothetical protein P1 p373c6 (P1P373C6)
> AGCCACCACGCCTGGTC NM_003693.1 260 acetyl LDL receptor; SREC=scavenger
> AGCCACCGCGCCCGGCC NM_007081.1 486 RAB, member of RAS oncogene family-like 2B
> NM_013412.1 486 RAB, member of RAS oncogene family-like 2A
> AGCCACCGCGCCTGGCC NM_000651.2 229 complement component (3b/4b) receptor 1,
> NM_000573.2 229 complement component (3b/4b) receptor 1,
> ATCAAAAAAAAAAAAAA NM_079421.1 25 cyclin-dependent kinase inhibitor 2D
How about this method of appending sequences as unique indices?
First:
create temp sequence dna_rownum1;
create temp sequence dna_rownum2;
Secand:
SELECT setval('dna_rownum1', 1, false); -- (1)
SELECT setval('dna_rownum2', 1, false); -- (2)
SELECT (CASE WHEN t1.idx = t3.idx THEN t1.tag ELSE NULL END) AS tag, t1.gb_id, t1.pos,
t1.descrip FROM (SELECT *, nextval('dna_rownum1') AS idx FROM dna ORDER BY idx
) AS t1, (SELECT t2.tag, MIN(t2.idx) AS idx FROM (SELECT tag,
nextval('dna_rownum2')AS idx FROM dna ORDER BY idx) AS t2 GROUP BY
t2.tag ) AS t3WHERE t1.tag = t3.tagORDER BY t1.tag, t1.idx; -- (3)
Note: (1) and (2) need to be executed at the same time, but (3) doesn't. And CREATE TEMP SEQUENCE is practicable
in7.2 or later.
Regards,
Masaru Sugawara