Thread: Aggregate question

Aggregate question

From
Sean Davis
Date:
I'm sorry for the simple question, but....  I have a table (description
given below) that I want to do something like:

select oligo,target_id,max(2*matches-mismatch) as "score" from hit
natural join oligo where oligo like 'H200000%' group by
oligo,target_id;


    oligo    |            target_id            | score
------------+---------------------------------+-------
  H200000001 | ENST00000286479                 |   138
  H200000001 | gi|4557782|ref|NM_000015.1|     |   138
  H200000005 | ENST00000206765                 |   138
  H200000005 | gi|4507474|ref|NM_000359.1|     |   138
  H200000006 | ENST00000262093                 |   138
  H200000006 | gi|4557592|ref|NM_000140.1|     |   138
  H200000007 | ENST00000287225                 |   138
  H200000007 | ENST00000327775                 |    96  <---I don't want
this
  H200000007 | gi|4504012|ref|NM_000170.1|     |   138
  H200000008 | ENST00000278888                 |   138
  H200000008 | gi|23397640|ref|NM_000139.2|    |   138
  H200000010 | ENST00000309399                 |   138
  H200000010 | gi|6806892|ref|NM_000595.2|     |   138

But, what I actually want is only those target_ids that reach the max
score like:


    oligo    |            target_id            | score
------------+---------------------------------+-------
  H200000001 | ENST00000286479                 |   138
  H200000001 | gi|4557782|ref|NM_000015.1|     |   138
  H200000005 | ENST00000206765                 |   138
  H200000005 | gi|4507474|ref|NM_000359.1|     |   138
  H200000006 | ENST00000262093                 |   138
  H200000006 | gi|4557592|ref|NM_000140.1|     |   138
  H200000007 | ENST00000287225                 |   138
  H200000007 | gi|4504012|ref|NM_000170.1|     |   138
  H200000008 | ENST00000278888                 |   138
  H200000008 | gi|23397640|ref|NM_000139.2|    |   138
  H200000010 | ENST00000309399                 |   138
  H200000010 | gi|6806892|ref|NM_000595.2|     |   138

I just can't seem to quite get it.

Thanks,
Sean





Table description:

                             Table "public.hit"
    Column    |     Type     |                  Modifiers
-------------+--------------
+----------------------------------------------
  hit_id      | integer      | not null default
nextval('hit_pk_seq'::text)
  analysis_id | integer      |
  matches     | integer      |
  mismatch    | integer      |
  repmatch    | integer      |
  n           | integer      |
  q_gap_count | integer      |
  q_gap_bases | integer      |
  t_gap_count | integer      |
  t_gap_bases | integer      |
  strand      | character(1) |
  oligo_id    | integer      |
  qsize       | integer      |
  qstart      | integer      |
  qend        | integer      |
  target_id   | text         |
  tsize       | integer      |
  tstart      | integer      |
  tend        | integer      |
  block_count | integer      |
  block_sizes | text         |
  qstarts     | text         |
  tstarts     | text         |
Indexes:
     "hit_pkey" primary key, btree (hit_id)
     "analysis_id" btree (analysis_id)
     "hit2" btree (oligo_id)
     "hit_analysis_id" btree (analysis_id)
     "oligo_id" btree (oligo_id, analysis_id, target_id, tstart, tend)



Re: Aggregate question

From
Oliver Elphick
Date:
On Fri, 2004-08-27 at 11:54, Sean Davis wrote:
> I'm sorry for the simple question, but....  I have a table (description
> given below) that I want to do something like:
>
> select oligo,target_id,max(2*matches-mismatch) as "score" from hit
> natural join oligo where oligo like 'H200000%' group by
> oligo,target_id;

...
> But, what I actually want is only those target_ids that reach the max
> score like:

The way to restrict the GROUP BY output is with a HAVING clause.  I
think it should go something like this:

select oligo,target_id,max(2*matches-mismatch) as "score" from hit
natural join oligo where oligo like 'H200000%' group by oligo,target_id
HAVING max(2*matches-mismatch) = (SELECT max(2*matches-mismatch) from
hit);

I don't think you can use score as a column name in the having clause,
which is why I have repeated the original definition of that column.

--
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
                 ========================================
     "Preach the word; be instant in season, out of season;
      reprove, rebuke, exhort with all longsuffering and
      doctrine."          II Timothy 4:2