Thread: complex queries

complex queries

From
Kumar S
Date:
Hi Sean and group members,

with all your help i could upload a part of my dataset
successfully into my database.

I am queying my database. currently I am querying two
tables for data.

There are two tables processed_data and
sequence_annotation.


My first query (which is successful):

db=> select proc_exprs, proc_pval from processed_data,
sequence_annotation
where processed_data.seq_anno_id =
sequence_annotation.seq_anno_id
and sequence_annotation.seq_gene_symbol = 'FN1';
 proc_exprs | proc_pval
------------+-----------
    6.72867 |  0.141216
    5.01667 |  0.212055
    4.94133 |  0.216501
   0.214667 |  0.624944
  0.0846667 |  0.525513
    7.19867 |  0.128558
    11.7807 |  0.053066
    11.5487 | 0.0545418
    7.57867 | 0.0984488
       0.33 |  0.678788
  0.0546667 |  0.483358
    11.7227 | 0.0534283
      4.806 |  0.221026
    2.86333 |  0.434024
    3.60933 |  0.324803
       0.43 |  0.742388
      0.018 |  0.427739
      3.222 |  0.375615
(18 rows)


Query 2: A little complicated one (successful):

db=> select proc_exprs, proc_pval, seq_anno_id from
processed_data
 where proc_pval < 0.004;
 proc_exprs | proc_pval  | seq_anno_id
------------+------------+-------------
    107.803 | 0.00382485 |       50237
    106.607 | 0.00387695 |       51903
    158.191 | 0.00240939 |       69829
    170.103 | 0.00220903 |       70008
    137.591 | 0.00284852 |       70009
    217.086 | 0.00165253 |       70214
    186.021 | 0.00198545 |       70223
    213.575 | 0.00168479 |       70558
    121.545 | 0.00330776 |       71717
    221.715 | 0.00161171 |       71757
    248.226 |  0.0014101 |       75187
    111.929 | 0.00365456 |       80492
      138.5 |  0.0022871 |       70008
    90.9887 | 0.00380519 |       70009

Query 3: A complicated one( unsuccessful):

bob=> select proc_exprs, proc_pval, seq_gene_symbol
from processed_data, sequence_annotation
where proc_pval < 0.05 AND
processed_data.seq_anno_id =
sequence_annotation.seq_gene_symbol;


 proc_exprs | proc_pval | seq_gene_symbol
------------+-----------+-----------------
(0 rows)


seq_gene_symbol is in sequence_annotation table and
its primary key is value of seq_anno_id.


Why I am getting 0 row. Whats wrong with my query.

Could you help me please.

Thank you.

K


My tables:
Processed_data table:
   Column    |          Type          |
              Modifiers
-------------+------------------------+---------------------------------------------------------------------
 proc_id     | integer                | not null
default
nextval('public.processed_data_proc_id_seq'::text)
 exp_id      | integer                |
 seq_anno_id | integer                |
 cel_id      | integer                |
 proc_symbol | character varying(100) |
 proc_exprs  | real                   |
 proc_pval   | real                   |
Indexes:
    "processed_data_pkey" primary key, btree (proc_id)
Foreign-key constraints:
    "$2" FOREIGN KEY (exp_id) REFERENCES
experiment(exp_id)
    "$3" FOREIGN KEY (seq_anno_id) REFERENCES
sequence_annotation(seq_anno_id)
    "$4" FOREIGN KEY (cel_id) REFERENCES cel(cel_id)



Sequence_annotation table:
       Column         |           Type           |
                             Modifiers

-----------------------+--------------------------+------------------------------------------------------------------------------
 seq_anno_id           | integer                  |
not null default
nextval('public.sequence_annotation_seq_anno_id_seq'::text)
 seq_probeset_id       | character varying(15000) |
 seq_source            | character varying(15000) |
 seq_transcript_id     | character varying(15000) |
 seq_target_desc       | character varying(15000) |
 seq_rep_pub_id        | character varying(15000) |
 seq_unigene_id        | character varying(15000) |
 seq_alignment         | character varying(15000) |
 seq_gene_title        | character varying(15000) |
 seq_gene_symbol       | character varying(15000) |
 seq_chr_location      | character varying(15000) |
 seq_ensembl           | character varying(15000) |
 seq_locus_link        | character varying(15000) |
 seq_swissport         | character varying(15000) |
 seq_ec                | character varying(15000) |
 seq_omim              | character varying(15000) |
 seq_refseq_prot_id    | character varying(15000) |
 seq_refseq_trans_id   | character varying(15000) |
 seq_flybase           | character varying(15000) |
 seq_agi               | character varying(15000) |
 seq_wormbase          | character varying(15000) |
 seq_mgi_name          | character varying(15000) |
 seq_rgd_name          | character varying(15000) |
 seq_sgd_acc           | character varying(15000) |
 seq_go_bp             | character varying(15000) |
 seq_go_cc             | character varying(15000) |
 seq_go_mf             | character varying(15000) |
 seq_pathway           | character varying(15000) |
 seq_prot_family       | character varying(15000) |
 seq_prot_domain       | character varying(15000) |
 seq_interpro_id       | character varying(15000) |
 seq_tm                | character varying(15000) |
 seq_qtl               | character varying(15000) |
 seq_ann_trasc_cluster | character varying(15000) |
 seq_transc_assign     | character varying(15000) |
 seq_anno_notes        | character varying(15000) |
 genechip_id           | integer                  |
Indexes:
    "sequence_annotation_pkey" primary key, btree
(seq_anno_id)
Foreign-key constraints:
    "$1" FOREIGN KEY (genechip_id) REFERENCES
genechip(genechip_id)



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: complex queries

From
Sean Davis
Date:
On Mar 11, 2005, at 11:45 AM, Kumar S wrote:

> bob=> select proc_exprs, proc_pval, seq_gene_symbol
> from processed_data, sequence_annotation
> where proc_pval < 0.05 AND
> processed_data.seq_anno_id =
> sequence_annotation.seq_gene_symbol;
>
>
>  proc_exprs | proc_pval | seq_gene_symbol
> ------------+-----------+-----------------
> (0 rows)
>
>
> seq_gene_symbol is in sequence_annotation table and
> its primary key is value of seq_anno_id.
>

You probably need to spend some time reading about queries in an SQL
book (many of which are online), but in this particular case, it
doesn't look like seq_gene_symbol is going to equal seq_anno_id.  It
looks like you want:

processed_data.seq_anno_id=sequence_annotation.seq_anno_id

Off-topic, but have you thought about looking at using BASE or CHADO to
keep your microarray data?

Sean


Re: complex queries

From
"Todd Lewis"
Date:
It's very hard to tell exactly which table your columns are coming
from in your SQL. It makes it cleaner and more readable if you alias
your table.

select
 pd.proc_exprs
, pd.proc_pval
, sa.seq_gene_symbol

from
processed_data pd   -- pd is and alias name for processed_data
, sequence_annotation sa

where
pd.seq_anno_id = sa.seq_gene_symbol
AND pd.proc_pval < 0.05;

don't depend upon the query to recognize that the column you are
stating is unique between all the tables selected. Plus using a short
alias that makes sense makes your query more easily readable.

As you start making more complex queries these tips will help.
1. use alias table names in your query.
2. when you have multple tables define the relations at the beginning
of the where clause. Then define your constraint.
3. Try do define your relationships starting with the larger tables.
This speeds up processing of the query.

You can take some real performance hits when your query uses 12
tables.

> Hi Sean and group members,
>
> with all your help i could upload a part of my dataset
> successfully into my database.
>
> I am queying my database. currently I am querying two
> tables for data.
>
> There are two tables processed_data and
> sequence_annotation.
>
>
> My first query (which is successful):
>
> db=> select proc_exprs, proc_pval from processed_data,
> sequence_annotation
> where processed_data.seq_anno_id =
> sequence_annotation.seq_anno_id
> and sequence_annotation.seq_gene_symbol = 'FN1';
>  proc_exprs | proc_pval
> ------------+-----------
>     6.72867 |  0.141216
>     5.01667 |  0.212055
>     4.94133 |  0.216501
>    0.214667 |  0.624944
>   0.0846667 |  0.525513
>     7.19867 |  0.128558
>     11.7807 |  0.053066
>     11.5487 | 0.0545418
>     7.57867 | 0.0984488
>        0.33 |  0.678788
>   0.0546667 |  0.483358
>     11.7227 | 0.0534283
>       4.806 |  0.221026
>     2.86333 |  0.434024
>     3.60933 |  0.324803
>        0.43 |  0.742388
>       0.018 |  0.427739
>       3.222 |  0.375615
> (18 rows)
>
>
> Query 2: A little complicated one (successful):
>
> db=> select proc_exprs, proc_pval, seq_anno_id from
> processed_data
>  where proc_pval < 0.004;
>  proc_exprs | proc_pval  | seq_anno_id
> ------------+------------+-------------
>     107.803 | 0.00382485 |       50237
>     106.607 | 0.00387695 |       51903
>     158.191 | 0.00240939 |       69829
>     170.103 | 0.00220903 |       70008
>     137.591 | 0.00284852 |       70009
>     217.086 | 0.00165253 |       70214
>     186.021 | 0.00198545 |       70223
>     213.575 | 0.00168479 |       70558
>     121.545 | 0.00330776 |       71717
>     221.715 | 0.00161171 |       71757
>     248.226 |  0.0014101 |       75187
>     111.929 | 0.00365456 |       80492
>       138.5 |  0.0022871 |       70008
>     90.9887 | 0.00380519 |       70009
>
> Query 3: A complicated one( unsuccessful):
>
> bob=> select proc_exprs, proc_pval, seq_gene_symbol
> from processed_data, sequence_annotation
> where proc_pval < 0.05 AND
> processed_data.seq_anno_id =
> sequence_annotation.seq_gene_symbol;
>
>
>  proc_exprs | proc_pval | seq_gene_symbol
> ------------+-----------+-----------------
> (0 rows)
>
>
> seq_gene_symbol is in sequence_annotation table and
> its primary key is value of seq_anno_id.
>
>
> Why I am getting 0 row. Whats wrong with my query.
>
> Could you help me please.
>
> Thank you.
>
> K
>
>
> My tables:
> Processed_data table:
>    Column    |          Type          |
>               Modifiers
> -------------+------------------------+---------------------------------------------------------------------
>  proc_id     | integer                | not null
> default
> nextval('public.processed_data_proc_id_seq'::text)
>  exp_id      | integer                |
>  seq_anno_id | integer                |
>  cel_id      | integer                |
>  proc_symbol | character varying(100) |
>  proc_exprs  | real                   |
>  proc_pval   | real                   |
> Indexes:
>     "processed_data_pkey" primary key, btree (proc_id)
> Foreign-key constraints:
>     "$2" FOREIGN KEY (exp_id) REFERENCES
> experiment(exp_id)
>     "$3" FOREIGN KEY (seq_anno_id) REFERENCES
> sequence_annotation(seq_anno_id)
>     "$4" FOREIGN KEY (cel_id) REFERENCES cel(cel_id)
>
>
>
> Sequence_annotation table:
>        Column         |           Type           |
>                              Modifiers
>
-----------------------+--------------------------+------------------------------------------------------------------------------
>  seq_anno_id           | integer                  |
> not null default
> nextval('public.sequence_annotation_seq_anno_id_seq'::text)
>  seq_probeset_id       | character varying(15000) |
>  seq_source            | character varying(15000) |
>  seq_transcript_id     | character varying(15000) |
>  seq_target_desc       | character varying(15000) |
>  seq_rep_pub_id        | character varying(15000) |
>  seq_unigene_id        | character varying(15000) |
>  seq_alignment         | character varying(15000) |
>  seq_gene_title        | character varying(15000) |
>  seq_gene_symbol       | character varying(15000) |
>  seq_chr_location      | character varying(15000) |
>  seq_ensembl           | character varying(15000) |
>  seq_locus_link        | character varying(15000) |
>  seq_swissport         | character varying(15000) |
>  seq_ec                | character varying(15000) |
>  seq_omim              | character varying(15000) |
>  seq_refseq_prot_id    | character varying(15000) |
>  seq_refseq_trans_id   | character varying(15000) |
>  seq_flybase           | character varying(15000) |
>  seq_agi               | character varying(15000) |
>  seq_wormbase          | character varying(15000) |
>  seq_mgi_name          | character varying(15000) |
>  seq_rgd_name          | character varying(15000) |
>  seq_sgd_acc           | character varying(15000) |
>  seq_go_bp             | character varying(15000) |
>  seq_go_cc             | character varying(15000) |
>  seq_go_mf             | character varying(15000) |
>  seq_pathway           | character varying(15000) |
>  seq_prot_family       | character varying(15000) |
>  seq_prot_domain       | character varying(15000) |
>  seq_interpro_id       | character varying(15000) |
>  seq_tm                | character varying(15000) |
>  seq_qtl               | character varying(15000) |
>  seq_ann_trasc_cluster | character varying(15000) |
>  seq_transc_assign     | character varying(15000) |
>  seq_anno_notes        | character varying(15000) |
>  genechip_id           | integer                  |
> Indexes:
>     "sequence_annotation_pkey" primary key, btree
> (seq_anno_id)
> Foreign-key constraints:
>     "$1" FOREIGN KEY (genechip_id) REFERENCES
> genechip(genechip_id)
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)