Thread: Problems Formulating a SELECT

Problems Formulating a SELECT

From
Charles Hauser
Date:
I am trying to formulate a SELECT and could use some suggestions.

From the TABLES below I would like to find ALL contigs which contain the
same clones except that one (or more) has read='y' and the other(s) have
read='x'.  
Or stated another way: find all contigs composed of (at least) both (x
and y) reads from the same clone. 

For example:
In the data below, the contig '20020630.488.1'(contig_id:13805) is
composed of 2 clones (clone_id='12018' and '12019') which are
894027G09.x and 894027G09.y, respectively.



Example:

TABLE clone     'A'    'B'

clone_id     12018    12019    
project     894    894
plate         27    27
row         G    G    
col         9    9
read         x    y

Table clone_contig:
clone_id    contig_id12018        1380512019        13805

TABLE contig:

contig_id     13805
assembly    20020630
ace        488
ver         1






CREATE TABLE clone (
clone_id SERIAL PRIMARY KEY,
project INTEGER REFERENCES library(project) NOT NULL,
plate INTEGER NOT NULL,
row CHAR(1) NOT NULL,
col INTEGER NOT NULL,
read CHAR(1) NOT NULL,
ver INTEGER NOT NULL,
seq TEXT NOT NULL,
L INTEGER NOT NULL,
Qvals TEXT NOT NULL,
TL INTEGER NOT NULL,
MQAT INTEGER NOT NULL,          
Qstart INTEGER NOT NULL,
Qend INTEGER NOT NULL,
gb_id INTEGER REFERENCES gb(gb_id) NULL,
unigene BOOLEAN NULL,                
UNIQUE (project,plate,row,col,read,ver)
);


CREATE TABLE contig (
contig_id SERIAL PRIMARY KEY,
assembly DATE NOT NULL,
ace INTEGER NOT NULL,
ver INTEGER NOT NULL,
length INTEGER NOT NULL,
seq TEXT NOT NULL,
UNIQUE (assembly,ace,ver)
);


CREATE TABLE clone_contig(    
clone_id INTEGER REFERENCES clone(clone_id) ON DELETE CASCADE,
contig_id INTEGER REFERENCES contig(contig_id) ON DELETE CASCADE,
UNIQUE(clone_id,contig_id)
);



regards,

Charles




Re: Problems Formulating a SELECT

From
Richard Huxton
Date:
On Monday 07 Oct 2002 8:24 pm, Charles Hauser wrote:
> I am trying to formulate a SELECT and could use some suggestions.
>
> From the TABLES below I would like to find ALL contigs which contain the
> same clones except that one (or more) has read='y' and the other(s) have
> read='x'.
> Or stated another way: find all contigs composed of (at least) both (x
> and y) reads from the same clone.
>
> For example:
>
>     In the data below, the contig '20020630.488.1'(contig_id:13805) is
> composed of 2 clones (clone_id='12018' and '12019') which are
> 894027G09.x and 894027G09.y, respectively.
>
> TABLE clone     'A'    'B'
>
> clone_id     12018    12019
> project     894    894
> plate         27    27
> row         G    G
> col         9    9
> read         x    y
>
> Table clone_contig:
>
>     clone_id    contig_id
>     12018        13805
>     12019        13805

How about something like:

CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_config, clone WHERE
clone_config.clone_id = clone.clone_id AND read='x';
CREATE VIEW contig_y_vw AS [same but for y]

SELECT config_id FROM contig_x_vw, contig_y_vw WHERE contig_x_vw.clone_id =
contig_y_vw.clone.id;

You don't need the views, but they make the example easier.

- Richard Huxton


Re: Problems Formulating a SELECT

From
Charles Hauser
Date:
Richard,

Thanks, a followup.

I believe this will not work (novice, so take w/ grain of salt).

I tried the following:

chlamy_est=> CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_contig, clone WHERE 
chlamy_est-> clone_contig.clone_id = clone.clone_id AND read='x';
CREATE
chlamy_est=> CREATE VIEW contig_y_vw AS SELECT contig_id FROM clone_contig, clone WHERE 
chlamy_est-> clone_contig.clone_id = clone.clone_id AND read='y';
CREATE
chlamy_est=> SELECT contig_id FROM contig_x_vw, contig_y_vw WHERE contig_x_vw.clone_id = 
chlamy_est-> contig_y_vw.clone.id;
ERROR:  Column reference "contig_id" is ambiguous

If I understand the logic, you SELECT:all contig_ids where where clone.read = 'x' (VIEW contig_x_vw)all contig_ids
wherewhere clone.read = 'y' (VIEW contig_y_vw)find isect of these two wehere their clone_ids are same
 


However, their clone_ids will never be the same as in the example.
cloneA.project=cloneB.project 
cloneA.plate=cloneB.plate 
cloneA.row=cloneB.row 
cloneA.col=cloneB.col 


TABLE clone      'A'    'B'

clone_id        12018   12019   
project         894     894
plate           27      27
row             G       G       
col             9       9
read            x       y


??

Charles

> >
> >     In the data below, the contig '20020630.488.1'(contig_id:13805) is
> > composed of 2 clones (clone_id='12018' and '12019') which are
> > 894027G09.x and 894027G09.y, respectively.
> >
> > TABLE clone     'A'    'B'
> >
> > clone_id     12018    12019
> > project     894    894
> > plate         27    27
> > row         G    G
> > col         9    9
> > read         x    y
> >
> > Table clone_contig:
> >
> >     clone_id    contig_id
> >     12018        13805
> >     12019        13805
> 
> How about something like:
> 
> CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_config, clone WHERE 
> clone_config.clone_id = clone.clone_id AND read='x';
> CREATE VIEW contig_y_vw AS [same but for y]
> 
> SELECT config_id FROM contig_x_vw, contig_y_vw WHERE contig_x_vw.clone_id = 
> contig_y_vw.clone.id;
> 
> You don't need the views, but they make the example easier.
> 
> - Richard Huxton
> 
> 




Re: Problems Formulating a SELECT

From
Richard Huxton
Date:
On Tuesday 08 Oct 2002 4:04 pm, Charles Hauser wrote:
> Richard,
>
> Thanks, a followup.
>
> I believe this will not work (novice, so take w/ grain of salt).
>
> I tried the following:
>
> chlamy_est=> CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_contig,
> clone WHERE chlamy_est-> clone_contig.clone_id = clone.clone_id AND
> read='x';
> CREATE
> chlamy_est=> CREATE VIEW contig_y_vw AS SELECT contig_id FROM clone_contig,
> clone WHERE chlamy_est-> clone_contig.clone_id = clone.clone_id AND
> read='y';
> CREATE
> chlamy_est=> SELECT contig_id FROM contig_x_vw, contig_y_vw WHERE
> contig_x_vw.clone_id = chlamy_est-> contig_y_vw.clone.id;
> ERROR:  Column reference "contig_id" is ambiguous

My bad, I didn't test it - for the last one you'll want

SELECT contig_x_vw.contig_id FROM contig_x_vw ...

It doesn't matter which id you use (since you want the value in each) but you
will need to tell PG which one you want.

> If I understand the logic, you SELECT:
>     all contig_ids where where clone.read = 'x' (VIEW contig_x_vw)
>     all contig_ids where where clone.read = 'y' (VIEW contig_y_vw)
>     find isect of these two wehere their clone_ids are same
>
>
> However, their clone_ids will never be the same as in the example.

It's checking the "contig_id"s are the same (from clone_contig) - if that
isn't what's wanted you can check the columns below match. Otherwise, you're
quite right, that's what it's doing.

> cloneA.project=cloneB.project
> cloneA.plate=cloneB.plate
> cloneA.row=cloneB.row
> cloneA.col=cloneB.col
>
>
> TABLE clone      'A'    'B'
>
> clone_id        12018   12019
> project         894     894
> plate           27      27
> row             G       G
> col             9       9
> read            x       y

Does that get you any further? If not, I'm back at my desk tomorrow 9am GMT

- Richard Huxton


Re: Problems Formulating a SELECT

From
Charles Hauser
Date:
Richard,

Thanks again.


> My bad, I didn't test it - for the last one you'll want
> 
> SELECT contig_x_vw.contig_id FROM contig_x_vw ...
> 
> It doesn't matter which id you use (since you want the value in each) but you 
> will need to tell PG which one you want.


No problem, after I sent the email I caught the source of the ' "contig_id" is ambiguous' ERROR.

> 
> It's checking the "contig_id"s are the same (from clone_contig) - ....


A contig_id match is what I wanted, but also a requirement that
clone.project,plate,col & row also match for cloneA and cloneB.  I added
what I think gives me the correct match params (see below) and a quick
survey of the results looks to be ok. 

Is there a method to remove duplicate results?  For instance the query
below in part yields :

chlamy_est-> ;contig_id 
-----------    27170    27173    27173    27179    27179    27179    27179    27179

The repetitive occurrence of the same contig_id (27179) occurs as a
result of multiple pairs of clones matching the criteria for a given
contig_id.  So for contig_id 27179 there are 5 pairs which match:
 cloneA          cloneB    contig_id
894018D05.x1    <->    894018D05.y1    27179    
894080H12.x1    <->    894080H12.y1    27179
894094E04.x2    <->    894094E04.y2    27179
894095H03.x1    <->    894095H03.y2    27179
963037B05.x2    <->    963037B05.y1    27179






CREATE VIEW contig_x_vw AS SELECT contig_id,project,plate,row,col FROM clone_contig, clone WHERE 
clone_contig.clone_id = clone.clone_id AND read='x';

CREATE VIEW contig_y_vw AS SELECT contig_id,project,plate,row,col FROM clone_contig, clone WHERE 
clone_contig.clone_id = clone.clone_id AND read='y';


SELECT contig_x_vw.contig_id FROM contig_x_vw, contig_y_vw 
WHERE contig_x_vw.project = contig_y_vw.project ANDcontig_x_vw.plate = contig_y_vw.plate ANDcontig_x_vw.col =
contig_y_vw.colANDcontig_x_vw.row = contig_y_vw.row ANDcontig_x_vw.contig_id = contig_y_vw.contig_id
 
;

> > cloneA.project=cloneB.project
> > cloneA.plate=cloneB.plate
> > cloneA.row=cloneB.row
> > cloneA.col=cloneB.col
> >
> >
> > TABLE clone      'A'    'B'
> >
> > clone_id        12018   12019
> > project         894     894
> > plate           27      27
> > row             G       G
> > col             9       9
> > read            x       y
> 
> Does that get you any further? If not, I'm back at my desk tomorrow 9am GMT
> 
> - Richard Huxton
> 
> 




Re: Problems Formulating a SELECT

From
Richard Huxton
Date:
On Tuesday 08 Oct 2002 7:19 pm, Charles Hauser wrote:
> Richard,
>
[snip]
> Is there a method to remove duplicate results?  For instance the query
> below in part yields :

SELECT DISTINCT ... is what you're after. I'd do it in the views so the join
has less rows to compare against.

> chlamy_est-> ;
>  contig_id
> -----------
>      27170
>      27173
>      27173
>      27179
>      27179
>      27179
>      27179
>      27179

--  Richard Huxton