Re: Non-Unique intems - Mailing list pgsql-novice

From Sean Davis
Subject Re: Non-Unique intems
Date
Msg-id 4429E8E6.1090508@mail.nih.gov
Whole thread Raw
In response to Re: Non-Unique intems  (Srinivas Iyyer <srini_iyyer_bio@yahoo.com>)
Responses Re: Non-Unique intems  ("Daniel T. Staal" <DStaal@usa.net>)
List pgsql-novice
Srinivas Iyyer wrote:
> Hi Sean and group,
>  thank you for your help. It worked.
> However, I guess I stepped on a land mine of unique
> and non-unique items.
>
> Here is the problem:
> (Example data) I have table A:
>
> seq_id   seq_name
> 123        ITAM3
> 234        ITAR
>
>
> Table B:
>
> spot_id  seq_id   image_name
> --------------------------------------------
> 849343    123       IMAGE: 12335
> 1348238   234       IMAGE: 12335
>
>
>
>
> Table C:
>
> exp_id |  spot_id  | spot_value
> -------|-----------|-----------
>
> Data to insert into Table C
> IMAGE: 12335   98.03344
>
>
>
>
>
>
> Here the log of query:
>
> arraydb=# SELECT spotanno_id from spotanno
> arraydb-# where spotanno_imageid = 'IMAGE:755402';
>  spotanno_id
> -------------
>       849343
>      1348238
> (2 rows)
>
> arraydb=# select * from spotanno where spotanno_id =
> 849343;
>  spotanno_id | seq_id | spotanno_imageid
> -------------+--------+------------------
>       849343 |  75343 | IMAGE:755402
> (1 row)
>
> arraydb=# select * from spotanno where spotanno_id =
> 1348238;
>  spotanno_id | seq_id | spotanno_imageid
> -------------+--------+------------------
>      1348238 |  50475 | IMAGE:755402
> (1 row)
>
> arraydb=# select * from seqdump where seq_id = 50475;
>  seq_id |  seq_acc  | seq_name
> --------+-----------+----------
>   50475 | NM_005501 | ITGA3
> (1 row)
>
> arraydb=# select * from seqdump where seq_id = 75343;
>  seq_id |  seq_acc  | seq_name
> --------+-----------+----------
>   75343 | NM_002204 | ITGA3
> (1 row)
>
>
> An instance of row of the data file that to be
> uploaded:
>
>
> IMAGE:755402    0.299781845119261
> 12.3638881597060
>
>
>
> The question:
> when I have a non-unique item (viz. IMAGE:755402 )
> what is the approach one should generally take.
>
> Do you have any suggestions/solution. Please help me.
>
> Thanks again.
>
> -sri

Sri,

Unfortunately, the biological data that you are working with has
one-to-many and many-to-many relationships.  While one would like to
believe that there should not be such relationships, there are.
Therefore, you need to store the data in a manner that respects those
manifold relationships.  In other words, store the data in a table with
whatever is the primary key (in this case, it looks like an IMAGE ID)
and store the annotation separately, allowing for a one-to-many
relationship between IMAGE ID and gene.  There is no way around this and
to try to eliminate these "non-unique" situations in this particular
case won't be possible; instead, you have to understand where the data
are coming from and design your database to match, not the other way
around.

Sean

pgsql-novice by date:

Previous
From: Michael Talbot-Wilson
Date:
Subject: Re: plpgsql questions
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Bytea and perl