Time consuming process ...3 million records please help - Mailing list pgsql-novice

From Srinivas Iyyer
Subject Time consuming process ...3 million records please help
Date
Msg-id 20060328005950.9895.qmail@web38102.mail.mud.yahoo.com
Whole thread Raw
In response to Does a connection support multiple transactions.  (johnf <jfabiani@yolo.com>)
Responses Re: Time consuming process ...3 million records please help  (Sean Davis <sdavis2@mail.nih.gov>)
List pgsql-novice
Dear group,
  I have two tables and a temp table where I uploaded
data using \copy. This temp table has "3,348,107"
lines of data.

I wrote a plpgsql function to read each record in
temp_table, take the firt col. data (temp_refseq_id)
and get corresponding 'seq_id' from table B and insert
into table A.

I started this process 8 hrs back. It has been running
for last 8 hrs and yet it is not finished. the reason
i did this temp table thing was to speedup process by
writing a server side function.  I still did not get
to win over the time issue here. If this is a
continuous problem I will have to search for another
db system since my files from now on are huge and has
over mil records. I am betting a lot of time in this
case.

Could any one help writing a faster function.

thanks
looking forward to hear from people.

Temp_table:

 temp_refseq_id |  temp_imageid  | temp_genbankacc
----------------+----------------+-----------------
 NM_003604      | IMAGE:1099538  | AA594716
 NM_003604      | IMAGE:853047   | AA668250
 NM_001008860   | IMAGE:3640970  | BC011775
 NM_001008860   | IMAGE:3640970  | BE737509
 NM_001008860   | IMAGE:6040319  | BU079001
 NM_001008860   | IMAGE:6040319  | BU078725
 NM_001008860   | IMAGE:3451448  | BC000957
 NM_001008860   | IMAGE:3451448  | BE539334
 NM_001008860   | IMAGE:4794135  | BG708105
 NM_001008860   | IMAGE:5214087  | BI911674

Table A : (I want to upload data from temp to here)

 spota_id | seq_id | spota_imageid | spota_genbacc
----------+--------+---------------+--------
     23   |  54525 | IMAGE:1099538 | AA594716




Table B : This table is seqdump table where seq_id is
a FK in Table B

 seq_id |  seq_acc    | seq_name
------------------------------
  54519 | NM_152918   | EMR2
  54520 | NM_001008860| CGGBP1
  54521 | NM_020040   | TUBB4Q
  54522 | NM_017525   | CDC42BPG
  54523 | NM_032158   | WBSCR20C
  54524 | NM_004047   | ATP6V0B
  54525 | NM_003604   | PLCB3


Function:

CREATE FUNCTION tab_update() RETURNS integer AS '
DECLARE
        referrer_keys RECORD;
BEGIN
        FOR referrer_keys IN SELECT * from
temp_spotanno LOOP
                INSERT INTO spotanno(seq_id,
spotanno_imageid,spotanno_genbankacc) values((SELECT
seq_id from seqdump where seq_acc =
referrer_keys.temp_refseq_id),referrer_keys.temp_imageid,referrer_keys.temp_genbankacc);
        END LOOP;
return 0;

END;
' LANGUAGE plpgsql;


Thanks
Sri

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

pgsql-novice by date:

Previous
From: johnf
Date:
Subject: Does a connection support multiple transactions.
Next
From: Sean Davis
Date:
Subject: Re: Time consuming process ...3 million records please help