I found several post about INSERT/UPDATE performance in this group, but actually it was not really what I am searching an answer for...
I have a simple reference table WORD_COUNTS that contains the count of words that appear in a word array storage in another table.
CREATE TABLE WORD_COUNTS ( word text NOT NULL, count integer, CONSTRAINT PK_WORD_COUNTS PRIMARY KEY (word) ) WITHOUT OIDS;
Is there any reason why count is not not null? (That should siplify your code by removing the coalesce)
insert is more efficient than update because update is always a delete followed by an insert.
Oh and group by is nearly always quicker than distinct and can always? be rewritten as such. I'm not 100% sure why its different but it is.
Peter.
I have some PL/pgSQL code in a stored procedure like
FOR r IN select id, array_of_words from word_storage LOOP begin -- insert the missing words insert into WORD_COUNTS ( word, count ) ( select word, 0 from ( select distinct (r.array_of_words) [s.index] as d_word from generate_series(1, array_upper( r.array_of_words, 1 ) ) as s(index) ) as distinct_words where word not in ( select d_word from WORD_COUNTS ) ); -- update the counts update WORD_COUNTS set count = COALESCE( count, 0 ) + 1 where word in ( select distinct ( r.array_of_words)[s.index] as word from generate_series(1, array_upper( r.array_of_words, 1) ) as s(index) ); exception when others then error_count := error_count + 1; end; record_count := record_count + 1; END LOOP;
This code runs extremely slowly. It takes about 10 minutes to process 10000 records and the word storage has more then 2 million records to be processed.
Does anybody have a know-how about populating of such a reference tables and what can be optimized in this situation.
Maybe the generate_series() procedure to unnest the array is the place where I loose the performance?
Are the set update/inserts more effitient, then single inserts/updates run in smaller loops?
Thanks for your help,
Valentine Gogichashvili
---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly