Home > mailing lists

Re: update 600000 rows - Mailing list pgsql-performance

From	andrew@pillette.com
Subject	Re: update 600000 rows
Date	December 16, 2007 01:30:41
Msg-id	200712160521.lBG5Llf25219@pillette.com Whole thread Raw
In response to	update 600000 rows (okparanoid@free.fr)
Responses	Re: update 600000 rows
List	pgsql-performance

Tree view

Lo�c Marteau <okparanoid@free.fr> wrote ..
> Steve Crawford wrote:
> > If this
> > is correct, I'd first investigate simply loading the csv data into a
> > temporary table, creating appropriate indexes, and running a single
> > query to update your other table.

My experience is that this is MUCH faster. My predecessor in my current position was doing an update from a csv file
lineby line with perl. That is one reason he is my predecessor. Performance did not justify continuing his contract. 

> i can try this. The problem is that i have to make an insert if the
> update don't have affect a rows (the rows don't exist yet). The number
> of rows affected by insert is minor regards to the numbers of updated
> rows and was 0 when i test my script). I can do with a temporary table
> : update all the possible rows and then insert the rows that are in
> temporary table and not in the production table with a 'not in'
> statement. is this a correct way ?

That's what I did at first, but later I found better performance with a TRIGGER on the permanent table that deletes the
targetof an UPDATE, if any, before the UPDATE. That's what PG does anyway, and now I can do the entire UPDATE in one
command.

pgsql-performance by date:

From: Greg Smith
Date: 15 December 2007, 11:47:11
Subject: Re: update 600000 rows

From: Bruce Momjian
Date: 16 December 2007, 02:39:52
Subject: Re: RAID arrays and performance

Re: update 600000 rows - Mailing list pgsql-performance

Previous

Next