Re: What's the fastest way to do this? - Mailing list pgsql-general

From Andrew Gould
Subject Re: What's the fastest way to do this?
Date
Msg-id 20011108221345.92109.qmail@web13409.mail.yahoo.com
Whole thread Raw
In response to What's the fastest way to do this?  (Orion <o2@trustcommerce.com>)
List pgsql-general
Are you updating directly from the flat file?  Or are
you copying the new data into an indexed table that is
emptied and vacuumed after daily updates, and
updating/inserting from there?

Andrew Gould

--- Orion <o2@trustcommerce.com> wrote:
>
> I have several really big tables that have columns
> uniquely identified by
> single or multiple rows.  [ I have about 25 tables,
> 10k to 500k rows
> per table ]
>
> Each day I get a flat file of updates.  I have no
> way of knowing which
> lines in the file are new records and which are
> updates for existing
> records.
>
> I need a way to insert the new ones and update the
> old ones.   I have
> a couple of ideas but none of them seem fast enough
> ( I will soon
> be getting updates faster than I can feed them into
> the database ).
>
> I am running postgres 7.1.3-1PDGD.
>
> Here are my ideas...
>
>
---------------------------------------------------------
>
> 1)
>
> select count(*) from table where id = X;
> (if the count == 1)
> update table set a = 1, b = 2, c = 3 where id = X;
>
> 2)
>
> insert into table (id, a, b, c) values (X, 1, 2, 3);
> (if there is an exception b/c of duplicate keys)
> update table set a = 1, b = 2, c = 3 where id = X;
>
> 3)
>
> delete from table where id = X;
> (if the record is new nothing happens)
> insert into table (id, a, b, c) values (X, 1, 2, 3);
>
>
----------------------------------------------------------
>
> right now I am doing (3) but it is very slow - I
> am getting about 8 records/second through and
> yes my id column is indexed.  The other problem
> with (3) is it requires more vacuuming.
>
> I cant imagine that (1) would be any faster.
>
> It would seem that (2) would be the fastest but some
> of my tables key on multiple columns
> (where id1 = X and id2 = Y) and thus cant use a
> primary
> key to raise an exception.
>
> I cant imagine I am the first person to try this...
> What's the standard/best/fastest way to do this.
>
> Thanks in advance for any advice!
>
>         Orion
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com

pgsql-general by date:

Previous
From: "Peter Darley"
Date:
Subject: Re: pg_dump and DEFAULT column values
Next
From: Stephan Szabo
Date:
Subject: Re: Performance: Sum()