On Thu, 21 Nov 2002, Josh Berkus wrote:
> Doing several large updates in a single transaction can lower performance if
> the number of updates is sufficient to affect index usability and a VACUUM is
> really needed between them. For example, a series of large data
> transformation statements on a single table or set of related tables should
> have VACCUUM statements between them, thus preventing you from putting them
> in a single transaction.
>
> Example, the series:
> 1. INSERT 10,000 ROWS INTO table_a;
> 2. UPDATE 100,000 ROWS IN table_a WHERE table_b;
> 3. UPDATE 100,000 ROWS IN table_c WHERE table_a;
>
> WIll almost certainly need a VACUUM or even VACUUM FULL table_a after 2),
> requiring you to split the update series into 2 transactions. Otherwise, the
> "where table_a" condition in step 3) will be extremely slow.
Very good point. One that points out the different mind set one needs
when dealing with pgsql.
> > > It can be dangerous though ... in the event of a power outage, for
> > > example, your database could be corrupted and difficult to recover. So
> > > ... "at your own risk".
> >
> > No, the database will not be corrupted, at least not in my experience.
> > however, you MAY lose data from transactions that you thought were
> > committed. I think Tom posted something about this a few days back.
>
> Hmmm ... have you done this? I'd like the performance gain, but I don't want
> to risk my data integrity. I've seen some awful things in databases (such as
> duplicate primary keys) from yanking a power cord repeatedly.
I have, with killall -9 postmaster, on several occasions during testing
under heavy parallel load. I've never had 7.2.x fail because of this.