Re: Savepoints in transactions for speed? - Mailing list pgsql-performance

From Mike Blackwell
Subject Re: Savepoints in transactions for speed?
Date
Msg-id CANPAkgv=bfTetDjJUF2irugq98RqaGiCz1=GXxOoDTXkY-bwFw@mail.gmail.com
Whole thread Raw
In response to Re: Savepoints in transactions for speed?  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: Savepoints in transactions for speed?
Re: Savepoints in transactions for speed?
List pgsql-performance

On Wed, Nov 28, 2012 at 9:48 PM, Claudio Freire <klaussfreire@gmail.com> wrote:

On Wed, Nov 28, 2012 at 8:28 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> The main problem with a long-running delete or update transaction is
> that the dead tuples (deleted tuples or the old version of an updated
> tuple) can't be removed until the transaction finishes. That can cause
> temporary "bloat", but 1.5M records shouldn't be noticeable.

Not really that fast if you have indices (and who doesn't have a PK or two).

I've never been able to update (update) 2M rows in one transaction in
reasonable times (read: less than several hours) without dropping
indices. Doing it in batches is way faster if you can't drop the
indices, and if you can leverage HOT updates.

What I'm trying at this point is:

BEGIN;
DROP INDEX -- only one unique index exists
DELETE FROM table;
COPY table FROM STDIN;
COMMIT;
CREATE INDEX CONCURRENTLY;

Do I understand correctly that DROP/CREATE index are not transactional, and thus the index will disappear immediately for other transactions?  Am I better off in that case moving the DROP INDEX outside the transaction?

The access pattern for the table is such that I can afford the occasional stray hit without an index during the reload time.  It's been pretty quick using the above.

Mike

pgsql-performance by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL
Next
From: Scott Marlowe
Date:
Subject: Re: Savepoints in transactions for speed?