Home > mailing lists

Re: Bunching "transactions" - Mailing list pgsql-performance

From	Erik Jones
Subject	Re: Bunching "transactions"
Date	October 25, 2007 13:00:33
Msg-id	1041C95E-ABA0-4FD0-8B77-586374444F4F@myemma.com Whole thread Raw
In response to	Bunching "transactions" (Jean-David Beyer <jeandavid8@verizon.net>)
List	pgsql-performance

Tree view

On Oct 25, 2007, at 10:30 AM, Jean-David Beyer wrote:

> I have just changed around some programs that ran too slowly (too
> much time
> in io-wait) and they speeded up greatly. This was not unexpected,
> but I
> wonder about the limitations.
>
> By transaction, I mean a single INSERT or a few related INSERTs.
>
> What I used to do is roughly like this:
>
> for each file {
>    for each record {
>       BEGIN WORK;
>       INSERT stuff in table(s);
>       if error {
>     ROLLBACK WORK
>       }
>       else {
>          COMMIT WORK;
>       }
>    }
> }
>
> The speedup was the obvious one:
>
> for each file {
>    BEGIN WORK;
>    for each record {
>       INSERT stuff in table(s);
>    }
>    if error {
>       ROLLBACK WORK
>    }
>    else {
>       COMMIT WORK;
>    }
> }
>
> This means, of course, that the things I think of as transactions
> have been
> bunched into a much smaller number of what postgreSQL thinks of as
> large
> transactions, since there is only one per file rather than one per
> record.
> Now if a file has several thousand records, this seems to work out
> just great.
>
> But what is the limitation on such a thing? In this case, I am just
> populating the database and there are no other users at such a
> time. I am
> willing to lose the whole insert of a file if something goes wrong
> -- I
> would fix whatever went wrong and start over anyway.
>
> But at some point, disk IO would have to be done. Is this just a
> function of
> how big /pgsql/data/postgresql.conf's shared_buffers is set to? Or
> does it
> have to do with wal_buffers and checkpoint_segments?

You're reading data from a file and generating inserts?  Can you not
use COPY?  That would be the most performant.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

pgsql-performance by date:

From: Jean-David Beyer
Date: 25 October 2007, 12:44:40
Subject: Bunching "transactions"

From: Heikki Linnakangas
Date: 25 October 2007, 13:11:55
Subject: Re: Bunching "transactions"

Re: Bunching "transactions" - Mailing list pgsql-performance

Previous

Next