Re: improving write performance for logging application - Mailing list pgsql-performance

From Steve Eckmann
Subject Re: improving write performance for logging application
Date
Msg-id 43BBD4EC.5020207@computer.org
Whole thread Raw
In response to Re: improving write performance for logging application  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: improving write performance for logging application  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
Steve Eckmann <eckmann@computer.org> writes: 
We also found that we could improve MySQL performance significantly
using MySQL's "INSERT" command extension allowing multiple value-list
tuples in a single command; the rate for MyISAM tables improved to
about 2600 objects/second. PostgreSQL doesn't support that language
extension. Using the COPY command instead of INSERT might help, but
since rows are being generated on the fly, I don't see how to use COPY
without running a separate process that reads rows from the
application and uses COPY to write to the database.   
Can you conveniently alter your application to batch INSERT commands
into transactions?  Ie
BEGIN;INSERT ...;... maybe 100 or so inserts ...COMMIT;BEGIN;... lather, rinse, repeat ...

This cuts down the transactional overhead quite a bit.  A downside is
that you lose multiple rows if any INSERT fails, but then the same would
be true of multiple VALUES lists per INSERT.
		regards, tom lane 
Thanks for the suggestion, Tom. Yes, I think I could do that. But I thought what I was doing now was effectively the same, because the PostgreSQL 8.0.0 Documentation says (section 27.3.1): "It is allowed to include multiple SQL commands (separated by semicolons) in the command string. Multiple queries sent in a single PQexec call are processed in a single transaction...." Our simulation application has nearly 400 event types, each of which is a C++ class for which we have a corresponding database table. So every thousand events or so I issue one PQexec() call for each event type that has unlogged instances, sending INSERT commands for all instances. For example,

    PQexec(dbConn, "INSERT INTO FlyingObjectState VALUES (...); INSERT INTO FlyingObjectState VALUES (...); ...");

My thought was that this would be a good compromise between minimizing transactions (one per event class per buffering interval instead of one per event) and minimizing disk seeking (since all queries in a single transaction insert rows into the same table). Am I overlooking something here? One thing I haven't tried is increasing the buffering interval from 1000 events to, say, 10,000. It turns out that 1000 is a good number for Versant, the object database system we're replacing, and for MySQL, so I assumed it would be a good number for PostgreSQL, too.

Regards,  Steve

pgsql-performance by date:

Previous
From: Ian Westmacott
Date:
Subject: Re: improving write performance for logging application
Next
From: Steve Eckmann
Date:
Subject: Re: improving write performance for logging application