Thread: Bulk INSERT performance in 7.4.1

Bulk INSERT performance in 7.4.1

From
Florian Weimer
Date:
After an upgrade to 7.4.1 (from 7.3) we see a severe performance
regression in bulk INSERTs.

This is apparently caused by constant checkpointing (every 10 to 20
seconds).  I've already increased the number of checkpoint segments to
32, but currently, there are just 10 or 11 files in the pg_xlog
directory.  With 7.3, we had configured checkpoint_segements at 16, and
there were 33 pg_xlog files.  Checkpoints happened every couple of
minutes.

How can I reduce the checkpoint frequency?

(I'd like to try that first because it's the most obvious anomaly.
Maybe we can look at the involved table later.)

--
Current mail filters: many dial-up/DSL/cable modem hosts, and the
following domains: atlas.cz, bigpond.com, freenet.de, hotmail.com,
libero.it, netscape.net, postino.it, tiscali.co.uk, tiscali.cz,
tiscali.it, voila.fr, wanadoo.fr, yahoo.com.

Re: Bulk INSERT performance in 7.4.1

From
Florian Weimer
Date:
Florian Weimer wrote:

> After an upgrade to 7.4.1 (from 7.3) we see a severe performance
> regression in bulk INSERTs.

In turns out that we were running the default configuration, and not the
tuned one in /etc/postgresql. *blush*

After increasing the number of checkpoint segments and the shared-memory
buffers, performance is back to the expected levels.  It might even be a
bit faster.

--
Current mail filters: many dial-up/DSL/cable modem hosts, and the
following domains: atlas.cz, bigpond.com, freenet.de, hotmail.com,
libero.it, netscape.net, postino.it, tiscali.co.uk, tiscali.cz,
tiscali.it, voila.fr, wanadoo.fr, yahoo.com.

Re: Bulk INSERT performance in 7.4.1

From
Vivek Khera
Date:
>>>>> "FW" == Florian Weimer <fw@deneb.enyo.de> writes:

FW> After increasing the number of checkpoint segments and the shared-memory
FW> buffers, performance is back to the expected levels.  It might even be a
FW> bit faster.

If you've got the time, could you try also doing the full bulk insert
test with the checkpoint log files on another physical disk?  See if
that's any faster.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Bulk INSERT performance in 7.4.1

From
Vivek Khera
Date:
On Mar 3, 2004, at 4:37 PM, Greg Spiegelberg wrote:

> Would turning autocommit off help?
>

doubtful, since the bulk insert is all one transaction.


Re: Bulk INSERT performance in 7.4.1

From
Greg Spiegelberg
Date:
Would turning autocommit off help?


Vivek Khera wrote:
>>>>>>"FW" == Florian Weimer <fw@deneb.enyo.de> writes:
>
>
> FW> After increasing the number of checkpoint segments and the shared-memory
> FW> buffers, performance is back to the expected levels.  It might even be a
> FW> bit faster.
>
> If you've got the time, could you try also doing the full bulk insert
> test with the checkpoint log files on another physical disk?  See if
> that's any faster.
>


--
Greg Spiegelberg
  Sr. Product Development Engineer
  Cranel, Incorporated.
  Phone: 614.318.4314
  Fax:   614.431.8388
  Email: gspiegelberg@Cranel.com
Cranel. Technology. Integrity. Focus.



Re: Bulk INSERT performance in 7.4.1

From
Florian Weimer
Date:
Vivek Khera wrote:

> If you've got the time, could you try also doing the full bulk insert
> test with the checkpoint log files on another physical disk?  See if
> that's any faster.

We have been doing that for a few weeks, but the performance
improvements are less than what we expected.  There is hardly any disk
activity on the log RAID, even during checkpointing.

After I activated the tuned configuration, we are again mostly CPU-bound
(it seems that updating all four indices is quite expensive).  The
bulk INSERT process runs single-threaded right now, and if we switched
to multiple processes for that, we could reach some 1,500 INSERTs per
second, I believe.  This is more than sufficient for us; our real-time
data collector is tuned to emit about 150 records per second, on the
average.  (There is an on-disk queue to compensate temporary problems,
such as spikes in the data rate and database updates gone awry.)

--
Current mail filters: many dial-up/DSL/cable modem hosts, and the
following domains: atlas.cz, bigpond.com, freenet.de, hotmail.com,
libero.it, netscape.net, postino.it, tiscali.co.uk, tiscali.cz,
tiscali.it, voila.fr, wanadoo.fr, yahoo.com.