Re: MVCC for massively parallel inserts - Mailing list pgsql-general

From Vivek Khera
Subject Re: MVCC for massively parallel inserts
Date
Msg-id x7r7yayuvq.fsf@yertle.int.kciLink.com
Whole thread Raw
In response to MVCC for massively parallel inserts  (Steven D.Arnold <stevena@neosynapse.net>)
List pgsql-general
>>>>> "GS" == Greg Stark <gsstark@mit.edu> writes:

GS> I would agree and if you really need the I/O bandwidth you can go
GS> to much larger stripe sets than even this. The documentation I've
GS> seen before suggested there were benefits up to stripe sets as
GS> large as twelve disks across. That would be 24 drives if you're
GS> also doing mirroring.

I did a bunch of testing with a 14 disk SCSI array.  I found that RAID5 was
best over RAID10 and RAID50.

GS> Ideally separating WAL, index, and heap files is good, but you
GS> would have to experiment to see which works out fastest for a
GS> given number of drives.

I found that putting the WAL on its own array (in my case a mirror on
the other RAID controller channel) helped quite a bit.  I don't think
it is easy to split off index files to alternate locations with Postgres.

Increasing the number of checkpoint segments was one of the biggest
improvements I observed for mass-insert performance (as tested while
doing a restore on a multi-million row database.)

The combination of having the WAL on a separate disk, and letting that
grow to be quite large has been very good for my performance and also
for reducing disk bandwidth requirements.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: 7.3.3 drop table takes very long time
Next
From: Tom Lane
Date:
Subject: Re: 7.4, 'group by' default ordering?