Thread: Maximum Possible Insert Performance?

Maximum Possible Insert Performance?

From
William Yu
Date:
My situation is this. We have a semi-production server where we
pre-process data and then upload the finished data to our production
servers. We need the fastest possible write performance. Having the DB
go corrupt due to power loss/OS crash is acceptable because we can
always restore from last night and re-run everything that was done since
then.

I already have fsync off. Short of buying more hardware -- which I will
probably do anyways once I figure out whether I need more CPU, memory or
disk -- what else can I do to max out the speed? Operation mix is about
50% select, 40% insert, 10% update.


Re: Maximum Possible Insert Performance?

From
Tom Lane
Date:
William Yu <wyu@talisys.com> writes:
> [ we don't care about data integrity ]
> I already have fsync off. Short of buying more hardware -- which I will
> probably do anyways once I figure out whether I need more CPU, memory or
> disk -- what else can I do to max out the speed? Operation mix is about
> 50% select, 40% insert, 10% update.

Batch operations so you commit more than one insert per transaction.
(With fsync off, this isn't such a killer consideration as it would be
with fsync on, but the per-transaction overhead is still nontrivial.)

Get rid of as many integrity constraints as you feel can reasonably be
postponed to the final upload.  FK checks are particularly painful.

Eliminate indexes where possible.

Also (I hate to say this, but...) you should consider using Some Other
Database.  "I don't care about data integrity, only speed" sounds like
a good fit to MySQL ...

            regards, tom lane

Re: Maximum Possible Insert Performance?

From
Josh Berkus
Date:
William,

> I already have fsync off. Short of buying more hardware -- which I will
> probably do anyways once I figure out whether I need more CPU, memory or
> disk -- what else can I do to max out the speed? Operation mix is about
> 50% select, 40% insert, 10% update.

Disk.   Multi-channel RAID is where it's at, and/or RAID with a great write
cache enabled.  For really fast updates, I'd suggest 6-disk or even 8-disk
RAID 1+0.

As soon as you have gobs of extra disk space, jack your checkpoint_buffers way
up, like a couple of gigs.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Maximum Possible Insert Performance?

From
Joe Conway
Date:
William Yu wrote:
> My situation is this. We have a semi-production server where we
> pre-process data and then upload the finished data to our production
> servers. We need the fastest possible write performance. Having the DB
> go corrupt due to power loss/OS crash is acceptable because we can
> always restore from last night and re-run everything that was done since
> then.

If you can, use COPY -- it is far faster than INSERT.

See:
http://www.postgresql.org/docs/current/static/sql-copy.html

HTH,

Joe



Re: Maximum Possible Insert Performance?

From
Shridhar Daithankar
Date:
William Yu wrote:

> My situation is this. We have a semi-production server where we
> pre-process data and then upload the finished data to our production
> servers. We need the fastest possible write performance. Having the DB
> go corrupt due to power loss/OS crash is acceptable because we can
> always restore from last night and re-run everything that was done since
> then.
>
> I already have fsync off. Short of buying more hardware -- which I will
> probably do anyways once I figure out whether I need more CPU, memory or
> disk -- what else can I do to max out the speed? Operation mix is about
> 50% select, 40% insert, 10% update.

Mount WAL on RAM disk. WAL is most often hit area for heavy updates/inserts. If
you spped that up, things should be pretty faster.

A non-tried advice though. Given that you can afford a crash, I would say it is
worth a try..

  Shridhar


Re: Maximum Possible Insert Performance?

From
Bill Moran
Date:
William Yu wrote:
> My situation is this. We have a semi-production server where we
> pre-process data and then upload the finished data to our production
> servers. We need the fastest possible write performance. Having the DB
> go corrupt due to power loss/OS crash is acceptable because we can
> always restore from last night and re-run everything that was done since
> then.
>
> I already have fsync off. Short of buying more hardware -- which I will
> probably do anyways once I figure out whether I need more CPU, memory or
> disk -- what else can I do to max out the speed? Operation mix is about
> 50% select, 40% insert, 10% update.

In line with what Tom Lane said, you may want to look at the various
memory databases available (I'm not familiar with any one to recommend,
though)  If you can fit the whole database in RAM, that would work
great, if not, you may be able to split the DB up and put the most
used tables just in the memory database.

I have also seen a number tutorials on how to put a database on a
RAM disk.  This helps, but it's still not as fast as a database server
that's designed to keep all its data in RAM.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


Re: Maximum Possible Insert Performance?

From
William Yu
Date:
This is an intriguing thought which leads me to think about a similar
solution for even a production server and that's a solid state drive for
just the WAL. What's the max disk space the WAL would ever take up?
There's quite a few 512MB/1GB/2GB solid state drives available now in
the ~$200-$500 range and if you never hit those limits...

When my current job batch is done, I'll save a copy of the dir and give
the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
store and run that through the hooper.


Shridhar Daithankar wrote:
>
> Mount WAL on RAM disk. WAL is most often hit area for heavy
> updates/inserts. If you spped that up, things should be pretty faster.


Re: Maximum Possible Insert Performance?

From
Josh Berkus
Date:
William,

> When my current job batch is done, I'll save a copy of the dir and give
> the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
> store and run that through the hooper.

We'll be interested in the results.   The Sandisk won't be much of a
performance test; last I checked, their access speed was about 1/2 that of a
fast SCSI drive.   But it could be a feasability test for the more expensive
RAMdrive approach.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Maximum Possible Insert Performance?

From
William Yu
Date:
Josh Berkus wrote:
> William,
>
>
>>When my current job batch is done, I'll save a copy of the dir and give
>>the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
>>store and run that through the hooper.
>
>
> We'll be interested in the results.   The Sandisk won't be much of a
> performance test; last I checked, their access speed was about 1/2 that of a
> fast SCSI drive.   But it could be a feasability test for the more expensive
> RAMdrive approach.
>


The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could
get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You
shouldn't need to do mirroring with a solid state drive.

Time to Google up some more solid state drive vendors.


Re: Maximum Possible Insert Performance?

From
Josh Berkus
Date:
William,

> The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could
> get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You
> shouldn't need to do mirroring with a solid state drive.

I wouldn't count on RAID0 improving the speed of SANDisk's much.  How are you
connecting to them?  USB?   USB doesn't support fast parallel data access.

Now, if it turns out that 256MB ramdisks are less than 1/5 the cost of 1GB
ramdisks, then that's worth considering.

You're right, though, mirroring a solid state drive is pretty pointless; if
power fails, both mirrors are dead.

As I said before, though, we're all very interested in this test.  Using a
ramdisk for WAL has been discussed on this list numerous times but not
attempted by anyone who published their results.

All that aside, though, I think you should also experiment with the Background
Writer patch recently discussed on Hackers, as it may give you a performance
boost as well.
--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Maximum Possible Insert Performance?

From
William Yu
Date:
Josh Berkus wrote:
> William,
>
>
>>The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could
>>get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You
>>shouldn't need to do mirroring with a solid state drive.
>
>
> I wouldn't count on RAID0 improving the speed of SANDisk's much.  How are you
> connecting to them?  USB?   USB doesn't support fast parallel data access.

You can get ATA SanDisks up to 2GB. Another vendor I checked out --
BitMicro -- has solid state drives for SATA, SCSI and FiberChannel. I'd
definitely would not use USB SSDs -- USB performance would be so pokey
to be useless.

> Now, if it turns out that 256MB ramdisks are less than 1/5 the cost of 1GB
> ramdisks, then that's worth considering.

Looks like they're linear with size. SanDisk Flashdrive 1GB is about
$1000 while 256MB is $250.

> You're right, though, mirroring a solid state drive is pretty pointless; if
> power fails, both mirrors are dead.

Actually no. Solid state memory is non-volatile. They retain data even
without power.


Re: Maximum Possible Insert Performance?

From
Greg Stark
Date:
William Yu <wyu@talisys.com> writes:

> > You're right, though, mirroring a solid state drive is pretty pointless; if
> > power fails, both mirrors are dead.
>
> Actually no. Solid state memory is non-volatile. They retain data even without
> power.

Note that flash ram only has a finite number of write cycles before it fails.

On the other hand that might not be so bad for WAL which writes sequentially,
you can easily calculate how close you are to the maximum. For things like
heap storage or swap it's awful as you can get hot spots that get written to
thousands of times before the rest of the space is used.

--
greg

Re: Maximum Possible Insert Performance?

From
Shridhar Daithankar
Date:
William Yu wrote:

> This is an intriguing thought which leads me to think about a similar
> solution for even a production server and that's a solid state drive for
> just the WAL. What's the max disk space the WAL would ever take up?
> There's quite a few 512MB/1GB/2GB solid state drives available now in
> the ~$200-$500 range and if you never hit those limits...

Maximum number of WAL segments at any time in 2*(number of checkpoint
segments)+1 IIRC.

So if you have 3 checkpoint segments, you can not have more than 7 WAL segments
at any time. Give or take 1.

Correct me if I am wrong..

  Shridhar


Re: Maximum Possible Insert Performance?

From
Tom Lane
Date:
Shridhar Daithankar <shridhar_daithankar@persistent.co.in> writes:
> William Yu wrote:
>> This is an intriguing thought which leads me to think about a similar
>> solution for even a production server and that's a solid state drive for
>> just the WAL. What's the max disk space the WAL would ever take up?

> Maximum number of WAL segments at any time in 2*(number of checkpoint
> segments)+1 IIRC.
> So if you have 3 checkpoint segments, you can not have more than 7 WAL
> segments at any time. Give or take 1.

I don't believe that's a *hard* limit.  The system tries to schedule
checkpoints often enough to prevent WAL from getting bigger than that,
but if you had a sufficiently big spike in update activity, it's at
least theoretically possible that more than checkpoint_segments segments
could be filled before the concurrently running checkpoint finishes and
releases some old segments.

The odds of this being a real problem are small, especially if you don't
try to fit on an undersized SSD by reducing checkpoint_segments.  I'd
think that a 512Mb SSD would be plenty of space for ordinary update load
levels ...

            regards, tom lane

Re: Maximum Possible Insert Performance?

From
William Yu
Date:
Josh Berkus wrote:
> William,
>
>
>>When my current job batch is done, I'll save a copy of the dir and give
>>the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
>>store and run that through the hooper.
>
>
> We'll be interested in the results.   The Sandisk won't be much of a
> performance test; last I checked, their access speed was about 1/2 that of a
> fast SCSI drive.   But it could be a feasability test for the more expensive
> RAMdrive approach.

Some initial numbers. I simulated a CPU increase by underclocking the
processors. Most of the time, performance does not scale linearly with
clock speed but since I also underclocked the FSB and memory bandwidth
with the CPU, it's nearly an exact match.

1.15GHz    6.14
1.53GHz    6.97 +33% CPU = +13.5% performance

I then simulated adding a heapload of extra memory by running my job a
second time. Unfortunately, to keep my 25GB DB mostly cached in memory,
the word heapload is too accurate.

Run 1      6.97
Run 2      7.99 +14%

I popped in an extra IDE hard drive to store the WAL files and that
boosted the numbers by a little. From looking at iostat, the ratio
looked like 300K/s WAL for 1MB/s data.

WAL+Data on same disk    6.97
WAL+Data separated       7.26 +4%

I then tried to put the WAL directory onto a ramdisk. I turned off
swapping, created a tmpfs mount point and copied the pg_xlog directory
over. Everything looked fine as far as I could tell but Postgres just
panic'd with a "file permissions" error. Anybody have thoughts to why
tmpfs would not work?


Re: Maximum Possible Insert Performance?

From
Tom Lane
Date:
William Yu <wyu@talisys.com> writes:
> I then tried to put the WAL directory onto a ramdisk. I turned off
> swapping, created a tmpfs mount point and copied the pg_xlog directory
> over. Everything looked fine as far as I could tell but Postgres just
> panic'd with a "file permissions" error. Anybody have thoughts to why
> tmpfs would not work?

I'd say you got the file or directory ownership or permissions wrong.

            regards, tom lane

Re: Maximum Possible Insert Performance?

From
William Yu
Date:
Tom Lane wrote:
> William Yu <wyu@talisys.com> writes:
>
>>I then tried to put the WAL directory onto a ramdisk. I turned off
>>swapping, created a tmpfs mount point and copied the pg_xlog directory
>>over. Everything looked fine as far as I could tell but Postgres just
>>panic'd with a "file permissions" error. Anybody have thoughts to why
>>tmpfs would not work?
>
>
> I'd say you got the file or directory ownership or permissions wrong.

I did a mv instead of a cp which duplicates ownership & permissions exactly.


Re: Maximum Possible Insert Performance?

From
Dror Matalon
Date:
But the permissions of the base ramdisk might be wrong. I'd su to the
user that you run postgres as (probably postgres), and make sure that
you can go to the directory where the log and the database files are and
make sure you can see the files.

On Wed, Nov 26, 2003 at 10:03:47AM -0800, William Yu wrote:
> Tom Lane wrote:
> >William Yu <wyu@talisys.com> writes:
> >
> >>I then tried to put the WAL directory onto a ramdisk. I turned off
> >>swapping, created a tmpfs mount point and copied the pg_xlog directory
> >>over. Everything looked fine as far as I could tell but Postgres just
> >>panic'd with a "file permissions" error. Anybody have thoughts to why
> >>tmpfs would not work?
> >
> >
> >I'd say you got the file or directory ownership or permissions wrong.
>
> I did a mv instead of a cp which duplicates ownership & permissions exactly.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.fastbuzz.com
http://www.zapatec.com

Re: Maximum Possible Insert Performance?

From
Suchandra Thapa
Date:
On Mon, 2003-11-24 at 19:16, Greg Stark wrote:
> William Yu <wyu@talisys.com> writes:
>
> > > You're right, though, mirroring a solid state drive is pretty pointless; if
> > > power fails, both mirrors are dead.
> >
> > Actually no. Solid state memory is non-volatile. They retain data even without
> > power.
>
> Note that flash ram only has a finite number of write cycles before it fails.
>
> On the other hand that might not be so bad for WAL which writes sequentially,
> you can easily calculate how close you are to the maximum. For things like
> heap storage or swap it's awful as you can get hot spots that get written to
> thousands of times before the rest of the space is used.

I could be wrong, but I was under the impression that most of the newer
flash disks tended to spread writes out over the drive so that hotspots
are minimized.

--
Suchandra Thapa <ssthapa@netzero.com>

Attachment