Re: Fusion-io ioDrive - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Fusion-io ioDrive
Date
Msg-id b42b73150807070623n14787d19g1cc416b88735438f@mail.gmail.com
Whole thread Raw
In response to Re: Fusion-io ioDrive  ("Jonah H. Harris" <jonah.harris@gmail.com>)
Responses Re: Fusion-io ioDrive
List pgsql-performance
On Wed, Jul 2, 2008 at 7:41 AM, Jonah H. Harris <jonah.harris@gmail.com> wrote:
> On Tue, Jul 1, 2008 at 8:18 PM, Jeffrey Baker <jwbaker@gmail.com> wrote:
>> Basically the ioDrive is smoking the RAID.  The only real problem with
>> this benchmark is that the machine became CPU-limited rather quickly.
>
> That's traditionally the problem with everything being in memory.
> Unless the database algorithms are designed to exploit L1/L2 cache and
> RAM, which is not the case for a disk-based DBMS, you generally lose
> some concurrency due to the additional CPU overhead of playing only
> with memory.  This is generally acceptable if you're going to trade
> off higher concurrency for faster service times.  And, it isn't only
> evidenced in single systems where a disk-based DBMS is 100% cached,
> but also in most shared-memory clustering architectures.
>
> In most cases, when you're waiting on disk I/O, you can generally
> support higher concurrency because the OS can utilize the CPU's free
> cycles (during the wait) to handle other users.  In short, sometimes,
> disk I/O is a good thing; it just depends on what you need.

I have a lot of problems with your statements.  First of all, we are
not really talking about 'RAM' storage...I think your comments would
be more on point if we were talking about mounting database storage
directly from the server memory for example.  Sever memory and cpu are
involved to the extent that the o/s using them for caching and
filesystem things and inside the device driver.

Also, your comments seem to indicate that having a slower device leads
to higher concurrency because it allows the process to yield and do
other things.  This is IMO simply false.  With faster storage cpu
loads will increase but only because the overall system throughput
increases and cpu/memory 'work' increases in terms of overall system
activity.  Presumably as storage approaches speeds of main system
memory the algorithms of dealing with it will become simpler (not
having to go through acrobatics to try and making everything
sequential) and thus faster.

I also find the remarks of software 'optimizing' for strict hardware
assumptions (L1+L2) cache a little suspicious.  In some old programs I
remember keeping a giant C 'union' of critical structures that was
exactly 8k to fit in the 486 cpu cache.  In modern terms I think that
type of programming (sans some specialized environments) is usually
counter-productive...I think PostgreSQL's approach of deferring as
much work as possible to the o/s is a great approach.

merlin

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Fusion-io ioDrive
Next
From: PFC
Date:
Subject: Re: Fusion-io ioDrive