Re: ZFS prefetch considered evil? - Mailing list pgsql-general

From Alban Hertroys
Subject Re: ZFS prefetch considered evil?
Date
Msg-id E4BE2D78-DE8F-4D3A-AA6C-FD9C4D15E2C7@solfertje.student.utwente.nl
Whole thread Raw
In response to Re: ZFS prefetch considered evil?  (Yaroslav Tykhiy <yar@barnet.com.au>)
Responses Re: ZFS prefetch considered evil?
List pgsql-general
On Jul 9, 2009, at 3:53 AM, Yaroslav Tykhiy wrote:

> On 08/07/2009, at 8:39 PM, Alban Hertroys wrote:
>
>> On Jul 8, 2009, at 2:50 AM, Yaroslav Tykhiy wrote:
>> IIRC prefetch tries to keep data (disk blocks?) in memory that it
>> fetched recently.
>
> What you described is just a disk cache.  And a trivial
> implementation of prefetch would work as follows:  An application or
> other file/disk consumer asks the provider (driver, kernel,
> whatever) to read, say, 2 disk blocks worth of data.  The provider
> thinks, "I know you are short-sighted; I bet you are going to ask
> for more contiguous blocks very soon," so it schedules a disk read
> for many more contiguous blocks than requested and caches them in
> RAM.  For bulk data applications such as file serving this trick
> works as a charm.  But other applications do truly random access and
> they never come back after the prefetched blocks; in this case both
> disk bandwidth and cache space are wasted.  An advanced
> implementation can try to distinguish sequential and random access
> patterns, but in reality it appears to be a challenging task.

Ah yes, thanks for the correction, I now remember reading about that
before. Makes the name 'prefetch' that more fitting, doesn't it?

And as you say, it's not that useful a feature with random access
(hadn't thought about that); in fact, I can imagine that it might
delay moving the disk-heads to the next desired (random) position as
the FS is still requesting data that it isn't going to be needing
(except for some lucky cases) - unless it manages to detect the
randomness of the access patterns. You can't predict randomness from
just read requests of course, you don't know about the requests that
are still to come. You can however assume something like that is the
case if historic requests turned out to be random by nature, but then
you'd want to know for which area of the FS this is the case.

I don't know how you partitioned your zpools, but to me it seems like
it'd be preferable to have the PostgreSQL tablespaces (and possibly
other data that's likely to be accessed randomly) in a separate zpool
from the rest of the system so you can restrict disabling prefetch to
just that file-system. You probably already did that...

It could be interesting to see how clustering the relevant tables
would affect the prefetch performance, I'd expect disk access to be
less random that way. It's probably still better to disable prefetch
though.

>> ZFS uses quite a bit of memory, so if you distributed all your
>> memory to be used by just postgres and disk cache then you didn't
>> leave enough space for the prefetch data and _something_ will be
>> moved to swap.
>
> I hope you know that FreeBSD is exceptionally good at distributing
> available memory between its consumers.  That said, useless prefetch
> indeed puts extra pressure on disk cache and results in unnecessary
> cache evictions, thus making things even worse.  It is true that ZFS
> is memory hungry and so rather sensitive to non-optimal memory use
> patterns.  Useless prefetch wastes memory that could be used to
> speed up other ZFS operations.

Yes, I do know that, it's one of the reasons I prefer it over other
OSs. The keyword here was 'available memory' though, under the
assumption that something was hitting swap. But apparently that wasn't
the case.

>> You'll probably want to ask about this on the FreeBSD mailing lists
>> as well, they'll know much better than I do ;)
>
> Are you a local FreeBSD expert? ;-)  Jokes apart, I don't think this
> topic has to do with FreeBSD as such; it is mostly about making the
> advanced technologies of Postgresql and ZFS go well together.  Even
> ZFS developers admit that in database related applications
> exceptions from general ZFS practices and rules may be called for.

I wouldn't call myself an expert, I just use it on a few systems at
home and am more a user than an administrator. I do read the stable/
current mailing lists though (since 2004 according to my mail client)
and keep an eye on (among others) the ZFS discussions as I feel
tempted to change my gmirrors into zpools some day. It certainly looks
like an interesting FS, very flexible and reliable.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a55e49a10131296212767!



pgsql-general by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: c++ program to connect to postgre database
Next
From: leif@crysberg.dk
Date:
Subject: Re: Bug in ecpg lib ?