Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Parallel Seq Scan
Date
Msg-id 20150130211633.GR3854@tamriel.snowman.net
Whole thread Raw
In response to Re: Parallel Seq Scan  (Daniel Bausch <bausch@dvs.tu-darmstadt.de>)
List pgsql-hackers
Daniel,

* Daniel Bausch (bausch@dvs.tu-darmstadt.de) wrote:
> I have been researching this topic long time ago.  One notably fact is
> that active prefetching disables automatic readahead prefetching (by
> Linux kernel), which can occour in larger granularities than 8K.
> Automatic readahead prefetching occours when consecutive addresses are
> read, which may happen by a seqscan but also by "accident" through an
> indexscan in correlated cases.

That strikes me as a pretty good point to consider.

> My consequence was to NOT prefetch seqscans, because OS does good enough
> without advice.  Prefetching indexscan heap accesses is very valuable
> though, but you need to detect the accidential sequential accesses to
> not hurt your performance in correlated cases.

Seems like we might be able to do that, it's not that different from
what we do with the bitmap scan case, we'd just look at the bitmap and
see if there's long runs of 1's.

> In general I can give you the hint to not only focus on HDDs with their
> single spindle.  A single SATA SSD scales up to 32 (31 on Linux)
> requests in parallel (without RAID or anything else).  The difference in
> throughput is extreme for this type of storage device.  While single
> spinning HDDs can only gain up to ~20% by NCQ, SATA SSDs can easily gain
> up to 700%.

I definitely agree with the idea that we should be looking at SSD-based
systems but I don't know if anyone happens to have easy access to server
gear with SSDs.  I've got an SSD in my laptop, but that's not really the
same thing.
Thanks!
    Stephen

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Possible typo in create_policy.sgml
Next
From: Jeff Janes
Date:
Subject: Re: Parallel Seq Scan