Home > mailing lists

Re: Huge Data sets, simple queries - Mailing list pgsql-performance

From	Jeffrey W. Baker
Subject	Re: Huge Data sets, simple queries
Date	February 1, 2006 04:25:16
Msg-id	1138782313.14732.1.camel@noodles Whole thread
In response to	Re: Huge Data sets, simple queries ("Luke Lonergan" <llonergan@greenplum.com>)
Responses	Re: Huge Data sets, simple queries Re: Huge Data sets, simple queries
List	pgsql-performance

Tree view

On Tue, 2006-01-31 at 21:53 -0800, Luke Lonergan wrote:
> Jeffrey,
>
> On 1/31/06 8:09 PM, "Jeffrey W. Baker" <jwbaker@acm.org> wrote:
> >> ... Prove it.
> > I think I've proved my point.  Software RAID1 read balancing provides
> > 0%, 300%, 100%, and 100% speedup on 1, 2, 4, and 8 threads,
> > respectively.  In the presence of random I/O, the results are even
> > better.
> > Anyone who thinks they have a single-threaded workload has not yet
> > encountered the autovacuum daemon.
>
> Good data - interesting case.  I presume from your results that you had to
> make the I/Os non-overlapping (the "skip" option to dd) in order to get the
> concurrent access to work.  Why the particular choice of offset - 3.2GB in
> this case?

No particular reason.  8k x 100000 is what the last guy used upthread.
>
> So - the bandwidth doubles in specific circumstances under concurrent
> workloads - not relevant to "Huge Data sets, simple queries", but possibly
> helpful for certain kinds of OLTP applications.

Ah, but someday Pg will be able to concurrently read from two
datastreams to complete a single query.  And that day will be glorious
and fine, and you'll want as much disk concurrency as you can get your
hands on.

-jwb

pgsql-performance by date:

From: "Luke Lonergan"
Date: 01 February 2006, 01:53:19
Subject: Re: Huge Data sets, simple queries

From: PFC
Date: 01 February 2006, 05:00:52
Subject: Re: Huge Data sets, simple queries

Re: Huge Data sets, simple queries - Mailing list pgsql-performance

Previous

Next