Re: getting the most of out multi-core systems for repeated complex SELECT statements - Mailing list pgsql-performance

From Andy Colson
Subject Re: getting the most of out multi-core systems for repeated complex SELECT statements
Date
Msg-id 4D4B7E64.4030203@squeakycode.net
Whole thread Raw
In response to Re: getting the most of out multi-core systems for repeated complex SELECT statements  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: getting the most of out multi-core systems for repeated complex SELECT statements  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: getting the most of out multi-core systems for repeated complex SELECT statements  (Greg Smith <greg@2ndquadrant.com>)
List pgsql-performance
On 02/03/2011 10:00 PM, Greg Smith wrote:
> Andy Colson wrote:
>> Cpu's wont get faster, but HD's and SSD's will. To have one database connection, which runs one query, run fast,
it'sgoing to need multi-core support. 
>
> My point was that situations where people need to run one query on one database connection that aren't in fact
limitedby disk I/O are far less common than people think. My troublesome database servers aren't ones with a single CPU
atits max but wishing there were more workers, they're the ones that have >25% waiting for I/O. And even that crowd is
stilla subset, distinct from people who don't care about the speed of any one core, they need lots of connections to go
atonce. 
>

Yes, I agree... for today.  If you gaze into 5 years... double the core count (but not the speed), double the IO rate.
Whatdo you see? 


>> My point is, there must be levels of threading, yes? If a backend has data to sort, has it collected, nothing
locked,what would it hurt to use multi-core sorting? 
>
> Optimizer nodes don't run that way. The executor "pulls" rows out of the top of the node tree, which then pulls from
itschildren, etc. If you just blindly ran off and executed every individual node to completion in parallel, that's not
alwaysgoing to be faster--could be a lot slower, if the original query never even needed to execute portions of the
tree.
>
> When you start dealing with all of the types of nodes that are out there it gets very messy in a hurry. Decomposing
thenodes of the query tree into steps that can be executed in parallel usefully is the hard problem hiding behind the
simpleidea of "use all the cores!" 
>


What if... the nodes were run in separate threads, and interconnected via queues?  A node would not have to run to
completioneither.  A queue could be setup to have a max items.  When a node adds 5 out of 5 items it would go to sleep.
Its parent node, removing one of the items could wake it up. 

-Andy

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: [HACKERS] Slow count(*) again...
Next
From: Scott Marlowe
Date:
Subject: Re: getting the most of out multi-core systems for repeated complex SELECT statements