Home > mailing lists

Re: Horizontal scalability/sharding - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Horizontal scalability/sharding
Date	September 1, 2015 21:04:28
Msg-id	55E5E8A2.5070500@2ndquadrant.com Whole thread Raw
In response to	Re: Horizontal scalability/sharding (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Horizontal scalability/sharding (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

On 09/01/2015 07:17 PM, Robert Haas wrote:
> On Tue, Sep 1, 2015 at 1:06 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> You're assuming that our primary bottleneck for writes is IO. It's
>> not at present for most users, and it certainly won't be in the
>> future. You need to move your thinking on systems resources into
>> the 21st century, instead of solving the resource problems from 15
>> years ago.
>
> Your experience doesn't match mine.  I find that it's frequently
> impossible to get the system to use all of the available CPU
> capacity, either because you're bottlenecked on locks or because you
> are bottlenecked on the  I/O subsystem, and with the locking
> improvements in newer versions, the former is becoming less and less
> common. Amit's recent work on scalability demonstrates this trend: he
> goes looking for lock bottlenecks, and finds problems that only occur
> at 128+ concurrent connections running full tilt.  The patches show
> limited benefit - a few percentage points - at lesser concurrency
> levels.  Either there are other locking bottlenecks that limit
> performance at lower client counts but which mysteriously disappear
> as concurrency increases, which I would find surprising, or the limit
> is somewhere else.  I haven't seen any convincing evidence that the
> I/O subsystem is the bottleneck, but I'm having a hard time figuring
> out what else it could be.

Memory bandwidth, for example. It's quite difficult to spot, because the 
intuition is that memory is fast, but thanks to improvements in storage 
(and stagnation in RAM bandwidth), this is becoming a significant issue.

Process-management overhead is another thing we tend to ignore, but once 
you get to many processes all willing to work at the same time, you need 
to account for that.

Of course, this applies differently to different sharding use cases. For 
example analytics workloads have serious issues with memory bandwidth, 
but not so much with process management overhead (because the number of 
connections is usually about number of cores). Use cases with many 
clients (in web-scale use cases) tends to run into both (all the 
processes also have to share all the caches, killing them).

I don't know if sharding can help solving (or at least improve) these 
issues. And if sharding in general can, I don't know if it still holds 
for FDW-based solution.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Magnus Hagander
Date: 01 September 2015, 20:53:05
Subject: Re: Proposal: Implement failover on libpq connect level.

From: Robert Haas
Date: 01 September 2015, 21:07:25
Subject: Re: Proposal: Implement failover on libpq connect level.

Re: Horizontal scalability/sharding - Mailing list pgsql-hackers

Previous

Next