Thread: Partial index-based load balancing

Partial index-based load balancing

From
Fabio Ugo Venchiarutti
Date:
Greetings


I'm working for a startup and our core DB is growing rather fast.

Our target scale is large enough that we expect some of our core tables'
indexes to grow bigger than the memory on any single node over the next
couple years (our current intended design involves conventional
stream-replication-based, write-on-one-read-from-many load balancing).

We don't fancy the idea of using inheritance through partitioning due to
the maintenance overhead and our reliance on validation
constraints/triggers.

My proposal will be to instead create a number of partial indexes
covering predefined ranges of client IDs, then use a connection-level
routing mechanism that relies on what range the relevant client's data
belongs to in order to address the right node and match the right
partial index.

The idea is to have any given read-only node hold just one of the
partial indexes in its cache and never fetch index pages off its
secondary storage.
Scaling would just be a matter of increasing the partitioning density.


I'm going to assume that I'm not the first one to come up with this
strategy (and that there already is a name for it. If so, what is it?).


Is it a valid setup or am I missing some key aspect of how index
partitioning is meant to work?


TIA


Best regards


Fabio Ugo Venchiarutti


Re: Partial index-based load balancing

From
Andy Colson
Date:
On 3/31/2015 1:58 AM, Fabio Ugo Venchiarutti wrote:
> Greetings
>
>
> I'm working for a startup and our core DB is growing rather fast.
>
> Our target scale is large enough that we expect some of our core tables'
> indexes to grow bigger than the memory on any single node over the next
> couple years (our current intended design involves conventional
> stream-replication-based, write-on-one-read-from-many load balancing).
>
> We don't fancy the idea of using inheritance through partitioning due to
> the maintenance overhead and our reliance on validation
> constraints/triggers.
>
> My proposal will be to instead create a number of partial indexes
> covering predefined ranges of client IDs, then use a connection-level
> routing mechanism that relies on what range the relevant client's data
> belongs to in order to address the right node and match the right
> partial index.
>
> The idea is to have any given read-only node hold just one of the
> partial indexes in its cache and never fetch index pages off its
> secondary storage.
> Scaling would just be a matter of increasing the partitioning density.
>
>
> I'm going to assume that I'm not the first one to come up with this
> strategy (and that there already is a name for it. If so, what is it?).
>
>
> Is it a valid setup or am I missing some key aspect of how index
> partitioning is meant to work?
>
>
> TIA
>
>
> Best regards
>
>
> Fabio Ugo Venchiarutti
>
>

Have you timed it?  It sounds like it should work ok, but just to play
devils advocate it sounds a little over complex.

Heres my thinking:  Have one index, have lots of read-only nodes, do the
connection-level routing.  I think the PG caching and OS caching will be
smart enough to cache the parts of the index that's hot per node.  You
are also less likely to confuse the planner.

For example, I have one db, with web hits logging, with a timestamp
index.  Its upward of 25 gig, but I only every query the last 24 hours
to get ip/session/hit counts.  That part of the index/db is cached very
well and responds very quick.  If I ever hit data that's many days old
it slow's down hitting the HD.

I don't know for sure which method would be faster though.  The only
real difference between the two methods is the number of indexes.  I'd
bet there is no speed difference between them.  PG will figure out what
data is hot and cache it.

-Andy


Re: Partial index-based load balancing

From
Fabio Ugo Venchiarutti
Date:

On 01/04/15 06:12, Andy Colson wrote:
> On 3/31/2015 1:58 AM, Fabio Ugo Venchiarutti wrote:
>> Greetings
>>
>>
>> I'm working for a startup and our core DB is growing rather fast.
>>
>> Our target scale is large enough that we expect some of our core tables'
>> indexes to grow bigger than the memory on any single node over the next
>> couple years (our current intended design involves conventional
>> stream-replication-based, write-on-one-read-from-many load balancing).
>>
>> We don't fancy the idea of using inheritance through partitioning due to
>> the maintenance overhead and our reliance on validation
>> constraints/triggers.
>>
>> My proposal will be to instead create a number of partial indexes
>> covering predefined ranges of client IDs, then use a connection-level
>> routing mechanism that relies on what range the relevant client's data
>> belongs to in order to address the right node and match the right
>> partial index.
>>
>> The idea is to have any given read-only node hold just one of the
>> partial indexes in its cache and never fetch index pages off its
>> secondary storage.
>> Scaling would just be a matter of increasing the partitioning density.
>>
>>
>> I'm going to assume that I'm not the first one to come up with this
>> strategy (and that there already is a name for it. If so, what is it?).
>>
>>
>> Is it a valid setup or am I missing some key aspect of how index
>> partitioning is meant to work?
>>
>>
>> TIA
>>
>>
>> Best regards
>>
>>
>> Fabio Ugo Venchiarutti
>>
>>
>
> Have you timed it?  It sounds like it should work ok, but just to play
> devils advocate it sounds a little over complex.
>
> Heres my thinking:  Have one index, have lots of read-only nodes, do the
> connection-level routing.  I think the PG caching and OS caching will be
> smart enough to cache the parts of the index that's hot per node.  You
> are also less likely to confuse the planner.
>
> For example, I have one db, with web hits logging, with a timestamp
> index.  Its upward of 25 gig, but I only every query the last 24 hours
> to get ip/session/hit counts.  That part of the index/db is cached very
> well and responds very quick.  If I ever hit data that's many days old
> it slow's down hitting the HD.
>
> I don't know for sure which method would be faster though.  The only
> real difference between the two methods is the number of indexes.  I'd
> bet there is no speed difference between them.  PG will figure out what
> data is hot and cache it.
>
> -Andy
>
>


I always assumed that an index, or at least the root level of a BTREE
one, had to be completely cached in order to guarantee instantaneous
lookups, but your thinking is seems sound as after the first level
resolution only certain branches will be walked on any given slave.

Will try both approaches and update you with about my findings if you're
interested (won't happen before a few weeks at least due to management
bottlenecks).

Many thanks


F



Re: Partial index-based load balancing

From
Andy Colson
Date:
On 3/31/2015 3:28 PM, Fabio Ugo Venchiarutti wrote:
>
>
> On 01/04/15 06:12, Andy Colson wrote:
>> On 3/31/2015 1:58 AM, Fabio Ugo Venchiarutti wrote:
>>> Greetings
>>>
>>>
>>> I'm working for a startup and our core DB is growing rather fast.
>>>
>>> Our target scale is large enough that we expect some of our core tables'
>>> indexes to grow bigger than the memory on any single node over the next
>>> couple years (our current intended design involves conventional
>>> stream-replication-based, write-on-one-read-from-many load balancing).
>>>
>>> We don't fancy the idea of using inheritance through partitioning due to
>>> the maintenance overhead and our reliance on validation
>>> constraints/triggers.
>>>
>>> My proposal will be to instead create a number of partial indexes
>>> covering predefined ranges of client IDs, then use a connection-level
>>> routing mechanism that relies on what range the relevant client's data
>>> belongs to in order to address the right node and match the right
>>> partial index.
>>>
>>> The idea is to have any given read-only node hold just one of the
>>> partial indexes in its cache and never fetch index pages off its
>>> secondary storage.
>>> Scaling would just be a matter of increasing the partitioning density.
>>>
>>>
>>> I'm going to assume that I'm not the first one to come up with this
>>> strategy (and that there already is a name for it. If so, what is it?).
>>>
>>>
>>> Is it a valid setup or am I missing some key aspect of how index
>>> partitioning is meant to work?
>>>
>>>
>>> TIA
>>>
>>>
>>> Best regards
>>>
>>>
>>> Fabio Ugo Venchiarutti
>>>
>>>
>>
>> Have you timed it?  It sounds like it should work ok, but just to play
>> devils advocate it sounds a little over complex.
>>
>> Heres my thinking:  Have one index, have lots of read-only nodes, do the
>> connection-level routing.  I think the PG caching and OS caching will be
>> smart enough to cache the parts of the index that's hot per node.  You
>> are also less likely to confuse the planner.
>>
>> For example, I have one db, with web hits logging, with a timestamp
>> index.  Its upward of 25 gig, but I only every query the last 24 hours
>> to get ip/session/hit counts.  That part of the index/db is cached very
>> well and responds very quick.  If I ever hit data that's many days old
>> it slow's down hitting the HD.
>>
>> I don't know for sure which method would be faster though.  The only
>> real difference between the two methods is the number of indexes.  I'd
>> bet there is no speed difference between them.  PG will figure out what
>> data is hot and cache it.
>>
>> -Andy
>>
>>
>
>
> I always assumed that an index, or at least the root level of a BTREE
> one, had to be completely cached in order to guarantee instantaneous
> lookups, but your thinking is seems sound as after the first level
> resolution only certain branches will be walked on any given slave.
>
> Will try both approaches and update you with about my findings if you're
> interested (won't happen before a few weeks at least due to management
> bottlenecks).
>
> Many thanks
>
>
> F
>
>
>

Yes, interested.  Everyone loves a good benchmark.  And if your solution
(with multiple partial index) (which I've never seen discussed on this
list before) is faster, then I propose we name it after you.  :-)

-Andy