Thread: larger shared buffers slows down cluster

larger shared buffers slows down cluster

From

Andrew Dunstan

Date:

22 August 2012, 23:48:40

This problem has been reported by a client.

Consider the following very small table test case:
   create table bar as select a,b,c,d,e from generate_series(1,2) a,   generate_series(3,4) b, generate_series( 5,6) c,
 generate_series(7,8) d, generate_series(9,10) e;   create index bar_a on bar(a);   create index bar_b on bar(b);
createindex bar_c on bar(c);   create index bar_d on bar(d);   create index bar_e on bar(e);   create unique index
bar_abcdeon bar(a,b,c,d,e);
 


Now running:
   cluster bar using bar_abcde;


appears to be very sensitive to the shared buffers setting. In an amazon 
very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the 
following timings:

    Shared Buffers   Time      48Gb           2058ms       8Gb            372ms       1gb             67ms


Is this expected behaviour? If so, is there a good explanation? I'm not 
sure what other operations might be affected this way.

cheers

andrew

Re: larger shared buffers slows down cluster

From

Tom Lane

Date:

23 August 2012, 00:00:59

Andrew Dunstan <andrew@dunslane.net> writes:
> Now running:
>     cluster bar using bar_abcde;
> appears to be very sensitive to the shared buffers setting. In an amazon 
> very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the 
> following timings:

>      Shared Buffers   Time
>        48Gb           2058ms
>         8Gb            372ms
>         1gb             67ms

DropRelFileNodeBuffers, perhaps?  See recent commits to reduce the cost
of that for large shared_buffers, notably
e8d029a30b5a5fb74b848a8697b1dfa3f66d9697 and
ece01aae479227d9836294b287d872c5a6146a11
        regards, tom lane

Re: larger shared buffers slows down cluster

From

Jeff Janes

Date:

23 August 2012, 00:19:32

On Wed, Aug 22, 2012 at 1:48 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> This problem has been reported by a client.
>
> Consider the following very small table test case:
>
>    create table bar as select a,b,c,d,e from generate_series(1,2) a,
>    generate_series(3,4) b, generate_series( 5,6) c,
>    generate_series(7,8) d, generate_series(9,10) e;
>    create index bar_a on bar(a);
>    create index bar_b on bar(b);
>    create index bar_c on bar(c);
>    create index bar_d on bar(d);
>    create index bar_e on bar(e);
>    create unique index bar_abcde on bar(a,b,c,d,e);
>
>
> Now running:
>
>    cluster bar using bar_abcde;
>
>
> appears to be very sensitive to the shared buffers setting. In an amazon
> very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the
> following timings:
>
>
>     Shared Buffers   Time
>       48Gb           2058ms
>        8Gb            372ms
>        1gb             67ms
>
>
> Is this expected behaviour?

Yeah.  Clustering the table means that all the indexes and the old
version of the table all get dropped, and each time something is
dropped the entire buffer pool is scoured to remove the old buffers.

In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB.
Because now the scouring is done once per object, not once per fork.
Also, the check is done without an initial spinlock.

It perhaps could be improved further by only scouring the pool once,
at the end of the transaction, with a hash of all objects to be
dropped.

> If so, is there a good explanation? I'm not sure
> what other operations might be affected this way.

drop, truncate, reindex, vacuum full.  What else causes a table to be
re-written?

Cheers,

Jeff

Re: larger shared buffers slows down cluster

From

Andrew Dunstan

Date:

23 August 2012, 03:50:20

On 08/22/2012 05:19 PM, Jeff Janes wrote:
> On Wed, Aug 22, 2012 at 1:48 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> This problem has been reported by a client.
>>
>> Consider the following very small table test case:
>>
>>     create table bar as select a,b,c,d,e from generate_series(1,2) a,
>>     generate_series(3,4) b, generate_series( 5,6) c,
>>     generate_series(7,8) d, generate_series(9,10) e;
>>     create index bar_a on bar(a);
>>     create index bar_b on bar(b);
>>     create index bar_c on bar(c);
>>     create index bar_d on bar(d);
>>     create index bar_e on bar(e);
>>     create unique index bar_abcde on bar(a,b,c,d,e);
>>
>>
>> Now running:
>>
>>     cluster bar using bar_abcde;
>>
>>
>> appears to be very sensitive to the shared buffers setting. In an amazon
>> very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the
>> following timings:
>>
>>
>>      Shared Buffers   Time
>>        48Gb           2058ms
>>         8Gb            372ms
>>         1gb             67ms
>>
>>
>> Is this expected behaviour?
> Yeah.  Clustering the table means that all the indexes and the old
> version of the table all get dropped, and each time something is
> dropped the entire buffer pool is scoured to remove the old buffers.
>
> In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB.
> Because now the scouring is done once per object, not once per fork.
> Also, the check is done without an initial spinlock.
>
> It perhaps could be improved further by only scouring the pool once,
> at the end of the transaction, with a hash of all objects to be
> dropped.
>
>> If so, is there a good explanation? I'm not sure
>> what other operations might be affected this way.
> drop, truncate, reindex, vacuum full.  What else causes a table to be
> re-written?


OK, thanks for the info.

cheers

andrew

Re: larger shared buffers slows down cluster

From

Andrew Dunstan

Date:

23 August 2012, 04:44:25

On 08/22/2012 05:19 PM, Jeff Janes wrote:


>>
>>      Shared Buffers   Time
>>        48Gb           2058ms
>>         8Gb            372ms
>>         1gb             67ms
>>
>>
>> Is this expected behaviour?
> Yeah.  Clustering the table means that all the indexes and the old
> version of the table all get dropped, and each time something is
> dropped the entire buffer pool is scoured to remove the old buffers.
>
> In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB.
> Because now the scouring is done once per object, not once per fork.
> Also, the check is done without an initial spinlock.
>
> It perhaps could be improved further by only scouring the pool once,
> at the end of the transaction, with a hash of all objects to be
> dropped.
>
>


FYI, I have rerun the tests on amazon with 9.2 BETA - the improvement I 
saw ranged from a factor of roughly 2 (with 1Gb of shared memory) to 6 
(with 48Gb).

cheers

andrew