Thread: larger shared buffers slows down cluster
This problem has been reported by a client. Consider the following very small table test case: create table bar as select a,b,c,d,e from generate_series(1,2) a, generate_series(3,4) b, generate_series( 5,6) c, generate_series(7,8) d, generate_series(9,10) e; create index bar_a on bar(a); create index bar_b on bar(b); createindex bar_c on bar(c); create index bar_d on bar(d); create index bar_e on bar(e); create unique index bar_abcdeon bar(a,b,c,d,e); Now running: cluster bar using bar_abcde; appears to be very sensitive to the shared buffers setting. In an amazon very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the following timings: Shared Buffers Time 48Gb 2058ms 8Gb 372ms 1gb 67ms Is this expected behaviour? If so, is there a good explanation? I'm not sure what other operations might be affected this way. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Now running: > cluster bar using bar_abcde; > appears to be very sensitive to the shared buffers setting. In an amazon > very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the > following timings: > Shared Buffers Time > 48Gb 2058ms > 8Gb 372ms > 1gb 67ms DropRelFileNodeBuffers, perhaps? See recent commits to reduce the cost of that for large shared_buffers, notably e8d029a30b5a5fb74b848a8697b1dfa3f66d9697 and ece01aae479227d9836294b287d872c5a6146a11 regards, tom lane
On Wed, Aug 22, 2012 at 1:48 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > > This problem has been reported by a client. > > Consider the following very small table test case: > > create table bar as select a,b,c,d,e from generate_series(1,2) a, > generate_series(3,4) b, generate_series( 5,6) c, > generate_series(7,8) d, generate_series(9,10) e; > create index bar_a on bar(a); > create index bar_b on bar(b); > create index bar_c on bar(c); > create index bar_d on bar(d); > create index bar_e on bar(e); > create unique index bar_abcde on bar(a,b,c,d,e); > > > Now running: > > cluster bar using bar_abcde; > > > appears to be very sensitive to the shared buffers setting. In an amazon > very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the > following timings: > > > Shared Buffers Time > 48Gb 2058ms > 8Gb 372ms > 1gb 67ms > > > Is this expected behaviour? Yeah. Clustering the table means that all the indexes and the old version of the table all get dropped, and each time something is dropped the entire buffer pool is scoured to remove the old buffers. In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB. Because now the scouring is done once per object, not once per fork. Also, the check is done without an initial spinlock. It perhaps could be improved further by only scouring the pool once, at the end of the transaction, with a hash of all objects to be dropped. > If so, is there a good explanation? I'm not sure > what other operations might be affected this way. drop, truncate, reindex, vacuum full. What else causes a table to be re-written? Cheers, Jeff
On 08/22/2012 05:19 PM, Jeff Janes wrote: > On Wed, Aug 22, 2012 at 1:48 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> This problem has been reported by a client. >> >> Consider the following very small table test case: >> >> create table bar as select a,b,c,d,e from generate_series(1,2) a, >> generate_series(3,4) b, generate_series( 5,6) c, >> generate_series(7,8) d, generate_series(9,10) e; >> create index bar_a on bar(a); >> create index bar_b on bar(b); >> create index bar_c on bar(c); >> create index bar_d on bar(d); >> create index bar_e on bar(e); >> create unique index bar_abcde on bar(a,b,c,d,e); >> >> >> Now running: >> >> cluster bar using bar_abcde; >> >> >> appears to be very sensitive to the shared buffers setting. In an amazon >> very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the >> following timings: >> >> >> Shared Buffers Time >> 48Gb 2058ms >> 8Gb 372ms >> 1gb 67ms >> >> >> Is this expected behaviour? > Yeah. Clustering the table means that all the indexes and the old > version of the table all get dropped, and each time something is > dropped the entire buffer pool is scoured to remove the old buffers. > > In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB. > Because now the scouring is done once per object, not once per fork. > Also, the check is done without an initial spinlock. > > It perhaps could be improved further by only scouring the pool once, > at the end of the transaction, with a hash of all objects to be > dropped. > >> If so, is there a good explanation? I'm not sure >> what other operations might be affected this way. > drop, truncate, reindex, vacuum full. What else causes a table to be > re-written? OK, thanks for the info. cheers andrew
On 08/22/2012 05:19 PM, Jeff Janes wrote: >> >> Shared Buffers Time >> 48Gb 2058ms >> 8Gb 372ms >> 1gb 67ms >> >> >> Is this expected behaviour? > Yeah. Clustering the table means that all the indexes and the old > version of the table all get dropped, and each time something is > dropped the entire buffer pool is scoured to remove the old buffers. > > In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB. > Because now the scouring is done once per object, not once per fork. > Also, the check is done without an initial spinlock. > > It perhaps could be improved further by only scouring the pool once, > at the end of the transaction, with a hash of all objects to be > dropped. > > FYI, I have rerun the tests on amazon with 9.2 BETA - the improvement I saw ranged from a factor of roughly 2 (with 1Gb of shared memory) to 6 (with 48Gb). cheers andrew