Re: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information) - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information) |
Date | |
Msg-id | CA+HiwqFm7D2Vrpw4byLERS4306Btd4E6v6yq_MuKyiV1OxZs1g@mail.gmail.com Whole thread Raw |
In response to | Re: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information) (Amit Langote <amitlangote09@gmail.com>) |
List | pgsql-hackers |
On Thu, Jun 19, 2014 at 2:09 PM, Amit Langote <amitlangote09@gmail.com> wrote: > On Wed, Jan 22, 2014 at 9:12 PM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> >> I think we need to add the deleted pages to the FSM more aggressively. >> >> I tried simply adding calls to RecordFreeIndexPage, after the list pages >> have been marked as deleted, but unfortunately that didn't help. The problem >> is that the FSM is organized into a three-level tree, and >> RecordFreeIndexPage only updates the bottom level. The upper levels are not >> updated until the FSM is vacuumed, so the pages are still not visible to >> GetFreeIndexPage calls until next vacuum. The simplest fix would be to add a >> call to IndexFreeSpaceMapVacuum after flushing the pending list, per >> attached patch. I'm slightly worried about the performance impact of the >> IndexFreeSpaceMapVacuum() call. It scans the whole FSM of the index, which >> isn't exactly free. So perhaps we should teach RecordFreeIndexPage to update >> the upper levels of the FSM in a retail-fashion instead. >> > > I wonder if you pursued this further? > > You recently added a number of TODO items related to GIN index; is it > worth adding this to the list? > In fact, I forgot to mention that I tried your patch and it seems to be working for the particular example I am working with using pg_bigm (bigram) (indexed data not so realistic maybe). postgres=# CREATE TABLE test(contents text); CREATE TABLE postgres=# create or replace function rnd_str(length integer) returns text as $$ declare chars text[] := '{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z, あ, い, う, え, お, か, き, く, け, こ, さ, し, す, せ, そ, た, ち, つ, て, と, な, に, ぬ, ね, の, は, ひ, ふ, へ, ほ, ま, み, む, め, も, や, ゆ, よ, ら, り, る, れ, ろ, わ, ゐ, ゑ, を}'; result text := ''; i integer := 0; arr_len integer; begin chars := array_append(chars, ' '); arr_len := array_length(chars, 1); if length < 0 then raise exception 'Given lengthcannot be less than 0'; end if; for i in 1..length loop result := result || chars[1+random()*(arr_len - 1)]; endloop; return result; end; $$ language plpgsql; CREATE FUNCTION -- HEAD postgres=# INSERT INTO test SELECT rnd_str((random() * 500)::int) FROM generate_series(1, 100000); INSERT 0 100000 Time: 76296.641 ms postgres=# SELECT pg_size_pretty(pg_table_size('test'));pg_size_pretty ----------------49 MB (1 row) postgres=# CREATE INDEX test_bigm_idx ON test USING GIN (contents gin_bigm_ops); CREATE INDEX Time: 50517.912 ms postgres=# SELECT pg_size_pretty(pg_relation_size('test_bigm_idx'));pg_size_pretty ----------------118 MB (1 row) postgres=# TRUNCATE test; TRUNCATE TABLE postgres=# INSERT INTO test SELECT rnd_str((random() * 500)::int) FROM generate_series(1, 100000); INSERT 0 100000 Time: 233369.366 ms postgres=# SELECT pg_size_pretty(pg_relation_size('test_bigm_idx'));pg_size_pretty ----------------747 MB (1 row) -- Whereas, patched (gin-recycle-deleted-list-pages-1.patch) HEAD postgres=# INSERT INTO test SELECT rnd_str((random() * 500)::int) FROM generate_series(1, 100000); INSERT 0 100000 Time: 32808.779 ms postgres=# CREATE INDEX test_bigm_idx ON test USING GIN (contents gin_bigm_ops); CREATE INDEX Time: 24490.945 ms postgres=# SELECT pg_size_pretty(pg_relation_size('test_bigm_idx'));pg_size_pretty ----------------118 MB (1 row) postgres=# TRUNCATE test; TRUNCATE TABLE postgres=# INSERT INTO test SELECT rnd_str((random() * 500)::int) FROM generate_series(1, 100000); INSERT 0 100000 Time: 153878.163 ms postgres=# SELECT pg_size_pretty(pg_relation_size('test_bigm_idx'));pg_size_pretty ----------------119 MB (1 row) That sure looks good. -- Amit
pgsql-hackers by date: