Re: Speedup of relation deletes during recovery - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Speedup of relation deletes during recovery
Date
Msg-id 20180419022802.GH957@paquier.xyz
Whole thread Raw
In response to Re: Speedup of relation deletes during recovery  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On Thu, Apr 19, 2018 at 01:52:26AM +0900, Fujii Masao wrote:
> No. But after my colleague truncated more than one hundred tables on
> the server with shared_buffers = 300GB, the recovery could not finish
> even after 10 minutes since the startup of the recovery. So I had to
> shutdown the server immediately, set shared_buffers to very small
> temporarily and start the server to cause the recovery to finish soon.

Hm, OK.  Here are simple functions to create and drop many relations in
a single transaction:
create or replace function create_tables(numtabs int)
returns void as $$
declare query_string text;
begin
  for i in 1..numtabs loop
    query_string := 'create table tab_' || i::text || ' (a int);';
    execute query_string;
  end loop;
end;
$$ language plpgsql;

create or replace function drop_tables(numtabs int)
returns void as $$
declare query_string text;
begin
  for i in 1..numtabs loop
    query_string := 'drop table tab_' || i::text;
    execute query_string;
  end loop;
end;
$$ language plpgsql;

With a server running 8GB of shared buffers (you likely need to increase
max_locks_per_transaction) and 10k relations created and dropped, it
took 50 seconds to finish redo of roughly 4 segments:
2018-04-19 11:17:15 JST [7472]: [14-1] db=,user=,app=,client= LOG:  redo
starts at 0/15DF2E8
2018-04-19 11:18:05 JST [7472]: [15-1] db=,user=,app=,client= LOG:
invalid record length at 0/4E46160: wanted 24, got 0
2018-04-19 11:18:05 JST [7472]: [16-1] db=,user=,app=,client= LOG:  redo
done at 0/4A7C6E

Then with your patch it took... Barely 1 second.
2018-04-19 11:20:33 JST [11690]: [14-1] db=,user=,app=,client= LOG:
redo starts at 0/15DF2E8
2018-04-19 11:20:34 JST [11690]: [15-1] db=,user=,app=,client= LOG:
invalid record length at 0/4E299B0: wanted 24, got 0
2018-04-19 11:20:34 JST [11690]: [16-1] db=,user=,app=,client= LOG:
redo done at 0/4E29978

Looking at profiles with perf I can also see that 95% of the time is
spent in DropRelFileNodesAllBuffers which is where the startup process
spends most of its CPU.  So HEAD is booh.  And your patch is excellent
in this context.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Adding an LWLockHeldByMe()-like function that reports if anybuffer content lock is held
Next
From: Masahiko Sawada
Date:
Subject: Re: VM map freeze corruption