Home > mailing lists

Re: Recovery performance of standby for multiple concurrenttruncates on large tables - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Recovery performance of standby for multiple concurrenttruncates on large tables
Date	July 10, 2018 10:14:45
Msg-id	20180710071445.zjeyoqeuzvxw7azd@alap3.anarazel.de Whole thread
In response to	Recovery performance of standby for multiple concurrent truncateson large tables ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>)
Responses	RE: Recovery performance of standby for multiple concurrenttruncates on large tables
List	pgsql-hackers

Tree view

Hi,

On 2018-07-10 07:05:12 +0000, Jamison, Kirk wrote:
> Hello hackers,
> 
> Recently, the problem on improving performance of multiple drop/truncate tables in a single transaction with large
shared_buffers(as shown below) was solved by commit b416691.
 
>               BEGIN;
>               truncate tbl001;
>               ...
>               truncate tbl050;
>               COMMIT;
> 
> However, we have a customer that needs to execute multiple concurrent TRUNCATEs (40~50) on different large tables (as
shownbelow) in a shorter amount of time. This one is not covered by the previous commit's improvement.
 
>               BEGIN;
>               truncate tbl001;
>               COMMIT;
>               ...
>               BEGIN;
>               truncate tbl050;
>               COMMIT;
> 
> [Problem]
> Currently, when the standby recovers the WAL of TRUNCATE/DROP TABLE, it leads to separate scans of the whole shared
bufferin sequence to check whether or not the table to be deleted is cached in the shared buffer. Moreover, if the size
ofshared_buffers is large (i.e. 300GB) and the primary server fails during the replay, it would take a long while for
thestandby to complete recovery.
 

If you do so sequentially on the primary, I'm not clear as to why you
think the issue is bigger in recovery?


> [Idea]
> Since in the current implementation, the replay of each TRUNCATE/DROP TABLE scans the whole shared buffer.
> One approach (though idea is not really developed yet) is to improve the recovery by delaying the shared buffer scan
andinvalidation (DropRelFileNodeBuffers) and to put it after the next checkpoint (after failover completion). The
replayof TRUNCATE/DROP TABLE just make the checkpointer process remember what relations should be invalidated in the
sharedbuffer during subsequent checkpoint. The checkpointer then scans the shared buffer only once to invalidate the
buffersof relations that was dropped and truncated.
 

I think you'd run into a lot of very hairy details with this
approach. Consider what happens if client processes need fresh buffers
and need to write out a victim buffer. You'll need to know that the
relevant buffer is actually invalid. Thus the knowledge about the
"delayed" drops would need to be in shared buffers and scanned on every
dirty buffer writeout.


> However, this is still a rough idea, so I am not sure if it’s feasible. I would like to know if the community has
adviceor other alternative solutions on how to work around this.
 
> Any insights, advice, feedback?

I personally think we should rather just work towards a ordered buffer
mapping implementation.

Greetings,

Andres Freund

pgsql-hackers by date:

From: "Jamison, Kirk"
Date: 10 July 2018, 10:05:12
Subject: Recovery performance of standby for multiple concurrent truncateson large tables

From: Andres Freund
Date: 10 July 2018, 10:26:30
Subject: Re: Non-reserved replication slots and slot advancing

Re: Recovery performance of standby for multiple concurrenttruncates on large tables - Mailing list pgsql-hackers

Previous

Next