Thread: Recovery performance of DROP DATABASE with many tablespaces

Recovery performance of DROP DATABASE with many tablespaces

From
Fujii Masao
Date:
Hi,

My colleague encountered the problem that WAL replay took a long time
in the standby with large shared_buffers when he dropped the database
using many tablespaces. As far as I read the code, this happens because
DROP DATABASE generates as many XLOG_DBASE_DROP WAL records as
the number of tablespaces that the database to drop uses,
and then WAL replay of one XLOG_DBASE_DROP record causes full scan of
shared_buffers. That is, DROP DATABASE causes the scans of shared_buffers
as many times as the number of the tablespaces during recovery.

Since the first scan caused by the first XLOG_DBASE_DROP record invalidates
all the pages related to the database to drop, in shared_buffers,
the subsequent scans by the subsequent records seem basically useless.
So I'd like to change the code so that we can avoid such subsequent
unnecessary scans, to reduce the recovery time of DROP DATABASE.

Generally the recovery performance of DROP DATABASE is not critical
for many users. But unfortunately my colleague's project might need to
sometimes drop the database using multiple tablespaces, for some reasons.
So, if the fix is not so complicated, I think that it's worth applying that.

The straight approach to avoid such unnecessary scans is to change
DROP DATABASE so that it generates only one XLOG_DBASE_DROP record,
and register the information of all the tablespace into it. Then, WAL replay
of XLOG_DBASE_DROP record scans shared_buffers once and deletes
all tablespaces. POC patch is attached.

Thought?

Regards,

-- 
Fujii Masao

Attachment

Re: Recovery performance of DROP DATABASE with many tablespaces

From
Ashwin Agrawal
Date:

On Mon, Jun 4, 2018 at 9:46 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Generally the recovery performance of DROP DATABASE is not critical
for many users. But unfortunately my colleague's project might need to
sometimes drop the database using multiple tablespaces, for some reasons.
So, if the fix is not so complicated, I think that it's worth applying that.

Agree, in isolation need for this improvement is not felt, but yes any improvements for single serialized replay process is definitely helpful.


The straight approach to avoid such unnecessary scans is to change
DROP DATABASE so that it generates only one XLOG_DBASE_DROP record,
and register the information of all the tablespace into it. Then, WAL replay
of XLOG_DBASE_DROP record scans shared_buffers once and deletes
all tablespaces. POC patch is attached.

Also, irrespective of performance improvement looks better to just have single xlog record for the same.

RE: Recovery performance of DROP DATABASE with many tablespaces

From
"Jamison, Kirk"
Date:
Hi, Fujii-san

I came across this post and I got interested in it,
 so I tried to apply/test the patch but I am not sure if I did it correctly.
I set-up master-slave sync, 200GB shared_buffers, 20000 max_locks_per_transaction,
1 DB with 500 table partitions shared evenly across 5 tablespaces.

After dropping the db, with or without patch, 
there were no difference in recovery performance when dropping database, 
so maybe I made a mistake somewhere. But anyway, here's the results.

======WITHOUT PATCH=======
[200GB shared buffers]
DROPDB only (skipped DROP TABLE and DROP TABLESPACE)
2018/07/04_13:35:00.161
dropdb
2018/07/04_13:35:05.591        5.591 sec

[200GB shared_buffers]
DROPDB (including DROP TABLE and DROP TABLESPACE)
real    3m19.717s
user    0m0.001s
sys     0m0.001s

======WITH PATCH=======
[200GB shared_buffers]
DROPDB only (skipped DROP TABLE and DROP TABLESPACE)
2018/07/04_14:19:47.128
dropdb
2018/07/04_14:19:53.177        6.049 sec

[200GB shared_buffers]
DROPDB (included the DROP TABLE and DROP TABLESPACE commands)
real    3m51.834s
user    0m0.001s
sys     0m0.002s

Just in case, do you also have some performance test numbers/case 
to show the recovery perf improvement when dropping database that contain multiple tablespaces?

Regards,
Kirk Jamison

Re: Recovery performance of DROP DATABASE with many tablespaces

From
Fujii Masao
Date:
On Wed, Jul 4, 2018 at 4:47 PM, Jamison, Kirk <k.jamison@jp.fujitsu.com> wrote:
> Hi, Fujii-san
>
> I came across this post and I got interested in it,
>  so I tried to apply/test the patch but I am not sure if I did it correctly.
> I set-up master-slave sync, 200GB shared_buffers, 20000 max_locks_per_transaction,
> 1 DB with 500 table partitions shared evenly across 5 tablespaces.
>
> After dropping the db, with or without patch,
> there were no difference in recovery performance when dropping database,
> so maybe I made a mistake somewhere. But anyway, here's the results.
>
> ======WITHOUT PATCH=======
> [200GB shared buffers]
> DROPDB only (skipped DROP TABLE and DROP TABLESPACE)
> 2018/07/04_13:35:00.161
> dropdb
> 2018/07/04_13:35:05.591         5.591 sec
>
> [200GB shared_buffers]
> DROPDB (including DROP TABLE and DROP TABLESPACE)
> real    3m19.717s
> user    0m0.001s
> sys     0m0.001s
>
> ======WITH PATCH=======
> [200GB shared_buffers]
> DROPDB only (skipped DROP TABLE and DROP TABLESPACE)
> 2018/07/04_14:19:47.128
> dropdb
> 2018/07/04_14:19:53.177         6.049 sec
>
> [200GB shared_buffers]
> DROPDB (included the DROP TABLE and DROP TABLESPACE commands)
> real    3m51.834s
> user    0m0.001s
> sys     0m0.002s
>
> Just in case, do you also have some performance test numbers/case
> to show the recovery perf improvement when dropping database that contain multiple tablespaces?

Thanks for testing!

TBH, I have no numbers measured by the test.
One question about your test is; how did you measure the *recovery time*
of DROP DATABASE? Since it's *recovery* performance, basically it's not easy
to measure that.

Regards,

-- 
Fujii Masao


Re: Recovery performance of DROP DATABASE with many tablespaces

From
Simon Riggs
Date:
On 4 June 2018 at 17:46, Fujii Masao <masao.fujii@gmail.com> wrote:
> Hi,
>
> My colleague encountered the problem that WAL replay took a long time
> in the standby with large shared_buffers when he dropped the database
> using many tablespaces. As far as I read the code, this happens because
> DROP DATABASE generates as many XLOG_DBASE_DROP WAL records as
> the number of tablespaces that the database to drop uses,
> and then WAL replay of one XLOG_DBASE_DROP record causes full scan of
> shared_buffers. That is, DROP DATABASE causes the scans of shared_buffers
> as many times as the number of the tablespaces during recovery.
>
> Since the first scan caused by the first XLOG_DBASE_DROP record invalidates
> all the pages related to the database to drop, in shared_buffers,
> the subsequent scans by the subsequent records seem basically useless.
> So I'd like to change the code so that we can avoid such subsequent
> unnecessary scans, to reduce the recovery time of DROP DATABASE.

+1

> Generally the recovery performance of DROP DATABASE is not critical
> for many users. But unfortunately my colleague's project might need to
> sometimes drop the database using multiple tablespaces, for some reasons.
> So, if the fix is not so complicated, I think that it's worth applying that.

Agreed

> The straight approach to avoid such unnecessary scans is to change
> DROP DATABASE so that it generates only one XLOG_DBASE_DROP record,
> and register the information of all the tablespace into it. Then, WAL replay
> of XLOG_DBASE_DROP record scans shared_buffers once and deletes
> all tablespaces. POC patch is attached.

Seems clear on read of patch, but not tested it.

Please replace tablespace_num with ntablespaces so its clearer and
consistent with other other WAL records

Cheers

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Recovery performance of DROP DATABASE with many tablespaces

From
Michael Paquier
Date:
On Thu, Jul 05, 2018 at 01:42:20AM +0900, Fujii Masao wrote:
> TBH, I have no numbers measured by the test.
> One question about your test is; how did you measure the *recovery time*
> of DROP DATABASE? Since it's *recovery* performance, basically it's not easy
> to measure that.

It would be simple to measure the time it takes to replay this single
DROP DATABASE record by putting two gettimeofday() calls or such things
and then take the time difference.  There are many methods that you
could use here, and I suppose that with a shared buffer setting of a
couple of GBs of shared buffers you would see a measurable difference
with a dozen of tablespaces or so.  You could also take a base backup
after creating all the tablespaces, connect the standby and then drop
the database on the primary to see the actual time it takes.  Your patch
looks logically correct to me because DropDatabaseBuffers is a
*bottleneck* with large shared_buffers, and it would be nice to see
numbers.
--
Michael

Attachment

Re: Recovery performance of DROP DATABASE with many tablespaces

From
Michael Paquier
Date:
On Tue, Jul 10, 2018 at 03:04:05PM +0900, Michael Paquier wrote:
> It would be simple to measure the time it takes to replay this single
> DROP DATABASE record by putting two gettimeofday() calls or such things
> and then take the time difference.  There are many methods that you
> could use here, and I suppose that with a shared buffer setting of a
> couple of GBs of shared buffers you would see a measurable difference
> with a dozen of tablespaces or so.  You could also take a base backup
> after creating all the tablespaces, connect the standby and then drop
> the database on the primary to see the actual time it takes.  Your patch
> looks logically correct to me because DropDatabaseBuffers is a
> *bottleneck* with large shared_buffers, and it would be nice to see
> numbers.

This was a couple of months ago, and nothing has happened since with the
patch waiting on author, so the patch is marked as returned with
feedback.
--
Michael

Attachment

Re: Recovery performance of DROP DATABASE with many tablespaces

From
Fujii Masao
Date:
On Thu, Jul 5, 2018 at 5:15 PM Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 4 June 2018 at 17:46, Fujii Masao <masao.fujii@gmail.com> wrote:
> > Hi,
> >
> > My colleague encountered the problem that WAL replay took a long time
> > in the standby with large shared_buffers when he dropped the database
> > using many tablespaces. As far as I read the code, this happens because
> > DROP DATABASE generates as many XLOG_DBASE_DROP WAL records as
> > the number of tablespaces that the database to drop uses,
> > and then WAL replay of one XLOG_DBASE_DROP record causes full scan of
> > shared_buffers. That is, DROP DATABASE causes the scans of shared_buffers
> > as many times as the number of the tablespaces during recovery.
> >
> > Since the first scan caused by the first XLOG_DBASE_DROP record invalidates
> > all the pages related to the database to drop, in shared_buffers,
> > the subsequent scans by the subsequent records seem basically useless.
> > So I'd like to change the code so that we can avoid such subsequent
> > unnecessary scans, to reduce the recovery time of DROP DATABASE.
>
> +1
>
> > Generally the recovery performance of DROP DATABASE is not critical
> > for many users. But unfortunately my colleague's project might need to
> > sometimes drop the database using multiple tablespaces, for some reasons.
> > So, if the fix is not so complicated, I think that it's worth applying that.
>
> Agreed
>
> > The straight approach to avoid such unnecessary scans is to change
> > DROP DATABASE so that it generates only one XLOG_DBASE_DROP record,
> > and register the information of all the tablespace into it. Then, WAL replay
> > of XLOG_DBASE_DROP record scans shared_buffers once and deletes
> > all tablespaces. POC patch is attached.
>
> Seems clear on read of patch, but not tested it.
>
> Please replace tablespace_num with ntablespaces so its clearer and
> consistent with other other WAL records

Thanks for the review! I changed the patch that way. Also I rebased it
on master.
Attached is the latest version of the patch.

I will add this patch to next CommitFest.

Regards,

-- 
Fujii Masao

Attachment

Re: Recovery performance of DROP DATABASE with many tablespaces

From
Fujii Masao
Date:
On Tue, Jul 10, 2018 at 3:04 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Jul 05, 2018 at 01:42:20AM +0900, Fujii Masao wrote:
> > TBH, I have no numbers measured by the test.
> > One question about your test is; how did you measure the *recovery time*
> > of DROP DATABASE? Since it's *recovery* performance, basically it's not easy
> > to measure that.
>
> It would be simple to measure the time it takes to replay this single
> DROP DATABASE record by putting two gettimeofday() calls or such things
> and then take the time difference.  There are many methods that you
> could use here, and I suppose that with a shared buffer setting of a
> couple of GBs of shared buffers you would see a measurable difference
> with a dozen of tablespaces or so.  You could also take a base backup
> after creating all the tablespaces, connect the standby and then drop
> the database on the primary to see the actual time it takes.  Your patch
> looks logically correct to me because DropDatabaseBuffers is a
> *bottleneck* with large shared_buffers, and it would be nice to see
> numbers.

Thanks for the comment!

I measured how long it takes to replay DROP DATABASE with 1000 tablespaces,
in master and patched version. shared_buffers was set to 16GB.

[master]
It took 8 seconds to replay DROP DATABASE with 1000 tablespaces,
as follows. In this case, 16GB shared_buffers was fully scanned 1000 times.

    2019-10-02 16:50:14 JST LOG:  redo starts at 0/2000028
    2019-10-02 16:50:22 JST LOG:  redo done at 0/300A298

[patched]
It took less than 1 second to replay DROP DATABASE with 1000 tablespaces,
as follows. In this case, 16GB shared_buffers was scanned only one time.

    2019-10-02 16:47:03 JST LOG:  redo starts at 0/2000028
    2019-10-02 16:47:03 JST LOG:  redo done at 0/3001588

Regards,

-- 
Fujii Masao



RE: Recovery performance of DROP DATABASE with many tablespaces

From
"k.jamison@fujitsu.com"
Date:
On Wed, Oct. 2, 2019 5:40 PM, Fujii Masao wrote:
> On Tue, Jul 10, 2018 at 3:04 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Thu, Jul 05, 2018 at 01:42:20AM +0900, Fujii Masao wrote:
> > > TBH, I have no numbers measured by the test.
> > > One question about your test is; how did you measure the *recovery
> > > time* of DROP DATABASE? Since it's *recovery* performance, basically
> > > it's not easy to measure that.
> >
> > It would be simple to measure the time it takes to replay this single
> > DROP DATABASE record by putting two gettimeofday() calls or such
> > things and then take the time difference.  There are many methods that
> > you could use here, and I suppose that with a shared buffer setting of
> > a couple of GBs of shared buffers you would see a measurable
> > difference with a dozen of tablespaces or so.  You could also take a
> > base backup after creating all the tablespaces, connect the standby
> > and then drop the database on the primary to see the actual time it
> > takes.  Your patch looks logically correct to me because
> > DropDatabaseBuffers is a
> > *bottleneck* with large shared_buffers, and it would be nice to see
> > numbers.
> 
> Thanks for the comment!
> 
> I measured how long it takes to replay DROP DATABASE with 1000 tablespaces,
> in master and patched version. shared_buffers was set to 16GB.
> 
> [master]
> It took 8 seconds to replay DROP DATABASE with 1000 tablespaces, as follows.
> In this case, 16GB shared_buffers was fully scanned 1000 times.
> 
>     2019-10-02 16:50:14 JST LOG:  redo starts at 0/2000028
>     2019-10-02 16:50:22 JST LOG:  redo done at 0/300A298
> 
> [patched]
> It took less than 1 second to replay DROP DATABASE with 1000 tablespaces,
> as follows. In this case, 16GB shared_buffers was scanned only one time.
> 
>     2019-10-02 16:47:03 JST LOG:  redo starts at 0/2000028
>     2019-10-02 16:47:03 JST LOG:  redo done at 0/3001588
> 

Hi Fujii-san,

It's been a while, so I checked the patch once again.
It's fairly straightforward and I saw no problems nor bug in the code.

> [patched]
> It took less than 1 second to replay DROP DATABASE with 1000 tablespaces,
The results are good.
I want to replicate the performance to confirm the results as well.
Could you share how you measured the recovery replay?
Did you actually execute a failover?

Regards,
Kirk Jamison

Re: Recovery performance of DROP DATABASE with many tablespaces

From
Fujii Masao
Date:
On Wed, Nov 13, 2019 at 3:57 PM k.jamison@fujitsu.com
<k.jamison@fujitsu.com> wrote:
>
> On Wed, Oct. 2, 2019 5:40 PM, Fujii Masao wrote:
> > On Tue, Jul 10, 2018 at 3:04 PM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Thu, Jul 05, 2018 at 01:42:20AM +0900, Fujii Masao wrote:
> > > > TBH, I have no numbers measured by the test.
> > > > One question about your test is; how did you measure the *recovery
> > > > time* of DROP DATABASE? Since it's *recovery* performance, basically
> > > > it's not easy to measure that.
> > >
> > > It would be simple to measure the time it takes to replay this single
> > > DROP DATABASE record by putting two gettimeofday() calls or such
> > > things and then take the time difference.  There are many methods that
> > > you could use here, and I suppose that with a shared buffer setting of
> > > a couple of GBs of shared buffers you would see a measurable
> > > difference with a dozen of tablespaces or so.  You could also take a
> > > base backup after creating all the tablespaces, connect the standby
> > > and then drop the database on the primary to see the actual time it
> > > takes.  Your patch looks logically correct to me because
> > > DropDatabaseBuffers is a
> > > *bottleneck* with large shared_buffers, and it would be nice to see
> > > numbers.
> >
> > Thanks for the comment!
> >
> > I measured how long it takes to replay DROP DATABASE with 1000 tablespaces,
> > in master and patched version. shared_buffers was set to 16GB.
> >
> > [master]
> > It took 8 seconds to replay DROP DATABASE with 1000 tablespaces, as follows.
> > In this case, 16GB shared_buffers was fully scanned 1000 times.
> >
> >     2019-10-02 16:50:14 JST LOG:  redo starts at 0/2000028
> >     2019-10-02 16:50:22 JST LOG:  redo done at 0/300A298
> >
> > [patched]
> > It took less than 1 second to replay DROP DATABASE with 1000 tablespaces,
> > as follows. In this case, 16GB shared_buffers was scanned only one time.
> >
> >     2019-10-02 16:47:03 JST LOG:  redo starts at 0/2000028
> >     2019-10-02 16:47:03 JST LOG:  redo done at 0/3001588
> >
>
> Hi Fujii-san,
>
> It's been a while, so I checked the patch once again.
> It's fairly straightforward and I saw no problems nor bug in the code.

Thanks for the review!

> > [patched]
> > It took less than 1 second to replay DROP DATABASE with 1000 tablespaces,
> The results are good.
> I want to replicate the performance to confirm the results as well.
> Could you share how you measured the recovery replay?

I forgot the actual steps that I used for the measurement.
But I think they are something like

1. create database "hoge"
2. create 1,000 tablespaces
3. create 1,000 tables on the database "hoge".
    each table should be placed in different tablespace.
4. take a base backup
5. drop database "hoge"
6. shutdown the server with immediate mode
7. start an archive recovery from the backup taken at #4
8. measure how long it takes to apply DROP DATABASE record by
    checking the timestamp at REDO start and REDO end.

I think that I performed the above steps on the master and
the patched version.

> Did you actually execute a failover?

No.

Regards,

-- 
Fujii Masao



RE: Recovery performance of DROP DATABASE with many tablespaces

From
"k.jamison@fujitsu.com"
Date:
On Wed, Nov 13, 2019 5:34PM (GMT+9), Fujii Masao wrote:
> On Wed, Nov 13, 2019 at 3:57 PM k.jamison@fujitsu.com <k.jamison@fujitsu.com>
> wrote:
> >
> > On Wed, Oct. 2, 2019 5:40 PM, Fujii Masao wrote:
> > > On Tue, Jul 10, 2018 at 3:04 PM Michael Paquier <michael@paquier.xyz>
> wrote:
> > > >
> > > > On Thu, Jul 05, 2018 at 01:42:20AM +0900, Fujii Masao wrote:
> > > > > TBH, I have no numbers measured by the test.
> > > > > One question about your test is; how did you measure the
> > > > > *recovery
> > > > > time* of DROP DATABASE? Since it's *recovery* performance,
> > > > > basically it's not easy to measure that.
> > > >
> > > > It would be simple to measure the time it takes to replay this
> > > > single DROP DATABASE record by putting two gettimeofday() calls or
> > > > such things and then take the time difference.  There are many
> > > > methods that you could use here, and I suppose that with a shared
> > > > buffer setting of a couple of GBs of shared buffers you would see
> > > > a measurable difference with a dozen of tablespaces or so.  You
> > > > could also take a base backup after creating all the tablespaces,
> > > > connect the standby and then drop the database on the primary to
> > > > see the actual time it takes.  Your patch looks logically correct
> > > > to me because DropDatabaseBuffers is a
> > > > *bottleneck* with large shared_buffers, and it would be nice to
> > > > see numbers.
> > >
> > > Thanks for the comment!
> > >
> > > I measured how long it takes to replay DROP DATABASE with 1000
> > > tablespaces, in master and patched version. shared_buffers was set to
> 16GB.
> > >
> > > [master]
> > > It took 8 seconds to replay DROP DATABASE with 1000 tablespaces, as follows.
> > > In this case, 16GB shared_buffers was fully scanned 1000 times.
> > >
> > >     2019-10-02 16:50:14 JST LOG:  redo starts at 0/2000028
> > >     2019-10-02 16:50:22 JST LOG:  redo done at 0/300A298
> > >
> > > [patched]
> > > It took less than 1 second to replay DROP DATABASE with 1000
> > > tablespaces, as follows. In this case, 16GB shared_buffers was scanned
> only one time.
> > >
> > >     2019-10-02 16:47:03 JST LOG:  redo starts at 0/2000028
> > >     2019-10-02 16:47:03 JST LOG:  redo done at 0/3001588
> > >
> >
> > Hi Fujii-san,
> >
> > It's been a while, so I checked the patch once again.
> > It's fairly straightforward and I saw no problems nor bug in the code.
> 
> Thanks for the review!
> 
> > > [patched]
> > > It took less than 1 second to replay DROP DATABASE with 1000
> > > tablespaces,
> > The results are good.
> > I want to replicate the performance to confirm the results as well.
> > Could you share how you measured the recovery replay?
> 
> I forgot the actual steps that I used for the measurement.
> But I think they are something like
> 
> 1. create database "hoge"
> 2. create 1,000 tablespaces
> 3. create 1,000 tables on the database "hoge".
>     each table should be placed in different tablespace.
> 4. take a base backup
> 5. drop database "hoge"
> 6. shutdown the server with immediate mode 7. start an archive recovery from
> the backup taken at #4 8. measure how long it takes to apply DROP DATABASE
> record by
>     checking the timestamp at REDO start and REDO end.
> 
> I think that I performed the above steps on the master and the patched version.
> 
> > Did you actually execute a failover?
> 
> No.

I'm sorry for the late reply, and thanks for the guide above.
I replicated the same recovery test above on a standalone server
and have confirmed with the logs that the patch made the recovery faster.

[MASTER/UNPATCHED] ~10 seconds
2019-11-19 15:25:23.891 JST [23042] LOG:  redo starts at 0/180006A0
...
2019-11-19 15:25:34.492 JST [23042] LOG:  redo done at 0/1800A478

[PATCHED]  ~less than 1 sec
2019-11-19 15:31:59.415 JST [17625] LOG:  redo starts at 0/40005B8
...
2019-11-19 15:32:00.159 JST [17625] CONTEXT:  WAL redo at 0/4000668 for Database/DROP: dir 1663/16384
16385/16384...//furtherdetails ommitted//...
 
...
2019-11-19 15:32:00.159 JST [17625] LOG:  redo done at 0/4001638

I believe there are no problems, so I am marking this patch now
as "Ready for Committer".

Regards,
Kirk Jamison

Re: Recovery performance of DROP DATABASE with many tablespaces

From
Fujii Masao
Date:
On Tue, Nov 19, 2019 at 3:39 PM k.jamison@fujitsu.com
<k.jamison@fujitsu.com> wrote:
>
> On Wed, Nov 13, 2019 5:34PM (GMT+9), Fujii Masao wrote:
> > On Wed, Nov 13, 2019 at 3:57 PM k.jamison@fujitsu.com <k.jamison@fujitsu.com>
> > wrote:
> > >
> > > On Wed, Oct. 2, 2019 5:40 PM, Fujii Masao wrote:
> > > > On Tue, Jul 10, 2018 at 3:04 PM Michael Paquier <michael@paquier.xyz>
> > wrote:
> > > > >
> > > > > On Thu, Jul 05, 2018 at 01:42:20AM +0900, Fujii Masao wrote:
> > > > > > TBH, I have no numbers measured by the test.
> > > > > > One question about your test is; how did you measure the
> > > > > > *recovery
> > > > > > time* of DROP DATABASE? Since it's *recovery* performance,
> > > > > > basically it's not easy to measure that.
> > > > >
> > > > > It would be simple to measure the time it takes to replay this
> > > > > single DROP DATABASE record by putting two gettimeofday() calls or
> > > > > such things and then take the time difference.  There are many
> > > > > methods that you could use here, and I suppose that with a shared
> > > > > buffer setting of a couple of GBs of shared buffers you would see
> > > > > a measurable difference with a dozen of tablespaces or so.  You
> > > > > could also take a base backup after creating all the tablespaces,
> > > > > connect the standby and then drop the database on the primary to
> > > > > see the actual time it takes.  Your patch looks logically correct
> > > > > to me because DropDatabaseBuffers is a
> > > > > *bottleneck* with large shared_buffers, and it would be nice to
> > > > > see numbers.
> > > >
> > > > Thanks for the comment!
> > > >
> > > > I measured how long it takes to replay DROP DATABASE with 1000
> > > > tablespaces, in master and patched version. shared_buffers was set to
> > 16GB.
> > > >
> > > > [master]
> > > > It took 8 seconds to replay DROP DATABASE with 1000 tablespaces, as follows.
> > > > In this case, 16GB shared_buffers was fully scanned 1000 times.
> > > >
> > > >     2019-10-02 16:50:14 JST LOG:  redo starts at 0/2000028
> > > >     2019-10-02 16:50:22 JST LOG:  redo done at 0/300A298
> > > >
> > > > [patched]
> > > > It took less than 1 second to replay DROP DATABASE with 1000
> > > > tablespaces, as follows. In this case, 16GB shared_buffers was scanned
> > only one time.
> > > >
> > > >     2019-10-02 16:47:03 JST LOG:  redo starts at 0/2000028
> > > >     2019-10-02 16:47:03 JST LOG:  redo done at 0/3001588
> > > >
> > >
> > > Hi Fujii-san,
> > >
> > > It's been a while, so I checked the patch once again.
> > > It's fairly straightforward and I saw no problems nor bug in the code.
> >
> > Thanks for the review!
> >
> > > > [patched]
> > > > It took less than 1 second to replay DROP DATABASE with 1000
> > > > tablespaces,
> > > The results are good.
> > > I want to replicate the performance to confirm the results as well.
> > > Could you share how you measured the recovery replay?
> >
> > I forgot the actual steps that I used for the measurement.
> > But I think they are something like
> >
> > 1. create database "hoge"
> > 2. create 1,000 tablespaces
> > 3. create 1,000 tables on the database "hoge".
> >     each table should be placed in different tablespace.
> > 4. take a base backup
> > 5. drop database "hoge"
> > 6. shutdown the server with immediate mode 7. start an archive recovery from
> > the backup taken at #4 8. measure how long it takes to apply DROP DATABASE
> > record by
> >     checking the timestamp at REDO start and REDO end.
> >
> > I think that I performed the above steps on the master and the patched version.
> >
> > > Did you actually execute a failover?
> >
> > No.
>
> I'm sorry for the late reply, and thanks for the guide above.
> I replicated the same recovery test above on a standalone server
> and have confirmed with the logs that the patch made the recovery faster.
>
> [MASTER/UNPATCHED] ~10 seconds
> 2019-11-19 15:25:23.891 JST [23042] LOG:  redo starts at 0/180006A0
> ...
> 2019-11-19 15:25:34.492 JST [23042] LOG:  redo done at 0/1800A478
>
> [PATCHED]  ~less than 1 sec
> 2019-11-19 15:31:59.415 JST [17625] LOG:  redo starts at 0/40005B8
> ...
> 2019-11-19 15:32:00.159 JST [17625] CONTEXT:  WAL redo at 0/4000668 for Database/DROP: dir 1663/16384
16385/16384...//furtherdetails ommitted//...
 
> ...
> 2019-11-19 15:32:00.159 JST [17625] LOG:  redo done at 0/4001638
>
> I believe there are no problems, so I am marking this patch now
> as "Ready for Committer".

Thanks for the review! Committed.

Regards,

-- 
Fujii Masao