Re: PATCH: optimized DROP of multiple tables within a transaction - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PATCH: optimized DROP of multiple tables within a transaction
Date
Msg-id f1adeff6990d863d76d191237b57a416@fuzzy.cz
Whole thread Raw
In response to Re: PATCH: optimized DROP of multiple tables within a transaction  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Responses Re: PATCH: optimized DROP of multiple tables within a transaction  (花田 茂 <shigeru.hanada@gmail.com>)
Re: PATCH: optimized DROP of multiple tables within a transaction  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: PATCH: optimized DROP of multiple tables within a transaction  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-hackers
Hi,

thanks for the review. I'll look into that in ~2 weeks, once the 
pgconf.eu
is over. A few comments in the text below.

Dne 17.10.2012 12:34, Shigeru HANADA napsal:
> Performance test
> ================
> I tested 1000 tables case (each is copy of pgbench_branches with 
> 100000
> rows) on 1GB shared_buffers server.  Please note that I tested on
> MacBook air, i.e. storage is not HDD but SSD.  Here is the test 
> procedure:
>
> 1) loop 1000 times
>  1-1) create copy table of pgbench_accounts as accounts$i
>  1-2) load 100000 rows
>  1-3) add primary key
>  1-4) select all rows to cache pages in shared buffer
> 2) BEGIN
> 3) loop 1000 times
>  3-1) DROP TABLE accounts$i
> 4) COMMIT

I don't think the 'load rows' and 'select all rows' is really 
necessary.
And AFAIK sequential scans use small circular buffer not to pollute 
shared
buffers, so I'd guess the pages are not cached in shared buffers 
anyway.
Have you verified that, e.g. by pg_buffercache?

> The numbers below are average of 5 trials.
>
> ---------+----------+-------------
>   Build  |   DROP * |   COMMIT
> ---------+----------+-------------
>  Master  | 0.239 ms | 1220.421 ms
>  Patched | 0.240 ms |  432.209 ms
> ---------+----------+-------------
> * time elapsed for one DROP TABLE statement
>
> IIUC the patch's target is improving COMMIT performance by avoiding
> repeated buffer search loop, so this results show that the patch
> obtained its goal.
>
> Coding review
> =============
> I have some comments about coding.
>
> * Some cosmetic changes are necessary.
> * Variable j in DropRelFileNodeAllBuffersList seems redundant.
> * RelFileNodeBackendIsTemp() macro is provided for checking whether 
> the
> relation is local, so using it would be better.
>
> Please see attached patch for changes above.
>
> * As Robert commented, this patch adds DropRelFileNodeAllBuffersList 
> by
> copying code from DropRelFileNodeAllBuffers.  Please refactor it to
> avoid code duplication.
> * In smgrDoPendingDeletes, you free srels explicitly.  Can't we leave
> them to memory context stuff?  Even it is required, at least pfree 
> must
> be called in the case nrels == 0 too.
> * In smgrDoPendingDeletes, the buffer srels is expanded in every
> iteration.  This seems a bit inefficient.  How about doubling the
> capacity when used it up?  This requires additional variable, but
> reduces repalloc call considerably.
> * Just an idea, but if we can remove entries for local relations from
> rnodes array before buffer loop in DropRelFileNodeAllBuffersList,
> following bsearch might be more efficient, though dropping many
> temporary tables might be rare.

Yes, I plan to do all of this.

>> Our system creates a lot of "working tables" (even 100.000) and we 
>> need
>> to perform garbage collection (dropping obsolete tables) regularly. 
>> This
>> often took ~ 1 hour, because we're using big AWS instances with lots 
>> of
>> RAM (which tends to be slower than RAM on bare hw). After applying 
>> this
>> patch and dropping tables in groups of 100, the gc runs in less than 
>> 4
>> minutes (i.e. a 15x speed-up).
>
> Hm, my environment seems very different from yours.  Could you show 
> the
> setting of shared_buffers in your environment?  I'd like to make my 
> test
> environment as similar as possible to yours.

We're using m2.4xlarge instances (70G of RAM) with 10GB shared buffers.

Tomas



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: How to avoid base backup in automated failover
Next
From: Amit Kapila
Date:
Subject: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown