Re: drop/truncate table sucks for large values of shared buffers - Mailing list pgsql-hackers

From Gurjeet Singh
Subject Re: drop/truncate table sucks for large values of shared buffers
Date
Msg-id CABwTF4UCG=kKK+u1WJRbetHP7hdvdPcc_o2wSmOo9TUJxY2pig@mail.gmail.com
Whole thread Raw
In response to drop/truncate table sucks for large values of shared buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: drop/truncate table sucks for large values of shared buffers  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Jun 26, 2015 at 9:45 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Sometime back on one of the PostgreSQL blog [1], there was
discussion about the performance of drop/truncate table for
large values of shared_buffers and it seems that as the value
of shared_buffers increase the performance of drop/truncate
table becomes worse.  I think those are not often used operations,
so it never became priority to look into improving them if possible.

I have looked into it and found that the main reason for such
a behaviour is that for those operations it traverses whole
shared_buffers and it seems to me that we don't need that
especially for not-so-big tables.  We can optimize that path
by looking into buff mapping table for the pages that exist in
shared_buffers for the case when table size is less than some
threshold (say 25%) of shared buffers.

Attached patch implements the above idea and I found that
performance doesn't dip much with patch even with large value
of shared_buffers.  I have also attached script and sql file used
to take performance data.

+1 for the effort to improve this.

With your technique added, there are 3 possible ways the search can happen a) Scan NBuffers and scan list of relations, b) Scan NBuffers and bsearch list of relations, and c) Scan list of relations and then invalidate blocks of each fork from shared buffers. Would it be worth it finding one technique that can serve decently from the low-end shared_buffers to the high-end.

On patch:

There are multiple naming styles being used in DropForkSpecificBuffers(); my_name and myName. Given this is a new function, it'd help to be consistent.

s/blk_count/blockNum/

s/new//, for eg. newTag, because there's no corresponding tag/oldTag variable in the function.

s/blocksToDel/blocksToDrop/. BTW, we never pass anything other than the total number of blocks in the fork, so we may as well call it just numBlocks.

s/traverse_buf_freelist/scan_shared_buffers/, because when it is true, we scan the whole shared_buffers.

s/rel_count/rel_num/

Reduce indentation/tab in header-comments of DropForkSpecificBuffers(). But I see there's precedent in neighboring functions, so this may be okay.

Doing pfree() of num_blocks, num_fsm_blocks and num_vm_blocks in one place (instead of two, at different indentation levels) would help readability.

Best regards,
-- 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Bogus postmaster-only contexts laying about in backends
Next
From: Andres Freund
Date:
Subject: Re: drop/truncate table sucks for large values of shared buffers