Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench)
Date
Msg-id CAA4eK1LemtSQdx4_P_v4P5qtZ0deDDC-kq8Q+q=2RiozvJzDnw@mail.gmail.com
Whole thread Raw
In response to LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was:[HACKERS] [WIP] Zipfian distribution in pgbench)  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench)
List pgsql-hackers
On Wed, Jul 26, 2017 at 3:32 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Fri, Jul 14, 2017 at 5:06 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> I think that what this probably comes down to, more than anything
>> else, is that you have leftmost hot/bloated leaf pages like this:
>>
>>
>>           idx          | level | l_item | blkno | btpo_prev |
>> btpo_next | btpo_flags | type | live_items | dead_items |
>> avg_item_size | page_size | free_size |         highkey
>>
-----------------------+-------+--------+-------+-----------+-----------+------------+------+------------+------------+---------------+-----------+-----------+-------------------------
>>  ...
>>  pgbench_accounts_pkey |     0 |      1 |     1 |         0 |
>> 2751 |         65 | l    |        100 |         41 |            16 |
>>    8192 |      5328 | 11 00 00 00 00 00 00 00
>>  pgbench_accounts_pkey |     0 |      2 |  2751 |         1 |
>> 2746 |         65 | l    |         48 |         90 |            16 |
>>    8192 |      5388 | 32 00 00 00 00 00 00 00
>> ...
>>
>> The high key for the leftmost shows that only values below 0x11 belong
>> on the first page. This is about 16 or 17 possible distinct values,
>> and yet the page has 100 live items, and 41 dead items; in total,
>> there is room for 367 items. It's only slightly better with other
>> nearby pages. This is especially bad because once the keyspace gets
>> split up this finely, it's *impossible* to reverse it -- it's more or
>> less a permanent problem, at least until a REINDEX.
>
> I've been thinking about this a lot, because this really does look
> like a pathological case to me. I think that this workload is very
> sensitive to how effective kill_prior_tuples/LP_DEAD hinting is. Or at
> least, I can imagine that mechanism doing a lot better than it
> actually manages to do here. I wonder if it's possible that commit
> 2ed5b87f9, which let MVCC snapshots not hold on to a pin on leaf
> pages, should have considered workloads like this.
>

Isn't it possible to confirm if the problem is due to commit
2ed5b87f9?  Basically, if we have unlogged tables, then it won't
release the pin.  So if the commit in question is the culprit, then
the same workload should not lead to bloat.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Adding support for Default partition in partitioning