Re: Opportunistically pruning page before update - Mailing list pgsql-hackers
From | James Coleman |
---|---|
Subject | Re: Opportunistically pruning page before update |
Date | |
Msg-id | CAAaqYe_f-a0agsgvq3YWybWh5Y3qqT7x+t1rVSBTgx+2sfwZjg@mail.gmail.com Whole thread Raw |
In response to | Re: Opportunistically pruning page before update (James Coleman <jtc331@gmail.com>) |
List | pgsql-hackers |
On Fri, Jan 26, 2024 at 8:33 PM James Coleman <jtc331@gmail.com> wrote: > > On Tue, Jan 23, 2024 at 2:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Jan 23, 2024 at 7:18 AM James Coleman <jtc331@gmail.com> wrote: > > > > > > On Mon, Jan 22, 2024 at 8:21 PM James Coleman <jtc331@gmail.com> wrote: > > > > > > > > See rebased patch attached. > > > > > > I just realized I left a change in during the rebase that wasn't necessary. > > > > > > v4 attached. > > > > I have noticed that you are performing the opportunistic pruning after > > we decided that the updated tuple can not fit in the current page and > > then we are performing the pruning on the new target page. Why don't > > we first perform the pruning on the existing page of the tuple itself? > > Or this is already being done before this patch? I could not find > > such existing pruning so got this question because such pruning can > > convert many non-hot updates to the HOT update right? > > First off I noticed that I accidentally sent a different version of > the patch I'd originally worked on. Here's the one from the proper > branch. It's still similar, but I want to make sure the right one is > being reviewed. > > I'm working on a demo case for updates (to go along with the insert > case I sent earlier) to test out your question, and I'll reply when I > have that. All right, getting all this loaded back into my head, as you noted earlier the patch currently implements points 1 and 2 of my list of possible improvements: > 1. The most trivial case where this is useful is INSERT: we have a > target page, and it may have dead tuples, so trying to prune may > result in us being able to use the target page rather than getting a > new page. > 2. The next most trivial case is where UPDATE (potentially after > failing to find space for a HOT tuple on the source tuple's page); > much like the INSERT case our backend's target page may benefit from > pruning. What you're describing above would be implementing (at least part of) point 3: > 3. A more complex UPDATE case occurs when we check the tuple's page > for space in order to insert a HOT tuple and fail to find enough > space. While we've already opportunistically pruned the page on > initial read of the tuple, in complex queries this might be some time > in the past, so it may be worth attempting again. > ... If we try to design a simple test case for updates (like my insert test case above) we might end up with something like: drop table if exists foo; create table foo(pk serial primary key, t text); insert into foo(t) select repeat('a', 250) from generate_series(1, 27); select pg_relation_size('foo'); delete from foo where pk = 1; update foo set t = repeat('b', 250) where pk = 2; select pg_relation_size('foo'); But that actually works as expected on master, because we call heap_page_prune_opt from heapam_index_fetch_tuple as part of the index scan that drives the update query. I was theorizing that if there are concurrent writes to the page we might being able to trigger the need to re-prune a page in the for loop in heap_update(), and I tried to both regular pgbench and a custom pgbench script with inserts/deletes/updates (including some artificial delays). What I concluded what this isn't isn't likely to be fruitful: we need the buffer to be local to our backend (no other pins) to be able to clean it, but since we've already pruned it on read, we need to have had another backend modify the page (and dropped its pin!) between our read and our write. If someone believes there's a scenario that would demonstrate otherwise, I would of course be interested to hear any ideas, but at this point I think it's probably worth focusing on the first two cases this patch already addresses. Regards, James Coleman
pgsql-hackers by date: