Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers

From Amit Langote
Subject Re: making update/delete of inheritance trees scale better
Date
Msg-id CA+HiwqFhz8OOMrNJD4kxiT=39UGQ=HojEtKSKOnL7+hA+0PTRQ@mail.gmail.com
Whole thread Raw
In response to Re: making update/delete of inheritance trees scale better  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On Sat, Oct 31, 2020 at 7:26 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 31/10/2020 00:12, Tom Lane wrote:
> > Heikki Linnakangas <hlinnaka@iki.fi> writes:
> >> .... But if you do:
> >
> >> postgres=# explain verbose update tab set a = 1, b = 2;
> >>                                      QUERY PLAN
> >> ---------------------------------------------------------------------------------
> >>    Update on public.tab  (cost=0.00..269603.27 rows=0 width=0)
> >>      ->  Seq Scan on public.tab  (cost=0.00..269603.27 rows=10028327
> >> width=14)
> >>            Output: 1, 2, ctid
> >
> >> The Modify Table will still fetch the old tuple, but in this case, it's
> >> not really necessary, because both columns are overwritten.
> >
> > Ah, that I believe.  Not sure it's a common enough case to spend cycles
> > looking for, though.
> >
> > In any case, we still have to access the old tuple, don't we?
> > To lock it and update its t_ctid, whether or not we have use for
> > its user columns.  Maybe there's some gain from not having to
> > deconstruct the tuple, but it doesn't seem like it'd be much.

With the patched, the old tuple fetched by ModifyTable node will not
be deconstructed in this case, because all the values needed to form
the new tuple will be obtained from the plan's output tuple, so there
is no need to read the user columns from the old tuple.  Given that,
it indeed sounds a bit wasteful to have read the tuple as Heikki
points out, but again, that's in a rare case.

> Yeah, you need to access the old tuple to update its t_ctid, but
> accessing it twice is still more expensive than accessing it once. Maybe
> you could optimize it somewhat by keeping the buffer pinned or
> something.

The buffer containing the old tuple is already pinned first when
ExecModifyTable() fetches the tuple to form the new tuple, and then
when, in this example, heap_update() fetches it to update the old
tuple contents.

> Or push the responsibility down to the table AM, passing the
> AM only the modified columns, and let the AM figure out how to deal with
> the columns that were not modified, hoping that it can do something smart.

That sounds interesting, but maybe a sizable project on its own?

Thanks a lot for taking a look at this, BTW.

--
Amit Langote
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: hash_array_extended() needs to pass down collation
Next
From: David Rowley
Date:
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans