Re: Avoid overhead open-close indexes (catalog updates) - Mailing list pgsql-hackers

From Ranier Vilela
Subject Re: Avoid overhead open-close indexes (catalog updates)
Date
Msg-id CAEudQAqgBCXO13jj-ykB0ygTC3RFNSaNjr59W1OhEXr5fggoww@mail.gmail.com
Whole thread Raw
In response to Re: Avoid overhead open-close indexes (catalog updates)  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Avoid overhead open-close indexes (catalog updates)
List pgsql-hackers
Em qua., 31 de ago. de 2022 às 22:12, Kyotaro Horiguchi <horikyota.ntt@gmail.com> escreveu:
At Wed, 31 Aug 2022 08:16:55 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in
> Hi,
>
> The commit
> https://github.com/postgres/postgres/commit/b17ff07aa3eb142d2cde2ea00e4a4e8f63686f96
> Introduced the CopyStatistics function.
>
> To do the work, CopyStatistics uses a less efficient function
> to update/insert tuples at catalog systems.
>
> The comment at indexing.c says:
> "Avoid using it for multiple tuples, since opening the indexes
>  * and building the index info structures is moderately expensive.
>  * (Use CatalogTupleInsertWithInfo in such cases.)"
>
> So inspired by the comment, changed in some fews places,
> the CatalogInsert/CatalogUpdate to more efficient functions
> CatalogInsertWithInfo/CatalogUpdateWithInfo.
>
> With quick tests, resulting in small performance.
Hi,
Thanks for taking a look at this.
 

Considering the whole operation usually takes far longer time, I'm not
sure that amount of performance gain is useful or not, but I like the
change as a matter of tidiness or as example for later codes.
Yeah, this serves as an example for future codes.


> There are other places that this could be useful,
> but a careful analysis is necessary.

What kind of concern do have in your mind?
Code Bloat.
3 more lines are required per call (CatalogTupleInsert/CatalogTupleUpdate).
However not all code paths are reachable.
The ideal typical case would be CopyStatistics, I think.
With none or at least one filter in tuples loop.
The cost to call CatalogOpenIndexes unconditionally, should be considered.
 

By the way, there is another similar function
CatalogTupleMultiInsertWithInfo() which would be more time-efficient
(but not space-efficient), which is used in InsertPgAttributeTuples. I
don't see a clear criteria of choosing which one of the two, though.

I don't think CatalogTupleMultiInsertWithInfo would be useful in these cases reported here.
The cost of building the slots I think would be unfeasible and would add unnecessary complexity.
 
I think the overhead of catalog index open is significant when any
other time-consuming tasks are not involved in the whole operation.
In that sense, in term of performance, rather storeOperations and
storePrecedures (called under DefineOpCalss) might get more benefit
from that if disregarding the rareness of the command being used..

Yeah, storeOperations and storePrecedures are good candidates.
Let's wait for the patch to be accepted and committed, so we can try to change it.

I will create a CF entry.

regards,
Ranier Vilela

pgsql-hackers by date:

Previous
From: Polina Bungina
Date:
Subject: Re: pg_rewind WAL segments deletion pitfall
Next
From: Polina Bungina
Date:
Subject: Re: pg_rewind WAL segments deletion pitfall