Re: Is it worth to optimize VACUUM/ANALYZE by combining duplicate rel instances into single rel instance? - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Is it worth to optimize VACUUM/ANALYZE by combining duplicate rel instances into single rel instance?
Date
Msg-id 20210421.113249.1564589097234983099.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Is it worth to optimize VACUUM/ANALYZE by combining duplicate rel instances into single rel instance?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Is it worth to optimize VACUUM/ANALYZE by combining duplicate rel instances into single rel instance?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
At Wed, 21 Apr 2021 07:34:40 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> On Sat, Apr 10, 2021 at 8:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> > > I'm reading the code for vacuum/analyze and it looks like currently we
> > > call vacuum_rel/analyze_rel for each relation specified. Which means
> > > that if a relation is specified more than once, then we simply
> > > vacuum/analyze it that many times. Do we gain any advantage by
> > > vacuuming/analyzing a relation back-to-back within a single command? I
> > > strongly feel no. I'm thinking we could do a simple optimization here,
> >
> > This really is not something to expend cycles and code complexity on.
> > If the user wrote the same table more than once, that's their choice.
> 
> Thanks! I think we could avoid extra processing costs for cases like
> VACUUM/ANALYZE foo, foo; when no explicit columns are specified. The
> avoided costs can be lock acquire, relation open, vacuum/analyze,
> relation close, starting new xact command, command counter increment
> in case of analyze etc. This can be done with a simple patch like the
> attached. When explicit columns are specified along with relations
> i.e. VACUUM/ANALYZE foo(c1), foo(c2); we don't want to do the extra
> complex processing to optimize the cases when c1 = c2.
> 
> Note that the TRUNCATE command currently skips processing repeated
> relations (see ExecuteTruncate). For example, TRUNCATE foo, foo; and
> TRUNCATE foo, ONLY foo, foo; first instance of relation foo is taken
> into consideration for processing and other relation instances
> (options specified if any) are ignored.
> 
> Thoughts?

Although I don't strongly oppose to check that, the check of truncate
is natural and required. The relation list is anyway used afterwards,
and we cannot truncate the same relation twice or more since a
relation under "use" cannot be truncated. (Truncation is one form of
use).  In short, TRUNCATE runs no checking just for the check's own
sake.

On the other hand the patch creates a relation list just for this
purpose, which is not needed to run VACUUM/ANALYZE, and VACUUM/ANALYE
works well with duplicates in target relations.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Docs: Move parallel_leader_participation GUC description under relevant category
Next
From: Bharath Rupireddy
Date:
Subject: Re: Docs: Move parallel_leader_participation GUC description under relevant category