Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAA4eK1Jnu3Y04_T1BegyK=xhSFpp3E6mxPNAz1TnU5pL0Rg73Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum
Re: [HACKERS] Block level parallel vacuum
List pgsql-hackers
On Sun, Oct 27, 2019 at 12:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> I haven't yet read the new set of the patch.  But, I have noticed one
> thing.  That we are getting the size of the statistics using the AM
> routine.  But, we are copying those statistics from local memory to
> the shared memory directly using the memcpy.   Wouldn't it be a good
> idea to have an AM specific routine to get it copied from the local
> memory to the shared memory?  I am not sure it is worth it or not but
> my thought behind this point is that it will give AM to have local
> stats in any form ( like they can store a pointer in that ) but they
> can serialize that while copying to shared stats.  And, later when
> shared stats are passed back to the Am then it can deserialize in its
> local form and use it.
>

You have a point, but after changing the gist index, we don't have any
current usage for indexes that need something like that. So, on one
side there is some value in having an API to copy the stats, but on
the other side without having clear usage of an API, it might not be
good to expose a new API for the same.   I think we can expose such an
API in the future if there is a need for the same.  Do you or anyone
know of any external IndexAM that has such a need?

Few minor comments while glancing through the latest patchset.

1. I think you can merge 0001*, 0002*, 0003* patch into one patch as
all three expose new variable/function from IndexAmRoutine.

2.
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ char *p = (char *) GetSharedIndStats(lvshared);
+ int vac_work_mem = IsAutoVacuumWorkerProcess() &&
+ autovacuum_work_mem != -1 ?
+ autovacuum_work_mem : maintenance_work_mem;

I think this function won't be called from AutoVacuumWorkerProcess at
least not as of now, so isn't it a better idea to have an Assert for
it?

3.
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)

This function is for performing a parallel operation on the index, so
why to start with heap?  It is better to name it as
index_parallel_vacuum_main or simply parallel_vacuum_main.

4.
/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +280,12 @@ typedef struct LVRelStats
  BlockNumber pages_removed;
  double tuples_deleted;
  BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
- /* List of TIDs of tuples we intend to delete */
- /* NB: this list is ordered by TID address */
- int num_dead_tuples; /* current # of entries */
- int max_dead_tuples; /* # slots allocated in array */
- ItemPointer dead_tuples; /* array of ItemPointerData */
+ LVDeadTuples *dead_tuples;
  int num_index_scans;
  TransactionId latestRemovedXid;
  bool lock_waiter_detected;
 } LVRelStats;

-
 /* A few variables that don't seem worth passing around as parameters */
 static int elevel = -1;

It seems like a spurious line removal.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Soumyadeep Chakraborty
Date:
Subject: Re: WIP: expression evaluation improvements
Next
From: Michael Paquier
Date:
Subject: Re: v12.0: interrupt reindex CONCURRENTLY: ccold: ERROR: could notfind tuple for parent of relation ...