Thread: a verbose option for autovacuum
Hi all
I was surprised to see that there's no way to get `VACUUM VERBOSE`-like output from autovacuum. Is there any interest in enabling this?
Additionally, is there any interest in exposing more vacuum options to be run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable. I skipped FULL in that list because I'm assuming no one would ever want autovac to run VACUUM FULL.
Tommy
Tommy Li <tommy@coffeemeetsbagel.com> writes: > I was surprised to see that there's no way to get `VACUUM VERBOSE`-like > output from autovacuum. Is there any interest in enabling this? Seems like that would very soon feel like log spam. What would be the use case for having this on? If you want one-off results you could run VACUUM manually. > Additionally, is there any interest in exposing more vacuum options to be > run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the > VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable. To the extent that any of these make sense in autovacuum, I'd say they ought to be managed automatically. I don't see a strong argument for users configuring this. (See also nearby thread about allowing index AMs to control some of this.) regards, tom lane
Hey Tom
> use case for having this on? If you want one-off results you could
> run VACUUM manually.
In my case I have a fairly large, fairly frequently updated table with a large number of indexes where autovacuum's runtime can fluctuate between 12 and 24 hours. If I want to investigate why autovacuum today is running many hours longer than it did last week, the only information I have to go off is from pg_stat_progress_vacuum, which reports only progress based on the number of blocks completed across _all_ indexes.
VACUUM VERBOSE's output is nice because it reports runtime per index, which would allow me to see if a specific index has bloated more than usual.
I also have autovacuum throttled much more aggressively than manual vacuums, so information from a one-off manual VACUUM isn't comparable.
As for log spam, I'm not sure it's a problem as long as the verbose option is disabled by default.
Tommy
On Fri, Jan 22, 2021 at 2:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Tommy Li <tommy@coffeemeetsbagel.com> writes:
> I was surprised to see that there's no way to get `VACUUM VERBOSE`-like
> output from autovacuum. Is there any interest in enabling this?
Seems like that would very soon feel like log spam. What would be the
use case for having this on? If you want one-off results you could
run VACUUM manually.
> Additionally, is there any interest in exposing more vacuum options to be
> run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the
> VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable.
To the extent that any of these make sense in autovacuum, I'd say they
ought to be managed automatically. I don't see a strong argument for
users configuring this. (See also nearby thread about allowing index
AMs to control some of this.)
regards, tom lane
Greetings, On Fri, Jan 22, 2021 at 2:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Tommy Li <tommy@coffeemeetsbagel.com> writes: > > Additionally, is there any interest in exposing more vacuum options to be > > run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the > > VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable. > > To the extent that any of these make sense in autovacuum, I'd say they > ought to be managed automatically. I don't see a strong argument for > users configuring this. (See also nearby thread about allowing index > AMs to control some of this.) I agree that it'd be nice to figure out some way to have these managed automatically, but it's probably useful to point out to Tommy that you can set vacuum options on a table level which autovacuum should respect, such as vacuum_index_cleanup and vacuum_truncate. For skip locked, autovacuum already will automatically release it's attempt to acquire a lock if someone backs up behind it for too long. Until we get something automatic though, I could see being able to set TRUNCATE, in particular, to be off globally as useful when running a system with replicas that might end up having queries which block WAL replay. If no one is stepping up to build some way to handle that automatically then I'd be in support of making it something that an administrator can configure (to avoid having to always remember to do it for each table created...). * Tommy Li (tommy@coffeemeetsbagel.com) wrote: > > Seems like that would very soon feel like log spam. What would be the > > use case for having this on? If you want one-off results you could > > run VACUUM manually. > > In my case I have a fairly large, fairly frequently updated table with a > large number of indexes where autovacuum's runtime can fluctuate between 12 > and 24 hours. If I want to investigate why autovacuum today is running many > hours longer than it did last week, the only information I have to go off > is from pg_stat_progress_vacuum, which reports only progress based on the > number of blocks completed across _all_ indexes. > > VACUUM VERBOSE's output is nice because it reports runtime per index, which > would allow me to see if a specific index has bloated more than usual. > > I also have autovacuum throttled much more aggressively than manual > vacuums, so information from a one-off manual VACUUM isn't comparable. I tend to agree that this is pretty useful information to have included when trying to figure out what autovacuum's doing. > As for log spam, I'm not sure it's a problem as long as the verbose option > is disabled by default. While this would be in-line with our existing dismal logging defaults, I'd be happier with a whole bunch more logging enabled by default, including this, so that we don't have to tell everyone who deploys PG to go enable this very sensible logging. Arguments of 'log spam' really fall down when you're on the receiving end of practically empty PG logs and trying to figure out what's going on. Further, log analysis tools exist to aggregate this data and bring it up to a higher level for administrators. Still, I'd much rather have the option, even if disabled by default, than not have it at all. Thanks, Stephen
Attachment
Hi Stephen
> ... can set vacuum options on a table level which autovacuum should respect,
> such as vacuum_index_cleanup and vacuum_truncate. For skip locked,
> autovacuum already will automatically release it's attempt to acquire a
> lock if someone backs up behind it for too long.
> such as vacuum_index_cleanup and vacuum_truncate. For skip locked,
> autovacuum already will automatically release it's attempt to acquire a
> lock if someone backs up behind it for too long.
This is good information, I wasn't aware that autovacuum respected those settings.
In that case I'd like to focus specifically on the verbose aspect.
My first thought was a new boolean configuration called "autovacuum_verbose".
I'd want it to behave similarly to autovacuum_vacuum_cost_limit in that it can be
set globally or on a per-table basis.
Thoughts?
Tommy
On Tue, Jan 26, 2021 at 2:46 AM Tommy Li <tommy@coffeemeetsbagel.com> wrote: > > Hi Stephen > > > ... can set vacuum options on a table level which autovacuum should respect, > > such as vacuum_index_cleanup and vacuum_truncate. For skip locked, > > autovacuum already will automatically release it's attempt to acquire a > > lock if someone backs up behind it for too long. > > This is good information, I wasn't aware that autovacuum respected those settings. > In that case I'd like to focus specifically on the verbose aspect. > > My first thought was a new boolean configuration called "autovacuum_verbose". > I'd want it to behave similarly to autovacuum_vacuum_cost_limit in that it can be > set globally or on a per-table basis. I agree to have autovacuum log more information, especially index vacuums. Currently, the log related to index vacuum is only the number of index scans. I think it would be helpful if the log has more details about each index vacuum. But I'm not sure that neither always logging that nor having set the parameter per-table basis is a good idea. In the former case, it could be log spam for example in the case of anti-wraparound vacuums that vacuums on all tables (and their indexes) in the database. If we set it per-table basis, it’s useful when the user already knows which tables are likely to take a long time for autovacuum but won’t work when the users want to check the autovacuum details for tables that autovacuum could take a long time for. Given that we already have log_autovacuum_min_duration, I think this verbose logging should work together with that. I’d prefer to enable the verbose logging by default for the same reason Stephen mentioned. Or maybe we can have a parameter to control verbosity, say log_autovaucum_verbosity. Regarding when to log, we can have autovacuum emit index vacuum log after each lazy_vacuum/cleanup_index() end like VACUUM VERBOSE does, but I’m not sure how it could work together with log_autovacuum_min_duration. So one idea could be to have autovacuum emit a log for each index vacuum statistics along with the current autovacuum logs at the end of lazy vacuum if the execution time exceeds log_autovacuum_min_duration. A downside would be one autovacuum log could be very long if the table has many indexes, and we still don’t know how much time taken for each index vacuum. But you can see if a specific index has bloated more than usual. And for the latter part, we would be able to add the statistics of execution time for each vacuum phase to the log as a further improvement. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Hi Masahiko
> If we set
> it per-table basis, it’s useful when the user already knows which
> tables are likely to take a long time for autovacuum
> it per-table basis, it’s useful when the user already knows which
> tables are likely to take a long time for autovacuum
I would assume that's the default case, most apps I've seen are designed around a small
number of large tables that take up most of the maintenance effort
> after each lazy_vacuum/cleanup_index() end like VACUUM VERBOSE does,
> but I’m not sure how it could work together with
> log_autovacuum_min_duration.
I do like having this rolled into the existing configuration. This might be an absurd idea, but
what if the autovacuum process accumulates the per-index vacuum information until that
threshold is reached, and then spits out the logs all at once? And after the min duration is
passed, it just logs the rest of the index vacuum information as they occur. That way the
information is more likely to be available to investigate an abnormally long running vacuum
while it's still happening.
Tommy
On Fri, Jan 29, 2021, at 4:35 AM, Masahiko Sawada wrote:
I agree to have autovacuum log more information, especially indexvacuums. Currently, the log related to index vacuum is only the numberof index scans. I think it would be helpful if the log has moredetails about each index vacuum.
+1 for this feature. Sometimes this analysis is useful to confirm your theory;
without data, it is just a wild guess.
But I'm not sure that neither always logging that nor having set theparameter per-table basis is a good idea. In the former case, it couldbe log spam for example in the case of anti-wraparound vacuums thatvacuums on all tables (and their indexes) in the database. If we setit per-table basis, it’s useful when the user already knows whichtables are likely to take a long time for autovacuum but won’t workwhen the users want to check the autovacuum details for tables thatautovacuum could take a long time for.
I prefer a per-table parameter since it allows us a fine-grained tuning. It
covers the cases you provided above. You can disable it at all and only enable
it in critical tables or enable it and disable it for known-to-be-spam tables.
Given that we already have log_autovacuum_min_duration, I think thisverbose logging should work together with that. I’d prefer to enablethe verbose logging by default for the same reason Stephen mentioned.Or maybe we can have a parameter to control verbosity, saylog_autovaucum_verbosity.
IMO this new parameter is just an option to inject VERBOSE into VACUUM command.
Since there is already a parameter to avoid spam autovacuum messages, this
feature shouldn't hijack log_autovacuum_min_duration behavior. If the
autovacuum command execution time runs less than l_a_m_d, the output should be
discarded.
I don't have a strong opinion about this parameter name but I think your
suggestion (log_autovaccum_verbosity) is easier to guess what this parameter is
for.
On Wed, Mar 17, 2021 at 08:50:26AM -0300, Euler Taveira wrote: > Since commit 5f8727f5a6, this patch doesn't apply anymore. Fortunately, it is > just a small hunk. I reviewed this patch and it looks good to me. There is just > a small issue (double space after 'if') that I fixed in the attached version. No major objections to what you are proposing here. > /* and log the action if appropriate */ > - if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0) > + if (IsAutoVacuumWorkerProcess()) > { > - TimestampTz endtime = GetCurrentTimestamp(); > + TimestampTz endtime = 0; > + int i; > > - if (params->log_min_duration == 0 || > - TimestampDifferenceExceeds(starttime, endtime, > - params->log_min_duration)) > + if (params->log_min_duration >= 0) > + endtime = GetCurrentTimestamp(); > + > + if (endtime > 0 && > + (params->log_min_duration == 0 || > + TimestampDifferenceExceeds(starttime, endtime, Why is there any need to actually change this part? If I am following the patch correctly, the reason why you are doing things this way is to free the set of N statistics all the time for autovacuum. However, we could make that much simpler, and your patch is already half-way through that by adding the stats of the N indexes to LVRelStats. Here is the idea: - Allocation of the N items for IndexBulkDeleteResult at the beginning of heap_vacuum_rel(). It seems to me that we are going to need the N index names within LVRelStats, to be able to still call vac_close_indexes() *before* logging the stats. - No need to pass down indstats for the two callers of lazy_vacuum_all_indexes(). - Clean up of vacrelstats once heap_vacuum_rel() is done with it. -- Michael
Attachment
On Thu, Mar 18, 2021 at 3:41 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Mar 17, 2021 at 08:50:26AM -0300, Euler Taveira wrote: > > Since commit 5f8727f5a6, this patch doesn't apply anymore. Fortunately, it is > > just a small hunk. I reviewed this patch and it looks good to me. There is just > > a small issue (double space after 'if') that I fixed in the attached version. > > No major objections to what you are proposing here. > > > /* and log the action if appropriate */ > > - if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0) > > + if (IsAutoVacuumWorkerProcess()) > > { > > - TimestampTz endtime = GetCurrentTimestamp(); > > + TimestampTz endtime = 0; > > + int i; > > > > - if (params->log_min_duration == 0 || > > - TimestampDifferenceExceeds(starttime, endtime, > > - params->log_min_duration)) > > + if (params->log_min_duration >= 0) > > + endtime = GetCurrentTimestamp(); > > + > > + if (endtime > 0 && > > + (params->log_min_duration == 0 || > > + TimestampDifferenceExceeds(starttime, endtime, > > Why is there any need to actually change this part? If I am following > the patch correctly, the reason why you are doing things this way is > to free the set of N statistics all the time for autovacuum. However, > we could make that much simpler, and your patch is already half-way > through that by adding the stats of the N indexes to LVRelStats. Here > is the idea: > - Allocation of the N items for IndexBulkDeleteResult at the beginning > of heap_vacuum_rel(). It seems to me that we are going to need the N > index names within LVRelStats, to be able to still call > vac_close_indexes() *before* logging the stats. > - No need to pass down indstats for the two callers of > lazy_vacuum_all_indexes(). > - Clean up of vacrelstats once heap_vacuum_rel() is done with it. Okay, I've updated the patch accordingly. If we add IndexBulkDeleteResult to LVRelStats I think we can remove IndexBulkDeleteResult function argument also from some other functions such as lazy_parallel_vacuum_indexes() and vacuum_indexes_leader(). Please review the attached patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Attachment
On Thu, Mar 18, 2021 at 9:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Mar 18, 2021 at 3:41 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Wed, Mar 17, 2021 at 08:50:26AM -0300, Euler Taveira wrote: > > > Since commit 5f8727f5a6, this patch doesn't apply anymore. Fortunately, it is > > > just a small hunk. I reviewed this patch and it looks good to me. There is just > > > a small issue (double space after 'if') that I fixed in the attached version. > > > > No major objections to what you are proposing here. > > > > > /* and log the action if appropriate */ > > > - if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0) > > > + if (IsAutoVacuumWorkerProcess()) > > > { > > > - TimestampTz endtime = GetCurrentTimestamp(); > > > + TimestampTz endtime = 0; > > > + int i; > > > > > > - if (params->log_min_duration == 0 || > > > - TimestampDifferenceExceeds(starttime, endtime, > > > - params->log_min_duration)) > > > + if (params->log_min_duration >= 0) > > > + endtime = GetCurrentTimestamp(); > > > + > > > + if (endtime > 0 && > > > + (params->log_min_duration == 0 || > > > + TimestampDifferenceExceeds(starttime, endtime, > > > > Why is there any need to actually change this part? If I am following > > the patch correctly, the reason why you are doing things this way is > > to free the set of N statistics all the time for autovacuum. However, > > we could make that much simpler, and your patch is already half-way > > through that by adding the stats of the N indexes to LVRelStats. Here > > is the idea: > > - Allocation of the N items for IndexBulkDeleteResult at the beginning > > of heap_vacuum_rel(). It seems to me that we are going to need the N > > index names within LVRelStats, to be able to still call > > vac_close_indexes() *before* logging the stats. > > - No need to pass down indstats for the two callers of > > lazy_vacuum_all_indexes(). > > - Clean up of vacrelstats once heap_vacuum_rel() is done with it. > > Okay, I've updated the patch accordingly. If we add > IndexBulkDeleteResult to LVRelStats I think we can remove > IndexBulkDeleteResult function argument also from some other functions > such as lazy_parallel_vacuum_indexes() and vacuum_indexes_leader(). > Please review the attached patch. Sorry, I attached the wrong version patch. So attached the right one. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Attachment
On Thu, Mar 18, 2021 at 5:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Okay, I've updated the patch accordingly. If we add > IndexBulkDeleteResult to LVRelStats I think we can remove > IndexBulkDeleteResult function argument also from some other functions > such as lazy_parallel_vacuum_indexes() and vacuum_indexes_leader(). > Please review the attached patch. That seems much clearer. Were you going to take care of this, Michael? -- Peter Geoghegan
On Thu, Mar 18, 2021 at 09:38:05AM -0700, Peter Geoghegan wrote: > Were you going to take care of this, Michael? Yes, I was waiting for Sawada-san to send an updated version, which he just did. -- Michael
Attachment
On Thu, Mar 18, 2021 at 2:08 PM Michael Paquier <michael@paquier.xyz> wrote: > Yes, I was waiting for Sawada-san to send an updated version, which he > just did. Great. This really seems worth having. I was hoping that somebody else could pick this one up. -- Peter Geoghegan
On Thu, Mar 18, 2021 at 11:30:46PM +0900, Masahiko Sawada wrote: > Sorry, I attached the wrong version patch. So attached the right one. Thanks. I have been hacking aon that, and I think that we could do more in terms of integration of the index stats into LVRelStats to help with debugging issues, mainly, but also to open the door at allowing autovacuum to use the parallel path in the future. Hence, for consistency, I think that we should change the following things in LVRelStats: - Add the number of indexes. It looks rather unusual to not track down the number of indexes directly in the structure anyway, as the stats array gets added there. - Add all the index names, for parallel and non-parallel mode. - Replace the index name in the error callback by an index number, pointing back to its location in indstats and indnames. As lazy_vacuum_index() requires the index number to be set internally to it, this means that we need to pass it down across vacuum_indexes_leader(), lazy_parallel_vacuum_indexes(), but that seems like an acceptable compromise to me for now. I think that it would be good to tighten a bit more the relationship between the index stats in the DSM for the parallel case and the ones in local memory, but what we have here looks enough to me so we could figure out that as a future step. Sawada-san, what do you think? Attached is the patch I have finished with. -- Michael
Attachment
On Fri, Mar 19, 2021 at 3:14 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Mar 18, 2021 at 11:30:46PM +0900, Masahiko Sawada wrote: > > Sorry, I attached the wrong version patch. So attached the right one. > > Thanks. I have been hacking aon that, and I think that we could do > more in terms of integration of the index stats into LVRelStats to > help with debugging issues, mainly, but also to open the door at > allowing autovacuum to use the parallel path in the future. Thank you for updating the patch! > Hence, > for consistency, I think that we should change the following things in > LVRelStats: > - Add the number of indexes. It looks rather unusual to not track > down the number of indexes directly in the structure anyway, as the > stats array gets added there. > - Add all the index names, for parallel and non-parallel mode. Agreed with those two changes. > - Replace the index name in the error callback by an index number, > pointing back to its location in indstats and indnames. I like this idea but I'm not sure the approach that the patch took improved the code. Please read the below my concern. > > As lazy_vacuum_index() requires the index number to be set internally > to it, this means that we need to pass it down across > vacuum_indexes_leader(), lazy_parallel_vacuum_indexes(), but that > seems like an acceptable compromise to me for now. I think that it > would be good to tighten a bit more the relationship between the index > stats in the DSM for the parallel case and the ones in local memory, > but what we have here looks enough to me so we could figure out that > as a future step. > > Sawada-san, what do you think? Attached is the patch I have finished > with. With this idea, vacuum_one_index() will become: static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats, LVDeadTuples *dead_tuples, double reltuples, LVRelStats *vacrelstats, int indnum) and the caller calls this function as follow: for (idx = 0; idx < nindexes; idx++) lazy_vacuum_index(Irel[idx], &(vacrelstats->indstats[idx]), vacrelstats->dead_tuples, vacrelstats->old_live_tuples, vacrelstats, idx); It's not bad but it seems redundant a bit to me. We pass the idx in spite of passing also Irel[idx] and &(vacrelstats->indstats[idx]). I think your first idea that is done in v4 patch (saving index names at the beginning of heap_vacuum_rel() for autovacuum logging purpose only) and the idea of deferring to close indexes until the end of heap_vacuum_rel() so that we can refer index name at autovacuum logging are more simple. BTW the patch led to a crash in my environment. The problem is here: static void -vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats, +vacuum_one_index(Relation indrel, LVShared *lvshared, LVSharedIndStats *shared_indstats, - LVDeadTuples *dead_tuples, LVRelStats *vacrelstats) + LVDeadTuples *dead_tuples, LVRelStats *vacrelstats, + int indnum) { IndexBulkDeleteResult *bulkdelete_res = NULL; + IndexBulkDeleteResult *stats; We need to initialize *stats with NULL here. And while looking at the change of vacuum_one_index() I found another problem: @@ -2349,17 +2400,20 @@ vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats, * Update the pointer to the corresponding bulk-deletion result if * someone has already updated it. */ - if (shared_indstats->updated && *stats == NULL) - *stats = bulkdelete_res; + if (shared_indstats->updated) + stats = bulkdelete_res; } + else + stats = vacrelstats->indstats[indnum]; /* Do vacuum or cleanup of the index */ if (lvshared->for_cleanup) - lazy_cleanup_index(indrel, stats, lvshared->reltuples, - lvshared->estimated_count, vacrelstats); + lazy_cleanup_index(indrel, &stats, lvshared->reltuples, + lvshared->estimated_count, vacrelstats, + indnum); else - lazy_vacuum_index(indrel, stats, dead_tuples, - lvshared->reltuples, vacelstats); + lazy_vacuum_index(indrel, &stats, dead_tuples, + lvshared->reltuples, vacrelstats, indnum); /* * Copy the index bulk-deletion result returned from ambulkdelete and If shared_indstats is NULL (e.g., we do " stats = vacrelstats->indstats[indnum];"), vacrelstats->indstats[indnum] is not updated since we pass &stats. I think we should pass &(vacrelstats->indstats[indnum]) instead in this case. Previously, we update the element of the pointer array of index statistics to the pointer pointing to either the local memory or DSM. But with the above change, we do that only when the index statistics are in the local memory. In other words, vacrelstats->indstats[i] is never updated if the corresponding index supports parallel indexes. I think this is not relevant with the change that we'd like to do here (i.e., passing indnum down). Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Sat, Mar 20, 2021 at 01:06:51PM +0900, Masahiko Sawada wrote: > It's not bad but it seems redundant a bit to me. We pass the idx in > spite of passing also Irel[idx] and &(vacrelstats->indstats[idx]). I > think your first idea that is done in v4 patch (saving index names at > the beginning of heap_vacuum_rel() for autovacuum logging purpose > only) and the idea of deferring to close indexes until the end of > heap_vacuum_rel() so that we can refer index name at autovacuum > logging are more simple. Okay. > We need to initialize *stats with NULL here. Right. I am wondering why I did not get any complain here. > If shared_indstats is NULL (e.g., we do " stats = > vacrelstats->indstats[indnum];"), vacrelstats->indstats[indnum] is not > updated since we pass &stats. I think we should pass > &(vacrelstats->indstats[indnum]) instead in this case. If we get rid completely of this idea around indnum, that I don't disagree with so let's keep just indname, you mean to keep the second argument IndexBulkDeleteResult of vacuum_one_index() and pass down &(vacrelstats->indstats[indnum]) as argument. No objections from me to just do that. > Previously, we update the element of the pointer array of index > statistics to the pointer pointing to either the local memory or DSM. > But with the above change, we do that only when the index statistics > are in the local memory. In other words, vacrelstats->indstats[i] is > never updated if the corresponding index supports parallel indexes. I > think this is not relevant with the change that we'd like to do here > (i.e., passing indnum down). Yeah, that looks like just some over-engineering design on my side. Would you like to update the patch with what you think is most adapted? -- Michael
Attachment
On Sat, Mar 20, 2021 at 3:40 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Sat, Mar 20, 2021 at 01:06:51PM +0900, Masahiko Sawada wrote: > > It's not bad but it seems redundant a bit to me. We pass the idx in > > spite of passing also Irel[idx] and &(vacrelstats->indstats[idx]). I > > think your first idea that is done in v4 patch (saving index names at > > the beginning of heap_vacuum_rel() for autovacuum logging purpose > > only) and the idea of deferring to close indexes until the end of > > heap_vacuum_rel() so that we can refer index name at autovacuum > > logging are more simple. > > Okay. > > > We need to initialize *stats with NULL here. > > Right. I am wondering why I did not get any complain here. > > > If shared_indstats is NULL (e.g., we do " stats = > > vacrelstats->indstats[indnum];"), vacrelstats->indstats[indnum] is not > > updated since we pass &stats. I think we should pass > > &(vacrelstats->indstats[indnum]) instead in this case. > > If we get rid completely of this idea around indnum, that I don't > disagree with so let's keep just indname, you mean to keep the second > argument IndexBulkDeleteResult of vacuum_one_index() and pass down > &(vacrelstats->indstats[indnum]) as argument. No objections from me > to just do that. > > > Previously, we update the element of the pointer array of index > > statistics to the pointer pointing to either the local memory or DSM. > > But with the above change, we do that only when the index statistics > > are in the local memory. In other words, vacrelstats->indstats[i] is > > never updated if the corresponding index supports parallel indexes. I > > think this is not relevant with the change that we'd like to do here > > (i.e., passing indnum down). > > Yeah, that looks like just some over-engineering design on my side. > Would you like to update the patch with what you think is most > adapted? I've updated the patch. I saved the index names at the beginning of heap_vacuum_rel() for autovacuum logging, and add indstats and nindexes to LVRelStats. Some functions still have nindexes as a function argument but it seems to make sense since it corresponds the list of index relations (*Irel). Please review the patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Attachment
On Mon, Mar 22, 2021 at 12:17:37PM +0900, Masahiko Sawada wrote: > I've updated the patch. I saved the index names at the beginning of > heap_vacuum_rel() for autovacuum logging, and add indstats and > nindexes to LVRelStats. Some functions still have nindexes as a > function argument but it seems to make sense since it corresponds the > list of index relations (*Irel). Please review the patch. Going back to that, the structure of the static APIs in this file make the whole logic a bit hard to follow, but the whole set of changes you have done here makes sense. It took me a moment to recall and understand why it is safe to free *stats at the end of vacuum_one_index() and if the index stats array actually pointed to the DSM segment correctly within the shared stats. I think that there is more consolidation possible within LVRelStats, but let's leave that for another day, if there is any need for such a move. To keep it short. Sold. -- Michael
Attachment
On Tue, Mar 23, 2021 at 1:31 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Mar 22, 2021 at 12:17:37PM +0900, Masahiko Sawada wrote: > > I've updated the patch. I saved the index names at the beginning of > > heap_vacuum_rel() for autovacuum logging, and add indstats and > > nindexes to LVRelStats. Some functions still have nindexes as a > > function argument but it seems to make sense since it corresponds the > > list of index relations (*Irel). Please review the patch. > > Going back to that, the structure of the static APIs in this file make > the whole logic a bit hard to follow, but the whole set of changes you > have done here makes sense. It took me a moment to recall and > understand why it is safe to free *stats at the end of > vacuum_one_index() and if the index stats array actually pointed to > the DSM segment correctly within the shared stats. > > I think that there is more consolidation possible within LVRelStats, > but let's leave that for another day, if there is any need for such a > move. While studying your patch (v5-index_stat_log.patch) I found we can polish the parallel vacuum code in some places. I'll try it another day. > > To keep it short. Sold. Thank you! Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/