Re: should check interrupts in BuildRelationExtStatistics ? - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: should check interrupts in BuildRelationExtStatistics ?
Date
Msg-id 20220603152837.GO29853@telsasoft.com
Whole thread Raw
In response to Re: should check interrupts in BuildRelationExtStatistics ?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: should check interrupts in BuildRelationExtStatistics ?
List pgsql-hackers
On Mon, May 09, 2022 at 09:11:37AM -0400, Robert Haas wrote:
> On Sun, May 8, 2022 at 11:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Michael Paquier <michael@paquier.xyz> writes:
> > > How long can the backend remain unresponsive?  I don't think that
> > > anybody would object to the addition of some CHECK_FOR_INTERRUPTS() in
> > > areas where it would be efficient to make the shutdown quicker, but
> > > we need to think carefully about the places where we'd want to add
> > > these.
> >
> > CHECK_FOR_INTERRUPTS is really quite cheap, just a test-and-branch.
> > I wouldn't put it in a *very* tight loop, but one test per row
> > processed while gathering stats is unlikely to be a problem.
> 
> +1. If we're finding things stalling that would be fixed by adding
> CHECK_FOR_INTERRUPTS(), we should generally just add it. In the
> unlikely event that this causes a performance problem, we can try to
> figure out some other solution, but not responding to interrupts isn't
> the right way to economize.

Reproduce the problem for ndistinct and dependencies like:

DROP TABLE t; CREATE TABLE t AS SELECT i A,1+i B,2+i C,3+i D,4+i E,5+i F, now() AS ts FROM generate_series(1.0,
99999.0)i;VACUUM t;
 
DROP STATISTICS stxx; CREATE STATISTICS stxx (ndistinct) ON mod(a,14),mod(b,15),mod(c,16),mod(d,17),mod(e,18),mod(f,19)
FROMt;
 
ANALYZE VERBOSE t;

Maybe this should actually call vacuum_delay_point(), like
compute_scalar_stats().  For MCV, there seems to be no issue, since those
functions are being called (but only for expressional stats).  But maybe I've
just failed to make a large enough, non-expressional MCV list for the problem
to be apparent.

The patch is WIP, but whatever we end up with should probably be backpatched at
least to v14, where expressional indexes were introduced, since they're likely
to have more columns, and are slower to compute.

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index e8f71567b4e..e5538dcd4e1 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -19,6 +19,7 @@
 #include "catalog/pg_statistic_ext.h"
 #include "catalog/pg_statistic_ext_data.h"
 #include "lib/stringinfo.h"
+#include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/nodes.h"
 #include "nodes/pathnodes.h"
@@ -383,6 +384,8 @@ statext_dependencies_build(StatsBuildData *data)
             MVDependency *d;
             MemoryContext oldcxt;
 
+            CHECK_FOR_INTERRUPTS();
+
             /* release memory used by dependency degree calculation */
             oldcxt = MemoryContextSwitchTo(cxt);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index dd67b19b6fa..9db1d0325cd 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -269,6 +269,8 @@ statext_mcv_build(StatsBuildData *data, double totalrows, int stattarget)
         SortItem  **freqs;
         int           *nfreqs;
 
+        // CHECK_FOR_INTERRUPTS();
+
         /* used to search values */
         tmp = (MultiSortSupport) palloc(offsetof(MultiSortSupportData, ssup)
                                         + sizeof(SortSupportData));
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 6ade5eff78c..3b739ab7ca0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -29,6 +29,7 @@
 #include "catalog/pg_statistic_ext.h"
 #include "catalog/pg_statistic_ext_data.h"
 #include "lib/stringinfo.h"
+#include "miscadmin.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
 #include "utils/fmgrprotos.h"
@@ -114,6 +115,8 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
             MVNDistinctItem *item = &result->items[itemcnt];
             int            j;
 
+            CHECK_FOR_INTERRUPTS();
+
             item->attributes = palloc(sizeof(AttrNumber) * k);
             item->nattributes = k;
 



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Proposal: adding a better description in psql command about large objects
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] fix doc example of bit-reversed MAC address