Re: Planning performance problem (67626.278ms) - Mailing list pgsql-performance

From David Rowley
Subject Re: Planning performance problem (67626.278ms)
Date
Msg-id CAApHDvq9o1J45is6w-3Gv1UgtuDQ4eQP4UhFPQSnQmEH_EijvQ@mail.gmail.com
Whole thread Raw
In response to Re: Planning performance problem (67626.278ms)  (Jeremy Schneider <schnjere@amazon.com>)
Responses Re: Planning performance problem (67626.278ms)
List pgsql-performance
On Thu, 22 Apr 2021 at 00:03, Jeremy Schneider <schnjere@amazon.com> wrote:
>
> Two years later, I still remember this. And today I just confirmed
> someone hitting this on open source PG13.

The only thing that changed about get_actual_variable_range() is that
it now uses a SnapshotNonVacuumable snapshot. Previously a
long-running transaction could have caused vacuum to be unable to
remove tuples which could have caused get_actual_variable_range() to
be slow if it had to skip the unvacuumable tuples.

That's now changed as the SnapshotNonVacuumable will see any tuples
required by that long-running transaction and use that to determine
the range instead of skipping over it.

Anyone with a large number of tuples that vacuum can remove that are
at either end of the range on a column that is indexed by a btree
index could still have issues.  Vacuuming more often might be a good
thing to consider.  With the original report on this thread there were
more dead tuples in the table than live tuples.  Disabling auto-vacuum
or tuning it so it waits that long is likely a bad idea.

FWIW, here's a simple test case that shows the problem in current master.

create table a (a int primary key) with (autovacuum_enabled = off);
insert into a select x from generate_series(1,10000000) x;
analyze a;
delete from a;
\timing on
explain select * from a where a < 10000000;
                         QUERY PLAN
------------------------------------------------------------
 Seq Scan on a  (cost=0.00..169247.71 rows=9998977 width=4)
   Filter: (a < 10000000)
(2 rows)


Time: 9062.600 ms (00:09.063)

vacuum a;
explain select * from a where a < 10000000;
                   QUERY PLAN
-------------------------------------------------
 Seq Scan on a  (cost=0.00..0.00 rows=1 width=4)
   Filter: (a < 10000000)
(2 rows)

Time: 2.665 ms

Notice that it became faster again after I did a vacuum.

David



pgsql-performance by date:

Previous
From: Pavan Pusuluri
Date:
Subject: Re: OLEDB for PostgreSQL
Next
From: Tom Lane
Date:
Subject: Re: Planning performance problem (67626.278ms)