Re: [HACKERS] More stats about skipped vacuums - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] More stats about skipped vacuums
Date
Msg-id CA+TgmoYZTHCFZzPrz8tG9TQknAd=MsacTy7aT2ttia=dRB2eGQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] More stats about skipped vacuums  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] More stats about skipped vacuums  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Nov 25, 2017 at 10:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we could get rid of the copy-to-a-temporary-file technology for
> transferring the stats collector's data to backends, then this problem
> would probably vanish or at least get a lot less severe.  But that seems
> like a nontrivial project.  With the infrastructure we have today, we
> could probably keep the stats tables in a DSM segment; but how would
> a backend get a consistent snapshot of them?

I suppose the obvious approach is to have a big lock around the
statistics data proper; this could be taken in shared mode to take a
snapshot or in exclusive mode to update statistics.  In addition,
create one or more queues where statistics messages can be enqueued in
lieu of updating the main statistics data directly.  If that doesn't
perform well enough, you could keep two copies of the statistics, A
and B.  At any given time, one copy is quiescent and the other copy is
being updated.  Periodically, at a time when we know that nobody is
taking a snapshot of the statistics, they reverse roles.

Of course, the other obvious question is whether we really need a
consistent snapshot, because that's bound to be pretty expensive even
if you eliminate the I/O cost.  Taking a consistent snapshot across
all 100,000 tables in the database even if we're only ever going to
access 5 of those tables doesn't seem like a good or scalable design.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] [POC] Faster processing at Gather node
Next
From: Mark Dilger
Date:
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists