Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Date
Msg-id 20130218155043.GB4739@alvh.no-ip.org
Whole thread Raw
In response to Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Tomas Vondra <tv@fuzzy.cz>)
Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
Tomas Vondra wrote:

> So, here's v10 of the patch (based on the v9+v9a), that implements the
> approach described above.
>
> It turned out to be much easier than I expected (basically just a
> rewrite of the pgstat_read_db_statsfile_timestamp() function.

Thanks.  I'm giving this another look now.  I think the new code means
we no longer need the first_write logic; just let the collector idle
until we get the first request.  (If for some reason we considered that
we should really be publishing initial stats as early as possible, we
could just do a write_statsfiles(allDbs) call before entering the main
loop.  But I don't see any reason to do this.  If you do, please speak
up.)

Also, it seems to me that the new pgstat_db_requested() logic is
slightly bogus (in the "inefficient" sense, not the "incorrect" sense):
we should be comparing the timestamp of the request vs.  what's already
on disk instead of blindly returning true if the list is nonempty.  If
the request is older than the file, we don't need to write anything and
can discard the request.  For example, suppose that backend A sends a
request for a DB; we write the file.  If then quickly backend B also
requests stats for the same DB, with the current logic we'd go write the
file, but perhaps backend B would be fine with the file we already
wrote.

Another point is that I think there's a better way to handle nonexistant
files, instead of having to read the global file and all the DB records
to find the one we want.  Just try to read the database file, and only
if that fails try to read the global file and compare the timestamp (so
there might be two timestamps for each DB, one in the global file and
one in the DB-specific file.  I don't think this is a problem).  The
point is avoid having to read the global file if possible.

So here's v11.  I intend to commit this shortly.  (I wanted to get it
out before lunch, but I introduced a silly bug that took me a bit to
fix.)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: JSON Function Bike Shedding
Next
From: Heikki Linnakangas
Date:
Subject: Re: 9.2.3 crashes during archive recovery