Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Date
Msg-id 510F1B1E.8080307@fuzzy.cz
Whole thread Raw
In response to Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On 1.2.2013 17:19, Alvaro Herrera wrote:
> Tomas Vondra wrote:
> 
>> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
>> index be3adf1..4ec485e 100644
>> --- a/src/backend/postmaster/pgstat.c
>> +++ b/src/backend/postmaster/pgstat.c
>> @@ -64,10 +64,14 @@
>>  
>>  /* ----------
>>   * Paths for the statistics files (relative to installation's $PGDATA).
>> + * Permanent and temprorary, global and per-database files.
> 
> Note typo in the line above.
> 
>> -#define PGSTAT_STAT_PERMANENT_FILENAME        "global/pgstat.stat"
>> -#define PGSTAT_STAT_PERMANENT_TMPFILE        "global/pgstat.tmp"
>> +#define PGSTAT_STAT_PERMANENT_DIRECTORY        "stat"
>> +#define PGSTAT_STAT_PERMANENT_FILENAME        "stat/global.stat"
>> +#define PGSTAT_STAT_PERMANENT_TMPFILE        "stat/global.tmp"
>> +#define PGSTAT_STAT_PERMANENT_DB_FILENAME    "stat/%d.stat"
>> +#define PGSTAT_STAT_PERMANENT_DB_TMPFILE    "stat/%d.tmp"
> 
>> +char       *pgstat_stat_directory = NULL;
>>  char       *pgstat_stat_filename = NULL;
>>  char       *pgstat_stat_tmpname = NULL;
>> +char       *pgstat_stat_db_filename = NULL;
>> +char       *pgstat_stat_db_tmpname = NULL;
> 
> I don't like the quoted parts very much; it seems awkward to have the
> snprintf patterns in one place and have them be used in very distant

I don't see that as particularly awkward, but that's a matter of taste.
I still see that as a bunch of constants that are sprintf patterns at
the same time.

> places.  Is there a way to improve that?  Also, if I understand clearly,
> the pgstat_stat_db_filename value needs to be an snprintf pattern too,
> right?  What if it doesn't contain the required % specifier?

Ummmm, yes - it needs to be a pattern too, but the user specifies the
directory (stats_temp_directory) and this is used to derive all the
other values - see assign_pgstat_temp_directory() in guc.c.

> Also, if you can filter this through pgindent, that would be best.  Make
> sure to add DBWriteRequest to src/tools/pgindent/typedefs_list.

Will do.

> 
>> +        /*
>> +         * There's no request for this DB yet, so lets create it (allocate a
>> +         * space for it, set the values).
>> +         */
>> +        if (last_statrequests == NULL)
>> +            last_statrequests = palloc(sizeof(DBWriteRequest));
>> +        else
>> +            last_statrequests = repalloc(last_statrequests,
>> +                                (num_statrequests + 1)*sizeof(DBWriteRequest));
>> +        
>> +        last_statrequests[num_statrequests].databaseid = msg->databaseid;
>> +        last_statrequests[num_statrequests].request_time = msg->clock_time;
>> +        num_statrequests += 1;
> 
> Having to repalloc this array each time seems wrong.  Would a list
> instead of an array help?  see ilist.c/h; I vote for a dlist because you
> can easily delete elements from the middle of it, if required (I think
> you'd need that.)

Thanks. I'm not very familiar with the list interface, so I've used
plain array. But yes, there are better ways than doing repalloc all the
time.

> 
>> +        char db_statfile[strlen(pgstat_stat_db_filename) + 11];
>> +        snprintf(db_statfile, strlen(pgstat_stat_db_filename) + 11,
>> +                 pgstat_stat_filename, dbentry->databaseid);
> 
> This pattern seems rather frequent.  Can we use a macro or similar here?
> Encapsulating the "11" better would be good.  Magic numbers are evil.

Yes, this needs to be cleaned / improved.

>> diff --git a/src/include/pgstat.h b/src/include/pgstat.h
>> index 613c1c2..b3467d2 100644
>> --- a/src/include/pgstat.h
>> +++ b/src/include/pgstat.h
>> @@ -205,6 +205,7 @@ typedef struct PgStat_MsgInquiry
>>      PgStat_MsgHdr m_hdr;
>>      TimestampTz clock_time;        /* observed local clock time */
>>      TimestampTz cutoff_time;    /* minimum acceptable file timestamp */
>> +    Oid            databaseid;        /* requested DB (InvalidOid => all DBs) */
>>  } PgStat_MsgInquiry;
> 
> Do we need to support the case that somebody requests stuff from the
> "shared" DB?  IIRC that's what InvalidOid means in pgstat ...

Frankly, I don't know, but I guess we do because it was in the original
code, and there are such inquiries right after the database starts
(that's why I had to add pgstat_write_db_dummyfile).

Tomas




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: json api WIP patch
Next
From: Noah Misch
Date:
Subject: Re: lazy_vacuum_heap()'s removal of HEAPTUPLE_DEAD tuples