Re: shared-memory based stats collector - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: shared-memory based stats collector |
Date | |
Msg-id | 11936.1537430127@linux-at0a Whole thread Raw |
In response to | Re: shared-memory based stats collector (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: shared-memory based stats collector
|
List | pgsql-hackers |
I've spent some time reviewing this version. Design ------ 1. Even with your patch the stats collector still uses an UDP socket to receive data. Now that the shared memory API is there, shouldn't the messages be sent via shared memory queue? [1] That would increase the reliability of message delivery. I can actually imagine backends inserting data into the shared hash tables themselves, but that might make them wait if the same entries are accessed by another backend. It should be much cheaper just to insert message into the queue and let the collector process it. In future version the collector can launch parallel workers so that writes by backends do not get blocked due to full queue. 2. I think the access to the shared hash tables introduces more contention than necessary. For example, pgstat_recv_tabstat() retrieves "dbentry" and leaves the containing hash table partition locked *exclusively* even if it changes only the containing table entries, while changes of the containing dbentry are done. It appears that the shared hash tables are only modified by the stats collector. The unnecessary use of the exclusive lock might be a bigger issue in the future if the stats collector will use parallel workers. Monitoring functions and autovacuum are affected by the locking now. (I see that the it's not trivial to get just-created entry locked in shared mode: it may need a loop in which we release the exclusive lock and acquire the shared lock unless the entry was already removed.) 3. Data in both shared_archiverStats and shared_globalStats is mostly accessed w/o locking. Is that ok? I'd expect the StatsLock to be used for these. Coding ------ * git apply v4-0003-dshash-based-stats-collector.patch needed manual resolution of one conflict. * pgstat_quickdie_handler() appears to be the only "quickdie handler" that calls on_exit_reset(), although the comments are almost copy & pasted from such a handler of other processes. Can you please explain what's specific about pgstat.c? * the variable name "area" would be sufficient if it was local to some function, otherwise I think the name is too generic. * likewise db_stats is too generic for a global variable. How about "snapshot_db_stats_local"? * backend_get_db_entry() passes 0 for handle to snapshot_statentry(). How about DSM_HANDLE_INVALID ? * I only see one call of snapshot_statentry_all() and it receives 0 for handle. Thus the argument can be removed and the function does not have to attach / detach to / from the shared hash table. * backend_snapshot_global_stats() switches to TopMemoryContext before it calls pgstat_attach_shared_stats(), but the latter takes care of the context itself. * pgstat_attach_shared_stats() - header comment should explain what the return value means. * reset_dbentry_counters() does more than just resetting the counters. Name like initialize_dbentry() would be more descriptive. * typos: ** backend_snapshot_global_stats(): "liftime" -> "lifetime" ** snapshot_statentry(): "entriy" -> "entry" ** backend_get_func_etnry(): "onshot" -> "oneshot" ** snapshot_statentry_all(): "Returns a local hash contains ..." -> "Returns a local hash containing ..." [1] https://www.postgresql.org/message-id/20180711000605.sqjik3vqe5opqz33@alap3.anarazel.de -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at
pgsql-hackers by date: