Add Postmaster health monitoring statistics - Mailing list pgsql-hackers

From Alaa Attya
Subject Add Postmaster health monitoring statistics
Date
Msg-id CAB_VXgs_b-hbCZbTgWWHp7Bro-e2+QnftbNv+q9bV6=Hgz30Vw@mail.gmail.com
Whole thread Raw
List pgsql-hackers
hello, this is a feature request that I wanted to discuss with you

Problem:

While PostgreSQL's postmaster manages all child processes, there's limited visibility into its health metrics and process management statistics. This information could be valuable for monitoring and debugging purposes.


Metrics that we can track

- The number of successful/failed processes spawns
- Child process lifecycle statistics
- Time taken for process initialization (initialization time for new connections)
- Memory usage patterns
- Connection handling metrics


Suggested approach


Data structure 

```
typedef struct PmStatData
{
pg_atomic_uint64 successful_spawns;
pg_atomic_uint64 failed_spawns;
pg_atomic_uint64 child_crashes;
pg_atomic_uint64 total_connect_time;
pg_atomic_uint64 num_connections;
TimestampTz start_time;
/* Add more metrics as needed */
} PmStatData;
```

1. we can initialize the statistic in "PostmasterMain"
2. then whenever we spawn a new child process we can update the metrics, we can do this in this method "StartChildProcess"
3. then we can persist the results to a new table, maybe we can call it "pg_postmaster_stats"


This idea came to mind when I wanted to trace the increase in memory usage based on the fork of new postmaster processes. 
also, it will enable users to have historical data about Postmaster spawns.
It will also help customers to get more info about the performance of new connection handling.


Let me know if it makes sense to the Postgres community.
--
Regards,

pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Moving the vacuum GUCs' docs out of the Client Connection Defaults section
Next
From: David Steele
Date:
Subject: Re: Fwd: Re: A new look at old NFS readdir() problems?