Re: Function to track shmem reinit time - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Function to track shmem reinit time
Date
Msg-id CAPpHfdsUNrKezRww5Sak2Dd_u9ZV-iPHLRLVRtA9fHfR4K5phw@mail.gmail.com
Whole thread Raw
In response to Re: Function to track shmem reinit time  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Feb 22, 2020 at 8:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > From my point of view criticism of this patch was addressed by
> > argument, that pg_shmem_init_time() allows to calculate the server
> > uptime [1].  This is very basic information, which is reasonable to
> > get without log files parsing.  It's more than year since [1] is
> > unanswered.  So, I'm going to push this if no objections.
>
> I'm still going to object to it, on the grounds that

OK!

> (1) it's exposing an implementation detail that clients should not be
> concerned with, and that we might change in future.  The name isn't
> even well chosen --- if the argument is that it's useful to monitor
> server uptime, why isn't the name "pg_server_uptime"?

Choosing a more user-friendly name sounds like good idea for me.
pg_server_uptime() sounds like it should return an interval.  I'd like
this function to calculate a diff between timestamps.  I would rather
delegate it to user side.  What about pg_server_up_since()?

> (2) if your server is crashing often enough that postmaster start
> time isn't an adequate substitute, you have way worse problems than
> whether you can measure it.

Well, it's enough server to crash once per postmaster run to make this
measure absolutely inadequate.  We have different reasons for server
crash, not all of them are exact bugs.  OOM killer can cause a crash,
and it doesn't seem feasible we can exclude this reason completely.

> I'd rather see us put effort into
> fixing whatever the underlying problem is.

This is monitoring problem vs fixing problem tradeoff.  We had similar
for lwlock wait monitoring.  Ideally we should make our code never
stuck on lwlock, but that's not feasible.  So, we got lwlock wait
monitoring for problem diagnosis.  I think now we're discussing
similar issue.  Ideally, postgres should never crash.  But that's too
hard to achieve.  But we can easily get better monitoring on server
crashes and that's useful.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Function to track shmem reinit time
Next
From: Jeff Davis
Date:
Subject: Re: POC: rational number type (fractions)