Re: [BUGS] BUG #14714: long running sessions from remote instanceseems to hang some times - Mailing list pgsql-bugs

From Josef Machytka
Subject Re: [BUGS] BUG #14714: long running sessions from remote instanceseems to hang some times
Date
Msg-id CAGvVEFu4boL+zuZ07f25_apy=-MTdBNc3a9oAm=bV_Sww8fcwA@mail.gmail.com
Whole thread Raw
In response to Re: [BUGS] BUG #14714: long running sessions from remote instance seems to hang some times  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
If there are no queries on database I see number of context switches from ~900 to ~1500.

I checked our monitoring based on node_exporter, Prometheus and Grafana and when this "drowsiness" of sessions happens I see number of context switches from ~10 000 to ~25 000 with a few peeks ~28 000.

But similar numbers or even slightly higher I see also in other cases of high load on the database and they did not caused any "drowsy" sessions. In fact monitoring does not show any significant differences between cases with "drowsy" sessions and without them...If you have some advices about what else to monitor I will add.

On 23 June 2017 at 16:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Josef Machytka <josef.machytka@gmail.com> writes:
> sorry, here is file with sample - basically several group by aggregations
> over different partitioned tables have been locked in this "drowsy" state

The amount of time being spent in the kernel is strikingly high.  I wonder
if you're seeing some variant of the old "context swap storm" problem.
Try watching the output of "vmstat 1" for awhile to see if the cs rate
is high.

                        regards, tom lane

pgsql-bugs by date:

Previous
From: "Caio Parolin"
Date:
Subject: [BUGS] RES: Problems installation
Next
From: Clive Evans
Date:
Subject: Re: [BUGS] BUG #14715: Constraint exclusion isn't used in functionusing language sql