Re: Background Processes and reporting - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Background Processes and reporting
Date
Msg-id 20160314204206.brdp35lympce3uv6@alap3.anarazel.de
Whole thread Raw
In response to Re: Background Processes and reporting  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Background Processes and reporting
List pgsql-hackers
Hi,

On 2016-03-14 16:16:43 -0400, Robert Haas wrote:
> > I have already shown [0, 1] the overhead of measuring timings in linux on
> > representative workload. AFAIK, these tests were the only one that showed
> > any numbers. All other statements about terrible performance have been and
> > remain unconfirmed.
>
> Of course, those numbers are substantial regressions which would
> likely make it impractical to turn this on on a heavily-loaded
> production system.

A lot of people operating production systems are fine with trading a <=
10% impact for more insight into the system; especially if that
configuration can be changed without a restart.  I know a lot of systems
that use pg_stat_statements, track_io_timing = on, etc; just to get
that. In fact there's people running perf more or less continuously in
production environments; just to get more insight.

I think it's important to get as much information out there without
performance overhead, so it can be enabled by default. But I don't think
it makes sense to not allow features in that cannot be enabled by
default, *if* we tried to make them cheap enough beforehand.


> > Ok, doing it in short steps seems to be a good plan. Any objections against
> > giving people an ability to turn some feature (i.e. notorious measuring
> > timings) even if it makes some performance degradation? Of course, it should
> > be turned off by default.
>
> I am not totally opposed to that, but I think a feature that causes a
> 10% performance hit when you turn it on will be mostly useless.  The
> people who need it won't be able to risk turning it on.

That's not my experience.


> > If anything, I’m not from PostgresPro and I’m not «accusing you». But to be
> > honest current committed implementation has been tested exactly on one
> > machine with two workloads. And I think, it is somehow unfair to demand more
> > from others. Although it doesn’t mean that testing on exactly one machine
> > with only one OS is enough, of course. I suppose, you should ask the authors
> > to test it on some representative hardware and workload but if authors don’t
> > have them, it would be nice to help them with that.
>
> I'm not necessarily opposed to that, but this thread has a lot more
> heat than light

Indeed.


>, and some of the other threads on this topic have had
> the same problem. There seems to be tremendous resistance to the idea
> that recording timestamps is going to be extensive even though there
> are many previous threads on pgsql-hackers about many different
> features showing that this is true.  Somehow, I've got to justify a
> position which has been taken by many people many times before on this
> very same mailing list.  That strikes me as 100% backwards.

Agreed; I find that pretty baffling. Especially that pointing out
problems like timestamp overhead generates a remarkable amount of
hostility is weird.


> > Also it would be really interesting to hear your opinion about the initial
> > Andres’s question. Any thoughts about changing current committed
> > implementation?
>
> I'm a little vague on specifically what Andres has in mind.

That makes two of us.


> I tend to think that there's not much point in allowing
> pg_stat_get_progress_info('checkpointer') because we can just have a
> dedicated view for that sort of thing, cf. pg_stat_bgwriter, which
> seems better.

But that infrastructure isn't really suitable for exposing quickly
changing counters imo. And given that we now have a relatively generic
framework, it seems like a pain to add a custom implementation just for
the checkpointer. Also, using custom infrastructure means it's not
extensible to custom bgworker, which doesn't seem like a good
idea. E.g. it'd be very neat to show the progress of a logical
replication catchup process that way, no?


> Exposing the wait events from background processes
> might be worth doing, but I don't think we want to add a bunch of
> dummy lines to pg_stat_activity.

Why are those dummy lines? It's activity in the cluster? We already show
autovacuum workers in there. And walsenders, if you query the underlying
function, instead of pg_stat_activity (due to a join to pg_database).

Andres



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Background Processes and reporting
Next
From: Andres Freund
Date:
Subject: Re: pglogical_output - a general purpose logical decoding output plugin