Thread: Observability in Postgres
So I've been dealing a lot with building and maintaining dashboards for (fleets of) Postgres servers. And it's a pain. I have a few strongly held ideas about where the pain points are and what the right ways to tackle them are. Some of which are going to be controversial I think... The state of the art is pg_exporter which is a separate client that connects to the database and runs SQL queries to extract database metrics. The pain points I've seen are: 1) The exporter isn't reliable when things are failing. If your clients can't connect the exporter also can't connect leading to data gaps in your metrics for precisely the time windows where you need data. This can happen to connection exhaustion, xid wraparound, or even something as simple as someone taking an exclusive lock on something used in the sql queries. 2) SQL connections are tied to specific databases within a cluster. Making it hard to get data for all your databases if you have more than one. The exporter needs to reconnect to each database. 3) The exporter needs to listen on a different port from the postmaster. Making it necessary to write software to manage the mapping from server port to exporter port and that's left to the end-user as it varies from site to site. 4) The queries are customizable (the built-in ones don't exhaustively exporter postgres's metrics). As a result there's no standard dashboard that will work on any site out of the box. Moreover issue (3) also makes it impossible to implement one that works properly. 5) data needs to be marshaled from shared memory into SQL and then read by the client and re-emitted in the metric format. The double processing requires writing SQL queries very carefully to avoid losing fidelity for things like LSN positions, xids, etc. Moreover the latency and gathering data from multiple SQL queries results in metrics that are often out of sync with each other making them hard to interpret. All this said, I think we should have a component in Postgres that reads from the stats data directly and outputs metrics in standard metrics format directly. This would probably take the form of a background worker with a few tricky bits. This would mean there would be a standard official set of metrics available that a standard dashboard could rely on to be present at any site and it would be reliable if the SQL layer isn't functioning due to lack of connections or xid wraparound or locking issues. The elephant in the room is that issue (3) requires a bit of sleight of hand. Ideally I would want it to be listening on the same ports as the database. That means having the postmaster recognize metric requests and hand them to the metrics background worker instead of a backend. I'm not sure people are going to be ok with that.... For now my approach is to implement a background worker that listens on a new port and is basically its own small web server with shared memory access. This ignores issue (3) and my hope is that when we have some experience with this approach we'll see how reliable it is and how comfortable we are with the kind of hacking in postmaster it would take to fix it. Fwiw I do think this is an important issue and not one that we can ignore indefinitely. There is another elephant in the room (it's a big room) which is that this all makes sense for stats data. It doesn't make much sense for data that currently lives in pg_class, pg_index, etc. In other words I'm mostly solving (2) by ignoring it and concentrating on stats data. I haven't settled on a good solution for that data. I vaguely lean towards saying that the volatile metrics in those tables should really live in stats or at least be mirrored there. That makes a clean definition of what Postgres thinks a metric is and what it thinks catalog data is. But I'm not sure that will really work in practice. In particular I think it's likely we'll need to get catalog data from every database anyways, for example to label things like tables with better labels than oids. This work is being funded by Aiven which is really interested in improving observability and integration between Postgres and other open source cloud software. -- greg
On Mon, Feb 14, 2022 at 10:15 Greg Stark <stark@mit.edu> wrote:
....
For now my approach is to implement a background worker that listens
on a new port and is basically its own small web server with shared
memory access
This reminds me bg_mon (included into Spilo, docker image used by Zalando operator for Postgres):
Hi Greg,
On Mon, 14 Feb 2022 at 20:16, Greg Stark <stark@mit.edu> wrote:
So I've been dealing a lot with building and maintaining dashboards
for (fleets of) Postgres servers. And it's a pain. I have a few
strongly held ideas about where the pain points are and what the right
ways to tackle them are. Some of which are going to be controversial I
think...
The state of the art is pg_exporter which is a separate client that
connects to the database and runs SQL queries to extract database
metrics. The pain points I've seen are:
1) The exporter isn't reliable when things are failing. If your
clients can't connect the exporter also can't connect leading to data
gaps in your metrics for precisely the time windows where you need
data. This can happen to connection exhaustion, xid wraparound, or
even something as simple as someone taking an exclusive lock on
something used in the sql queries.
2) SQL connections are tied to specific databases within a cluster.
Making it hard to get data for all your databases if you have more
than one. The exporter needs to reconnect to each database.
3) The exporter needs to listen on a different port from the
postmaster. Making it necessary to write software to manage the
mapping from server port to exporter port and that's left to the
end-user as it varies from site to site.
4) The queries are customizable (the built-in ones don't exhaustively
exporter postgres's metrics). As a result there's no standard
dashboard that will work on any site out of the box. Moreover issue
(3) also makes it impossible to implement one that works properly.
5) data needs to be marshaled from shared memory into SQL and then
read by the client and re-emitted in the metric format. The double
processing requires writing SQL queries very carefully to avoid losing
fidelity for things like LSN positions, xids, etc. Moreover the
latency and gathering data from multiple SQL queries results in
metrics that are often out of sync with each other making them hard to
interpret.
All this said, I think we should have a component in Postgres that
reads from the stats data directly and outputs metrics in standard
metrics format directly. This would probably take the form of a
background worker with a few tricky bits.
This would mean there would be a standard official set of metrics
available that a standard dashboard could rely on to be present at any
site and it would be reliable if the SQL layer isn't functioning due
to lack of connections or xid wraparound or locking issues.
The elephant in the room is that issue (3) requires a bit of sleight
of hand. Ideally I would want it to be listening on the same ports as
the database. That means having the postmaster recognize metric
requests and hand them to the metrics background worker instead of a
backend. I'm not sure people are going to be ok with that....
For now my approach is to implement a background worker that listens
on a new port and is basically its own small web server with shared
memory access. This ignores issue (3) and my hope is that when we have
some experience with this approach we'll see how reliable it is and
how comfortable we are with the kind of hacking in postmaster it would
take to fix it. Fwiw I do think this is an important issue and not one
that we can ignore indefinitely.
There is another elephant in the room (it's a big room) which is that
this all makes sense for stats data. It doesn't make much sense for
data that currently lives in pg_class, pg_index, etc. In other words
I'm mostly solving (2) by ignoring it and concentrating on stats data.
I haven't settled on a good solution for that data. I vaguely lean
towards saying that the volatile metrics in those tables should really
live in stats or at least be mirrored there. That makes a clean
definition of what Postgres thinks a metric is and what it thinks
catalog data is. But I'm not sure that will really work in practice.
In particular I think it's likely we'll need to get catalog data from
every database anyways, for example to label things like tables with
better labels than oids.
This work is being funded by Aiven which is really interested in
improving observability and integration between Postgres and other
open source cloud software.
I agree with pretty much everything above, bar a couple of points:
- Does it really matter if metrics are exposed on a separate port from the postmaster? I actually think doing that is a good thing as it allows use of alternative listen addresses and firewalling rules; you could then confine the monitoring traffic to a management VLAN for example.
- I strongly dislike the idea of building this around the prometheus exporter format. Whilst that is certainly a useful format if you're using prom (as many do), it does have limitations and quirks that would make it painful for other systems to use; for example, the need to encode non-numeric data into labels rather than the metrics themselves (e.g. server version strings or LSNs). I would much prefer to see a common format such as JSON used by default, and perhaps offer a hook to allow alternate formatters to replace that. The prometheus format is also pretty inefficient, as you have to repeat all the key data (labels) for each individual metric.
On 2/15/22 07:30, Dave Page wrote: > On Mon, 14 Feb 2022 at 20:16, Greg Stark <stark@mit.edu > <mailto:stark@mit.edu>> wrote: > > So I've been dealing a lot with building and maintaining dashboards > for (fleets of) Postgres servers. And it's a pain. I have a few > strongly held ideas about where the pain points are and what the right > ways to tackle them are. Some of which are going to be controversial I > think... <snip> > This work is being funded by Aiven which is really interested in > improving observability and integration between Postgres and other > open source cloud software. > > > I agree with pretty much everything above, bar a couple of points: > > - Does it really matter if metrics are exposed on a separate port from > the postmaster? I actually think doing that is a good thing as it allows > use of alternative listen addresses and firewalling rules; you could > then confine the monitoring traffic to a management VLAN for example. > > - I strongly dislike the idea of building this around the > prometheus exporter format. Whilst that is certainly a useful format if > you're using prom (as many do), it does have limitations and quirks that > would make it painful for other systems to use; for example, the need to > encode non-numeric data into labels rather than the metrics themselves > (e.g. server version strings or LSNs). I would much prefer to see a > common format such as JSON used by default, and perhaps offer a hook to > allow alternate formatters to replace that. The prometheus format is > also pretty inefficient, as you have to repeat all the key data (labels) > for each individual metric. +1 to Dave's points Joe --- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
On Tue, Feb 15, 2022 at 1:30 PM Dave Page <dpage@pgadmin.org> wrote: > > Hi Greg, > > On Mon, 14 Feb 2022 at 20:16, Greg Stark <stark@mit.edu> wrote: >> >> So I've been dealing a lot with building and maintaining dashboards >> for (fleets of) Postgres servers. And it's a pain. I have a few >> strongly held ideas about where the pain points are and what the right >> ways to tackle them are. Some of which are going to be controversial I >> think... >> >> The state of the art is pg_exporter which is a separate client that >> connects to the database and runs SQL queries to extract database >> metrics. The pain points I've seen are: >> >> 1) The exporter isn't reliable when things are failing. If your >> clients can't connect the exporter also can't connect leading to data >> gaps in your metrics for precisely the time windows where you need >> data. This can happen to connection exhaustion, xid wraparound, or >> even something as simple as someone taking an exclusive lock on >> something used in the sql queries. >> >> 2) SQL connections are tied to specific databases within a cluster. >> Making it hard to get data for all your databases if you have more >> than one. The exporter needs to reconnect to each database. >> >> 3) The exporter needs to listen on a different port from the >> postmaster. Making it necessary to write software to manage the >> mapping from server port to exporter port and that's left to the >> end-user as it varies from site to site. >> >> 4) The queries are customizable (the built-in ones don't exhaustively >> exporter postgres's metrics). As a result there's no standard >> dashboard that will work on any site out of the box. Moreover issue >> (3) also makes it impossible to implement one that works properly. >> >> 5) data needs to be marshaled from shared memory into SQL and then >> read by the client and re-emitted in the metric format. The double >> processing requires writing SQL queries very carefully to avoid losing >> fidelity for things like LSN positions, xids, etc. Moreover the >> latency and gathering data from multiple SQL queries results in >> metrics that are often out of sync with each other making them hard to >> interpret. >> >> All this said, I think we should have a component in Postgres that >> reads from the stats data directly and outputs metrics in standard >> metrics format directly. This would probably take the form of a >> background worker with a few tricky bits. >> >> This would mean there would be a standard official set of metrics >> available that a standard dashboard could rely on to be present at any >> site and it would be reliable if the SQL layer isn't functioning due >> to lack of connections or xid wraparound or locking issues. >> >> The elephant in the room is that issue (3) requires a bit of sleight >> of hand. Ideally I would want it to be listening on the same ports as >> the database. That means having the postmaster recognize metric >> requests and hand them to the metrics background worker instead of a >> backend. I'm not sure people are going to be ok with that.... >> >> For now my approach is to implement a background worker that listens >> on a new port and is basically its own small web server with shared >> memory access. This ignores issue (3) and my hope is that when we have >> some experience with this approach we'll see how reliable it is and >> how comfortable we are with the kind of hacking in postmaster it would >> take to fix it. Fwiw I do think this is an important issue and not one >> that we can ignore indefinitely. >> >> There is another elephant in the room (it's a big room) which is that >> this all makes sense for stats data. It doesn't make much sense for >> data that currently lives in pg_class, pg_index, etc. In other words >> I'm mostly solving (2) by ignoring it and concentrating on stats data. >> >> I haven't settled on a good solution for that data. I vaguely lean >> towards saying that the volatile metrics in those tables should really >> live in stats or at least be mirrored there. That makes a clean >> definition of what Postgres thinks a metric is and what it thinks >> catalog data is. But I'm not sure that will really work in practice. >> In particular I think it's likely we'll need to get catalog data from >> every database anyways, for example to label things like tables with >> better labels than oids. >> >> This work is being funded by Aiven which is really interested in >> improving observability and integration between Postgres and other >> open source cloud software. > > > I agree with pretty much everything above, bar a couple of points: > > - Does it really matter if metrics are exposed on a separate port from the postmaster? I actually think doing that is agood thing as it allows use of alternative listen addresses and firewalling rules; you could then confine the monitoringtraffic to a management VLAN for example. +1. I think it would be much better to keep it on a separate port. Doesn't even have to be to the point of VLANs or whatever. You just want your firewall rules to be able to know what data it's talking about. Another part missing in the proposal is how to deal with authentication. That'll be an even harder problem if it sits on the same port but speaks a different protocol. How would it work with pg_hba etc? > - I strongly dislike the idea of building this around the prometheus exporter format. Whilst that is certainly a usefulformat if you're using prom (as many do), it does have limitations and quirks that would make it painful for othersystems to use; for example, the need to encode non-numeric data into labels rather than the metrics themselves (e.g.server version strings or LSNs). I would much prefer to see a common format such as JSON used by default, and perhapsoffer a hook to allow alternate formatters to replace that. The prometheus format is also pretty inefficient, as youhave to repeat all the key data (labels) for each individual metric. There's good and bad with it. The bug "good" with it is that it's an open standard (openmetrics). I think supporting that would be a very good idea. But it would also be good to have a different, "richer", format available. Whether it'd be worth to go the full "postgresql way" and make it pluggable is questionable, but I would suggest at least having both openmetrics and a native/richer one, and not just the latter. Being able to just point your existing monitoring system at a postgres instance (with auth configured) and have things just shows up is in itself a large value. (Then either pluggable or hooks beyond that, but having both those as native) -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On Tue, 15 Feb 2022 at 16:43, Magnus Hagander <magnus@hagander.net> wrote: > > On Tue, Feb 15, 2022 at 1:30 PM Dave Page <dpage@pgadmin.org> wrote: > > > > - Does it really matter if metrics are exposed on a separate port from the postmaster? I actually think doing that isa good thing as it allows use of alternative listen addresses and firewalling rules; you could then confine the monitoringtraffic to a management VLAN for example. > > +1. I think it would be much better to keep it on a separate port. > > Doesn't even have to be to the point of VLANs or whatever. You just > want your firewall rules to be able to know what data it's talking > about. I would definitely want that to be an option that could be configured. If you're deploying a server to be accessible as a public service and configuring firewall rules etc then sure you probably want to be very explicit about what is listening where. But when you're deploying databases automatically in a clustered type environment you really want a service to deploy on a given port and have the monitoring associated with that port as well. If you deploy five databases you don't want to have to deal with five other ports for monitoring and then have to maintain a database of which monitoring ports are associated with which service ports.... It's definitely doable -- that's what people do today -- but it's a pain and it's fragile and it's different at each site which makes it impossible for dashboards to work out of the box. > Another part missing in the proposal is how to deal with > authentication. That'll be an even harder problem if it sits on the > same port but speaks a different protocol. How would it work with > pg_hba etc? Wouldn't it make it easier to work with pg_hba? If incoming connections are coming through pg_hba then postmaster gets to accept or refuse the connection based on the host and TLS information. If it's listening on a separate port then unless that logic is duplicated it'll be stuck in a parallel world with different security rules. I'm not actually sure how to make this work. There's a feature in Unix where a file descriptor can be passed over from one process to another over a socket but that's gotta be a portability pain. And starting a new worker for each incoming connection would be a different pain. So right now I'm kind of guessing this might be just a hook in postmaster that we can experiment with in the module. The hook would just return a flag to postmaster saying the connection was handled. > There's good and bad with it. The bug "good" with it is that it's an > open standard (openmetrics). I think supporting that would be a very > good idea. But it would also be good to have a different, "richer", > format available. Whether it'd be worth to go the full "postgresql > way" and make it pluggable is questionable, but I would suggest at > least having both openmetrics and a native/richer one, and not just > the latter. Being able to just point your existing monitoring system > at a postgres instance (with auth configured) and have things just > shows up is in itself a large value. (Then either pluggable or hooks > beyond that, but having both those as native) Ideally I would want to provide OpenMetrics data that doesn't break compatibility with OpenTelemetry -- which I'm still not 100% sure I understand but I gather that means following certain conventions about metadata. But those standards only have quantitive metrics, no rich structured data. I assume the idea is that that kind of rich structured data belongs in some other system. But I definitely see people squeezing it into metrics. For things like replication topology for example.... I would love to have a Personally I feel similarly about the inefficiency but I think the feeling is that compression makes it irrelevant. I suspect there's a fair amount of burnout over predecessors like SNMP that went to a lot of trouble to be efficient and implementations were always buggy and impenetrable as a result. (The predecessor in Google had some features that made it slightly more efficient too but also made it more complex. It seems intentional that they didn't carry those over too) Fwiw one constant source of pain is the insistence on putting everything into floating point numbers. They have 56 bits of precision and that leaves us not quite being able to represent an LSN or 64-bit xid for example. -- greg
On Tue, Feb 15, 2022 at 11:24 PM Greg Stark <stark@mit.edu> wrote: > > On Tue, 15 Feb 2022 at 16:43, Magnus Hagander <magnus@hagander.net> wrote: > > > > On Tue, Feb 15, 2022 at 1:30 PM Dave Page <dpage@pgadmin.org> wrote: > > > > > > - Does it really matter if metrics are exposed on a separate port from the postmaster? I actually think doing thatis a good thing as it allows use of alternative listen addresses and firewalling rules; you could then confine the monitoringtraffic to a management VLAN for example. > > > > +1. I think it would be much better to keep it on a separate port. > > > > Doesn't even have to be to the point of VLANs or whatever. You just > > want your firewall rules to be able to know what data it's talking > > about. > > I would definitely want that to be an option that could be configured. > If you're deploying a server to be accessible as a public service and > configuring firewall rules etc then sure you probably want to be very > explicit about what is listening where. > > But when you're deploying databases automatically in a clustered type > environment you really want a service to deploy on a given port and > have the monitoring associated with that port as well. If you deploy > five databases you don't want to have to deal with five other ports > for monitoring and then have to maintain a database of which > monitoring ports are associated with which service ports.... It's > definitely doable -- that's what people do today -- but it's a pain > and it's fragile and it's different at each site which makes it > impossible for dashboards to work out of the box. I really don't see the problem with having the monitoring on a different port. I *do* see the problem with having a different monitoring port for each database in a cluster, if that's what you're saying. Definitely. But if it's 5432 for the database and 8432 for the monitoring for example, I'd see that as an improvement. And if you're deploying a larger cluster you're auto-configuring these things anyway so to have your environment always set "monitoring port = database port + 3000" for example should be trivial. > > Another part missing in the proposal is how to deal with > > authentication. That'll be an even harder problem if it sits on the > > same port but speaks a different protocol. How would it work with > > pg_hba etc? > > Wouldn't it make it easier to work with pg_hba? If incoming > connections are coming through pg_hba then postmaster gets to accept > or refuse the connection based on the host and TLS information. If > it's listening on a separate port then unless that logic is duplicated > it'll be stuck in a parallel world with different security rules. I guess you could map it against yet another virtual database, like we do with streaming replication. > I'm not actually sure how to make this work. There's a feature in Unix > where a file descriptor can be passed over from one process to another > over a socket but that's gotta be a portability pain. And starting a > new worker for each incoming connection would be a different pain. > > So right now I'm kind of guessing this might be just a hook in > postmaster that we can experiment with in the module. The hook would > just return a flag to postmaster saying the connection was handled. If it was as easy as username/password you could just have a comm channel between postmaster and a bgworker for example. But you also have to implement things liker GSSAPI authentication. But I think you'll run into a different problem much earlier. Pretty much everything out there is going to want to speak http(s). How are you going to terminate that, especially https, on the same port as a PostgreSQL connection? PostgreSQL will have to reply with it's initial negotiating byte before anything else is done, including the TLS negotiation, and that will kill anything http. And if your metrics endpoint isn't going to speak http, you've given up the ability for the "plug and play setup". > > There's good and bad with it. The bug "good" with it is that it's an > > open standard (openmetrics). I think supporting that would be a very > > good idea. But it would also be good to have a different, "richer", > > format available. Whether it'd be worth to go the full "postgresql > > way" and make it pluggable is questionable, but I would suggest at > > least having both openmetrics and a native/richer one, and not just > > the latter. Being able to just point your existing monitoring system > > at a postgres instance (with auth configured) and have things just > > shows up is in itself a large value. (Then either pluggable or hooks > > beyond that, but having both those as native) > > Ideally I would want to provide OpenMetrics data that doesn't break > compatibility with OpenTelemetry -- which I'm still not 100% sure I > understand but I gather that means following certain conventions about > metadata. But those standards only have quantitive metrics, no rich > structured data. Yeah. That's why I think the reasonable thing is to provide both. > I assume the idea is that that kind of rich structured data belongs in > some other system. But I definitely see people squeezing it into > metrics. For things like replication topology for example.... I would > love to have a .... love to have a completed sentence there? :) > Personally I feel similarly about the inefficiency but I think the > feeling is that compression makes it irrelevant. I suspect there's a > fair amount of burnout over predecessors like SNMP that went to a lot > of trouble to be efficient and implementations were always buggy and > impenetrable as a result. (The predecessor in Google had some features > that made it slightly more efficient too but also made it more > complex. It seems intentional that they didn't carry those over too) Given the amount of metrics that people pull in through prometheus or similar today, I think most have just accepted the overhead. You can of course serve it over protobufs to make it more efficient, but yes the basic part is still there -- but overall the market has spoken and said they accept that one. > Fwiw one constant source of pain is the insistence on putting > everything into floating point numbers. They have 56 bits of precision > and that leaves us not quite being able to represent an LSN or 64-bit > xid for example. Yeah, it's clearly designed around certain types of metrics only... -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
I am curious what you mean by standard metrics format? I am all for standards-based but what are those in the case of DBs. For environments where O11y matters a lot, I think the challenge lies in mapping specific query executions back to system characteristics. I am just thinking aloud as a newbie to this community.
On Mon, Feb 14, 2022, 12:16 PM Greg Stark <stark@mit.edu> wrote:
So I've been dealing a lot with building and maintaining dashboards
for (fleets of) Postgres servers. And it's a pain. I have a few
strongly held ideas about where the pain points are and what the right
ways to tackle them are. Some of which are going to be controversial I
think...
The state of the art is pg_exporter which is a separate client that
connects to the database and runs SQL queries to extract database
metrics. The pain points I've seen are:
1) The exporter isn't reliable when things are failing. If your
clients can't connect the exporter also can't connect leading to data
gaps in your metrics for precisely the time windows where you need
data. This can happen to connection exhaustion, xid wraparound, or
even something as simple as someone taking an exclusive lock on
something used in the sql queries.
2) SQL connections are tied to specific databases within a cluster.
Making it hard to get data for all your databases if you have more
than one. The exporter needs to reconnect to each database.
3) The exporter needs to listen on a different port from the
postmaster. Making it necessary to write software to manage the
mapping from server port to exporter port and that's left to the
end-user as it varies from site to site.
4) The queries are customizable (the built-in ones don't exhaustively
exporter postgres's metrics). As a result there's no standard
dashboard that will work on any site out of the box. Moreover issue
(3) also makes it impossible to implement one that works properly.
5) data needs to be marshaled from shared memory into SQL and then
read by the client and re-emitted in the metric format. The double
processing requires writing SQL queries very carefully to avoid losing
fidelity for things like LSN positions, xids, etc. Moreover the
latency and gathering data from multiple SQL queries results in
metrics that are often out of sync with each other making them hard to
interpret.
All this said, I think we should have a component in Postgres that
reads from the stats data directly and outputs metrics in standard
metrics format directly. This would probably take the form of a
background worker with a few tricky bits.
This would mean there would be a standard official set of metrics
available that a standard dashboard could rely on to be present at any
site and it would be reliable if the SQL layer isn't functioning due
to lack of connections or xid wraparound or locking issues.
The elephant in the room is that issue (3) requires a bit of sleight
of hand. Ideally I would want it to be listening on the same ports as
the database. That means having the postmaster recognize metric
requests and hand them to the metrics background worker instead of a
backend. I'm not sure people are going to be ok with that....
For now my approach is to implement a background worker that listens
on a new port and is basically its own small web server with shared
memory access. This ignores issue (3) and my hope is that when we have
some experience with this approach we'll see how reliable it is and
how comfortable we are with the kind of hacking in postmaster it would
take to fix it. Fwiw I do think this is an important issue and not one
that we can ignore indefinitely.
There is another elephant in the room (it's a big room) which is that
this all makes sense for stats data. It doesn't make much sense for
data that currently lives in pg_class, pg_index, etc. In other words
I'm mostly solving (2) by ignoring it and concentrating on stats data.
I haven't settled on a good solution for that data. I vaguely lean
towards saying that the volatile metrics in those tables should really
live in stats or at least be mirrored there. That makes a clean
definition of what Postgres thinks a metric is and what it thinks
catalog data is. But I'm not sure that will really work in practice.
In particular I think it's likely we'll need to get catalog data from
every database anyways, for example to label things like tables with
better labels than oids.
This work is being funded by Aiven which is really interested in
improving observability and integration between Postgres and other
open source cloud software.
--
greg
On Tue, 15 Feb 2022 at 22:48, Stephan Doliov <stephan.doliov@gmail.com> wrote: > > I am curious what you mean by standard metrics format? I am all for standards-based but what are those in the case of DBs.For environments where O11y matters a lot, I think the challenge lies in mapping specific query executions back to systemcharacteristics. I am just thinking aloud as a newbie to this community. I was about to reply simply that the standard the open source world is coalescing around is OpenMetrics. This is basically a codification of the prometheus format with some extra features. But on further thought I think what you're asking is whether there are standard database metrics and standard names for them. A lot of this work has already been done with pg_exporter but it is worth looking at other database software and see if there are opportunities to standardize metrics naming for across databases. I don't really expect a lot of deep examples of that though. As soon as you dig under the surface the differences quickly add up so you'll have different labels with different semantics. But sure, it would be nice if you could do simple high level queries like "number of active connections" using the same metrics on mysql, postgres, and ... whatever. -- greg
On Tue, 15 Feb 2022 at 17:37, Magnus Hagander <magnus@hagander.net> wrote: > > On Tue, Feb 15, 2022 at 11:24 PM Greg Stark <stark@mit.edu> wrote: > > > > On Tue, 15 Feb 2022 at 16:43, Magnus Hagander <magnus@hagander.net> wrote: > > I really don't see the problem with having the monitoring on a different port. > > I *do* see the problem with having a different monitoring port for > each database in a cluster, if that's what you're saying. Definitely. No, I'm talking about each cluster. Like, say you deploy software that has an embedded postgres database in it and doesn't know about your custom port mapping scheme. Or you are trying to pack as many as you can on your servers and they're dynamically allocating ports when the jobs start. > But if it's 5432 for the database and 8432 for the monitoring for > example, I'd see that as an improvement. And if you're deploying a > larger cluster you're auto-configuring these things anyway so to have > your environment always set "monitoring port = database port + 3000" > for example should be trivial. It's definitely doable -- it's what people do today -- but it would be better if people didn't have to do this. In particular the thing that really bothers me is that it's one of the reasons you can't write dashboards (or alerting rules or recording rules) that work out of the box. Your custom +3000 rule is not something that service discovery tools or dashboards are going to know about. And when you try to use the metrics for anything further you run into issues. Like, if you have metrics from clients about connection errors -- they'll have labels for the database connection address. Or if you want to use OS metrics for the network traffic -- same thing. Or if you want to use metrics about replication from replicas... > But I think you'll run into a different problem much earlier. Pretty > much everything out there is going to want to speak http(s). How are > you going to terminate that, especially https, on the same port as a > PostgreSQL connection? PostgreSQL will have to reply with it's initial > negotiating byte before anything else is done, including the TLS > negotiation, and that will kill anything http. Yeah this is a serious problem. I think there are other even more compelling reasons someone else was already looking at this so I'm kind of hoping it solves itself :) > > I assume the idea is that that kind of rich structured data belongs in > > some other system. But I definitely see people squeezing it into > > metrics. For things like replication topology for example.... I would > > love to have a > > .... love to have a completed sentence there? :) Oops :) I think I was going to say something like: I would love to have a system like this but I don't know of one. I mean there are plenty of tools that could be used to build this but nothing that does it for you. -- greg
Hi, On Mon, Feb 14, 2022 at 03:15:14PM -0500, Greg Stark wrote: > > [...] > 2) SQL connections are tied to specific databases within a cluster. > Making it hard to get data for all your databases if you have more > than one. The exporter needs to reconnect to each database. > > 3) The exporter needs to listen on a different port from the > postmaster. Making it necessary to write software to manage the > mapping from server port to exporter port and that's left to the > end-user as it varies from site to site. > > 4) The queries are customizable (the built-in ones don't exhaustively > exporter postgres's metrics). As a result there's no standard > dashboard that will work on any site out of the box. Moreover issue > (3) also makes it impossible to implement one that works properly. > [...] > All this said, I think we should have a component in Postgres that > reads from the stats data directly and outputs metrics in standard > metrics format directly. This would probably take the form of a > background worker with a few tricky bits. But having a background worker for that will bring its own set of (new) problem. I never really had a problem with (3), and even if we fixed that users will still have to rely on mapping for other products they monitor so I don't see that as a really big issue. Also I don't think that having such a component directly embedded in postgres is a good idea, as it means it would be tied to major version releases. I don't think anyone will like to hear "sorry you need to upgrade to a new postgres major version to monitor X even if the data is available in the catalogs". It also means that you will now maybe have different standard metric definition depending on the major version, which seems to contradict (4). > There is another elephant in the room (it's a big room) which is that > this all makes sense for stats data. It doesn't make much sense for > data that currently lives in pg_class, pg_index, etc. In other words > I'm mostly solving (2) by ignoring it and concentrating on stats data. > > I haven't settled on a good solution for that data. I vaguely lean > towards saying that the volatile metrics in those tables should really > live in stats or at least be mirrored there. That makes a clean > definition of what Postgres thinks a metric is and what it thinks > catalog data is. But I'm not sure that will really work in practice. > In particular I think it's likely we'll need to get catalog data from > every database anyways, for example to label things like tables with > better labels than oids. I also don't think that sending those data in stats is going to work, which makes me quite worried about spending a lot of efforts on a solution that has problematic limitations for something as useful as database-specific metrics.
On Wed, 2022-02-16 at 02:10 -0500, Greg Stark wrote: > On Tue, 15 Feb 2022 at 17:37, Magnus Hagander <magnus@hagander.net> wrote: > > > But I think you'll run into a different problem much earlier. Pretty > > much everything out there is going to want to speak http(s). How are > > you going to terminate that, especially https, on the same port as a > > PostgreSQL connection? PostgreSQL will have to reply with it's initial > > negotiating byte before anything else is done, including the TLS > > negotiation, and that will kill anything http. > > Yeah this is a serious problem. I think there are other even more > compelling reasons someone else was already looking at this so I'm > kind of hoping it solves itself :) Yeah, this seems like a shoe-in with implicit TLS support and ALPN. So hopefully we can help that piece solve itself. :) That said, I feel like I should probably advise against forwarding HTTP through Postgres. With implicit TLS you should be able to run a reverse proxy out front, which could check the ALPN and redirect traffic to the bgworker port as needed. (I don't think you have to terminate TLS in order to do this -- so channel bindings et al should be unaffected -- but I don't have experience with that.) So Postgres wouldn't have to touch HTTP traffic at all, and the bgworker extension can upgrade its HTTP stack completely independently. (And if you don't want to share ports, you don't have to deploy the proxy at all.) --Jacob
Hi, On Wed, Feb 16, 2022 at 12:48:11AM -0500, Greg Stark wrote: > But on further thought I think what you're asking is whether there are > standard database metrics and standard names for them. A lot of this > work has already been done with pg_exporter but it is worth looking at > other database software and see if there are opportunities to > standardize metrics naming for across databases. Can you clarify what exactly you mean with "pg_exporter", I think you mentioned it upthread as well? https://github.com/Vonng/pg_exporter (90 GH stars, never heard of it) ? https://github.com/prometheus-community/postgres_exporter (1.6k GH stars) ? Something else? Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz