Thread: Monitoring PostgreSQL connections using cricket and pg_stat_activity

Monitoring PostgreSQL connections using cricket and pg_stat_activity

From
Tony Wasson
Date:
I was unable to find examples of connection usage graphing and
trending. I really wanted this and after a bunch of googling, I just
made something up myself. It isn't fancy, but it seems to be working
fine for my purposes. This uses cricket to collect, store and graph
postgresql connection information. Here's what it currently graphs:

Active Connections  - COUNT(*) from pg_stat_activity

% of Connections Used (SHOW max_connections / COUNT(*) from pg_stat_activity)

Non Idle Connections - all the connections that weren't marked as <IDLE>

All Idle Connections - pg_stat_activity shows these connections as <IDLE>

Long Idle Connections - A connection that has been <IDLE> for more
than 5 minutes

Resting Connections - connections that have no query information. I
suspect these are connections starting up or shutting down, but feel
free to clue me in.

I thought about graphing the number of SELECTs/INSERTS/UPDATEs/DELETEs
currently running. If anyone is interested, I think it would be easy
to handle.

Hope you find this useful,
Tony

Attachment

Re: Monitoring PostgreSQL connections using cricket and

From
Steve Crawford
Date:
> Resting Connections - connections that have no query information. I
> suspect these are connections starting up or shutting down, but feel
> free to clue me in.

OK, this appears to be version-dependent but it can mean that
stats_query_string is false or that the user you are connecting as has
no permission to see the query of the other user.

> I thought about graphing the number of SELECTs/INSERTS/UPDATEs/DELETEs
> currently running. If anyone is interested, I think it would be easy
> to handle.

Each time you run your script you will only have a snapshot at that
instant. It might provide minimally useful information to someone who is
clear about what they are seeing but that snapshot could show a
connection as idle even though it is handling hundreds of small queries
per minute or a connection as running a query even though it sits idle
in a connection pool nearly all the time.

It certainly won't tell you the server-load (you didn't claim it would,
of course). A single huge or badly-written query can hose a server while
piles of quick queries will hardly load it at all.

A snapshot showing a large number of running queries may even be fine if
they are all backed up waiting for a few-second-long table-lock to be
released.

For finding potential problems you should consider looking for "idle in
transaction" queries - especially any that are aging as they can
indicate that something has failed to commit or rollback a transaction.
This can be especially bad on a pool-connection. Until the transaction
is closed, locks can remain in place and start to cause all sorts of
trouble.

Cheers,
Steve

Re: Monitoring PostgreSQL connections using cricket and

From
Tony Wasson
Date:
On 1/5/06, Steve Crawford <scrawford@pinpointresearch.com> wrote:
> > Resting Connections - connections that have no query information. I
> > suspect these are connections starting up or shutting down, but feel
> > free to clue me in.
>
> OK, this appears to be version-dependent but it can mean that
> stats_query_string is false or that the user you are connecting as has
> no permission to see the query of the other user.

Ok, thanks. I forgot to mention that you do have to run this as
superuser to "see" the query information.

> > I thought about graphing the number of SELECTs/INSERTS/UPDATEs/DELETEs
> > currently running. If anyone is interested, I think it would be easy
> > to handle.
>
> Each time you run your script you will only have a snapshot at that
> instant. It might provide minimally useful information to someone who is
> clear about what they are seeing but that snapshot could show a
> connection as idle even though it is handling hundreds of small queries
> per minute or a connection as running a query even though it sits idle
> in a connection pool nearly all the time.

Right, this is certainly far from comprehensive. I basically needed to
watch connection utilization so that was my primary focus. I have been
able to spot a few cases where pooling wasn't doing what we wanted or
expecting using this kind of information.

> It certainly won't tell you the server-load (you didn't claim it would,
> of course). A single huge or badly-written query can hose a server while
> piles of quick queries will hardly load it at all.

I'd suggest using this in conjuction with the system monitoring tools
available over SNMP and in cricket contrib. That can get you load,
memory, cpu, processes, packets and bandwidth. I think that get's you
a lot of what you need to keep a server happy.

> A snapshot showing a large number of running queries may even be fine if
> they are all backed up waiting for a few-second-long table-lock to be
> released.
>
> For finding potential problems you should consider looking for "idle in
> transaction" queries - especially any that are aging as they can
> indicate that something has failed to commit or rollback a transaction.
> This can be especially bad on a pool-connection. Until the transaction
> is closed, locks can remain in place and start to cause all sorts of
> trouble.

Yeah, locks can be interesting to troubleshoot. I ended up with a view
combining pg_stat_activity joined with pg_locks to see what queries
are causing what locks and how long they have been running.

On a longer term, it would be useful to NOT need to run pqa to see a
profile of what % of my queries are SELECT/INSERT/UPDATE/DELETE. Then
as things change, I'd drill down with the details pqa provides to see
"what changed". PQA rocks, but it is an annoyance to need to flick on
logging and run that through pqa.