Timing events WIP v1 - Mailing list pgsql-hackers

From Greg Smith
Subject Timing events WIP v1
Date
Msg-id 50A4BC4E.4030007@2ndQuadrant.com
Whole thread Raw
Responses Re: Timing events WIP v1
Re: Timing events WIP v1
Re: Timing events WIP v1
List pgsql-hackers
Attached is a first WIP saving Timing Events via a new hook, grabbed by
a new pg_timing_events contrib module.  This only implements a small
portion of the RFP spec I presented earlier this month, and is by no
means finished code looking for a regular review.  This is just enough
framework to do something useful while looking for parts that I suspect
have tricky details.

Right now this saves just a single line of text and inserts the hook
into the checkpoint code, so output looks like this:

$ psql -c "select * from pg_timing_events" -x
-[ RECORD 1 ]-
seq   | 0
event | check/restartpoint complete: wrote 0 buffers (0.0%); 0
transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s,
sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s

Extending this to save the key/value set and most of the other data I
mentioned before is pretty straightforward.  There is a fair amount of
debris in here from the pg_stat_statements code this started as.  I've
left a lot of that here but commented out because it gives examples of
grabbing things like the user and database I'll need soon.

There's a modest list of things that I've done or am facing soon that
involve less obvious choices though, and I would appreciate feedback on
these things in particular:

-This will eventually aggregate data from clients running queries, the
checkpointer process, and who knows what else in the future.  The most
challenging part of this so far is deciding how to handle the memory
management side.  I've tried just dumping everything into
TopMemoryContext, and that not working as hoped is the biggest known bug
right now.  I'm not decided on whether to continue doing that but
resolve the bug; if there is an alternate context that makes more sense
for a contrib module like this; or if I should fix the size of the data
and allocate everything at load time.  The size of the data from
log_lock_waits in particular will be hard to estimate in advance.

-The event queue into a simple array accessed in circular fashion.
After realizing that output needed to navigate in the opposite order of
element addition, I ended up just putting all the queue navigation code
directly into the add/output routines.  I'd be happy to use a more
formal Postgres type here instead--I looked at SHM_QUEUE for
example--but I haven't found something yet that makes this any simpler.

-I modeled the hook here on the logging one that went into 9.2.  It's
defined in its own include file and it gets initialized by the logging
system.  No strong justification for putting it there, it was just
similar to the existing hook and that seemed reasonable.  This is in
some respects an alternate form of logging after all.  The data sent by
the hook itself needs to be more verbose in order to handle the full
spec, right now I'm just asking about placement.

-I'm growing increasingly worried about allowing concurrent reads of
this data (which might be large and take a while to return to the
client) without blocking insertions.  The way that was split to improve
pg_stat_statements concurrency doesn't convert naturally to this data.
The serial number field in here is part of one idea I had.  I might grab
the header with an exclusive lock for only a moment and lookup the
serial number of the last record I should display.  Then I could drop to
a share lock looping over the elements, stopping if I find one with a
serial number over that.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical changeset generation v3
Next
From: Magnus Hagander
Date:
Subject: Re: Doc patch making firm recommendation for setting the value of commit_delay