Re: Timing events WIP v1 - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Timing events WIP v1
Date
Msg-id 50F46C94.5030906@2ndQuadrant.com
Whole thread Raw
In response to Re: Timing events WIP v1  (Peter Geoghegan <peter@2ndquadrant.com>)
List pgsql-hackers
On 1/14/13 11:19 AM, Peter Geoghegan wrote:
> I noticed that when !log_checkpoints, control never reaches the site
> where the hook is called, and thus the checkpoint info is not stored.
> Is that the intended behaviour of the patch?

I was aware and considered it a defensible situation--that turning off 
checkpoint logging takes that data out everywhere.  But it was not 
necessarily the right thing to do.

> Currently, explain.c
> manages to generate JSON representations of plans with very little
> fuss, and without using any of the JSON datatype stuff. Doing this
> with hstore is just way too controversial, and not particularly worth
> it, IMHO.

I wasn't optimistic after seeing the number of bugs that scurried out 
when the hstore rock was turned over for this job.  On the 
implementation side, the next round of code I've been playing with here 
has struggled with the problem of rendering to strings earlier than I'd 
like.  I'd like to delay that as long as possible; certainly not do it 
during storage, and preferably it only happens when someone asks for the 
timing event.
> With a "where event_type = x" in the query predicate, the> JSON datums would have predictable, consistent structure,
facilitating>machine reading and aggregation.
 

Filtering on a range of timestamps or on the serial number field is the 
main thing that I imagined, as something that should limit results 
before even producing tuples.  The expected and important case where 
someone wants "all timing events after #123" after persisting #123 to 
disk, I'd like that to be efficient.  All of the fields I'd want to see 
filtering on are part of the fixed set of columns every entry will have.

To summarize, your suggestion is to build an in-memory structure capable 
of holding the timing event data.  The Datum approach will be used to 
cope with different types of events having different underlying types. 
The output format for queries against this data set will be JSON, 
rendered directly from the structure similarly to how EXPLAIN (FORMAT 
JSON) outputs query trees.  The columns every row contains, like a 
serial number, timestamp, and pid, can be filtered on by something 
operating at the query executor level.  Doing something useful with the 
more generic, "dynamic schema" parts will likely require parsing the 
JSON output.

Those are all great ideas I think people could live with.

It looks to me like the hook definition itself would need the entire 
data structure defined, and known to work, before its API could be 
nailed down.  I was hoping that we might get a hook for diverting this 
data committed into 9.3 even if the full extension to expose it wasn't 
nailed down.  That was based on similarity to the generic logging hook 
that went into 9.2.  This new implementation idea reminds me more of the 
query length decoration needed for normalized pg_stat_statements 
though--something that wasn't easy to just extract out from the consumer 
at all.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Validation in to_date()
Next
From: Tom Lane
Date:
Subject: Re: fix SQL example syntax in file comment