Re: IDLE in transaction introspection - Mailing list pgsql-hackers

From Scott Mead
Subject Re: IDLE in transaction introspection
Date
Msg-id CAKq0gv+758cX7iRwJGwKSx-VmN4x46NXtBfdCSH9mrUBF1DVzQ@mail.gmail.com
Whole thread Raw
In response to Re: IDLE in transaction introspection  (Magnus Hagander <magnus@hagander.net>)
Responses Re: IDLE in transaction introspection
List pgsql-hackers


On Tue, Nov 1, 2011 at 1:20 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Tue, Nov 1, 2011 at 18:11, Scott Mead <scottm@openscg.com> wrote:
>
> On Tue, Nov 1, 2011 at 10:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Robert Haas <robertmhaas@gmail.com> writes:
>> > On Tue, Nov 1, 2011 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >> That would cost twice as much shared memory for query strings, and
>> >> twice
>> >> as much time to update the strings, for what seems pretty marginal
>> >> value.  I'm for just redefining the query field as "current or last
>> >> query".
>>
>> > Not really.  You could just store it once in shared memory, and put
>> > the complexity in the view definition.
>>
>> I understood the proposal to be "store the previous query in addition
>> to the current-query-if-any".  If that's not what was meant, then my
>> objection was incorrect.  However, like you, I'm pretty dubious of
>> having two mostly-redundant fields in the view definition, just because
>> of window width issues.
>
> The biggest reason I dislike the multi-field approach is because it limits
> us to only the [single] previous_query in the system with all the overhead
> we talked about  (memory, window width and messing with system catalogs in
> general).  That's actually why I implemented it the way I did, just by
> appending the last query on the end of the string when it's <IDLE> in
> transaction.

Well, by appending it in that field, you require the end
user/monitoring app to parse out the string anyway, so you're not
exactly making life easier on the consumer of the information..


>> Marti wrote:
>>
>> I'd very much like to see a more generic solution: a runtime query log
>> facility that can be queried in any way you want. pg_stat_statements
>> comes close, but is limited too due to its (arbitrary, I find)
>> deduplication -- you can't query for "10 last statements from process
>> N" since it has no notion of processes, just users and databases.
>
> This is what I'd really like to see (just haven't had time as it is a much
> bigger project).  The next question my devs ask is "what were the last five
> queries that ran"... "can you show me an overview of an entire transaction"
> etc...
>   That being said, having the previous_query available feels like it fixes
> about 80% of the *problem*; transaction profiling, or looking back 10 / 15 /
> 20 queries is [immensely] useful, but I find that the bigger need is the
> ability to short-circuit dba / dev back-n-forth by just saying "Your app
> refused to commit/rollback after running query XYZ".

This would be great, but as you say, it's a different project.

Perhaps something could be made along the line of each backend keeping
it's *own* set of old queries, and then making it available to a
specific function ("pg_get_last_queries_for_backend(nnn)")

Yeah, I was kind of thinking this too, I just feel dirty adding a (n * track_activity_query_size) overhead to shared memory for tracking that if we're already concerned about adding just a 'previous_query' string.  It's easy enough to either hard-code or set a limit on 'n', but, if I were to do that, is it something that would be accepted? (my ability to code not-withstanding :-)  
 
- since
this is not the type of information you'd ask for often, it would be
ok if getting this data took longer.. And you seldom want "give me the
last 5 queries for each backend at once".

<thinking-aloud>
 I'm more concerned with the overhead of managing that array every single time that a backend updates its status than I am on retrieval.  I guess if the array were small enough and the the logic clean, it could end up having about the same overhead as what I'm doing now (obviously, I'm still gobbling more memory). 
</thinking-aloud>

Doing that would allow us to get away from concerns about breaking monitoring tools and the like.

--
 Scott Mead


 


>> Robert Wrote:
>> Yeah.  Otherwise, people who are parsing the hard-coded strings <idle>
>> and <idle in transaction> are likely to get confused.
>
> I would be interested ( and frankly very surprised ) to find out if many
> monitoring tools are actually parsing that field.  Most that I see just dump
> whatever is in current_query to the user.  I would imaging that, so long as
> the server obeyed pgstat_track_activity_size most tools would behave nicely.

Really? I'd assume every single monitoring tool that graphs how many
active connections you have (vs idle vs idle in transaction) would do
just this.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Compiler branch prediction hints (was: So, is COUNT(*) fast now?)
Next
From: Magnus Hagander
Date:
Subject: Re: IDLE in transaction introspection