Re: PATCH: tracking temp files in pg_stat_database - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: PATCH: tracking temp files in pg_stat_database
Date
Msg-id CABUevExxSuCrU7pwdHigmyeo4gF-gYk0X5bsMWSA9uaXjKP97Q@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: tracking temp files in pg_stat_database  ("Tomas Vondra" <tv@fuzzy.cz>)
Responses Re: PATCH: tracking temp files in pg_stat_database
List pgsql-hackers
On Sat, Jan 21, 2012 at 20:12, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 21 Leden 2012, 18:13, Magnus Hagander wrote:
>> On Sat, Jan 21, 2012 at 18:02, Tomas Vondra <tv@fuzzy.cz> wrote:
>> Though I just looked at the tabstat code again, and we already split
>> that message up at regular intervals. Which would make it quite weird
>> to have global counters in it as well.
>>
>> But instead of there, perhaps we need a general "non table, but more
>> than one type of data" message sent out at the same time. There is
>> other stuff in the queue for it.
>>
>> I'm not convinced either way - I'm not against the original way in
>> your patch either. I just wanted to throw the idea out there, and was
>> hoping somebody else would have an opinion too :-)
>
> Let's make that in a separate patch. It seems like a good thing to do, but
> I don't think it should be mixed with this patch.

Yeah, I'm not sure we even want to do that yet, when we go down this
road. We might eventually, of course.


>>> I do like the idea of not sending the message for each temp file,
>>> though.
>>> One thing that worries me are long running transactions (think about a
>>> batch process that runs for several hours within a single transaction).
>>> By
>>> sending the data only at the commit, such transactions would not be
>>> accounted properly. So I'd suggest sending the message either at commit
>>
>> By that argument they are *already* not accounted properly, because
>> the "number of rows" and those counters are wrong. By sending the temp
>> file data in the middle of the transaction, you won't be able to
>> correlate those numbers with the temp file usage.
>>
>> I'm not saying the other usecase isn't more common, but whichever way
>> you do it, it's going to get inconsistent with *something*.
>
> The question is whether the patch should be modified to send the message
> only at commit (i.e. just like tabstats) or if it can remain the way it is
> now. And maybe we could change the way all those messages are sent (e.g.
> to the regular submits I've proposed in the previous post) to make them
> more consistent.
>
> I'm not asking for 100% consistency. What I don't like is a batch process
> consuming resources but not reflected in the stats until the final commit.
> With a huge spike in the stats right after that, although the activity was
> actually spread over several hours.

Yeah, having thought about it some more, I think sending them when
they happen is a better idea. (It's obviously still only going to send
them when the file is closed, but that's as close as we can reasonably
get).

I've cleaned up the patch a bit and applied it - thanks!

Other than cosmetic, I changed the text in the description around a
bit (sol if that's worse now, that's purely my fault) and added docs
to the section about pg_stat_database that you'd forgotten.

Also, I'd like to point you in the direction of the
src/include/catalog/find_unused_oids and duplicate_oids scripts. The
oids you'd picked conflicted with the pg_extension catalog from a year
ago. Easy enough to fix, and there are often conflicts with more
recent oids that the committer has to deal with anyway, but cleaning
those up before submission always helps a little bit. For the next one
:-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Abhijit Menon-Sen
Date:
Subject: Re: psql NUL record and field separator
Next
From: "Tomas Vondra"
Date:
Subject: Re: PATCH: tracking temp files in pg_stat_database