Re: [GENERAL] autoanalyze criteria - Mailing list pgsql-hackers

From Mark Kirkwood
Subject Re: [GENERAL] autoanalyze criteria
Date
Msg-id 5192D7D2.8020605@catalyst.net.nz
Whole thread Raw
In response to Re: [GENERAL] autoanalyze criteria  (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>)
Responses Re: [GENERAL] autoanalyze criteria  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
On 24/02/13 10:51, Mark Kirkwood wrote:
> On 24/02/13 10:12, Stefan Andreatta wrote:
>>
>> On 02/23/2013 09:30 PM, Jeff Janes wrote:
>>> Moved discussion from General To Hackers.
>>>
>>> On Sat, Feb 23, 2013 at 10:41 AM, Stefan Andreatta
>>> <s.andreatta@synedra.com <mailto:s.andreatta@synedra.com>> wrote:
>>>
>>>
>>>     On 02/23/2013 05:10 PM, Jeff Janes wrote:
>>>>
>>>>     Sorry, I got tunnel vision about the how the threshold was
>>>>     computed, and forgot about the thing it was compared to.  There
>>>>     is a "secret" data point in the stats collector
>>>>     called changes_since_analyze.  This is not exposed in the
>>>>     pg_stat_user_tables.  But I think it should be as I often have
>>>>     wanted to see it.
>>>>
>>>>
>>>
>>>     Sounds like a very good idea to me - any way I could help to make
>>>     such a thing happen?
>>>
>>>
>>>
>>> It should be fairly easy to implement because the other columns are
>>> already there to show you the way, and if you want to try your hand at
>>> hacking pgsql it would be a good introduction to doing so.
>>>
>>> Look at each instance in the code of n_dead_dup and
>>> pg_stat_get_dead_tuples, and those are the places where
>>> changes_since_analyze also need to be addressed, in an analogous
>>> manner (assuming it is isn't already there.)
>>>
>>> git grep 'n_dead_tup'
>>>
>>> It looks like we would need to add an SQL function to retrieve the
>>> data, then incorporate that function into the view definitions that
>>> make up the pg_stat_user_tables etc. views. and of course update the
>>> regression test and the documentation.
>>>
>>> Other than implementing it, we would need to convince other hackers
>>> that this is desirable to have.  I'm not sure how hard that would be.
>>> I've looked in the archives to see if this idea was already considered
>>> but rejected, but I don't see any indication that it was previously
>>> considered.
>>>
>>> (http://www.postgresql.org/message-id/4823.1262132964@sss.pgh.pa.us).
>>>
>>> Cheers,
>>>
>>> Jeff
>>
>> Not being a developer, I am afraid, I will not be going to implement it
>> myself - nor would anybody wish so ;-)
>>
>> I also searched the archives, but the closest I found is a discussion on
>> the Admin List starting here:
>> http://www.postgresql.org/message-id/626919622.7634700.1351695913466.JavaMail.root@alaloop.com
>>
>>
>> On the other hand, there is quite a lot of discussion about making
>> autoanalyze more (or less) aggressive - which seems a difficult task to
>> me, when you cannot even check what's triggering your autoanalyze.
>>
>> Anybody else interested?
>>
>
> I was asked about this exact thing the other day - it would be very
> nice to have the information visible. I may take a look at doing it
> (I've done some hacking on the stats system previously). However don't
> let that put anyone else off - as I'll have to find the time to start :-)
>
>
>

I happened to be looking at the whole autovacuum/analyze setup in
another context - which reminded me about volunteering to take a look at
a patch for adding changes_since_analyze. So with probably impeccably
poor timing (smack in the middle of 9.3 beta), here is a patch that does
that (so it is probably an early 9.4 addition).

I've called the column "n_changes_since_analyze" - I can sense that
there might be discussion about how to maybe shorten that :-) , and
added a doc line for the view + updated the regression test expected input.

Regards

Mark


Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Parallel Sort
Next
From: Amit Langote
Date:
Subject: Update description in sysv_shmem.c