Thread: since when has pg_stat_user_indexes.idx_scan been counting?

since when has pg_stat_user_indexes.idx_scan been counting?

From
Willy-Bas Loos
Date:
Hi,

We have some indexes that don't seem to be used at all.
I'd like to know since when they have not been used.
That is, the time when postgres started counting to reach the number that is in pg_stat_user_indexes.idx_scan

Is there a way to retrieve that from the database ?

Cheers,

WBL

--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw

Re: since when has pg_stat_user_indexes.idx_scan been counting?

From
Tomas Vondra
Date:
Dne 12.5.2011 17:39, Willy-Bas Loos napsal(a):
> Hi,
>
> We have some indexes that don't seem to be used at all.
> I'd like to know since when they have not been used.
> That is, the time when postgres started counting to reach the number
> that is in pg_stat_user_indexes.idx_scan
>
> Is there a way to retrieve that from the database ?

Well, not really :-( You could call pg_postmaster_start_time() to get
the start time, but that has two major drawbacks

(1) The stats may be actually collected for much longer, because restart
does not reset them.

(2) If someone called pg_stat_reset(), the stats are lost but the start
time remains the same.

So there really is no reliable way to do detect this.

In 9.1 this is not true - there's a timestamp for each database (and
global stats) to keep track of the last reset.

regards
Tomas

Re: since when has pg_stat_user_indexes.idx_scan been counting?

From
Tomas Vondra
Date:
Dne 12.5.2011 17:39, Willy-Bas Loos napsal(a):
> Hi,
>
> We have some indexes that don't seem to be used at all.
> I'd like to know since when they have not been used.
> That is, the time when postgres started counting to reach the number
> that is in pg_stat_user_indexes.idx_scan
>
> Is there a way to retrieve that from the database ?

BTW it's really really tricky to remove indexes once they're created.
What if the index is created for a single batch process that runs once a
year to close the fiscal year etc?

So be very careful about this.

Tomas

Re: [ADMIN] since when has pg_stat_user_indexes.idx_scan been counting?

From
Willy-Bas Loos
Date:
Then, are the index scans counted in a memory variable and written at analyze time?

On Thu, May 12, 2011 at 8:22 PM, raghu ram <raghuchennuru@gmail.com> wrote:

"Analyze" activity will update the statistics of each catalog table.
--Raghu Ram




--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw

Re: since when has pg_stat_user_indexes.idx_scan been counting?

From
Greg Smith
Date:
Tomas Vondra wrote:
> BTW it's really really tricky to remove indexes once they're created.
> What if the index is created for a single batch process that runs once a
> year to close the fiscal year etc?
>

True in theory.  Reports that are executing something big at the end of
the year fall into three categories:

1) They touch a whole lot of the data for the year first.  In this case,
sequential scan is likely regardless.

2) They access data similarly to regular queries, using the same indexes.

3) They have some very specific data only they touch that is retrieved
with an index.

You're saying to watch out for (3); I think that's not usually the case,
but that's a fair thing to warn about.  Even in that case, though, it
may still be worth dropping the index.  Year-end processes are not
usually very sensitive to whether they take a little or a long time to
execute.  But you will be paying to maintain the index every day while
it is there.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Re: since when has pg_stat_user_indexes.idx_scan been counting?

From
Tom Lane
Date:
Greg Smith <greg@2ndquadrant.com> writes:
> You're saying to watch out for (3); I think that's not usually the case,
> but that's a fair thing to warn about.  Even in that case, though, it
> may still be worth dropping the index.  Year-end processes are not
> usually very sensitive to whether they take a little or a long time to
> execute.  But you will be paying to maintain the index every day while
> it is there.

Yeah.  Another idea worth considering is to have the year-end processing
build the index it wants, use it, drop it.  It seems unlikely that it's
worth maintaining an index year-round for such infrequent usage.

            regards, tom lane



On Thu, May 12, 2011 at 9:09 PM, Willy-Bas Loos <willybas@gmail.com> wrote:
Hi,

We have some indexes that don't seem to be used at all.
I'd like to know since when they have not been used.
That is, the time when postgres started counting to reach the number that is in pg_stat_user_indexes.idx_scan

Is there a way to retrieve that from the database ?


"Analyze" activity will update the statistics of each catalog table.

pg_postmaster_start_time --> Retrieves the Postmaster [ PostgreSQL Instance] start time

postgres=# select pg_postmaster_start_time();

--Raghu Ram

Re: [ADMIN] since when has pg_stat_user_indexes.idx_scan been counting?

From
Tomas Vondra
Date:
Dne 12.5.2011 22:03, Willy-Bas Loos napsal(a):
> Then, are the index scans counted in a memory variable and written at
> analyze time?

No, I believe raghu mixed two things - stats used by the planner and
stats about access to the data (how many tuples were read using an
index, etc.)

Stats for the planner are stored in pg_class/pg_statistic/pg_stats
catalogs and are updated by ANALYZE (either manual or automatic). This
is what raghu refered to, but these stats are completely useless when
looking for unused indexes.

Stats about access to the data (index/seq scans, cache hit ratio etc.)
are stored in pg_stat_* and pg_statio_* catalogs, and are updated after
running each query. AFAIK it's not a synchronous process, but when a
backend finishes a query, it sends the stats to the postmaster (and
postmaster updates the catalogs).

Tomas

Tomas Vondra wrote:
> Stats about access to the data (index/seq scans, cache hit ratio etc.)
> are stored in pg_stat_* and pg_statio_* catalogs, and are updated after
> running each query. AFAIK it's not a synchronous process, but when a
> backend finishes a query, it sends the stats to the postmaster (and
> postmaster updates the catalogs).
>

Description in the docs goes over this in a little more detail
http://www.postgresql.org/docs/current/static/monitoring-stats.html :

"The statistics collector communicates with the backends needing
information (including autovacuum) through temporary files. These files
are stored in the pg_stat_tmp subdirectory...When using the statistics
to monitor current activity, it is important to realize that the
information does not update instantaneously. Each individual server
process transmits new statistical counts to the collector just before
going idle; so a query or transaction still in progress does not affect
the displayed totals. Also, the collector itself emits a new report at
most once per PGSTAT_STAT_INTERVAL milliseconds (500 unless altered
while building the server). So the displayed information lags behind
actual activity. However, current-query information collected by
track_activities is always up-to-date."

It's not synchronous at all.  The clients create a temporary file for
the statistics collector and move on.  The actual statistics don't get
updated until the statistics collector decides enough time has passed to
bother, which defaults to at most every 500ms.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Re: [ADMIN] since when has pg_stat_user_indexes.idx_scan been counting?

From
tv@fuzzy.cz
Date:
> It's not synchronous at all.  The clients create a temporary file for
> the statistics collector and move on.  The actual statistics don't get
> updated until the statistics collector decides enough time has passed to
> bother, which defaults to at most every 500ms.

Really? I thought the clients send the updates using a socket, at least
that's what I see in backend/postmaster/pgstat.c (e.g. in
pgstat_send_bgwriter where the data are sent, and in PgstatCollectorMain
where it's read from the socket and applied).

But no matter how exactly this works, this kind of stats has nothing to do
with ANALYZe - it's asynchronously updated every time you run a query.

regards
Tomas