Re: Seems like there is an issue with reltuples showing twice thenumber of rows - Mailing list pgsql-general

From Tomas Vondra
Subject Re: Seems like there is an issue with reltuples showing twice thenumber of rows
Date
Msg-id 7d986cf7-157e-b8b3-0b01-27e96797af71@2ndquadrant.com
Whole thread Raw
In response to Seems like there is an issue with reltuples showing twice the numberof rows  (Ranjith Ramachandra <ranjith@mammoth.io>)
Responses Re: Seems like there is an issue with reltuples showing twice the number of rows
List pgsql-general
On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote:
> I am relying on reltuples on my web app to get fast row counts.
> 
> This was recommended by this article to get fast approx row
> counts: https://wiki.postgresql.org/wiki/Count_estimate
> 
> 
> However for some table I am getting twice as many values when I try to
> do this. I did some more research and came up with this query.
> 
> select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname =
> 'main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d';
> 
> 
> it returns
> 
> 
>  reltuples  | n_live_tup | n_dead_tup
> -------------+------------+------------
> 2.7209e+06 |    1360448 |    1360448
> 
> 
> If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d
> 
> and I run the same query again,
> 
>   reltuples  | n_live_tup | n_dead_tup
> -------------+------------+------------
>  1.36045e+06 |    1360448 |    1360448
> 
> But after some time the value goes back to being double the value. This
> is causing me a lot of problems since this inaccuracy does not make any
> sense to me.
> 

There was a difference between VACUUM and ANALYZE in handling recently
dead rows (essentially deleted rows that can't be removed yet), causing
similar changes to reltuples. Essentially if you do VACUUM and ANALYZE,
it may set reltuples to rather different estimates. That is fixed now
and should be in the next minor release.

Hard to say if this is a case of that, but judging by the number of dead
tuples chances are it is.

It's probably better to use n_live_tup instead, though. I'd say that's
closer to the "live tuples" definition.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-general by date:

Previous
From: Vitaliy Garnashevich
Date:
Subject: Re: Autovacuum Problem
Next
From: Adrian Klaver
Date:
Subject: Re: How to install pgTAP on cenos machine