Thread: Seems like there is an issue with reltuples showing twice the numberof rows

Seems like there is an issue with reltuples showing twice the numberof rows

From
Ranjith Ramachandra
Date:
I am relying on reltuples on my web app to get fast row counts.

This was recommended by this article to get fast approx row counts: https://wiki.postgresql.org/wiki/Count_estimate


However for some table I am getting twice as many values when I try to do this. I did some more research and came up with this query.

select reltuples, n_live_tup, n_dead_tup
              from pg_stat_user_tables join pg_class using (relname)
             where relname = 'main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d';


it returns


 reltuples  | n_live_tup | n_dead_tup
-------------+------------+------------
2.7209e+06 |    1360448 |    1360448


If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d

and I run the same query again,

  reltuples  | n_live_tup | n_dead_tup
-------------+------------+------------
 1.36045e+06 |    1360448 |    1360448

But after some time the value goes back to being double the value. This is causing me a lot of problems since this inaccuracy does not make any sense to me.

Any help would be appreciated.

FYI, also asked the same question on stackoverflow since I am new to postgres mail lists.


--
~Ranjith 
On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote:
> I am relying on reltuples on my web app to get fast row counts.
> 
> This was recommended by this article to get fast approx row
> counts: https://wiki.postgresql.org/wiki/Count_estimate
> 
> 
> However for some table I am getting twice as many values when I try to
> do this. I did some more research and came up with this query.
> 
> select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname =
> 'main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d';
> 
> 
> it returns
> 
> 
>  reltuples  | n_live_tup | n_dead_tup
> -------------+------------+------------
> 2.7209e+06 |    1360448 |    1360448
> 
> 
> If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d
> 
> and I run the same query again,
> 
>   reltuples  | n_live_tup | n_dead_tup
> -------------+------------+------------
>  1.36045e+06 |    1360448 |    1360448
> 
> But after some time the value goes back to being double the value. This
> is causing me a lot of problems since this inaccuracy does not make any
> sense to me.
> 

There was a difference between VACUUM and ANALYZE in handling recently
dead rows (essentially deleted rows that can't be removed yet), causing
similar changes to reltuples. Essentially if you do VACUUM and ANALYZE,
it may set reltuples to rather different estimates. That is fixed now
and should be in the next minor release.

Hard to say if this is a case of that, but judging by the number of dead
tuples chances are it is.

It's probably better to use n_live_tup instead, though. I'd say that's
closer to the "live tuples" definition.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote:
>> it returns
>>  reltuples  | n_live_tup | n_dead_tup
>> -------------+------------+------------
>> 2.7209e+06 |    1360448 |    1360448
>> 
>> If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d
>> and I run the same query again,
>>   reltuples  | n_live_tup | n_dead_tup
>> -------------+------------+------------
>>  1.36045e+06 |    1360448 |    1360448
>> 
>> But after some time the value goes back to being double the value.

> There was a difference between VACUUM and ANALYZE in handling recently
> dead rows (essentially deleted rows that can't be removed yet), causing
> similar changes to reltuples. Essentially if you do VACUUM and ANALYZE,
> it may set reltuples to rather different estimates. That is fixed now
> and should be in the next minor release.

No, I think this is the *other* thing we fixed recently: VACUUM thinks
it should set reltuples to total tuples (live + dead) whereas ANALYZE
counts only live tuples.  We did not risk back-patching that.

The question I'd ask about this case is why is there persistently 100%
bloat?  Those dead tuples should've gotten reclaimed by autovacuum.
Perhaps an open prepared transaction, or some such?

> It's probably better to use n_live_tup instead, though. I'd say that's
> closer to the "live tuples" definition.

Yeah, you might be better off looking at that, particularly since it
updates on-the-fly not just after a vacuum or analyze.

            regards, tom lane


Re: Seems like there is an issue with reltuples showing twice thenumber of rows

From
Ranjith Ramachandra
Date:
List,

OP here. Thank you for replying. Confirms my diagnosis that it might have to do with analyze vaccum.

Some debug info. 

1. Loaded a CSV to fill the table with data.
2. performed analyse vacuum on this table after uploading.
3. I do not see any reason for dead rows because I have not updated data in this table. But I may not understand dead rows correctly.
4. I can reproduce this problem on multiple machines with 9.6.8 postres installed.


Can not wait for next minor update since my prod would get updated. My current strategy is to use n_live_tup.  On my local it seems to work fine.

Thinking about it, I could even develop another mechanism for keeping a track of row counts by manually scanning the row count with a background process, a hard count with count(*) too.

But happy provide debug any other info if needed. Will reply within 24 hours max. 

This is what I had found earlier before I contacted the list. Relevant?




On Tue 3 Apr, 2018, 19:49 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote:
>> it returns
>>  reltuples  | n_live_tup | n_dead_tup
>> -------------+------------+------------
>> 2.7209e+06 |    1360448 |    1360448
>>
>> If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d
>> and I run the same query again,
>>   reltuples  | n_live_tup | n_dead_tup
>> -------------+------------+------------
>>  1.36045e+06 |    1360448 |    1360448
>>
>> But after some time the value goes back to being double the value.

> There was a difference between VACUUM and ANALYZE in handling recently
> dead rows (essentially deleted rows that can't be removed yet), causing
> similar changes to reltuples. Essentially if you do VACUUM and ANALYZE,
> it may set reltuples to rather different estimates. That is fixed now
> and should be in the next minor release.

No, I think this is the *other* thing we fixed recently: VACUUM thinks
it should set reltuples to total tuples (live + dead) whereas ANALYZE
counts only live tuples.  We did not risk back-patching that.

The question I'd ask about this case is why is there persistently 100%
bloat?  Those dead tuples should've gotten reclaimed by autovacuum.
Perhaps an open prepared transaction, or some such?

> It's probably better to use n_live_tup instead, though. I'd say that's
> closer to the "live tuples" definition.

Yeah, you might be better off looking at that, particularly since it
updates on-the-fly not just after a vacuum or analyze.

                        regards, tom lane