Re: Query planner / Analyse statistics bad estimate rows=1 withmaximum statistics 10000 on PostgreSQL 10.2 - Mailing list pgsql-general

From Ron
Subject Re: Query planner / Analyse statistics bad estimate rows=1 withmaximum statistics 10000 on PostgreSQL 10.2
Date
Msg-id c0ac4d7a-62d2-eea1-5818-4f9a45cbe453@gmail.com
Whole thread Raw
In response to Re: Query planner / Analyse statistics bad estimate rows=1 withmaximum statistics 10000 on PostgreSQL 10.2  (Mark <mwchambers@gmail.com>)
Responses Re: Query planner / Analyse statistics bad estimate rows=1 withmaximum statistics 10000 on PostgreSQL 10.2
List pgsql-general
Try using SERIALIZABLE transactions instead of the default READ COMMITTED.

On 1/2/19 9:28 AM, Mark wrote:
Hi Ron, 

Yes, my process will commit the transaction (or roll it back) eventually. It's the window where
one transaction has deleted all the rows (before committing) and an ANALYSE has ran.

The deleted rows won't make it into the sample even though the transaction has NOT been commited.

During this time I will get bad row estimates on rows for that ID. You can see this in the example below with two 
database connections (labelled 1 and 2).

I would have expected the DELETE to have no effect until it was committed.

connection 1=> create table test_table(school_id integer, note text); 
CREATE TABLE
connection 1=> insert into test_table(school_id, note) SELECT g.id,md5(random()::text) FROM generate_series(1,2) as g(id);
INSERT 0 2
connection 1=> insert into test_table(school_id, note) SELECT g.id,md5(random()::text) FROM generate_series(1,2) as g(id);
INSERT 0 2
connection 1=> insert into test_table(school_id, note) SELECT g.id,md5(random()::text) FROM generate_series(1,2) as g(id);
INSERT 0 2
connection 1=> select * from test_table ;
 school_id |               note               
-----------+----------------------------------
         1 | 0e08cf3990a04f0e943584517c564d31
         2 | 96bf83ae5f8eb9342e8408b1ac25cb14
         1 | f8fd943012edfe42a03a421df660bc33
         2 | cd5d7ff0abca61f18857df9b21d234e0
         1 | 60d731f430cb68c7285ddbcd9186eaa0
         2 | 635e6c9cf305147ad8684213f0a9299c
(6 rows)

connection 1=> analyse verbose test_table ;
INFO:  analyzing "public.test_table"
INFO:  "test_table": scanned 1 of 1 pages, containing 6 live rows and 0 dead rows; 6 rows in sample, 6 estimated total rows
ANALYZE
connection 1=> explain analyse select * from test_table where school_id = 2 ;
                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Seq Scan on test_table  (cost=0.00..1.07 rows=3 width=37) (actual time=0.011..0.015 rows=3 loops=1)
   Filter: (school_id = 2)
   Rows Removed by Filter: 3
 Planning time: 0.164 ms
 Execution time: 0.043 ms
(5 rows)

connection 2=> BEGIN ;
BEGIN
connection 2=> delete from test_table where school_id = 2 ;
DELETE 3
connection 2=> /* This connection is now idle */


connection 1=> analyse verbose test_table ;
INFO:  analyzing "public.test_table"
INFO:  "test_table": scanned 1 of 1 pages, containing 6 live rows and 0 dead rows; 3 rows in sample, 6 estimated total rows
ANALYZE
connection 1=> explain analyse select * from test_table where school_id = 2 ;
                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Seq Scan on test_table  (cost=0.00..1.07 rows=1 width=37) (actual time=0.009..0.014 rows=3 loops=1)
   Filter: (school_id = 2)
   Rows Removed by Filter: 3
 Planning time: 0.095 ms
 Execution time: 0.039 ms
(5 rows)


On Wed, 2 Jan 2019 at 14:04 Ron <ronljohnsonjr@gmail.com> wrote:
But Jeff said "left open indefinitely without either committing or rolling back".  Your process is committing the transaction.


On 1/2/19 6:15 AM, Mark wrote:
Hi Jeff, 

Thanks for your help. That is exactly what is happening.

I have a long running job which deletes all of the common_student table and then repopulates it. It takes long time to load all the other data and commit the transaction. I didn't think the delete inside the transaction would have any effect until it is commited or rolled back.

I will have to rewrite the application so it updates the existing rows rather than deleting all and then inserting.

Thanks again for helping me understand what's happening here. 

Proof:

db=> explain analyze select * from common_student where school_id = 36;
                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on common_student  (cost=88.30..3846.49 rows=1533 width=384) (actual time=4.852..7.065 rows=1388 loops=1)
   Recheck Cond: (school_id = 36)
   Heap Blocks: exact=67
   ->  Bitmap Index Scan on idx_common_student_sid  (cost=0.00..87.91 rows=1533 width=0) (actual time=4.817..4.817 rows=1388 loops=1)
         Index Cond: (school_id = 36)
 Planning time: 0.097 ms
 Execution time: 8.084 ms
(7 rows)

db=> /* At this point I have started a long running transaction that deletes all of common_student for school_id 36  */ ;

db=> analyse verbose common_student(school_id); 
INFO:  analyzing "public.common_student"
INFO:  "common_student": scanned 7322 of 7322 pages, containing 65431 live rows and 8060 dead rows; 56818 rows in sample, 65431 estimated total rows
ANALYZE
db=> explain analyze select * from common_student where school_id = 36;
                                                                 QUERY PLAN                                                                 
--------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_common_student_sid on common_student  (cost=0.41..8.43 rows=1 width=384) (actual time=0.017..1.498 rows=1388 loops=1)
   Index Cond: (school_id = 36)
 Planning time: 0.098 ms
 Execution time: 2.583 ms
(4 rows)

db=> /* At this point I have killed the long running transaction that deletes all of common_student for school_id 36  */ ;
db=> vacuum analyze common_student; 
VACUUM
db=> explain analyze select * from common_student where school_id = 36;
                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on common_student  (cost=79.17..3357.79 rows=1388 width=383) (actual time=0.088..1.302 rows=1388 loops=1)
   Recheck Cond: (school_id = 36)
   Heap Blocks: exact=67
   ->  Bitmap Index Scan on idx_common_student_sid  (cost=0.00..78.83 rows=1388 width=0) (actual time=0.077..0.077 rows=1388 loops=1)
         Index Cond: (school_id = 36)
 Planning time: 0.327 ms
 Execution time: 2.311 ms
(7 rows)


On Sun, 23 Dec 2018 at 02:57 Jeff Janes <jeff.janes@gmail.com> wrote:

- Does the analyse output below mean that it only scanned 51538 of 65463 rows in the table? Is school_id 36 just being missed in the sample? (This happens when the analyse is repeated )

Is there a transaction which had deleted all of school_id=36, and then was just left open indefinitely without either committing or rolling back?

That would explain it, and I don't know of anything else that could.  The deleted but not committed tuples are still live, but don't get sampled.

Cheers,

Jeff

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

pgsql-general by date:

Previous
From: Mark
Date:
Subject: Re: Query planner / Analyse statistics bad estimate rows=1 withmaximum statistics 10000 on PostgreSQL 10.2
Next
From: Mark
Date:
Subject: Re: Query planner / Analyse statistics bad estimate rows=1 withmaximum statistics 10000 on PostgreSQL 10.2