Thread: Wrong rows count in EXPLAIN

Wrong rows count in EXPLAIN

From
Пантюшин Александр Иванович
Date:

When I create a new table, and then I evaluate the execution of the SELECT query, I see a strange rows count in EXPLAIN
CREATE TABLE test1(f INTEGER PRIMARY KEY NOT NULL);
ANALYZE test1;
EXPLAIN SELECT * FROM test1;
                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on test1  (cost=0.00..35.50 rows=2550 width=4)
(1 row)

Table is empty but rows=2550. Seem like it was calculated from some default values.
Is this normal behavior or a bug? Can it lead to a poor choice of the plan of a query in general?

Re: Wrong rows count in EXPLAIN

From
Andrey Borodin
Date:
Hi!

> 26 апр. 2022 г., в 13:45, Пантюшин Александр Иванович <AI.Pantyushin@gaz-is.ru> написал(а):
>
> When I create a new table, and then I evaluate the execution of the SELECT query, I see a strange rows count in
EXPLAIN
> CREATE TABLE test1(f INTEGER PRIMARY KEY NOT NULL);
> ANALYZE test1;
> EXPLAIN SELECT * FROM test1;
>                        QUERY PLAN
> ---------------------------------------------------------
>  Seq Scan on test1  (cost=0.00..35.50 rows=2550 width=4)
> (1 row)
>
> Table is empty but rows=2550. Seem like it was calculated from some default values.
> Is this normal behavior or a bug? Can it lead to a poor choice of the plan of a query in general?

Which Postgres version do you use?

I observe:
postgres=# CREATE TABLE test1(f INTEGER PRIMARY KEY NOT NULL);
CREATE TABLE
postgres=# ANALYZE test1;
ANALYZE
postgres=# EXPLAIN SELECT * FROM test1;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Index Only Scan using test1_pkey on test1  (cost=0.12..8.14 rows=1 width=4)
(1 row)

postgres=# select version();
                                                       version
----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 15devel on x86_64-apple-darwin19.6.0, compiled by Apple clang version 11.0.3 (clang-1103.0.32.62), 64-bit
(1 row)

Without "ANALYZE test1;" table_block_relation_estimate_size() assumes relation size is 10 blocks.

Best regards, Andrey Borodin.


Re: Wrong rows count in EXPLAIN

From
Пантюшин Александр Иванович
Date:

Hi,
>Which Postgres version do you use?
I checked this on PG 11
postgres=# select version();
                                                                     version
-------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.5 on x86_64-w64-mingw32, compiled by x86_64-w64-mingw32-gcc.exe (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 8.1.0, 64-bit
(1 row)

and on PG 13
postgres=# select version();
                                               version                                               
-----------------------------------------------------------------------------------------------------
 PostgreSQL 13.5 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0, 64-bit
(1 row)

Both versions shows same strange rows counting in EXPLAINE


От: Andrey Borodin <x4mmm@yandex-team.ru>
Отправлено: 27 апреля 2022 г. 13:08:22
Кому: Пантюшин Александр Иванович
Копия: pgsql-hackers@postgresql.org; Тарасов Георгий Витальевич
Тема: Re: Wrong rows count in EXPLAIN
 
Hi!

> 26 апр. 2022 г., в 13:45, Пантюшин Александр Иванович <AI.Pantyushin@gaz-is.ru> написал(а):
>
> When I create a new table, and then I evaluate the execution of the SELECT query, I see a strange rows count in EXPLAIN
> CREATE TABLE test1(f INTEGER PRIMARY KEY NOT NULL);
> ANALYZE test1;
> EXPLAIN SELECT * FROM test1;
>                        QUERY PLAN
> ---------------------------------------------------------
>  Seq Scan on test1  (cost=0.00..35.50 rows=2550 width=4)
> (1 row)
>
> Table is empty but rows=2550. Seem like it was calculated from some default values.
> Is this normal behavior or a bug? Can it lead to a poor choice of the plan of a query in general?

Which Postgres version do you use?

I observe:
postgres=# CREATE TABLE test1(f INTEGER PRIMARY KEY NOT NULL);
CREATE TABLE
postgres=# ANALYZE test1;
ANALYZE
postgres=# EXPLAIN SELECT * FROM test1;
                                 QUERY PLAN                                 
-----------------------------------------------------------------------------
 Index Only Scan using test1_pkey on test1  (cost=0.12..8.14 rows=1 width=4)
(1 row)

postgres=# select version();
                                                       version                                                       
----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 15devel on x86_64-apple-darwin19.6.0, compiled by Apple clang version 11.0.3 (clang-1103.0.32.62), 64-bit
(1 row)

Without "ANALYZE test1;" table_block_relation_estimate_size() assumes relation size is 10 blocks.

Best regards, Andrey Borodin.

Re: Wrong rows count in EXPLAIN

From
Andrey Borodin
Date:

> 27 апр. 2022 г., в 15:17, Пантюшин Александр Иванович <AI.Pantyushin@gaz-is.ru> написал(а):
>
> Hi,
> >Which Postgres version do you use?
> I checked this on PG 11
> ...

> and on PG 13

Yes, I think before 3d351d91 it was impossible to distinguish between actually empty and never analyzed table.
But now it is working just as you would expect. There's an interesting relevant discussion linked to the commit
message.

Best regards, Andrey Borodin.

[0] https://github.com/postgres/postgres/commit/3d351d916b20534f973eda760cde17d96545d4c4




Re: Wrong rows count in EXPLAIN

From
David Rowley
Date:
On Wed, 27 Apr 2022 at 21:08, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> Which Postgres version do you use?

3d351d91 changed things so we could tell the difference between a
relation which was analyzed and is empty vs a relation that's never
been analyzed. That's why you're not seeing the same behaviour as the
OP.

Tom's commit message [1] also touches on the "safety measure". Here
he's referring to the 2550 estimate, or more accurately, 10 pages
filled with tuples of that width.  This is intended so that newly
created tables that quickly subsequently are loaded with data then
queried before auto-analyze gets a chance to run are not assumed to be
empty.  The problem, if we assumed these non-analyzed tables were
empty, would be that the planner would likely choose plans containing
nodes like Seq Scans and non-parameterized Nested Loops rather than
maybe Index Scans and Merge or Hash joins. The 10-page thing is aimed
to try and avoid the planner from making that mistake.  Generally, the
planner underestimating the number of rows causes worse problems than
when it overestimates the row counts. So 10 seems much better than 0.

David

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3d351d916b20534f973eda760cde17d96545d4c4



Re: Wrong rows count in EXPLAIN

From
Tom Lane
Date:
=?koi8-r?B?8MHO1MDbyc4g4czFy9PBzsTSIOnXwc7P18ne?= <AI.Pantyushin@gaz-is.ru> writes:
> When I create a new table, and then I evaluate the execution of the SELECT query, I see a strange rows count in
EXPLAIN
> CREATE TABLE test1(f INTEGER PRIMARY KEY NOT NULL);
> ANALYZE test1;
> EXPLAIN SELECT * FROM test1;
>                        QUERY PLAN
> ---------------------------------------------------------
>  Seq Scan on test1  (cost=0.00..35.50 rows=2550 width=4)
> (1 row)

> Table is empty but rows=2550.

This is intentional, arising from the planner's unwillingness to
assume that a table is empty.  It assumes that such a table actually
contains (from memory) 10 pages, and then backs into a rowcount
estimate from that depending on the data-type-dependent width of
the table rows.

Without this provision, we'd produce very bad plans for cases
where a newly-populated table hasn't been analyzed yet.

            regards, tom lane



Re: Wrong rows count in EXPLAIN

From
Bruce Momjian
Date:
On Wed, Apr 27, 2022 at 09:44:21AM -0400, Tom Lane wrote:
> =?koi8-r?B?8MHO1MDbyc4g4czFy9PBzsTSIOnXwc7P18ne?= <AI.Pantyushin@gaz-is.ru> writes:
> > When I create a new table, and then I evaluate the execution of the SELECT query, I see a strange rows count in
EXPLAIN
> > CREATE TABLE test1(f INTEGER PRIMARY KEY NOT NULL);
> > ANALYZE test1;
> > EXPLAIN SELECT * FROM test1;
> >                        QUERY PLAN
> > ---------------------------------------------------------
> >  Seq Scan on test1  (cost=0.00..35.50 rows=2550 width=4)
> > (1 row)
> 
> > Table is empty but rows=2550.
> 
> This is intentional, arising from the planner's unwillingness to
> assume that a table is empty.  It assumes that such a table actually
> contains (from memory) 10 pages, and then backs into a rowcount
> estimate from that depending on the data-type-dependent width of
> the table rows.
> 
> Without this provision, we'd produce very bad plans for cases
> where a newly-populated table hasn't been analyzed yet.

We could have a noice mode that warns when a table without statistics is
used.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson