Thread: Optimizer showing wrong rows in plan
Hi All,
Example on optimizer
===============
postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select * from test;
id
----
1
(1 row)
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select * from test;
id
----
1
(1 row)
postgres=# explain select * from test;
QUERY PLAN
--------------------------------------------------------
Seq Scan on test (cost=0.00..34.00 rows=2400 width=4)
(1 row)
QUERY PLAN
--------------------------------------------------------
Seq Scan on test (cost=0.00..34.00 rows=2400 width=4)
(1 row)
In the above, example the optimizer is retreiving those many rows where there is only one row in that table. If i analyze am geting one row.
postgres=# ANALYZE test;
ANALYZE
postgres=# explain select * from test;
QUERY PLAN
----------------------------------------------------
Seq Scan on test (cost=0.00..1.01 rows=1 width=4)
(1 row)
ANALYZE
postgres=# explain select * from test;
QUERY PLAN
----------------------------------------------------
Seq Scan on test (cost=0.00..1.01 rows=1 width=4)
(1 row)
My question here is, what it retreiving as rows when there is no such. One more thing, if i wont do analyze and run the explain plan for three or more times, then catalogs getting updated automatically and resulting the correct row as 1.
Q2. Does explain , will update the catalogs automatically.
Regards
Raghavendra
2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>
Hi All,Example on optimizer===============postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select * from test;
id
----
1
(1 row)postgres=# explain select * from test;
QUERY PLAN
--------------------------------------------------------
Seq Scan on test (cost=0.00..34.00 rows=2400 width=4)
(1 row)In the above, example the optimizer is retreiving those many rows where there is only one row in that table. If i analyze am geting one row.
No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).
postgres=# ANALYZE test;
ANALYZE
postgres=# explain select * from test;
QUERY PLAN
----------------------------------------------------
Seq Scan on test (cost=0.00..1.01 rows=1 width=4)
(1 row)My question here is, what it retreiving as rows when there is no such. One more thing, if i wont do analyze and run the explain plan for three or more times, then catalogs getting updated automatically and resulting the correct row as 1.
Now ANALYZE changed the statistics for this table and now the planner knows that there is just one row. In the background there can work autovacuum so it changes rows automatically (the autovacuum work characteristic depends on the settings for the database).
Q2. Does explain , will update the catalogs automatically.
No, explain doesn't update table's statistics.
regards
Szymon Guz
Hi Guz,
Thank you for the prompt reply.
No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).
So, whats it assuming here as rows(2400). Could you explain this.
Regards
Raghavendra
On Sun, Mar 28, 2010 at 12:32 PM, Szymon Guz <mabewlun@gmail.com> wrote:
2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>Hi All,Example on optimizer===============postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select * from test;
id
----
1
(1 row)postgres=# explain select * from test;
QUERY PLAN
--------------------------------------------------------
Seq Scan on test (cost=0.00..34.00 rows=2400 width=4)
(1 row)In the above, example the optimizer is retreiving those many rows where there is only one row in that table. If i analyze am geting one row.No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).postgres=# ANALYZE test;
ANALYZE
postgres=# explain select * from test;
QUERY PLAN
----------------------------------------------------
Seq Scan on test (cost=0.00..1.01 rows=1 width=4)
(1 row)My question here is, what it retreiving as rows when there is no such. One more thing, if i wont do analyze and run the explain plan for three or more times, then catalogs getting updated automatically and resulting the correct row as 1.Now ANALYZE changed the statistics for this table and now the planner knows that there is just one row. In the background there can work autovacuum so it changes rows automatically (the autovacuum work characteristic depends on the settings for the database).Q2. Does explain , will update the catalogs automatically.No, explain doesn't update table's statistics.regardsSzymon Guz
2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>
Hi Guz,Thank you for the prompt reply.No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).So, whats it assuming here as rows(2400). Could you explain this.
It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.
regards
Szymon Guz
Szymon Guz
Hi Guz,
It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.
Frankly speaking its a newly created table without any operation on it as you have seen the example. Then how come it showing those many rows where we have only one in it.
Thanks if we have proper explination on this..
Regards
Raghavendra
On Sun, Mar 28, 2010 at 12:59 PM, Szymon Guz <mabewlun@gmail.com> wrote:
2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>Hi Guz,Thank you for the prompt reply.No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).So, whats it assuming here as rows(2400). Could you explain this.It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.regards
Szymon Guz
Hi All,
I want to give some more light on this by analysing more like this
1. In my example I have created a table with one column as INT( which occupies 4 bytes)
2. Initially it occupies one page of space on the file that is (8kb).
So, here is it assuming these many rows may fit in this page. Clarify me on this Please.
Regards
Raghavendra
On Sun, Mar 28, 2010 at 2:06 PM, Gary Doades <gpd@gpdnet.co.uk> wrote:
On 28/03/2010 8:33 AM, Tadipathri Raghu wrote:It's not *showing* any rows at all, it's *guessing* 2400 rows because you've never analyzed the table. Without any statistics at all, postgres will use some form of in-built guess for a table that produces reasonable plans under average conditions. As you've already seen, once you analyze the table, the guess get's much better and therefore would give you a more appropriate plan.Hi Guz,It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.Frankly speaking its a newly created table without any operation on it as you have seen the example. Then how come it showing those many rows where we have only one in it.Thanks if we have proper explination on this..
Regards,
Gary.
Op 28 mrt 2010, om 11:07 heeft Tadipathri Raghu het volgende geschreven:
Hi All,I want to give some more light on this by analysing more like this1. In my example I have created a table with one column as INT( which occupies 4 bytes)2. Initially it occupies one page of space on the file that is (8kb).So, here is it assuming these many rows may fit in this page. Clarify me on this Please.
See these chapters in the manual: http://www.postgresql.org/docs/8.4/interactive/storage.html
The minimum size of a file depends on the block size, by default 8kb: http://www.postgresql.org/docs/8.4/interactive/install-procedure.html
Regards,
Frank
RegardsRaghavendra
On Sun, Mar 28, 2010 at 2:06 PM, Gary Doades <gpd@gpdnet.co.uk> wrote:On 28/03/2010 8:33 AM, Tadipathri Raghu wrote:It's not *showing* any rows at all, it's *guessing* 2400 rows because you've never analyzed the table. Without any statistics at all, postgres will use some form of in-built guess for a table that produces reasonable plans under average conditions. As you've already seen, once you analyze the table, the guess get's much better and therefore would give you a more appropriate plan.Hi Guz,It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.Frankly speaking its a newly created table without any operation on it as you have seen the example. Then how come it showing those many rows where we have only one in it.Thanks if we have proper explination on this..
Regards,
Gary.
On 28/03/2010 10:07 AM, Tadipathri Raghu wrote: > Hi All, > I want to give some more light on this by analysing more like this > 1. In my example I have created a table with one column as INT( which > occupies 4 bytes) > 2. Initially it occupies one page of space on the file that is (8kb). > So, here is it assuming these many rows may fit in this page. Clarify > me on this Please. Like I said, it's just a guess. With no statistics all postgres can do is guess, or in this case use the in-built default for a newly created table. It could guess 1 or it could guess 10,000,000. What it does is produce a reasonable guess in the absence of any other information. You should read the postgres documentation for further information about statistics and how the optimizer uses them. Regards, Gary.
Tadipathri Raghu <traghu.dba@gmail.com> writes: > Frankly speaking its a newly created table without any operation on it as > you have seen the example. Then how come it showing those many rows where we > have only one in it. Yes. This is intentional: the size estimates for a never-yet-analyzed table are *not* zero. This is because people frequently create and load up a table and then immediately query it without an explicit ANALYZE. The quality of the plans you'd get at that point (especially for joins) would be spectacularly bad if the default assumption were that the table was very small. regards, tom lane
On 03/28/2010 05:27 PM, Tom Lane wrote: > This is intentional: the size estimates for a never-yet-analyzed > table are *not* zero. This is because people frequently create and load > up a table and then immediately query it without an explicit ANALYZE. Does the creation of an index also populate statistics? Thanks, Jeremy
Jeremy Harris <jgh@wizmail.org> writes: > On 03/28/2010 05:27 PM, Tom Lane wrote: >> This is intentional: the size estimates for a never-yet-analyzed >> table are *not* zero. This is because people frequently create and load >> up a table and then immediately query it without an explicit ANALYZE. > Does the creation of an index also populate statistics? IIRC, it will set the relpages/reltuples counts (though not any more-complex statistics); but only if the table is found to not be completely empty. Again, this is a behavior designed with common usage patterns in mind, to not set relpages/reltuples to zero on a table that's likely to get populated shortly. regards, tom lane
Hi Tom,
Thank for the update.
IIRC, it will set the relpages/reltuples counts (though not any
more-complex statistics); but only if the table is found to not be
completely empty. Again, this is a behavior designed with common
usage patterns in mind, to not set relpages/reltuples to zero on a
table that's likely to get populated shortly.
As Harris, asked about creation of index will update the statistics. Yes indexes are updating the statistics, so indexes will analyze the table on the backend and update the statistics too, before it creating the index or after creating the index.
Example
======
postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select relname,reltuples,relpages from pg_class where relname='test';
relname | reltuples | relpages
---------+-----------+----------
test | 0 | 0
(1 row)
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select relname,reltuples,relpages from pg_class where relname='test';
relname | reltuples | relpages
---------+-----------+----------
test | 0 | 0
(1 row)
postgres=# create INDEX itest on test (id);
CREATE INDEX
postgres=# select relname,reltuples,relpages from pg_class where relname='test';
relname | reltuples | relpages
---------+-----------+----------
test | 1 | 1
(1 row)
CREATE INDEX
postgres=# select relname,reltuples,relpages from pg_class where relname='test';
relname | reltuples | relpages
---------+-----------+----------
test | 1 | 1
(1 row)
Adding one more thing to this thread
==========================
As per the documentation, one page is 8kb, when i create a table with int as one column its 4 bytes. If i insert 2000 rows, it should be in one page only as its 8kb, but its extending vastly as expected. Example shown below, taking the previous example table test with one column.
postgres=# insert into test VALUES (generate_series(2,2000));
INSERT 0 1999
postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
----------+------+-------+----------+-------+-------------
edbstore | test | table | postgres | 64 kB |
(1 row)
INSERT 0 1999
postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
----------+------+-------+----------+-------+-------------
edbstore | test | table | postgres | 64 kB |
(1 row)
postgres=# select count(*) from test ;
count
-------
2000
(1 row)
count
-------
2000
(1 row)
Why the its extending so many pages, where it can fit in one page. Is there any particular reason in behaving this type of paging.
Thanks for all in advance
Regards
Raghavendra
On Sun, Mar 28, 2010 at 11:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jeremy Harris <jgh@wizmail.org> writes:IIRC, it will set the relpages/reltuples counts (though not any
> On 03/28/2010 05:27 PM, Tom Lane wrote:
>> This is intentional: the size estimates for a never-yet-analyzed
>> table are *not* zero. This is because people frequently create and load
>> up a table and then immediately query it without an explicit ANALYZE.
> Does the creation of an index also populate statistics?
more-complex statistics); but only if the table is found to not be
completely empty. Again, this is a behavior designed with common
usage patterns in mind, to not set relpages/reltuples to zero on a
table that's likely to get populated shortly.
regards, tom lane
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Mon, 29 Mar 2010, Tadipathri Raghu wrote: > As per the documentation, one page is 8kb, when i create a table with int as > one column its 4 bytes. If i insert 2000 rows, it should be in one page only > as its 8kb, but its extending vastly as expected. Example shown below, > taking the previous example table test with one column. There is more to a row than just the single int column. The space used by a column will include a column start marker (data length), transaction ids, hint bits, an oid, a description of the types of the columns, and finally your data columns. That takes a bit more space. Matthew -- If you let your happiness depend upon how somebody else feels about you, now you have to control how somebody else feels about you. -- Abraham Hicks
Hi Mattew,
Thank you for the information.
Once again, I like to thank each and everyone in this thread for there ultimate support.
Regards
Raghavendra
On Mon, Mar 29, 2010 at 4:47 PM, Matthew Wakeling <matthew@flymine.org> wrote:
On Mon, 29 Mar 2010, Tadipathri Raghu wrote:As per the documentation, one page is 8kb, when i create a table with int as
one column its 4 bytes. If i insert 2000 rows, it should be in one page only
as its 8kb, but its extending vastly as expected. Example shown below,
taking the previous example table test with one column.
There is more to a row than just the single int column. The space used by a column will include a column start marker (data length), transaction ids, hint bits, an oid, a description of the types of the columns, and finally your data columns. That takes a bit more space.
Matthew
--
If you let your happiness depend upon how somebody else feels about you,
now you have to control how somebody else feels about you. -- Abraham Hicks
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
See http://www.postgresql.org/docs/current/static/storage-page-layout.html for all of what is taking up the space. Short version:
Per block overhead is > 24 bytes
Per row overhead is 23 bytes + some alignment loss + the null bitmap if you have nullable columns
On Mon, Mar 29, 2010 at 8:24 AM, raghavendra t <raagavendra.rao@gmail.com> wrote:
Hi Mattew,Thank you for the information.Once again, I like to thank each and everyone in this thread for there ultimate support.RegardsRaghavendraOn Mon, Mar 29, 2010 at 4:47 PM, Matthew Wakeling <matthew@flymine.org> wrote:On Mon, 29 Mar 2010, Tadipathri Raghu wrote:As per the documentation, one page is 8kb, when i create a table with int as
one column its 4 bytes. If i insert 2000 rows, it should be in one page only
as its 8kb, but its extending vastly as expected. Example shown below,
taking the previous example table test with one column.
There is more to a row than just the single int column. The space used by a column will include a column start marker (data length), transaction ids, hint bits, an oid, a description of the types of the columns, and finally your data columns. That takes a bit more space.
Matthew
--
If you let your happiness depend upon how somebody else feels about you,
now you have to control how somebody else feels about you. -- Abraham Hicks
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance