Thread: Can you spot the difference?
postgres@moshe=>devmain:ises=# explain analyze SELECT count(*) FROM tb_order_location ol JOIN tb_line_item li on li.order_location = ol.order_location WHERE li.tracking_number = '10137378459';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=671.48..671.49 rows=1 width=0) (actual time=0.272..0.272 rows=1 loops=1)
-> Nested Loop (cost=0.00..671.34 rows=54 width=0) (actual time=0.124..0.265 rows=16 loops=1)
-> Index Scan using tb_line_item_tracking_number_key on tb_line_item li (cost=0.00..219.17 rows=54 width=4) (actual time=0.087..0.161 rows=16 loops=1)
Index Cond: ((tracking_number)::text = '10137378459'::text)
-> Index Scan using tb_order_location_pkey on tb_order_location ol (cost=0.00..8.36 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=16)
Index Cond: (order_location = li.order_location)
Total runtime: 0.343 ms
(7 rows)postgres@moshe=>devmain:ises_coelacanth=# explain analyze SELECT count(*) FROM tb_order_location ol JOIN tb_line_item li on li.order_location = ol.order_location WHERE li.tracking_number = '10137378459';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=50467.40..50467.41 rows=1 width=0) (actual time=333.490..333.491 rows=1 loops=1)
-> Hash Join (cost=26551.11..50447.62 rows=7915 width=0) (actual time=332.045..333.481 rows=16 loops=1)
Hash Cond: (li.order_location = ol.order_location)
-> Bitmap Heap Scan on tb_line_item li (cost=177.82..20715.03 rows=7915 width=4) (actual time=0.128..0.209 rows=16 loops=1)
Recheck Cond: ((tracking_number)::text = '10137378459'::text)
-> Bitmap Index Scan on tb_line_item_tracking_number_key (cost=0.00..175.84 rows=7915 width=0) (actual time=0.108..0.108 rows=16 loops=1)
Index Cond: ((tracking_number)::text = '10137378459'::text)
-> Hash (cost=13190.24..13190.24 rows=803524 width=4) (actual time=324.114..324.114 rows=803553 loops=1)
Buckets: 4096 Batches: 32 Memory Usage: 887kB
-> Seq Scan on tb_order_location ol (cost=0.00..13190.24 rows=803524 width=4) (actual time=0.024..144.581 rows=803553 loops=1)
Total runtime: 333.766 ms
(11 rows)
(...)postgres@moshe=>devmain:ises=# \d tb_line_item
Table "public.tb_line_item"
Column | Type | Modifiers
----------------------------+-----------------------------+-------------------------------------------------------
line_item | integer | not null default nextval('sq_pk_line_item'::regclass)
order_location | integer | not null
(...)
tracking_number | character varying(512) |
(...)Indexes:
"tb_line_item_pkey" PRIMARY KEY, btree (line_item)
"tb_line_item_order_catalog_article_key" UNIQUE CONSTRAINT, btree (order_catalog_article, order_location, project, creator)
"tb_line_item_order_vendor_article_key" UNIQUE CONSTRAINT, btree (order_vendor_article, order_location, project, creator)
"idx_line_item_canceled" btree (canceled)
"ix_line_item_project" btree (project)
"ix_line_item_reset" btree (reset)
"tb_line_item_order_location_key" btree (order_location)
"tb_line_item_tracking_number_key" btree (tracking_number)
Check constraints:
"chk_order_vendor_article_or_order_catalog_article" CHECK (order_vendor_article IS NULL AND order_catalog_article IS NOT NULL OR order_vendor_article IS NOT NULL AND order_catalog_article IS NULL)
"tb_line_item_check" CHECK (
CASE
WHEN executed IS NOT NULL AND canceled IS NOT NULL THEN false
ELSE true
END)
"tb_line_item_quantity_backordered_check" CHECK (quantity_backordered >= 0::numeric)
"tb_line_item_quantity_ordered_check" CHECK (quantity_ordered <> 0::numeric)
"tb_line_item_unit_price_check" CHECK (unit_price >= 0::numeric)
Foreign-key constraints:
(...)
"tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
(...)
Referenced by:
TABLE "tb_shipment" CONSTRAINT "tb_shipment_line_item_fkey" FOREIGN KEY (line_item) REFERENCES tb_line_item(line_item)
Triggers:
(...)
postgres@moshe=>devmain:ises=# \d tb_order_location
Table "public.tb_order_location"
Column | Type | Modifiers
-----------------------------+-----------------------------+------------------------------------------------------------
order_location | integer | not null default nextval('sq_pk_order_location'::regclass)
orderid | integer | not null
(...)
Indexes:
"tb_order_location_pkey" PRIMARY KEY, btree (order_location)
"tb_order_location_orderid_key" UNIQUE CONSTRAINT, btree (orderid, location)
"tb_order_location_location_key" btree (location)
Foreign-key constraints:
"tb_order_location_location_fkey" FOREIGN KEY (location) REFERENCES tb_location(location)
"tb_order_location_orderid_fkey" FOREIGN KEY (orderid) REFERENCES tb_order(orderid)
Referenced by:
TABLE "tb_line_item" CONSTRAINT "tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
Triggers:
(...)
postgres@moshe=>devmain:ises_coelacanth=# \d tb_line_item(...)Table "public.tb_line_item"
Column | Type | Modifiers
----------------------------+-----------------------------+-------------------------------------------------------
line_item | integer | not null default nextval('sq_pk_line_item'::regclass)
order_location | integer | not null
(...)
tracking_number | character varying(512) |
(...)Indexes:
"tb_line_item_pkey" PRIMARY KEY, btree (line_item)
"tb_line_item_order_catalog_article_key" UNIQUE CONSTRAINT, btree (order_article_location, order_location, project, creator)
"idx_line_item_canceled" btree (canceled)
"idx_line_item_executed" btree (executed)
"ix_line_item_project" btree (project)
"ix_line_item_reset" btree (reset)
"tb_line_item_order_location_key" btree (order_location)
"tb_line_item_tracking_number_key" btree (tracking_number)
Check constraints:
"tb_line_item_check" CHECK (
CASE
WHEN executed IS NOT NULL AND canceled IS NOT NULL THEN false
ELSE true
END)
"tb_line_item_quantity_backordered_check" CHECK (quantity_backordered >= 0::numeric)
"tb_line_item_quantity_ordered_check" CHECK (quantity_ordered <> 0::numeric)
"tb_line_item_unit_price_check" CHECK (unit_price >= 0::numeric)
Foreign-key constraints:
(...)
"tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
(...)
Referenced by:
TABLE "tb_shipment" CONSTRAINT "tb_shipment_line_item_fkey" FOREIGN KEY (line_item) REFERENCES tb_line_item(line_item)
Triggers:
(...)
postgres@moshe=>devmain:ises_coelacanth=# \d tb_order_location
Table "public.tb_order_location"
Column | Type | Modifiers
-----------------------------+-----------------------------+------------------------------------------------------------
order_location | integer | not null default nextval('sq_pk_order_location'::regclass)
orderid | integer | not null
(...)
Indexes:
"tb_order_location_pkey" PRIMARY KEY, btree (order_location)
"tb_order_location_orderid_key" UNIQUE CONSTRAINT, btree (orderid, location)
"tb_order_location_location_key" btree (location)
Foreign-key constraints:
"tb_order_location_location_fkey" FOREIGN KEY (location) REFERENCES tb_location(location)
"tb_order_location_orderid_fkey" FOREIGN KEY (orderid) REFERENCES tb_order(orderid)
Referenced by:
TABLE "tb_line_item" CONSTRAINT "tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
Triggers:
(...)
Nead Werx, Inc. | Manager of Systems Engineering
moshe@neadwerx.com | www.neadwerx.com
On 04/16/2013 12:07 PM, Moshe Jacobson wrote: > Hi PostgreSQL friends, > > I have two databases in the same cluster that are almost identical. One > is a copy of the other as we are developing some new features in the copy. > > My problem is that the exact same simple query performs great in the > original database ("ises") and dismally in the copy database > ("ises_coelacanth"). The problem is that in ises, it uses an index scan, > but in ises_coelacanth it uses a sequential scan: The difference is that Postgres is coming to alternate conclusions as to what plan to use. Given that the copy is causing the 'problem', the question to ask is; did you run ANALYZE on the table once the data was copied in? > > -- > Moshe Jacobson > Nead Werx, Inc. | Manager of Systems Engineering > 2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339 > moshe@neadwerx.com <mailto:moshe@neadwerx.com> | www.neadwerx.com > <http://www.neadwerx.com/> > > "Quality is not an act, it is a habit." -- Aristotle -- Adrian Klaver adrian.klaver@gmail.com
Moshe Jacobson <moshe@neadwerx.com> writes: > My problem is that the exact same simple query performs great in the > original database ("ises") and dismally in the copy database > ("ises_coelacanth"). The problem is that in ises, it uses an index scan, > but in ises_coelacanth it uses a sequential scan: The rowcount estimates are much further away from reality in the second database. Either you forgot to run ANALYZE at all, or the stats target settings are different (and lower) in the second DB. regards, tom lane
On 04/16/2013 12:07 PM, Moshe Jacobson wrote:The difference is that Postgres is coming to alternate conclusions as to what plan to use. Given that the copy is causing the 'problem', the question to ask is; did you run ANALYZE on the table once the data was copied in?Hi PostgreSQL friends,
I have two databases in the same cluster that are almost identical. One
is a copy of the other as we are developing some new features in the copy.
My problem is that the exact same simple query performs great in the
original database ("ises") and dismally in the copy database
("ises_coelacanth"). The problem is that in ises, it uses an index scan,
but in ises_coelacanth it uses a sequential scan:moshe@neadwerx.com <mailto:moshe@neadwerx.com> | www.neadwerx.com
--
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
<http://www.neadwerx.com/>
"Quality is not an act, it is a habit." -- Aristotle
--
Adrian Klaver
adrian.klaver@gmail.com
Nead Werx, Inc. | Manager of Systems Engineering
moshe@neadwerx.com | www.neadwerx.com
Statistics on your “original” and “copy” databases must be different.
For the same condition (li.tracking_number = '10137378459') optimizer expects to find 7915 rows in tb_line_item table on the “copy” database while only 54 rows on the “original” database.
Also, the other table (tb_order_location) could have bad statistics as well.
That could cause different execution plans.
Regards,
Igor Neyman
From: Moshe Jacobson [mailto:moshe@neadwerx.com]
Sent: Tuesday, April 16, 2013 3:07 PM
To: pgsql-general
Subject: Can you spot the difference?
Hi PostgreSQL friends,
I have two databases in the same cluster that are almost identical. One is a copy of the other as we are developing some new features in the copy.
My problem is that the exact same simple query performs great in the original database ("ises") and dismally in the copy database ("ises_coelacanth"). The problem is that in ises, it uses an index scan, but in ises_coelacanth it uses a sequential scan:
postgres@moshe=>devmain:ises=# explain analyze SELECT count(*) FROM tb_order_location ol JOIN tb_line_item li on li.order_location = ol.order_location WHERE li.tracking_number = '10137378459';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=671.48..671.49 rows=1 width=0) (actual time=0.272..0.272 rows=1 loops=1)
-> Nested Loop (cost=0.00..671.34 rows=54 width=0) (actual time=0.124..0.265 rows=16 loops=1)
-> Index Scan using tb_line_item_tracking_number_key on tb_line_item li (cost=0.00..219.17 rows=54 width=4) (actual time=0.087..0.161 rows=16 loops=1)
Index Cond: ((tracking_number)::text = '10137378459'::text)
-> Index Scan using tb_order_location_pkey on tb_order_location ol (cost=0.00..8.36 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=16)
Index Cond: (order_location = li.order_location)
Total runtime: 0.343 ms
(7 rows)
postgres@moshe=>devmain:ises_coelacanth=# explain analyze SELECT count(*) FROM tb_order_location ol JOIN tb_line_item li on li.order_location = ol.order_location WHERE li.tracking_number = '10137378459';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=50467.40..50467.41 rows=1 width=0) (actual time=333.490..333.491 rows=1 loops=1)
-> Hash Join (cost=26551.11..50447.62 rows=7915 width=0) (actual time=332.045..333.481 rows=16 loops=1)
Hash Cond: (li.order_location = ol.order_location)
-> Bitmap Heap Scan on tb_line_item li (cost=177.82..20715.03 rows=7915 width=4) (actual time=0.128..0.209 rows=16 loops=1)
Recheck Cond: ((tracking_number)::text = '10137378459'::text)
-> Bitmap Index Scan on tb_line_item_tracking_number_key (cost=0.00..175.84 rows=7915 width=0) (actual time=0.108..0.108 rows=16 loops=1)
Index Cond: ((tracking_number)::text = '10137378459'::text)
-> Hash (cost=13190.24..13190.24 rows=803524 width=4) (actual time=324.114..324.114 rows=803553 loops=1)
Buckets: 4096 Batches: 32 Memory Usage: 887kB
-> Seq Scan on tb_order_location ol (cost=0.00..13190.24 rows=803524 width=4) (actual time=0.024..144.581 rows=803553 loops=1)
Total runtime: 333.766 ms
(11 rows)
Both of these queries return 16 rows, as you can see.
Below I've included the information on each of these tables. They have the same indexes and are identical for the purposes of this query.
Can you help me figure out what is going on here?? Thank you!
postgres@moshe=>devmain:ises=# \d tb_line_item
Table "public.tb_line_item"
Column | Type | Modifiers
----------------------------+-----------------------------+-------------------------------------------------------
line_item | integer | not null default nextval('sq_pk_line_item'::regclass)(...)
order_location | integer | not null
(...)
tracking_number | character varying(512) |
(...)Indexes:
"tb_line_item_pkey" PRIMARY KEY, btree (line_item)
"tb_line_item_order_catalog_article_key" UNIQUE CONSTRAINT, btree (order_catalog_article, order_location, project, creator)
"tb_line_item_order_vendor_article_key" UNIQUE CONSTRAINT, btree (order_vendor_article, order_location, project, creator)
"idx_line_item_canceled" btree (canceled)
"ix_line_item_project" btree (project)
"ix_line_item_reset" btree (reset)
"tb_line_item_order_location_key" btree (order_location)
"tb_line_item_tracking_number_key" btree (tracking_number)
Check constraints:
"chk_order_vendor_article_or_order_catalog_article" CHECK (order_vendor_article IS NULL AND order_catalog_article IS NOT NULL OR order_vendor_article IS NOT NULL AND order_catalog_article IS NULL)
"tb_line_item_check" CHECK (
CASE
WHEN executed IS NOT NULL AND canceled IS NOT NULL THEN false
ELSE true
END)
"tb_line_item_quantity_backordered_check" CHECK (quantity_backordered >= 0::numeric)
"tb_line_item_quantity_ordered_check" CHECK (quantity_ordered <> 0::numeric)
"tb_line_item_unit_price_check" CHECK (unit_price >= 0::numeric)
Foreign-key constraints:
(...)
"tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
(...)
Referenced by:
TABLE "tb_shipment" CONSTRAINT "tb_shipment_line_item_fkey" FOREIGN KEY (line_item) REFERENCES tb_line_item(line_item)
Triggers:
(...)
postgres@moshe=>devmain:ises=# \d tb_order_location
Table "public.tb_order_location"
Column | Type | Modifiers
-----------------------------+-----------------------------+------------------------------------------------------------
order_location | integer | not null default nextval('sq_pk_order_location'::regclass)
orderid | integer | not null
(...)
Indexes:
"tb_order_location_pkey" PRIMARY KEY, btree (order_location)
"tb_order_location_orderid_key" UNIQUE CONSTRAINT, btree (orderid, location)
"tb_order_location_location_key" btree (location)
Foreign-key constraints:
"tb_order_location_location_fkey" FOREIGN KEY (location) REFERENCES tb_location(location)
"tb_order_location_orderid_fkey" FOREIGN KEY (orderid) REFERENCES tb_order(orderid)
Referenced by:
TABLE "tb_line_item" CONSTRAINT "tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
Triggers:
(...)
Here is the \d for tb_order_location and tb_line_item on ises_coelacanth:
postgres@moshe=>devmain:ises_coelacanth=# \d tb_line_item
Table "public.tb_line_item"
Column | Type | Modifiers
----------------------------+-----------------------------+-------------------------------------------------------
line_item | integer | not null default nextval('sq_pk_line_item'::regclass)(...)
order_location | integer | not null
(...)
tracking_number | character varying(512) |
(...)Indexes:
"tb_line_item_pkey" PRIMARY KEY, btree (line_item)
"tb_line_item_order_catalog_article_key" UNIQUE CONSTRAINT, btree (order_article_location, order_location, project, creator)
"idx_line_item_canceled" btree (canceled)
"idx_line_item_executed" btree (executed)
"ix_line_item_project" btree (project)
"ix_line_item_reset" btree (reset)
"tb_line_item_order_location_key" btree (order_location)
"tb_line_item_tracking_number_key" btree (tracking_number)
Check constraints:
"tb_line_item_check" CHECK (
CASE
WHEN executed IS NOT NULL AND canceled IS NOT NULL THEN false
ELSE true
END)
"tb_line_item_quantity_backordered_check" CHECK (quantity_backordered >= 0::numeric)
"tb_line_item_quantity_ordered_check" CHECK (quantity_ordered <> 0::numeric)
"tb_line_item_unit_price_check" CHECK (unit_price >= 0::numeric)
Foreign-key constraints:
(...)
"tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
(...)
Referenced by:
TABLE "tb_shipment" CONSTRAINT "tb_shipment_line_item_fkey" FOREIGN KEY (line_item) REFERENCES tb_line_item(line_item)
Triggers:
(...)
postgres@moshe=>devmain:ises_coelacanth=# \d tb_order_location
Table "public.tb_order_location"
Column | Type | Modifiers
-----------------------------+-----------------------------+------------------------------------------------------------
order_location | integer | not null default nextval('sq_pk_order_location'::regclass)
orderid | integer | not null
(...)
Indexes:
"tb_order_location_pkey" PRIMARY KEY, btree (order_location)
"tb_order_location_orderid_key" UNIQUE CONSTRAINT, btree (orderid, location)
"tb_order_location_location_key" btree (location)
Foreign-key constraints:
"tb_order_location_location_fkey" FOREIGN KEY (location) REFERENCES tb_location(location)
"tb_order_location_orderid_fkey" FOREIGN KEY (orderid) REFERENCES tb_order(orderid)
Referenced by:
TABLE "tb_line_item" CONSTRAINT "tb_line_item_order_location_fkey" FOREIGN KEY (order_location) REFERENCES tb_order_location(order_location)
Triggers:
(...)
--
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | www.neadwerx.com
"Quality is not an act, it is a habit." -- Aristotle
Given that the copy is causing the 'problem', the question to ask is; did you run ANALYZE on the table once the data was copied in?
I did not -- I expected the autovacuum daemon to do so. Why did it not?
Nead Werx, Inc. | Manager of Systems Engineering
moshe@neadwerx.com | www.neadwerx.com
On 04/16/2013 01:55 PM, Moshe Jacobson wrote: > On Tue, Apr 16, 2013 at 3:29 PM, Adrian Klaver <adrian.klaver@gmail.com > <mailto:adrian.klaver@gmail.com>> wrote: > > Given that the copy is causing the 'problem', the question to ask > is; did you run ANALYZE on the table once the data was copied in? > > > I did not -- I expected the autovacuum daemon to do so. Why did it not? > The database was created & restored days ago, and the autovacuum daemon > is running with default settings. http://www.postgresql.org/docs/9.2/static/routine-vacuuming.html " The autovacuum daemon, if enabled, will automatically issue ANALYZE commands whenever the content of a table has changed sufficiently. However, administrators might prefer to rely on manually-scheduled ANALYZE operations, particularly if it is known that update activity on a table will not affect the statistics of "interesting" columns. The daemon schedules ANALYZE strictly as a function of the number of rows inserted or updated; it has no knowledge of whether that will lead to meaningful statistical changes. " So at a guess there has not been enough churn on the table. > > Thanks. > > -- > Moshe Jacobson > Nead Werx, Inc. | Manager of Systems Engineering > 2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339 > moshe@neadwerx.com <mailto:moshe@neadwerx.com> | www.neadwerx.com > <http://www.neadwerx.com/> > > "Quality is not an act, it is a habit." -- Aristotle -- Adrian Klaver adrian.klaver@gmail.com
"
The autovacuum daemon, if enabled, will automatically issue ANALYZE commands whenever the content of a table has changed sufficiently. However, administrators might prefer to rely on manually-scheduled ANALYZE operations, particularly if it is known that update activity on a table will not affect the statistics of "interesting" columns. The daemon schedules ANALYZE strictly as a function of the number of rows inserted or updated; it has no knowledge of whether that will lead to meaningful statistical changes.
"
So at a guess there has not been enough churn on the table.
So pg_restore's COPY would not trigger the ANALYZE? That seems wrong.
Nead Werx, Inc. | Manager of Systems Engineering
moshe@neadwerx.com | www.neadwerx.com
On 04/17/2013 07:49 AM, Moshe Jacobson wrote: > > On Tue, Apr 16, 2013 at 7:29 PM, Adrian Klaver <adrian.klaver@gmail.com > <mailto:adrian.klaver@gmail.com>> wrote: > > " > The autovacuum daemon, if enabled, will automatically issue ANALYZE > commands whenever the content of a table has changed sufficiently. > However, administrators might prefer to rely on manually-scheduled > ANALYZE operations, particularly if it is known that update activity > on a table will not affect the statistics of "interesting" columns. > The daemon schedules ANALYZE strictly as a function of the number of > rows inserted or updated; it has no knowledge of whether that will > lead to meaningful statistical changes. > " > > So at a guess there has not been enough churn on the table. > > > So pg_restore's COPY would not trigger the ANALYZE? That seems wrong. Well the argument most often heard is that the command has a cost and it left to the discretion of the user as to when to incur that cost. For instance I, and others, often use COPY to transfer data from some external data source to a holding table, from which the data is then manipulated/transferred to one or more tables. I have not interest in having the holding table ANALYZEd as I am going to hit all the rows anyway. Generally what people do in your situation is include a manual ANALYZE in a script that is part of or follows the COPY. > > > -- > Moshe Jacobson > Nead Werx, Inc. | Manager of Systems Engineering > 2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339 > moshe@neadwerx.com <mailto:moshe@neadwerx.com> | www.neadwerx.com > <http://www.neadwerx.com/> > > "Quality is not an act, it is a habit." -- Aristotle -- Adrian Klaver adrian.klaver@gmail.com