Thread: [PERFORM] Performance Issue -- "Materialize"

[PERFORM] Performance Issue -- "Materialize"

From
anand086
Date:
I am a Postgres Newbie and trying to learn :) We have a scenario wherein, one of the SQL with different input value for import_num showing different execution plan. As an example, with import_num = '4520440' the execution plan shows Nested Loop and is taking ~12secs. With import_num = '4520460' execution plan showed using "Materialize" and never completed. After I set enable_material to off, the execution plan is changed using Hash Semi Join and completes in less than 3 secs. SELECT count(*) FROM test_tab WHERE login IN (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520440' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num = '0' AND login IS NOT NULL) AND import_num = '4520440'; +--------+ | count | +--------+ | 746982 | +--------+ (1 row) Time: 12054.274 ms
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                        QUERY PLAN                                                                         |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=351405.08..351405.09 rows=1 width=8)                                                                                                     |
|   ->  Nested Loop  (cost=349846.23..350366.17 rows=415562 width=0)                                                                                        |
|         ->  HashAggregate  (cost=349845.67..349847.67 rows=200 width=96)                                                                                  |
|               Group Key: ("ANY_subquery".login)::text                                                                                                     |
|               ->  Subquery Scan on "ANY_subquery"  (cost=340828.23..348557.47 rows=515282 width=96)                                                       |
|                     ->  SetOp Except  (cost=340828.23..343404.65 rows=515282 width=100)                                                                   |
|                           ->  Sort  (cost=340828.23..342116.44 rows=515283 width=100)                                                                     |
|                                 Sort Key: "*SELECT* 1".login                                                                                              |
|                                 ->  Append  (cost=0.56..275836.74 rows=515283 width=100)                                                                  |
|                                       ->  Subquery Scan on "*SELECT* 1"  (cost=0.56..275834.70 rows=515282 width=12)                                      |
|                                             ->  Unique  (cost=0.56..270681.88 rows=515282 width=8)                                                        |
|                                                   ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1  (cost=0.56..268604.07 rows=831125 width=8) |
|                                                         Index Cond: ((import_num = '4520440'::numeric) AND (login IS NOT NULL))                           |
|                                       ->  Subquery Scan on "*SELECT* 2"  (cost=0.56..2.04 rows=1 width=12)                                                |
|                                             ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                                  |
|                                                   ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2  (cost=0.56..2.03 rows=1 width=8)           |
|                                                         Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL))                                 |
|         ->  Index Only Scan using ui_nkey_test_tab on test_tab  (cost=0.56..2.58 rows=1 width=8)                                                              |
|               Index Cond: ((import_num = '4520440'::numeric) AND (login = ("ANY_subquery".login)::text))                                                  |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
(19 rows)
SELECT count(*) FROM test_tab WHERE import_num = '4520460' and login IN (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520460' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num = '0' AND login IS NOT NULL); The SQL was never completing and had the below SQL execution plan --
+-------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                QUERY PLAN                                                                 |
+-------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=6.14..6.15 rows=1 width=8)                                                                                               |
|   ->  Nested Loop Semi Join  (cost=1.12..6.13 rows=1 width=0)                                                                             |
|         Join Filter: ((test_tab.login)::text = ("ANY_subquery".login)::text)                                                                |
|         ->  Index Only Scan using ui_nkey_test_tab on test_tab  (cost=0.56..2.02 rows=1 width=8)                                              |
|               Index Cond: (import_num = '4520460'::numeric)                                                                               |
|         ->  Materialize  (cost=0.56..4.10 rows=1 width=96)                                                                                |
|               ->  Subquery Scan on "ANY_subquery"  (cost=0.56..4.09 rows=1 width=96)                                                      |
|                     ->  HashSetOp Except  (cost=0.56..4.08 rows=1 width=100)                                                              |
|                           ->  Append  (cost=0.56..4.08 rows=2 width=100)                                                                  |
|                                 ->  Subquery Scan on "*SELECT* 1"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL))                 |
|                                 ->  Subquery Scan on "*SELECT* 2"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL))                       |
+-------------------------------------------------------------------------------------------------------------------------------------------+
(17 rows)
############################################# # After I set enable_material to off; ############################################# SELECT count(*) FROM test_tab WHERE import_num = '4520460' and login IN (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520460' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num = '0' AND login IS NOT NULL); +--------+ | count | +--------+ | 762599 | +--------+ (1 row) Time: 2116.889 ms
+-------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                QUERY PLAN                                                                 |
+-------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=6.13..6.14 rows=1 width=8)                                                                                               |
|   ->  Hash Semi Join  (cost=4.67..6.13 rows=1 width=0)                                                                                    |
|         Hash Cond: ((test_tab.login)::text = ("ANY_subquery".login)::text)                                                                  |
|         ->  Index Only Scan using ui_nkey_test_tab on test_tab  (cost=0.56..2.02 rows=1 width=8)                                              |
|               Index Cond: (import_num = '4520460'::numeric)                                                                               |
|         ->  Hash  (cost=4.09..4.09 rows=1 width=96)                                                                                       |
|               ->  Subquery Scan on "ANY_subquery"  (cost=0.56..4.09 rows=1 width=96)                                                      |
|                     ->  HashSetOp Except  (cost=0.56..4.08 rows=1 width=100)                                                              |
|                           ->  Append  (cost=0.56..4.08 rows=2 width=100)                                                                  |
|                                 ->  Subquery Scan on "*SELECT* 1"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL))                 |
|                                 ->  Subquery Scan on "*SELECT* 2"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL))                       |
+-------------------------------------------------------------------------------------------------------------------------------------------+
(17 rows)
Looking at the row count for import_num select import_num, count(*) from test_tab group by import_num order by 2; +------------+--------+ | import_num | count | +------------+--------+ | 4520440 | 746982 | | 4520460 | 762599 | +------------+--------+ (37 rows) With different value of import_num we are having different execution plan. Is there a way to force the same Hash semi Join plan to sql with import_num 4520440, currently doing nested loop. I tried /*+HashJoin(a1 ANY_subquery)*/ but the sql execution plan doesn't change. SELECT /*+HashJoin(a1 ANY_subquery)*/ count(*) FROM test_tab a1 WHERE import_num = '4520440' and login IN (SELECT DISTINCT login FROM test_tab a2 WHERE import_num = '4520440' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab a3 WHERE import_num = '0' AND login IS NOT NULL); Regards, Anand

View this message in context: Performance Issue -- "Materialize"
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: [PERFORM] Performance Issue -- "Materialize"

From
anand086
Date:
Any thoughts on this?



--
View this message in context: http://www.postgresql-archive.org/Performance-Issue-Materialize-tp5979128p5979481.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: [PERFORM] Performance Issue -- "Materialize"

From
Justin Pryzby
Date:
On Sat, Aug 19, 2017 at 10:37:56AM -0700, anand086 wrote:
>
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+|
                                                                      
> QUERY PLAN
>
|+-----------------------------------------------------------------------------------------------------------------------------------------------------------+|
> Aggregate  (cost=351405.08..351405.09 rows=1 width=8)
                                   

Would you send explain ANALYZE and not just explain ?

Justin


Re: [PERFORM] Performance Issue -- "Materialize"

From
Carlos Augusto Machado
Date:

Do you have an index on login column ?

If not, try creating an index and taking off those DISTICTs.

Em seg, 21 de ago de 2017 às 15:33, Justin Pryzby <pryzby@telsasoft.com> escreveu:
On Sat, Aug 19, 2017 at 10:37:56AM -0700, anand086 wrote:
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------+|
> QUERY PLAN
> |+-----------------------------------------------------------------------------------------------------------------------------------------------------------+|
> Aggregate  (cost=351405.08..351405.09 rows=1 width=8)

Would you send explain ANALYZE and not just explain ?

Justin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance Issue -- "Materialize"

From
Carlos Augusto Machado
Date:

I think you query is a bit confusing and have many subqueries, so I tried to simplify

If you cant´t have more import_num = 0 to the same login, try this

SELECT count(*)
FROM test_tab tab1
LEFT JOIN test_tab tab2
    ON tab1.login = tab2.login AND tab2.import_num = '0'
WHERE 
   tab2.login IS NULL AND
   import_num = '4520440'

otherwise try this

SELECT count(*)
FROM test_tab tab1
LEFT JOIN (
   SELECT DISTINCT login FROM test_tab WHERE import_num = '0'
) tab2
    ON tab1.login = tab2.login
WHERE 
   tab2.login IS NULL AND
   import_num = '4520440'


Em seg, 21 de ago de 2017 às 15:47, Carlos Augusto Machado <caugustom@gmail.com> escreveu:

Do you have an index on login column ?

If not, try creating an index and taking off those DISTICTs.

Em seg, 21 de ago de 2017 às 15:33, Justin Pryzby <pryzby@telsasoft.com> escreveu:
On Sat, Aug 19, 2017 at 10:37:56AM -0700, anand086 wrote:
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------+|
> QUERY PLAN
> |+-----------------------------------------------------------------------------------------------------------------------------------------------------------+|
> Aggregate  (cost=351405.08..351405.09 rows=1 width=8)

Would you send explain ANALYZE and not just explain ?

Justin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance Issue -- "Materialize"

From
Jeff Janes
Date:
On Sat, Aug 19, 2017 at 10:37 AM, anand086 <anand086@gmail.com> wrote:

Your email is very hard to read, the formatting and line wrapping is heavily mangled.  You might want to attach the plans as files attachments instead of or in addition to putting the in the body.

 
 -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1 (cost=0.56..2.03 rows=1 width=8) |
         Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL)) 

It looks like the statistics for your table are desperately out of date, as a later query showed there are 762599 rows (unless login is null for all of them) but the above is estimating there is only one.   When was the table last analyzed?  

Cheers,

Jeff

On Sat, Aug 19, 2017 at 10:37 AM, anand086 <anand086@gmail.com> wrote:
I am a Postgres Newbie and trying to learn :) We have a scenario wherein, one of the SQL with different input value for import_num showing different execution plan. As an example, with import_num = '4520440' the execution plan shows Nested Loop and is taking ~12secs. With import_num = '4520460' execution plan showed using "Materialize" and never completed. After I set enable_material to off, the execution plan is changed using Hash Semi Join and completes in less than 3 secs. SELECT count(*) FROM test_tab WHERE login IN (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520440' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num = '0' AND login IS NOT NULL) AND import_num = '4520440'; +--------+ | count | +--------+ | 746982 | +--------+ (1 row) Time: 12054.274 ms
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                        QUERY PLAN                                                                         |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=351405.08..351405.09 rows=1 width=8)                                                                                                     |
|   ->  Nested Loop  (cost=349846.23..350366.17 rows=415562 width=0)                                                                                        |
|         ->  HashAggregate  (cost=349845.67..349847.67 rows=200 width=96)                                                                                  |
|               Group Key: ("ANY_subquery".login)::text                                                                                                     |
|               ->  Subquery Scan on "ANY_subquery"  (cost=340828.23..348557.47 rows=515282 width=96)                                                       |
|                     ->  SetOp Except  (cost=340828.23..343404.65 rows=515282 width=100)                                                                   |
|                           ->  Sort  (cost=340828.23..342116.44 rows=515283 width=100)                                                                     |
|                                 Sort Key: "*SELECT* 1".login                                                                                              |
|                                 ->  Append  (cost=0.56..275836.74 rows=515283 width=100)                                                                  |
|                                       ->  Subquery Scan on "*SELECT* 1"  (cost=0.56..275834.70 rows=515282 width=12)                                      |
|                                             ->  Unique  (cost=0.56..270681.88 rows=515282 width=8)                                                        |
|                                                   ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1  (cost=0.56..268604.07 rows=831125 width=8) |
|                                                         Index Cond: ((import_num = '4520440'::numeric) AND (login IS NOT NULL))                           |
|                                       ->  Subquery Scan on "*SELECT* 2"  (cost=0.56..2.04 rows=1 width=12)                                                |
|                                             ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                                  |
|                                                   ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2  (cost=0.56..2.03 rows=1 width=8)           |
|                                                         Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL))                                 |
|         ->  Index Only Scan using ui_nkey_test_tab on test_tab  (cost=0.56..2.58 rows=1 width=8)                                                              |
|               Index Cond: ((import_num = '4520440'::numeric) AND (login = ("ANY_subquery".login)::text))                                                  |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
(19 rows)
SELECT count(*) FROM test_tab WHERE import_num = '4520460' and login IN (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520460' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num = '0' AND login IS NOT NULL); The SQL was never completing and had the below SQL execution plan --
+-------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                QUERY PLAN                                                                 |
+-------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=6.14..6.15 rows=1 width=8)                                                                                               |
|   ->  Nested Loop Semi Join  (cost=1.12..6.13 rows=1 width=0)                                                                             |
|         Join Filter: ((test_tab.login)::text = ("ANY_subquery".login)::text)                                                                |
|         ->  Index Only Scan using ui_nkey_test_tab on test_tab  (cost=0.56..2.02 rows=1 width=8)                                              |
|               Index Cond: (import_num = '4520460'::numeric)                                                                               |
|         ->  Materialize  (cost=0.56..4.10 rows=1 width=96)                                                                                |
|               ->  Subquery Scan on "ANY_subquery"  (cost=0.56..4.09 rows=1 width=96)                                                      |
|                     ->  HashSetOp Except  (cost=0.56..4.08 rows=1 width=100)                                                              |
|                           ->  Append  (cost=0.56..4.08 rows=2 width=100)                                                                  |
|                                 ->  Subquery Scan on "*SELECT* 1"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL))                 |
|                                 ->  Subquery Scan on "*SELECT* 2"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL))                       |
+-------------------------------------------------------------------------------------------------------------------------------------------+
(17 rows)
############################################# # After I set enable_material to off; ############################################# SELECT count(*) FROM test_tab WHERE import_num = '4520460' and login IN (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520460' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num = '0' AND login IS NOT NULL); +--------+ | count | +--------+ | 762599 | +--------+ (1 row) Time: 2116.889 ms
+-------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                QUERY PLAN                                                                 |
+-------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=6.13..6.14 rows=1 width=8)                                                                                               |
|   ->  Hash Semi Join  (cost=4.67..6.13 rows=1 width=0)                                                                                    |
|         Hash Cond: ((test_tab.login)::text = ("ANY_subquery".login)::text)                                                                  |
|         ->  Index Only Scan using ui_nkey_test_tab on test_tab  (cost=0.56..2.02 rows=1 width=8)                                              |
|               Index Cond: (import_num = '4520460'::numeric)                                                                               |
|         ->  Hash  (cost=4.09..4.09 rows=1 width=96)                                                                                       |
|               ->  Subquery Scan on "ANY_subquery"  (cost=0.56..4.09 rows=1 width=96)                                                      |
|                     ->  HashSetOp Except  (cost=0.56..4.08 rows=1 width=100)                                                              |
|                           ->  Append  (cost=0.56..4.08 rows=2 width=100)                                                                  |
|                                 ->  Subquery Scan on "*SELECT* 1"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL))                 |
|                                 ->  Subquery Scan on "*SELECT* 2"  (cost=0.56..2.04 rows=1 width=12)                                      |
|                                       ->  Unique  (cost=0.56..2.03 rows=1 width=8)                                                        |
|                                             ->  Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2  (cost=0.56..2.03 rows=1 width=8) |
|                                                   Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL))                       |
+-------------------------------------------------------------------------------------------------------------------------------------------+
(17 rows)
Looking at the row count for import_num select import_num, count(*) from test_tab group by import_num order by 2; +------------+--------+ | import_num | count | +------------+--------+ | 4520440 | 746982 | | 4520460 | 762599 | +------------+--------+ (37 rows) With different value of import_num we are having different execution plan. Is there a way to force the same Hash semi Join plan to sql with import_num 4520440, currently doing nested loop. I tried /*+HashJoin(a1 ANY_subquery)*/ but the sql execution plan doesn't change. SELECT /*+HashJoin(a1 ANY_subquery)*/ count(*) FROM test_tab a1 WHERE import_num = '4520440' and login IN (SELECT DISTINCT login FROM test_tab a2 WHERE import_num = '4520440' AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab a3 WHERE import_num = '0' AND login IS NOT NULL); Regards, Anand

View this message in context: Performance Issue -- "Materialize"
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.