Re: neqjoinsel versus "refresh materialized view concurrently" - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: neqjoinsel versus "refresh materialized view concurrently"
Date
Msg-id CAEepm=1kdvEq-06_w2+VyfxOxy3sbRCq5Rw_MsjEaTjaXTSZOg@mail.gmail.com
Whole thread Raw
In response to neqjoinsel versus "refresh materialized view concurrently"  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: neqjoinsel versus "refresh materialized view concurrently"  (Thomas Munro <thomas.munro@enterprisedb.com>)
Re: neqjoinsel versus "refresh materialized view concurrently"  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Mar 14, 2018 at 8:07 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> The following commit has caused a devastating performance regression
> in concurrent refresh of MV:
>
> commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97
> Author: Tom Lane <tgl@sss.pgh.pa.us>
> Date:   Wed Nov 29 22:00:29 2017 -0500
>
>     Fix neqjoinsel's behavior for semi/anti join cases.
>
>
> The below reproduction goes from taking about 1 second to refresh, to taking
> an amount of time I don't have the patience to measure.
>
> drop table foobar2 cascade;
> create table foobar2 as select * from generate_series(1,200000);
> create materialized view foobar3 as select * from foobar2;
> create unique index on foobar3 (generate_series );
> analyze foobar3;
> refresh materialized view CONCURRENTLY foobar3 ;
>
>
> When I interrupt the refresh, I get a message including this line:
>
> CONTEXT:  SQL statement "SELECT newdata FROM pg_temp_3.pg_temp_16420 newdata
> WHERE newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16420
> newdata2 WHERE newdata2 IS NOT NULL AND newdata2 OPERATOR(pg_catalog.*=)
> newdata AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1"
>
> So I makes sense that the commit in question could have caused a change in
> the execution plan.  Because these are temp tables, I can't easily get my
> hands on them to investigate further.

Ouch.  A quadratic join.  This looks like an invisible correlation problem.

load 'auto_explain';
set auto_explain.log_min_duration = 0;
set auto_explain.log_analyze = true;

drop table if exists t cascade;

create table t as select generate_series(1, 5000);
create materialized view mv as select * from t;
create unique index on mv(generate_series);
analyze mv;
refresh materialized view concurrently mv;

HEAD:

LOG:  duration: 26101.612 ms  plan:
Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16452 newdata WHERE
newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16452
newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
Limit  (cost=0.00..90.52 rows=1 width=28) (actual
time=26101.608..26101.608 rows=0 loops=1)
 ->  Nested Loop Semi Join  (cost=0.00..225220.96 rows=2488 width=28)
(actual time=26101.606..26101.606 rows=0 loops=1)
       Join Filter: ((newdata2.ctid <> newdata.ctid) AND (newdata.* *=
newdata2.*))
       Rows Removed by Join Filter: 25000000
       ->  Seq Scan on pg_temp_16452 newdata  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.022..15.448 rows=5000 loops=1)
             Filter: (newdata.* IS NOT NULL)
       ->  Materialize  (cost=0.00..97.88 rows=4975 width=34) (actual
time=0.000..0.500 rows=5000 loops=5000)
             ->  Seq Scan on pg_temp_16452 newdata2  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.010..4.033 rows=5000 loops=1)
                   Filter: (newdata2.* IS NOT NULL)

And with commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97 reverted:

LOG:  duration: 36.358 ms  plan:
Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16470 newdata WHERE
newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16470
newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
Limit  (cost=756.95..939.50 rows=1 width=28) (actual
time=36.354..36.354 rows=0 loops=1)
 ->  Merge Semi Join  (cost=756.95..2947.51 rows=12 width=28) (actual
time=36.352..36.352 rows=0 loops=1)
       Merge Cond: (newdata.* *= newdata2.*)
       Join Filter: (newdata2.ctid <> newdata.ctid)
       Rows Removed by Join Filter: 5000
       ->  Sort  (cost=378.48..390.91 rows=4975 width=34) (actual
time=9.622..10.300 rows=5000 loops=1)
             Sort Key: newdata.* USING *<
             Sort Method: quicksort  Memory: 622kB
             ->  Seq Scan on pg_temp_16470 newdata  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.021..4.986 rows=5000 loops=1)
                   Filter: (newdata.* IS NOT NULL)
       ->  Sort  (cost=378.48..390.91 rows=4975 width=34) (actual
time=7.378..8.010 rows=5000 loops=1)
             Sort Key: newdata2.* USING *<
             Sort Method: quicksort  Memory: 622kB
             ->  Seq Scan on pg_temp_16470 newdata2  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.017..3.034 rows=5000 loops=1)
                   Filter: (newdata2.* IS NOT NULL)

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: JIT compiling with LLVM v11
Next
From: Euler Taveira
Date:
Subject: Re: [submit code] I develop a tool for pgsql, how can I submit it