neqjoinsel versus "refresh materialized view concurrently" - Mailing list pgsql-hackers

From Jeff Janes
Subject neqjoinsel versus "refresh materialized view concurrently"
Date
Msg-id CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com
Whole thread Raw
Responses Re: neqjoinsel versus "refresh materialized view concurrently"  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
The following commit has caused a devastating performance regression
in concurrent refresh of MV:

commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed Nov 29 22:00:29 2017 -0500

    Fix neqjoinsel's behavior for semi/anti join cases.


The below reproduction goes from taking about 1 second to refresh, to taking an amount of time I don't have the patience to measure.  

drop table foobar2 cascade;
create table foobar2 as select * from generate_series(1,200000);
create materialized view foobar3 as select * from foobar2;
create unique index on foobar3 (generate_series );
analyze foobar3;
refresh materialized view CONCURRENTLY foobar3 ;


When I interrupt the refresh, I get a message including this line:

CONTEXT:  SQL statement "SELECT newdata FROM pg_temp_3.pg_temp_16420 newdata WHERE newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16420 newdata2 WHERE newdata2 IS NOT NULL AND newdata2 OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1"

So I makes sense that the commit in question could have caused a change in the execution plan.  Because these are temp tables, I can't easily get my hands on them to investigate further.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Jeevan Ladhe
Date:
Subject: Re: Fix error in ECPG while connection handling
Next
From: Tom Lane
Date:
Subject: Re: JIT compiling with LLVM v11