Re: weird hash plan cost, starting with pg10 - Mailing list pgsql-hackers

From Richard Guo
Subject Re: weird hash plan cost, starting with pg10
Date
Msg-id CAMbWs4-oR+V1AtR7j8Nv85-n1eEng6Z4NmHvcJLEpsb69x5JMw@mail.gmail.com
Whole thread Raw
In response to Re: weird hash plan cost, starting with pg10  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: weird hash plan cost, starting with pg10  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers

On Tue, Mar 24, 2020 at 3:36 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Tue, Mar 24, 2020 at 11:05 AM Thomas Munro <thomas.munro@gmail.com> wrote:

I think there might be a case like this:

* ExecRescanHashJoin() decides it can't reuse the hash table for a
rescan, so it calls ExecHashTableDestroy(), clears HashJoinState's
hj_HashTable and sets hj_JoinState to HJ_BUILD_HASHTABLE
* the HashState node still has a reference to the pfree'd HashJoinTable!
* HJ_BUILD_HASHTABLE case reaches the empty-outer optimisation case so
it doesn't bother to build a new hash table
* EXPLAIN examines the HashState's pointer to a freed HashJoinTable struct

Yes, debugging with gdb shows this is exactly what happens.

According to the scenario above, here is a recipe that reproduces this
issue.

-- recipe start
create table a(i int, j int);
create table b(i int, j int);
create table c(i int, j int);

insert into a select 3,3;
insert into a select 2,2;
insert into a select 1,1;

insert into b select 3,3;

insert into c select 0,0;

analyze a;
analyze b;
analyze c;

set enable_nestloop to off;
set enable_mergejoin to off;

explain analyze
select exists(select * from b join c on a.i > c.i and a.i = b.i and b.j = c.j) from a;
-- recipe end

I tried this recipe on different PostgreSQL versions, starting from
current master and going backwards. I was able to reproduce this issue
on all versions above 8.4. In 8.4 version, we do not output information
on hash buckets/batches. But manual inspection with gdb shows in 8.4 we
also have the dangling pointer for HashState->hashtable. I didn't check
versions below 8.4 though.

Thanks
Richard

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: truncating timestamps on arbitrary intervals
Next
From: Sergei Kornilov
Date:
Subject: Re: replay pause vs. standby promotion