Re: Bug in huge simplehash - Mailing list pgsql-hackers

From David Rowley
Subject Re: Bug in huge simplehash
Date
Msg-id CAApHDvrbpd_bSHiSh_Rzvo5EEn16ij5s80sJa7uK31m=rfxWTw@mail.gmail.com
Whole thread Raw
In response to Bug in huge simplehash  (Yura Sokolov <y.sokolov@postgrespro.ru>)
List pgsql-hackers
On Tue, 10 Aug 2021 at 20:53, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> EXPLAIN shows that there are 2604186278 rows in all partitions, but
> planner
> thinks there will be only 200 unique rows after group by. Looks like we
> was
> mistaken.

This looks unrelated.  Looks like the planner used DEFAULT_NUM_DISTINCT.

>          /* now set size */
>          tb->size = size;
>
>          if (tb->size == SH_MAX_SIZE)
>                  tb->sizemask = 0;
>          else
>                  tb->sizemask = tb->size - 1;

Ouch.  That's not great.

>          /* now set size */
>          tb->size = size;
>          tb->sizemask = (uint32)(size - 1);

That fix seems fine.

> I went to check SH_GROW and.... It is `SH_GROW(SH_TYPE *tb, uint32
> newsize)`
> :-(((
> Therefore when `tb->size == SH_MAX_SIZE/2` and we call `SH_GROW(tb,
> tb->size * 2)`,
> then SH_GROW(tb, 0) is called due to truncation.
> And SH_COMPUTE_PARAMETERS is also accepts `uint32 newsize`.

Yeah. Agreed. I don't see anything wrong with your fix for that.

I'm surprised nobody has hit this before. I guess having that many
groups is not common.

Annoyingly this just missed the window for being fixed in the minor
releases going out soon. We'll need to wait a few days before
patching.

David



pgsql-hackers by date:

Previous
From: "kuroda.hayato@fujitsu.com"
Date:
Subject: RE: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE
Next
From: Amit Kapila
Date:
Subject: Re: Skipping logical replication transactions on subscriber side