On 8/20/21, 11:20 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Fri, Aug 20, 2021 at 1:29 PM Bossart, Nathan <bossartn@amazon.com> wrote:
>> Thinking about this stuff further, I was wondering if one way to
>> handle the bounded shared hash table problem would be to replace the
>> latest boundary in the map whenever it was full. But at that point,
>> do we even need a hash table? This led me to revisit the two-element
>> approach that was discussed upthread. What if we only stored the
>> earliest and latest segment boundaries at any given time? Once the
>> earliest boundary is added, it never changes until the segment is
>> flushed and it is removed. The latest boundary, however, will be
>> updated any time we register another segment. Once the earliest
>> boundary is removed, we replace it with the latest boundary. This
>> strategy could cause us to miss intermediate boundaries, but AFAICT
>> the worst case scenario is that we hold off creating .ready files a
>> bit longer than necessary.
>
> I think this is a promising approach. We could also have a small
> fixed-size array, so that we only have to risk losing track of
> anything when we overflow the array. But I guess I'm still unconvinced
> that there's a real possibility of genuinely needing multiple
> elements. Suppose we are thinking of adding a second element to the
> array (or the hash table). I feel like it's got to be safe to just
> remove the first one. If not, then apparently the WAL record that
> caused us to make the first entry isn't totally flushed yet - which I
> still think is impossible.
I've attached a patch to demonstrate what I'm thinking.
Nathan