Re: BufferAlloc: don't take two simultaneous locks - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: BufferAlloc: don't take two simultaneous locks
Date
Msg-id 20220314.093948.1426408606053906565.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: BufferAlloc: don't take two simultaneous locks  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Responses Re: BufferAlloc: don't take two simultaneous locks  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
At Fri, 11 Mar 2022 11:30:27 +0300, Yura Sokolov <y.sokolov@postgrespro.ru> wrote in
> В Пт, 11/03/2022 в 15:30 +0900, Kyotaro Horiguchi пишет:
> > At Thu, 03 Mar 2022 01:35:57 +0300, Yura Sokolov <y.sokolov@postgrespro.ru> wrote in
> > > В Вт, 01/03/2022 в 10:24 +0300, Yura Sokolov пишет:
> > > > Ok, here is v4.
> > >
> > > And here is v5.
> > >
> > > First, there was compilation error in Assert in dynahash.c .
> > > Excuse me for not checking before sending previous version.
> > >
> > > Second, I add third commit that reduces HASHHDR allocation
> > > size for non-partitioned dynahash:
> > > - moved freeList to last position
> > > - alloc and memset offset(HASHHDR, freeList[1]) for
> > >   non-partitioned hash tables.
> > > I didn't benchmarked it, but I will be surprised if it
> > > matters much in performance sence.
> > >
> > > Third, I put all three commits into single file to not
> > > confuse commitfest application.
> >
> > Thanks!  I looked into dynahash part.
> >
> >  struct HASHHDR
> >  {
> > -       /*
> > -        * The freelist can become a point of contention in high-concurrency hash
> >
> > Why did you move around the freeList?
> >
> >
> > -       long            nentries;               /* number of entries in associated buckets */
> > +       long            nfree;                  /* number of free entries in the list */
> > +       long            nalloced;               /* number of entries initially allocated for
> >
> > Why do we need nfree?  HASH_ASSING should do the same thing with
> > HASH_REMOVE.  Maybe the reason is the code tries to put the detached
> > bucket to different free list, but we can just remember the
> > freelist_idx for the detached bucket as we do for hashp.  I think that
> > should largely reduce the footprint of this patch.
>
> If we keep nentries, then we need to fix nentries in both old
> "freeList" partition and new one. It is two freeList[partition]->mutex
> lock+unlock pairs.
>
> But count of free elements doesn't change, so if we change nentries
> to nfree, then no need to fix freeList[partition]->nfree counters,
> no need to lock+unlock.

Ah, okay. I missed that bucket reuse chages key in most cases.

But still I don't think its good to move entries around partition
freelists for another reason.  I'm afraid that the freelists get into
imbalanced state.  get_hash_entry prefers main shmem allocation than
other freelist so that could lead to freelist bloat, or worse
contension than the traditinal way involving more than two partitions.

I'll examine the possibility to resolve this...

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: On login trigger: take three
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: BufferAlloc: don't take two simultaneous locks