Hi,
On 2021-07-08 20:53:32 -0700, Andres Freund wrote:
> On 2021-07-07 20:46:38 +0900, Masahiko Sawada wrote:
> > 1. Don't allocate more than 1GB. There was a discussion to eliminate
> > this limitation by using MemoryContextAllocHuge() but there were
> > concerns about point 2[1].
> >
> > 2. Allocate the whole memory space at once.
> >
> > 3. Slow lookup performance (O(logN)).
> >
> > I’ve done some experiments in this area and would like to share the
> > results and discuss ideas.
>
> Yea, this is a serious issue.
>
>
> 3) could possibly be addressed to a decent degree without changing the
> fundamental datastructure too much. There's some sizable and trivial
> wins by just changing vac_cmp_itemptr() to compare int64s and by using
> an open coded bsearch().
Just using itemptr_encode() makes array in test #1 go from 8s to 6.5s on my
machine.
Another thing I just noticed is that you didn't include the build times for the
datastructures. They are lower than the lookups currently, but it does seem
like a relevant thing to measure as well. E.g. for #1 I see the following build
times
array 24.943 ms
tbm 206.456 ms
intset 93.575 ms
vtbm 134.315 ms
rtbm 145.964 ms
that's a significant range...
Randomizing the lookup order (using a random shuffle in
generate_index_tuples()) changes the benchmark results for #1 significantly:
shuffled time unshuffled time
array 6551.726 ms 6478.554 ms
intset 67590.879 ms 10815.810 ms
rtbm 17992.487 ms 2518.492 ms
tbm 364.917 ms 360.128 ms
vtbm 12227.884 ms 1288.123 ms
FWIW, I get an assertion failure when using an assertion build:
#2 0x0000561800ea02e0 in ExceptionalCondition (conditionName=0x7f9115a88e91 "found", errorType=0x7f9115a88d11
"FailedAssertion",
fileName=0x7f9115a88e8a "rtbm.c", lineNumber=242) at
/home/andres/src/postgresql/src/backend/utils/error/assert.c:69
#3 0x00007f9115a87645 in rtbm_add_tuples (rtbm=0x561806293280, blkno=0, offnums=0x7fffdccabb00, nitems=10) at
rtbm.c:242
#4 0x00007f9115a8363d in load_rtbm (rtbm=0x561806293280, itemptrs=0x7f908a203050, nitems=10000000) at bdbench.c:618
#5 0x00007f9115a834b9 in rtbm_attach (lvtt=0x7f9115a8c300 <LVTestSubjects+352>, nitems=10000000, minblk=2139062143,
maxblk=2139062143,maxoff=32639)
at bdbench.c:587
#6 0x00007f9115a83837 in attach (lvtt=0x7f9115a8c300 <LVTestSubjects+352>, nitems=10000000, minblk=2139062143,
maxblk=2139062143,maxoff=32639)
at bdbench.c:658
#7 0x00007f9115a84190 in attach_dead_tuples (fcinfo=0x56180322d690) at bdbench.c:873
I assume you just inverted the Assert(found) assertion?
Greetings,
Andres Freund