On 11/11/20 8:52 PM, Tomas Vondra wrote: > Hi, > > I took a look at this today, doing a bit of stress-testing, and I can > get it to crash because of segfaults in pagetable_create (not sure if > the issue is there, it might be just a symptom of an issue elsewhere). > > Attached is a shell script I use to run the stress test - it's using > 'test' database, generates tables of different size and then runs > queries with various parameter combinations. It takes a while to trigger > the crash, so it might depend on timing or something like that. > > I've also attached two examples of backtraces. I've also seen infinite > loop in pagetable_create, but the crashes are much more common. >
Hi Dilip,
Do you plan to work on this for PG14? I haven't noticed any response in this thread, dealing with the crashes I reported a while ago. Also, it doesn't seem to be added to any of the commitfests.
Hi Tomas,
Thanks for testing this. Actually we have noticed a lot of performance drop in many cases due to the tbm_merge. So off list we are discussing different approaches and testing the performance. So basically, in the current approach all the worker are first preparing their bitmap hash and then they are merging into the common bitmap hash under a lock. So based on the off list discussion with Robert, the next approach I am trying is to directly insert into the shared bitmap hash while scanning the index itself. So now instead of preparing a separate bitmap, all the workers will directly insert into the shared bitmap hash. I agree that for getting each page from the bitmaphash we need to acquire the lock and this also might generate a lot of lock contention but we want to try the POC and check the performance. In fact I have already implemented the POC and results aren't great. But I am still experimenting with it to see whether the lock can be more granular than I have now. I will share my finding soon along with the POC patch.