Re: DSA overflow in hash join - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: DSA overflow in hash join
Date
Msg-id 7e9903e7-c640-429f-a68c-612a2db4b6d1@vondra.me
Whole thread Raw
In response to Re: DSA overflow in hash join  (Konstantin Knizhnik <knizhnik@garret.ru>)
List pgsql-hackers
Hi,

I did look at this because of the thread about "nbatch overflow" [1].
And the patches I just posted in that thread resolve the issue for me,
in the sense that the reproducer [2] no longer fails for me.

But I think that's actually mostly an accident - the balancing reduces
nbatch, exchanging it for in-memory hash table. In this case we start
with nbatch=2M, but it gets reduced to 64k. Which is low enough to fit
into the 1GB allocation limit.

Which is nice, but I can't guarantee it will always work out like this.
It's unlikely we'd need 2M batches, but is it impossible?

So we may still need something like this the max_batches protection. I
don't think we should apply this to non-parallel hash joins, though.
Which is what the last patch would do, I think.

However, why don't we simply allow huge allocations for this?

  /* Allocate space. */
  pstate->batches =
    dsa_allocate_extended(hashtable->area,
           EstimateParallelHashJoinBatch(hashtable) * nbatch,
       (DSA_ALLOC_ZERO | DSA_ALLOC_HUGE));

This fixes the issue for me, even with the balancing disabled. Or is
there a reason why this would be a bad idea?

It seems a bit strange to force parallel scans to use fewer batches,
when (presumably) parallelism is more useful for larger data sets.

regards


[1]
https://www.postgresql.org/message-id/244dc6c1-3b3d-4de2-b3de-b1511e6a6d10%40vondra.me

[2]
https://www.postgresql.org/message-id/52b94d5b-a135-489d-9833-2991a69ec623%40garret.ru

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Fix overflow of nbatch
Next
From: Michael Paquier
Date:
Subject: Re: [PATCH] Add tests for Bitmapset