Thread: Patch for ginCombineData

Patch for ginCombineData

From
Robert Abraham
Date:
Hello,

we are using gin indexes on big tables. these tables happen to have several billion rows.
the index creation fails in ginCombineData in src/backend/access/ginbulk.c because repalloc is limited to 1 gb.
this limitation makes no sense in this context (compare comments in src/include/utils/memutils.h).
To overcome this limitation on tables with lots of rows repalloc_huge is used.
The test suite still succeeds on my machine.
Find the patch attached,

Kind regards,

Robert Abraham
Attachment

Re: Patch for ginCombineData

From
Alexander Korotkov
Date:
Hi!

On Wed, Aug 5, 2015 at 1:17 PM, Robert Abraham <robert.abraham86@googlemail.com> wrote:
we are using gin indexes on big tables. these tables happen to have several billion rows.
the index creation fails in ginCombineData in src/backend/access/ginbulk.c because repalloc is limited to 1 gb.
this limitation makes no sense in this context (compare comments in src/include/utils/memutils.h).
To overcome this limitation on tables with lots of rows repalloc_huge is used.
The test suite still succeeds on my machine.
Find the patch attached,

Thank you for notice and for the patch!
You should have maintenance_work_mem > 1gb and some very frequent entry so that it's posting list exceeds 1 gb itself.
These circumstances shouldn't be very rare in modern systems. I think it could be backpatched.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Patch for ginCombineData

From
Robert Haas
Date:
On Wed, Aug 5, 2015 at 6:17 AM, Robert Abraham
<robert.abraham86@googlemail.com> wrote:
> we are using gin indexes on big tables. these tables happen to have several
> billion rows.
> the index creation fails in ginCombineData in src/backend/access/ginbulk.c
> because repalloc is limited to 1 gb.
> this limitation makes no sense in this context (compare comments in
> src/include/utils/memutils.h).
> To overcome this limitation on tables with lots of rows repalloc_huge is
> used.
> The test suite still succeeds on my machine.
> Find the patch attached,

Since nobody seems ready to act on this patch immediately, I suggest
adding it here so it doesn't get forgotten:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Patch for ginCombineData

From
Jeff Janes
Date:
On Wed, Aug 5, 2015 at 3:17 AM, Robert Abraham <robert.abraham86@googlemail.com> wrote:
Hello,

we are using gin indexes on big tables. these tables happen to have several billion rows.
the index creation fails in ginCombineData in src/backend/access/ginbulk.c because repalloc is limited to 1 gb.
this limitation makes no sense in this context (compare comments in src/include/utils/memutils.h).
To overcome this limitation on tables with lots of rows repalloc_huge is used.
The test suite still succeeds on my machine.
Find the patch attached,

This looks good to me.  

One possible issue I see is that if accum.allocatedMemory is only slightly less than maintenance_work_mem just before we decide to repalloc, then doing the repalloc_huge could cause us to exceed maintenance_work_mem. Potentially by a lot.  In the worst case (when all the data we've seen during this round of accumulation falls into the same key), we would overrun by a factor of almost 2.  Or almost 3, if you count the time during the repalloc when the new data has been allocated but the old data not yet freed.  Perhaps the code here should look at the amount of maintenance_work_mem left and grow by less than a factor of 2 if it is too close to going over.

Mitigating that, it won't actually use very much of that memory.  Once control passes back at the end of this tuple, the caller will then realize it has overshot the maintenance_work_mem and will flush it all.  I think most modern OSes will not have a problems caused by allocated but untouched memory.

This patch doesn't introduce that problem, it just allows it to operate at a higher absolute size.  So I'm marking this as ready for committer.

Cheers,

Jeff