Previously, heapBlk was defined as an unsigned 32-bit integer. When incremented
by pagesPerRange on very large tables, it could wrap around, causing the condition
heapBlk < nblocks to remain true indefinitely — resulting in an infinite loop.
This could cause the PostgreSQL backend to hang, consuming 100% CPU indefinitely
and preventing operations from completing on large tables.
The solution is straightforward — the data type of `heapBlk` has been changed
from a 32-bit integer to a 64-bit `BlockNumber` (int64), ensuring it can safely
handle extremely large tables without risk of overflow.
This was explained very nicely by Tomas Vondra[1] and below two solutions were
suggested.
i) Change to int64
ii) Tracking the prevHeapBlk
Among these two I feel using solution #1 would be more feasible(similar to previously used solution 4bc6fb57f774ea18187fd8565aad9994160bfc17[2]), though
other solution also works.
I’ve attached a patch with the changes for solution #1.
Kindly review it and share your feedback or suggestions — your input would be greatly appreciated.
Reference:
[1] https://www.postgresql.org/message-id/b8a4e04c-c091-056c-a379-11d35c7b2d8d%40enterprisedb.com
[2] https://github.com/postgres/postgres/commit/4bc6fb57f774ea18187fd8565aad9994160bfc17