On 05.11.24 16:03, Bertrand Drouvot wrote:
> On Tue, Nov 05, 2024 at 05:08:41PM +1300, David Rowley wrote:
>> On Tue, 5 Nov 2024 at 06:39, Ranier Vilela <ranier.vf@gmail.com> wrote:
>>> I think we can add a small optimization to this last patch [1].
>>
>> I think if you want to make it faster, you could partially unroll the
>> inner-most loop, like:
>>
>> // size_t * 4
>> for (; p < aligned_end - (sizeof(size_t) * 3); p += sizeof(size_t) * 4)
>> {
>> if (((size_t *) p)[0] != 0 | ((size_t *) p)[1] != 0 | ((size_t *)
>> p)[2] != 0 | ((size_t *) p)[3] != 0)
>> return false;
>> }
>
> Another option could be to use SIMD instructions to check multiple bytes
> is zero in a single operation. Maybe just an idea to keep in mind and experiment
> if we feel the need later on.
Speaking of which, couldn't you just use
pg_popcount(ptr, len) == 0
?