use CLZ instruction in AllocSetFreeIndex() - Mailing list pgsql-hackers

From John Naylor
Subject use CLZ instruction in AllocSetFreeIndex()
Date
Msg-id CACPNZCuNUGMxjK7WTn_=WZnRbfASDdBxmjsVf2+m9MdmeNw_sg@mail.gmail.com
Whole thread Raw
Responses Re: use CLZ instruction in AllocSetFreeIndex()
List pgsql-hackers
Hi all,

In commit ab5b4e2f9ed, we optimized AllocSetFreeIndex() using a lookup
table. At the time, using CLZ was rejected because compiler/platform
support was not widespread enough to justify it. For other reasons, we
recently added bitutils.h which uses __builtin_clz() where available,
so it makes sense to revisit this. I modified the test in [1] (C files
attached), using two separate functions to test CLZ versus the
open-coded algorithm of pg_leftmost_one_pos32().

These are typical results on a recent Intel platform:

HEAD        5.55s
clz         4.51s
open-coded  9.67s

CLZ gives a nearly 20% speed boost on this microbenchmark. I suspect
that this micro-benchmark is actually biased towards the lookup table
more than real-world workloads, because it can monopolize the L1
cache. Sparing cache is possibly the more interesting reason to use
CLZ. The open-coded version is nearly twice as slow, so it makes sense
to keep the current implementation as the default one, and not use
pg_leftmost_one_pos32() directly. However, with a small tweak, we can
reuse the lookup table in bitutils.c instead of the bespoke one used
solely by AllocSetFreeIndex(), saving a couple cache lines here also.
This is done in the attached patch.

[1] https://www.postgresql.org/message-id/407d949e0907201811i13c73e18x58295566d27aadcc%40mail.gmail.com

--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [Patch] Invalid permission check in pg_stats for functional indexes
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Superuser can permit passwordless connections on postgres_fdw