Re: Popcount optimization using AVX512 - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Popcount optimization using AVX512
Date
Msg-id 20240402155301.GA2750455@nathanxps13
Whole thread Raw
In response to Re: Popcount optimization using AVX512  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Popcount optimization using AVX512
List pgsql-hackers
On Mon, Apr 01, 2024 at 05:11:17PM -0500, Nathan Bossart wrote:
> Here is a v19 of the patch set.  I moved out the refactoring of the
> function pointer selection code to 0001.  I think this is a good change
> independent of $SUBJECT, and I plan to commit this soon.  In 0002, I
> changed the syslogger.c usage of pg_popcount() to use pg_number_of_ones
> instead.  This is standard practice elsewhere where the popcount functions
> are unlikely to win.  I'll probably commit this one soon, too, as it's even
> more trivial than 0001.
>
> 0003 is the AVX512 POPCNT patch.  Besides refactoring out 0001, there are
> no changes from v18.  0004 is an early proof-of-concept for using AVX512
> for the visibility map code.  The code is missing comments, and I haven't
> performed any benchmarking yet, but I figured I'd post it because it
> demonstrates how it's possible to build upon 0003 in other areas.

I've committed the first two patches, and I've attached a rebased version
of the latter two.

> AFAICT the main open question is the function call overhead in 0003 that
> Alvaro brought up earlier.  After 0002 is committed, I believe the only
> in-tree caller of pg_popcount() with very few bytes is bit_count(), and I'm
> not sure it's worth expending too much energy to make sure there are
> absolutely no regressions there.  However, I'm happy to do so if folks feel
> that it is necessary, and I'd be grateful for thoughts on how to proceed on
> this one.

Another idea I had is to turn pg_popcount() into a macro that just uses the
pg_number_of_ones array when called for few bytes:

    static inline uint64
    pg_popcount_inline(const char *buf, int bytes)
    {
        uint64        popcnt = 0;

        while (bytes--)
            popcnt += pg_number_of_ones[(unsigned char) *buf++];

        return popcnt;
    }

    #define pg_popcount(buf, bytes) \
        ((bytes < 64) ? \
         pg_popcount_inline(buf, bytes) : \
         pg_popcount_optimized(buf, bytes))

But again, I'm not sure this is really worth it for the current use-cases.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Detoasting optionally to make Explain-Analyze less misleading
Next
From: Tom Lane
Date:
Subject: Re: On disable_cost