A population of population counts - Mailing list pgsql-hackers

From Thomas Munro
Subject A population of population counts
Date
Msg-id CAEepm=3k++Ytf2LNQCvpP6m1=gY9zZHP_cfnn47=WTsoCrLCvA@mail.gmail.com
Whole thread Raw
Responses Re: A population of population counts  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
Hi

I noticed that we have three "number_of_ones" tables under contrib and
two under src, and some new specially masked variants for visibility
maps.

Would it be an improvement if we just defined one table with external
linkage, and accessed it via a macros/functions popcount_uint8, and
wider versions _uint32, popcount_array(data, len) that sum the
popcounts of their component bytes?

Then there would be less duplication, and future opportunities to use
fancy built-ins/assembler instructions/vectorisation in one central
place, and to work in larger sizes than bytes.

Perhaps we could also get rid of the new special masked popcount
tables by masking the value we look up instead, eg walk through the
page calling popcount_uint64(value & FROZEN_BITMASK_64).

As for the rightmost_one_pos table in bitmapset.c, couldn't the
various bms_XXX functions just use ffs(n) - 1 and work word-at-a-time?That generates a bsf instruction at -O2 on this
machine.

The micro-optimisation opportunities probably don't matter, but I
wondered if it might at least be interesting to delete a bunch of
code, and re-use a standard interface for something that apparently
several modules need to do.

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: "Allow usage of huge maintenance_work_mem for GIN build" patch
Next
From: Kevin Grittner
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <