Thread: HyperLogLog.c and pg_leftmost_one_pos32()
Is there a reason that HyperLogLog doesn't use pg_leftmost_one_pos32()? I tried the following patch and some brief performance tests seem to show an improvement. This came up because my recent commit 9878b643 uses HLL for estimating the cardinality of spill files, which solves a few annoyances with overpartitioning[1]. I think it's overall an improvement, but addHyperLogLog() itself seemed to show up as a cost, so it can hurt spilled-but-still-in-memory cases. I'd also like to backpatch this to 13 (as I already did for 9878b643), if that's acceptable. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/CAH2-Wznidojad-zbObnFOzDA5RTCS0JLsqcZkDNu+ou1NGYQYQ@mail.gmail.com
Attachment
On Wed, Jul 29, 2020 at 10:08 AM Jeff Davis <pgsql@j-davis.com> wrote: > Is there a reason that HyperLogLog doesn't use pg_leftmost_one_pos32()? Yes: HyperLogLog predates pg_leftmost_one_pos32(). > I tried the following patch and some brief performance tests seem to > show an improvement. Makes sense. How did you test this? What kind of difference are we talking about? I ported this code from the upstream C++ as part of the original abbreviated keys commit. Note that the cardinality of abbreviated keys are displayed when you set "trace_sort = on". > This came up because my recent commit 9878b643 uses HLL for estimating > the cardinality of spill files, which solves a few annoyances with > overpartitioning[1]. I think that you should change back the rhs() variable names to match HyperLogLog upstream (as well as the existing rhs() comments). > I think it's overall an improvement, but > addHyperLogLog() itself seemed to show up as a cost, so it can hurt > spilled-but-still-in-memory cases. I'd also like to backpatch this to > 13 (as I already did for 9878b643), if that's acceptable. I still wonder if it was ever necessary to add HLL to abbreviated keys. It only served to avoid some pretty narrow worse cases, at the expense of typical cases. Given that the existing users of HLL are pretty narrow, and given the importance of preserving the favorable performance characteristics of hash aggregate, I'm inclined to agree that it's worth backpatching to 13 now. Assuming it is a really measurable cost in practice. -- Peter Geoghegan
On Wed, 2020-07-29 at 17:32 -0700, Peter Geoghegan wrote: > How did you test this? What kind of difference are we talking about? Essentially: initHyperLogLog(&hll, 5) for i in 0 .. one billion addHyperLogLog(&hll, hash(i)) estimateHyperLogLog The numbers are the same regardless of bwidth. Before my patch, it takes about 15.6s. After my patch, it takes about 6.6s, so it's more than a 2X speedup (including the hash calculation). As a part of HashAgg, when I test the worst case, it goes from a 4-5% penalty to ~1% (within noise). > I think that you should change back the rhs() variable names to match > HyperLogLog upstream (as well as the existing rhs() comments). Done. > > I think it's overall an improvement, but > > addHyperLogLog() itself seemed to show up as a cost, so it can hurt > > spilled-but-still-in-memory cases. I'd also like to backpatch this > > to > > 13 (as I already did for 9878b643), if that's acceptable. > > I still wonder if it was ever necessary to add HLL to abbreviated > keys. It only served to avoid some pretty narrow worse cases, at the > expense of typical cases. Given that the existing users of HLL are > pretty narrow, and given the importance of preserving the favorable > performance characteristics of hash aggregate, I'm inclined to agree > that it's worth backpatching to 13 now. Assuming it is a really > measurable cost in practice. Yes, the difference (at least in a tight loop, on my machine) is not subtle. I went ahead and committed and backpatched. Regards, Jeff Davis
On Thu, Jul 30, 2020 at 09:21:23AM -0700, Jeff Davis wrote: >On Wed, 2020-07-29 at 17:32 -0700, Peter Geoghegan wrote: >> How did you test this? What kind of difference are we talking about? > >Essentially: > initHyperLogLog(&hll, 5) > for i in 0 .. one billion > addHyperLogLog(&hll, hash(i)) > estimateHyperLogLog > >The numbers are the same regardless of bwidth. > >Before my patch, it takes about 15.6s. After my patch, it takes about >6.6s, so it's more than a 2X speedup (including the hash calculation). > Wow. That's a huge improvements. How does the whole test (data + query) look like? Is it particularly rare / special case, or something reasonable to expect in practice? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 2020-07-30 at 19:16 +0200, Tomas Vondra wrote: > > Essentially: > > initHyperLogLog(&hll, 5) > > for i in 0 .. one billion > > addHyperLogLog(&hll, hash(i)) > > estimateHyperLogLog > > > > The numbers are the same regardless of bwidth. > > > > Before my patch, it takes about 15.6s. After my patch, it takes > > about > > 6.6s, so it's more than a 2X speedup (including the hash > > calculation). > > > > Wow. That's a huge improvements. To be clear: the 2X+ speedup was on the tight loop test. > How does the whole test (data + query) look like? Is it particularly > rare / special case, or something reasonable to expect in practice? The whole-query test was: config: shared_buffers=8GB jit = off max_parallel_workers_per_gather=0 setup: create table t_1m_20(i int); vacuum (freeze, analyze) t_1m_20; insert into t_1m_20 select (random()*1000000)::int4 from generate_series(1,20000000); query: set work_mem='2048kB'; SELECT pg_prewarm('t_1m_20', 'buffer'); -- median of the three runs select distinct i from t_1m_20 offset 10000000; select distinct i from t_1m_20 offset 10000000; select distinct i from t_1m_20 offset 10000000; results: f2130e77 (before using HLL): 6787ms f1af75c5 (before my recent commit): 7170ms fd734f38 (master now): 6990ms My previous results before I committed the patch (and therefore not on the same exact SHA1s) were 6812, 7158, and 6898. So my most recent batch of results is slightly worse, but the most recent commit (fd734f38) still does show an improvement of a couple percent. Regards, Jeff Davis