Thread: Speed up collation cache

Speed up collation cache

From
Jeff Davis
Date:
The blog post here (thank you depesz!):

https://www.depesz.com/2024/06/11/how-much-speed-youre-leaving-at-the-table-if-you-use-default-locale/

showed an interesting result where the builtin provider is not quite as
fast as "C" for queries like:

   SELECT * FROM a WHERE t = '...';

The reason is that it's calling varstr_cmp() many times, which does a
lookup in the collation cache for each call. For sorts, it only does a
lookup in the collation cache once, so the effect is not significant.

The reason looking up "C" is faster is because there's a special check
for C_COLLATION_OID, so it doesn't even need to do the hash lookup. If
you create an equivalent collation like:

   CREATE COLLATION libc_c(PROVIDER = libc, LOCALE = 'C');

it will perform the same as a collation with the builtin provider.

Attached is a patch to use simplehash.h instead, which speeds things up
enough to make them fairly close (from around 15% slower to around 8%).

The patch is based on the series here:

https://postgr.es/m/f1935bc481438c9d86c2e0ac537b1c110d41a00a.camel@j-davis.com

which does some refactoring in a related area, but I can make them
independent.

We can also consider what to do about those special cases:

  * add a special case for PG_C_UTF8?
  * instead of a hardwired set of special collation IDs, have a single-
element "last collation ID" to check before doing the hash lookup?
  * remove the special cases entirely if we can close the performance
gap enough that it's not important?

(Note: the special case in lc_ctpye_is_c() is currently required for
correctness because hba.c uses C_COLLATION_OID for regexes before the
syscache is initialized. That can be fixed pretty easily a couple
different ways, though.)

--
Jeff Davis
PostgreSQL Contributor Team - AWS



Attachment

Re: Speed up collation cache

From
Peter Eisentraut
Date:
On 15.06.24 01:46, Jeff Davis wrote:
>    * instead of a hardwired set of special collation IDs, have a single-
> element "last collation ID" to check before doing the hash lookup?

I'd imagine that method could be very effective.



Re: Speed up collation cache

From
John Naylor
Date:
On Sat, Jun 15, 2024 at 6:46 AM Jeff Davis <pgsql@j-davis.com> wrote:
> Attached is a patch to use simplehash.h instead, which speeds things up
> enough to make them fairly close (from around 15% slower to around 8%).

+#define SH_HASH_KEY(tb, key)   hash_uint32((uint32) key)

For a static inline hash for speed reasons, we can use murmurhash32
here, which is also inline.



Re: Speed up collation cache

From
Jeff Davis
Date:
On Thu, 2024-06-20 at 17:07 +0700, John Naylor wrote:
> On Sat, Jun 15, 2024 at 6:46 AM Jeff Davis <pgsql@j-davis.com> wrote:
> > Attached is a patch to use simplehash.h instead, which speeds
> > things up
> > enough to make them fairly close (from around 15% slower to around
> > 8%).
>
> +#define SH_HASH_KEY(tb, key)   hash_uint32((uint32) key)
>
> For a static inline hash for speed reasons, we can use murmurhash32
> here, which is also inline.

Thank you, that brings it down a few more percentage points.

New patches attached, still based on the setlocale-removal patch
series.

Setup:

  create collation libc_c (provider=libc, locale='C');
  create table collation_cache_test(t text);
  insert into collation_cache_test
    select g::text||' '||g::text
      from generate_series(1,200000000) g;

Queries:

  select * from collation_cache_test where t < '0' collate "C";
  select * from collation_cache_test where t < '0' collate libc_c;

The two collations are identical except that the former benefits from
the optimization for C_COLLATION_OID, and the latter does not, so these
queries measure the overhead of the collation cache lookup.

Results (in ms):

              "C"   "libc_c"   overhead
   master:    6350     7855     24%
   v4-0001:   6091     6324      4%

(Note: I don't have an explanation for the difference in performance of
the "C" locale -- probably just some noise in the test.)

Considering that simplehash brings the worst case overhead under 5%, I
don't see a big reason to use the single-element cache also.

Regards,
    Jeff Davis


Attachment

Re: Speed up collation cache

From
Andreas Karlsson
Date:
On 7/26/24 11:00 PM, Jeff Davis wrote:
> Results (in ms):
> 
>                "C"   "libc_c"   overhead
>     master:    6350     7855     24%
>     v4-0001:   6091     6324      4%

I got more overhead in my quick benchmarking when I ran the same 
benchmark. Also tried your idea with caching the last lookup (PoC patch 
attached) and it basically removed all overhead, but I guess it will not 
help if you have two different non.default locales in the same query.

             "C"   "libc_c" overhead
before:     6695  8376     25%
after:      6605  7340     11%
cache last: 6618  6677      1%

But even without that extra optimization I think this patch is worth 
merging and the patch is small, simple and clean and easy to understand 
and a just a clear speed up. Feels like a no brainer. I think that it is 
ready for committer.

And then we can discuss after committing if an additional cache of the 
last locale is worth it or not.

Andreas
Attachment

Re: Speed up collation cache

From
Jeff Davis
Date:
On Sun, 2024-07-28 at 00:14 +0200, Andreas Karlsson wrote:
> But even without that extra optimization I think this patch is worth
> merging and the patch is small, simple and clean and easy to
> understand
> and a just a clear speed up. Feels like a no brainer. I think that it
> is
> ready for committer.

Committed, thank you.

> And then we can discuss after committing if an additional cache of
> the
> last locale is worth it or not.

Yeah, I'm holding off on that until refactoring in the area settles,
and we'll see if it's still worth it.

Regards,
    Jeff Davis