Re: [POC] verifying UTF-8 using SIMD instructions - Mailing list pgsql-hackers

From John Naylor
Subject Re: [POC] verifying UTF-8 using SIMD instructions
Date
Msg-id CAFBsxsGkjcpFmqVNmE+T8AUV8XJNMsU+LOzu_HveQLvA5zjc6w@mail.gmail.com
Whole thread Raw
In response to [POC] verifying UTF-8 using SIMD instructions  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: [POC] verifying UTF-8 using SIMD instructions  (John Naylor <john.naylor@enterprisedb.com>)
List pgsql-hackers
I wrote:

> Thanks for testing! Good, the speedup is about as much as I can hope for using plain C. In the next patch I'll go ahead and squash in the ascii fast path, using 16-byte stride, unless there are objections. I claim we can live with the regression Heikki found on an old 32-bit Arm platform since it doesn't seem to be true of Arm in general.

In v8, I've squashed the 16-byte stride into 0002. I also removed the sole holdout of hard-coded intrinsics, by putting _mm_setr_epi8 inside a variadic macro, and also did some reordering of the one-line function definitions. (As before, 0001 is not my patch, but parts of it are a prerequisite to my regressions tests).

Over in [1] , I tested in-situ in a COPY FROM test and found a 10% speedup with mixed ascii and multibyte in the copy code, i.e. with buffer and storage taken completely out of the picture.

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Do we work with LLVM 12 on s390x?
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] ProcessInterrupts_hook