Re: speed up verifying UTF-8 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: speed up verifying UTF-8
Date
Msg-id bca46396-a517-467c-72f8-6140a05a4d1e@iki.fi
Whole thread Raw
In response to Re: speed up verifying UTF-8  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: speed up verifying UTF-8  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On 03/06/2021 22:10, John Naylor wrote:
> On Thu, Jun 3, 2021 at 3:08 PM Heikki Linnakangas <hlinnaka@iki.fi 
> <mailto:hlinnaka@iki.fi>> wrote:
>  >                 x1 = half1 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
>  >                 x2 = half2 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
>  >
>  >                 /* then check that the high bit is set in each byte. */
>  >                 x = (x1 | x2);
>  >                 x &= UINT64CONST(0x8080808080808080);
>  >                 if (x != UINT64CONST(0x8080808080808080))
>  >                         return 0;
> 
> That seems right, I'll try that and update the patch. (Forgot to attach 
> earlier anyway)

Ugh, actually that has the same issue as before. If one of the bytes is 
in one half is zero, but not in the other half, this fail to detect it. 
Sorry for the noise..

- Heikki



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: security_definer_search_path GUC
Next
From: David Christensen
Date:
Subject: Re: [PATCH] expand the units that pg_size_pretty supports on output