Re: speed up verifying UTF-8 - Mailing list pgsql-hackers

From Greg Stark
Subject Re: speed up verifying UTF-8
Date
Msg-id CAM-w4HMTHARzthg-3j1GnXUFTer0mLVXt8voPbA+iF68OSvHTg@mail.gmail.com
Whole thread Raw
In response to Re: speed up verifying UTF-8  (Greg Stark <stark@mit.edu>)
Responses Re: speed up verifying UTF-8
List pgsql-hackers
I haven't looked at the surrounding code. Are we processing all the
COPY data in one long stream or processing each field individually? If
we're processing much more than 128 bits and happy to detect NUL
errors only at the end after wasting some work then you could hoist
that has_zero check entirely out of the loop (removing the branch
though it's probably a correctly predicted branch anyways).

Do something like:

zero_accumulator = zero_accumulator & next_chunk

in the loop and then only at the very end check for zeros in that.



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: speed up verifying UTF-8
Next
From: Nitin Jadhav
Date:
Subject: Re: Multi-Column List Partitioning