Thread: [PATCH] avoid double scanning in function byteain
Hi, The byteain function converts a string input into a bytea type. The original implementation processes two input formats: a hex format (starting with \x) and a traditional escaped format. For the escaped format, the function scans the input string twice — once to calculate the exact size of the output and allocate memory, and again to fill the allocated memory with the parsed data. This double scanning can be inefficient, especially for large inputs. So I optimized the function to eliminate the need for two scans, while preserving correctness and efficiency. Please help review it and share your valuable comments. Thanks, Steven Niu https://www.highgo.com/
Attachment
On Wed, 26 Mar 2025 at 12:17, Steven Niu <niushiji@gmail.com> wrote: > > Hi, Hi! > This double scanning can be inefficient, especially for large inputs. > So I optimized the function to eliminate the need for two scans, > while preserving correctness and efficiency. While the argument that processing input once not twice is fast is generally true, may we have some simple bench here just to have an idea how valuable this patch is? Patch: >+ /* Handle traditional escaped style in a single pass */ >+ input_len = strlen(inputText); >+ result = palloc(input_len + VARHDRSZ); /* Allocate max possible size */ > rp = VARDATA(result); >+ tp = inputText; >+ > while (*tp != '\0') Isn't this `strlen` O(n) + `while` O(n)? Where is the speed up? [0] https://github.com/bminor/glibc/blob/master/string/strlen.c#L43-L45 -- Best regards, Kirill Reshke
在 2025/3/26 16:37, Kirill Reshke 写道: > On Wed, 26 Mar 2025 at 12:17, Steven Niu <niushiji@gmail.com> wrote: >> >> Hi, > > Hi! > >> This double scanning can be inefficient, especially for large inputs. >> So I optimized the function to eliminate the need for two scans, >> while preserving correctness and efficiency. > > While the argument that processing input once not twice is fast is > generally true, may we have some simple bench here just to have an > idea how valuable this patch is? > > > Patch: > > >> + /* Handle traditional escaped style in a single pass */ >> + input_len = strlen(inputText); >> + result = palloc(input_len + VARHDRSZ); /* Allocate max possible size */ >> rp = VARDATA(result); >> + tp = inputText; >> + >> while (*tp != '\0') > > > Isn't this `strlen` O(n) + `while` O(n)? Where is the speed up? > > > > [0] https://github.com/bminor/glibc/blob/master/string/strlen.c#L43-L45 > Hi, Kirill, Your deep insight suprised me! Yes, you are correct that strlen() actually performed a loop operation. So maybe the performance difference is not so obvious. However, there are some other reasons that drive me to make this change. 1. The author of original code left comment: "BUGS: The input is scanned twice." . You can find this line of code in my patch. This indicates a left work to be done. 2. If I were the author of this function, I would not be satisfied with myself that I used two loops to do something which actually can be done with one loop. I prefer to choose a way that would not add more burden to readers. 3. The while (*tp != '\0') loop has some unnecessary codes and I made some change. Thanks, Steven
在 2025/3/26 16:37, Kirill Reshke 写道:
> On Wed, 26 Mar 2025 at 12:17, Steven Niu <niushiji@gmail.com> wrote:
>>
>> Hi,
>
> Hi!
>
>> This double scanning can be inefficient, especially for large inputs.
>> So I optimized the function to eliminate the need for two scans,
>> while preserving correctness and efficiency.
>
> While the argument that processing input once not twice is fast is
> generally true, may we have some simple bench here just to have an
> idea how valuable this patch is?
>
>
> Patch:
>
>
>> + /* Handle traditional escaped style in a single pass */
>> + input_len = strlen(inputText);
>> + result = palloc(input_len + VARHDRSZ); /* Allocate max possible size */
>> rp = VARDATA(result);
>> + tp = inputText;
>> +
>> while (*tp != '\0')
>
>
> Isn't this `strlen` O(n) + `while` O(n)? Where is the speed up?
>
>
>
> [0] https://github.com/bminor/glibc/blob/master/string/strlen.c#L43-L45
>
Hi, Kirill,
Your deep insight suprised me!
Yes, you are correct that strlen() actually performed a loop operation.
So maybe the performance difference is not so obvious.
However, there are some other reasons that drive me to make this change.
1. The author of original code left comment: "BUGS: The input is scanned
twice." .
You can find this line of code in my patch. This indicates a left work
to be done.
2. If I were the author of this function, I would not be satisfied with
myself that I used two loops to do something which actually can be done
with one loop.
I prefer to choose a way that would not add more burden to readers.
3. The while (*tp != '\0') loop has some unnecessary codes and I made
some change.
Thanks,
Steven
This is a revised version (v2) of the patch that optimizes the `byteain()` function.
The original implementation handled escaped input by scanning the string twice — first to determine the output size and again to fill in the bytea. This patch eliminates the double scan by using a single-pass approach with `StringInfo`, simplifying the logic and improving maintainability.
Changes since v1 (originally by Steven Niu):
- Use `StringInfo` instead of manual memory allocation.
- Remove redundant code and improve readability.
- Add regression tests for both hex and escaped formats.
This version addresses performance and clarity while ensuring compatibility with existing behavior. The patch also reflects discussion on the original version, including feedback from Kirill Reshke.
Looking forward to your review and comments.
Best regards,
Stepan Neretin
Attachment
On Fri, May 9, 2025 at 5:24 PM Stepan Neretin <slpmcf@gmail.com> wrote:On Wed, Mar 26, 2025 at 9:39 PM Steven Niu <niushiji@gmail.com> wrote:
在 2025/3/26 16:37, Kirill Reshke 写道:
> On Wed, 26 Mar 2025 at 12:17, Steven Niu <niushiji@gmail.com> wrote:
>>
>> Hi,
>
> Hi!
>
>> This double scanning can be inefficient, especially for large inputs.
>> So I optimized the function to eliminate the need for two scans,
>> while preserving correctness and efficiency.
>
> While the argument that processing input once not twice is fast is
> generally true, may we have some simple bench here just to have an
> idea how valuable this patch is?
>
>
> Patch:
>
>
>> + /* Handle traditional escaped style in a single pass */
>> + input_len = strlen(inputText);
>> + result = palloc(input_len + VARHDRSZ); /* Allocate max possible size */
>> rp = VARDATA(result);
>> + tp = inputText;
>> +
>> while (*tp != '\0')
>
>
> Isn't this `strlen` O(n) + `while` O(n)? Where is the speed up?
>
>
>
> [0] https://github.com/bminor/glibc/blob/master/string/strlen.c#L43-L45
>
Hi, Kirill,
Your deep insight suprised me!
Yes, you are correct that strlen() actually performed a loop operation.
So maybe the performance difference is not so obvious.
However, there are some other reasons that drive me to make this change.
1. The author of original code left comment: "BUGS: The input is scanned
twice." .
You can find this line of code in my patch. This indicates a left work
to be done.
2. If I were the author of this function, I would not be satisfied with
myself that I used two loops to do something which actually can be done
with one loop.
I prefer to choose a way that would not add more burden to readers.
3. The while (*tp != '\0') loop has some unnecessary codes and I made
some change.
Thanks,
StevenHi hackers,
This is a revised version (v2) of the patch that optimizes the `byteain()` function.
The original implementation handled escaped input by scanning the string twice — first to determine the output size and again to fill in the bytea. This patch eliminates the double scan by using a single-pass approach with `StringInfo`, simplifying the logic and improving maintainability.
Changes since v1 (originally by Steven Niu):
- Use `StringInfo` instead of manual memory allocation.
- Remove redundant code and improve readability.
- Add regression tests for both hex and escaped formats.
This version addresses performance and clarity while ensuring compatibility with existing behavior. The patch also reflects discussion on the original version, including feedback from Kirill Reshke.
Looking forward to your review and comments.
Best regards,
Stepan NeretinHi,
I noticed that the previous version of the patch was authored with an incorrect email address due to a misconfigured
git config
.I've corrected the author information in this v2 and made sure it's consistent with my usual contributor identity. No other changes were introduced apart from that and the updates discussed earlier.
Sorry for the confusion, and thanks for your understanding.
Best regards,
Stepan Neretin
Hi,
Sorry for the noise — I'm resending the patch because I noticed a compiler warning related to mixed declarations and code, which I’ve now fixed.
Apologies for the oversight in the previous submission.
Regards,
Stepan Neretin
Attachment
Hi Stepan, > Sorry for the noise — I'm resending the patch because I noticed a compiler warning related to mixed declarations and code,which I’ve now fixed. > > Apologies for the oversight in the previous submission. Thanks for the patch. As Kirill pointed out above, it would be nice if you could prove that your implementation is actually faster. I think something like a simple micro-benchmark will do. -- Best regards, Aleksander Alekseev
Hi Stepan,
> Sorry for the noise — I'm resending the patch because I noticed a compiler warning related to mixed declarations and code, which I’ve now fixed.
>
> Apologies for the oversight in the previous submission.
Thanks for the patch.
As Kirill pointed out above, it would be nice if you could prove that
your implementation is actually faster. I think something like a
simple micro-benchmark will do.
--
Best regards,
Aleksander Alekseev
Hi,
Thanks for the feedback.
I’ve done a simple micro-benchmark using PL/pgSQL with a large escaped input string (\\123
repeated 100,000 times), converted to bytea
in a loop:
DO $$
DECLARE start_time TIMESTAMP; end_time TIMESTAMP; i INTEGER; dummy BYTEA; input TEXT := repeat(E'\\123', 100000); elapsed_ms DOUBLE PRECISION;
BEGIN start_time := clock_timestamp();
FOR i IN 1..1000 LOOP dummy := input::bytea; END LOOP;
end_time := clock_timestamp(); elapsed_ms := EXTRACT(EPOCH FROM end_time - start_time) * 1000; RAISE NOTICE 'Average time per call: % ms', elapsed_ms / 1000;
END
$$;
Here are the results from NOTICE
output:
Without patch:
NOTICE: Average time per call: 0.49176600000000004 ms
NOTICE: Average time per call: 0.47658999999999996 ms
With patch:
NOTICE: Average time per call: 0.468231 ms
NOTICE: Average time per call: 0.463909 ms
The gain is small but consistent. Let me know if a more rigorous benchmark would be useful.
Best regards,
Stepan Neretin