Thread: [PATCH] pg_convert improvement
Hi,
I propose a patch that ensures `pg_convert` doesn't allocate and copy data when no conversion is done. It is an unnecessary overhead, especially when such conversions are done frequently and for large values.
The patch builds against `master` and `make check` succeeds.
I propose a patch that ensures `pg_convert` doesn't allocate and copy data when no conversion is done. It is an unnecessary overhead, especially when such conversions are done frequently and for large values.
I've tried measuring the performance impact, and the patched version has a small but non-zero gain.
The patch builds against `master` and `make check` succeeds.
Happy to hear any feedback!
--
Y.
Attachment
Hi, On 11/24/23 3:05 PM, Yurii Rashkovskii wrote: > Hi, > > I propose a patch that ensures `pg_convert` doesn't allocate and copy data when no conversion is done. It is an unnecessary overhead,especially when such conversions are done frequently and for large values. > +1 for the patch, I think the less is done the better. > > Happy to hear any feedback! > The patch is pretty straightforward, I just have one remark: + /* if no actual conversion happened, return the original string */ + /* (we are checking pointers to strings instead of encodings because + `pg_do_encoding_conversion` above covers more cases than just + encoding equality) */ I think this could be done in one single comment and follow the preferred style for multi-line comment, see [1]. [1]: https://www.postgresql.org/docs/current/source-format.html Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi Bertrand,
On Fri, Nov 24, 2023 at 6:26 AM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote:
The patch is pretty straightforward, I just have one remark:
+ /* if no actual conversion happened, return the original string */
+ /* (we are checking pointers to strings instead of encodings because
+ `pg_do_encoding_conversion` above covers more cases than just
+ encoding equality) */
I think this could be done in one single comment and follow the preferred style
for multi-line comment, see [1].
Thank you for your feedback. I've attached a revised patch.
Y.
Attachment
Hi, On 11/24/23 3:32 PM, Yurii Rashkovskii wrote: > Hi Bertrand, > > On Fri, Nov 24, 2023 at 6:26 AM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com <mailto:bertranddrouvot.pg@gmail.com>>wrote: > > > The patch is pretty straightforward, I just have one remark: > > + /* if no actual conversion happened, return the original string */ > + /* (we are checking pointers to strings instead of encodings because > + `pg_do_encoding_conversion` above covers more cases than just > + encoding equality) */ > > I think this could be done in one single comment and follow the preferred style > for multi-line comment, see [1]. > > > Thank you for your feedback. I've attached a revised patch. Did some minor changes in the attached: - Started the multi-line comment with an upper case and finished it with a "." and re-worded a bit. - Ran pgindent What do you think about the attached? Also, might be good to create a CF entry [1] so that the patch proposal does not get lost and gets visibility. [1]: https://commitfest.postgresql.org/46/ Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Hi Bertrand,
Did some minor changes in the attached:
- Started the multi-line comment with an upper case and finished
it with a "." and re-worded a bit.
- Ran pgindent
What do you think about the attached?
It looks great!
Also, might be good to create a CF entry [1] so that the patch proposal does not get lost
and gets visibility.
Just submitted it to SF. Thank you for the review!
Y.
Hi, On 11/28/23 2:16 AM, Yurii Rashkovskii wrote: > Hi Bertrand, > > > Did some minor changes in the attached: > > - Started the multi-line comment with an upper case and finished > it with a "." and re-worded a bit. > - Ran pgindent > > What do you think about the attached? > > > It looks great! > > > Also, might be good to create a CF entry [1] so that the patch proposal does not get lost > and gets visibility. > > > Just submitted it to SF. Thank you for the review! > Thanks! Just marked it as "Ready for Committer". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Nov 27, 2023 at 08:11:06AM +0100, Drouvot, Bertrand wrote: > + PG_RETURN_BYTEA_P(string); I looked around to see whether there was some sort of project policy about returning arguments without copying them, but the only strict rule I see is to avoid scribbling on argument data without first copying it. However, I do see functions that return unmodified arguments both with and without copying. For example, unaccent_dict() is careful to copy the argument before returning it: PG_RETURN_TEXT_P(PG_GETARG_TEXT_P_COPY(strArg)); But replace_text() is not: /* Return unmodified source string if empty source or pattern */ if (src_text_len < 1 || from_sub_text_len < 1) { PG_RETURN_TEXT_P(src_text); } I don't have any specific concerns about doing this, though. Otherwise, the patch looks pretty good to me, so I will plan on committing it shortly. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Committed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com