Thread: [PATCH] pg_convert improvement

[PATCH] pg_convert improvement

From
Yurii Rashkovskii
Date:
Hi,

I propose a patch that ensures `pg_convert` doesn't allocate and copy data when no conversion is done. It is an unnecessary overhead, especially when such conversions are done frequently and for large values.

I've tried measuring the performance impact, and the patched version has a small but non-zero gain.

The patch builds against `master` and `make check` succeeds.

Happy to hear any feedback!

--
Y.

Attachment

Re: [PATCH] pg_convert improvement

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/24/23 3:05 PM, Yurii Rashkovskii wrote:
> Hi,
> 
> I propose a patch that ensures `pg_convert` doesn't allocate and copy data when no conversion is done. It is an
unnecessary overhead,especially when such conversions are done frequently and for large values.
 
> 

+1 for the patch, I think the less is done the better.

> 
> Happy to hear any feedback!
> 

The patch is pretty straightforward, I just have one remark:

+       /* if no actual conversion happened, return the original string */
+       /* (we are checking pointers to strings instead of encodings because
+          `pg_do_encoding_conversion` above covers more cases than just
+          encoding equality) */

I think this could be done in one single comment and follow the preferred style
for multi-line comment, see [1].

[1]: https://www.postgresql.org/docs/current/source-format.html

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] pg_convert improvement

From
Yurii Rashkovskii
Date:
Hi Bertrand,

On Fri, Nov 24, 2023 at 6:26 AM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote:

The patch is pretty straightforward, I just have one remark:

+       /* if no actual conversion happened, return the original string */
+       /* (we are checking pointers to strings instead of encodings because
+          `pg_do_encoding_conversion` above covers more cases than just
+          encoding equality) */

I think this could be done in one single comment and follow the preferred style
for multi-line comment, see [1].

Thank you for your feedback. I've attached a revised patch. 

--
Y.

Attachment

Re: [PATCH] pg_convert improvement

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/24/23 3:32 PM, Yurii Rashkovskii wrote:
> Hi Bertrand,
> 
> On Fri, Nov 24, 2023 at 6:26 AM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com
<mailto:bertranddrouvot.pg@gmail.com>>wrote:
 
> 
> 
>     The patch is pretty straightforward, I just have one remark:
> 
>     +       /* if no actual conversion happened, return the original string */
>     +       /* (we are checking pointers to strings instead of encodings because
>     +          `pg_do_encoding_conversion` above covers more cases than just
>     +          encoding equality) */
> 
>     I think this could be done in one single comment and follow the preferred style
>     for multi-line comment, see [1].
> 
> 
> Thank you for your feedback. I've attached a revised patch.

Did some minor changes in the attached:

- Started the multi-line comment with an upper case and finished
it with a "." and re-worded a bit.
- Ran pgindent

What do you think about the attached?

Also, might be good to create a CF entry [1] so that the patch proposal does not get lost
and gets visibility.

[1]: https://commitfest.postgresql.org/46/

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: [PATCH] pg_convert improvement

From
Yurii Rashkovskii
Date:
 Hi Bertrand,


Did some minor changes in the attached:

- Started the multi-line comment with an upper case and finished
it with a "." and re-worded a bit.
- Ran pgindent

What do you think about the attached?

It looks great!
 

Also, might be good to create a CF entry [1] so that the patch proposal does not get lost
and gets visibility.

Just submitted it to SF. Thank you for the review! 

--
Y.

Re: [PATCH] pg_convert improvement

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/28/23 2:16 AM, Yurii Rashkovskii wrote:
>   Hi Bertrand,
> 
> 
>     Did some minor changes in the attached:
> 
>     - Started the multi-line comment with an upper case and finished
>     it with a "." and re-worded a bit.
>     - Ran pgindent
> 
>     What do you think about the attached?
> 
> 
> It looks great!
> 
> 
>     Also, might be good to create a CF entry [1] so that the patch proposal does not get lost
>     and gets visibility.
> 
> 
> Just submitted it to SF. Thank you for the review!
> 

Thanks! Just marked it as "Ready for Committer".

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] pg_convert improvement

From
Nathan Bossart
Date:
On Mon, Nov 27, 2023 at 08:11:06AM +0100, Drouvot, Bertrand wrote:
> +        PG_RETURN_BYTEA_P(string);

I looked around to see whether there was some sort of project policy about
returning arguments without copying them, but the only strict rule I see is
to avoid scribbling on argument data without first copying it.  However, I
do see functions that return unmodified arguments both with and without
copying.  For example, unaccent_dict() is careful to copy the argument
before returning it:

    PG_RETURN_TEXT_P(PG_GETARG_TEXT_P_COPY(strArg));

But replace_text() is not:

    /* Return unmodified source string if empty source or pattern */
    if (src_text_len < 1 || from_sub_text_len < 1)
    {
        PG_RETURN_TEXT_P(src_text);
    }

I don't have any specific concerns about doing this, though.  Otherwise,
the patch looks pretty good to me, so I will plan on committing it shortly.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] pg_convert improvement

From
Nathan Bossart
Date:
Committed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com