Thread: [PATCH] clarify palloc comment on quote_literal_cstr

[PATCH] clarify palloc comment on quote_literal_cstr

From
Steve Chavez
Date:
Hello hackers,

I found the numbers in `quote_literal_cstr` palloc quite magical. So I've added a comment clarifying what they mean. The change is small:

        /* We make a worst-case result area; wasting a little space is OK */
-       result = palloc(len * 2 + 3 + 1);
+       result = palloc(
+               (len * 2)    /* worst-case doubling for every character if each one is a quote */
+               + 3          /* two outer quotes + possibly 'E' if needed */
+               + 1          /* null terminator */
+       );

Best regards,
Steve
Attachment

Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Michael Paquier
Date:
On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote:
> I found the numbers in `quote_literal_cstr` palloc quite magical. So I've
> added a comment clarifying what they mean. The change is small:
>
>         /* We make a worst-case result area; wasting a little space is OK */
> -       result = palloc(len * 2 + 3 + 1);
> +       result = palloc(
> +               (len * 2)    /* worst-case doubling for every character if
> each one is a quote */
> +               + 3          /* two outer quotes + possibly 'E' if needed */
> +               + 1          /* null terminator */
> +       );

Agreed that this is a good idea to have a better documentation here if
someone decides to tweak this area in the future, rather than have
them guess the numbers.

Looking at what quote_literal_internal() does, it seems to me that you
are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the
extra backslashes added to the destination buffer.
--
Michael

Attachment

Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Ashutosh Bapat
Date:
On Mon, Apr 7, 2025 at 3:19 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote:
> > I found the numbers in `quote_literal_cstr` palloc quite magical. So I've
> > added a comment clarifying what they mean. The change is small:
> >
> >         /* We make a worst-case result area; wasting a little space is OK */
> > -       result = palloc(len * 2 + 3 + 1);
> > +       result = palloc(
> > +               (len * 2)    /* worst-case doubling for every character if
> > each one is a quote */
> > +               + 3          /* two outer quotes + possibly 'E' if needed */
> > +               + 1          /* null terminator */
> > +       );
>
> Agreed that this is a good idea to have a better documentation here if
> someone decides to tweak this area in the future, rather than have
> them guess the numbers.
>
> Looking at what quote_literal_internal() does, it seems to me that you
> are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the
> extra backslashes added to the destination buffer.

This is an improvement in readability. However, I have not seen this
style of commenting arguments of a function. If there's a precedence,
please let me know. Usually we explain it in a comment before the
call. For example, something like
/* Allocate memory for worst case result factoring in a. double the
length number of bytes, in case every character is a quote, b. two
bytes for two outer quotes c. extra byte for E' d. one byte for null
terminator. */

--
Best Wishes,
Ashutosh Bapat



Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Steve Chavez
Date:
I haven't found a similar style of comment on any other function call.

I've attached a new patch using the style you suggest. 

That being said, I do find the first form much more readable, but I understand this is a debatable subject.

Best regards,
Steve Chavez

On Mon, 7 Apr 2025 at 06:33, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Mon, Apr 7, 2025 at 3:19 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote:
> > I found the numbers in `quote_literal_cstr` palloc quite magical. So I've
> > added a comment clarifying what they mean. The change is small:
> >
> >         /* We make a worst-case result area; wasting a little space is OK */
> > -       result = palloc(len * 2 + 3 + 1);
> > +       result = palloc(
> > +               (len * 2)    /* worst-case doubling for every character if
> > each one is a quote */
> > +               + 3          /* two outer quotes + possibly 'E' if needed */
> > +               + 1          /* null terminator */
> > +       );
>
> Agreed that this is a good idea to have a better documentation here if
> someone decides to tweak this area in the future, rather than have
> them guess the numbers.
>
> Looking at what quote_literal_internal() does, it seems to me that you
> are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the
> extra backslashes added to the destination buffer.

This is an improvement in readability. However, I have not seen this
style of commenting arguments of a function. If there's a precedence,
please let me know. Usually we explain it in a comment before the
call. For example, something like
/* Allocate memory for worst case result factoring in a. double the
length number of bytes, in case every character is a quote, b. two
bytes for two outer quotes c. extra byte for E' d. one byte for null
terminator. */

--
Best Wishes,
Ashutosh Bapat
Attachment

Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Ashutosh Bapat
Date:
On Tue, Apr 8, 2025 at 12:08 AM Steve Chavez <steve@supabase.io> wrote:
>
> I haven't found a similar style of comment on any other function call.
>
> I've attached a new patch using the style you suggest.
>
> That being said, I do find the first form much more readable, but I understand this is a debatable subject.

Thanks for addressing the comment.

In PG code, we start a multiline comment with just /* on the first
line and end with just */ on the last line. All the lines in-between
start with * . Please check other comments in the file.

I would write a., b. c. d. on separate lines.

--
Best Wishes,
Ashutosh Bapat



Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Michael Paquier
Date:
On Tue, Apr 08, 2025 at 08:47:53AM +0530, Ashutosh Bapat wrote:
> Thanks for addressing the comment.
>
> In PG code, we start a multiline comment with just /* on the first
> line and end with just */ on the last line. All the lines in-between
> start with * . Please check other comments in the file.
>
> I would write a., b. c. d. on separate lines.

Hmm.  I don't think that grouping all these details in the same
comment block is an improvement in this case compared to the first
version, where it is clear which part of the calculation applies to
what.  In short, I'm OK with how things are on HEAD.
--
Michael

Attachment

Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> Hmm.  I don't think that grouping all these details in the same
> comment block is an improvement in this case compared to the first
> version, where it is clear which part of the calculation applies to
> what.  In short, I'm OK with how things are on HEAD.

+1.  When I saw the patch I was mainly afraid that pgindent would
make a hash of it.  But it didn't, so I'm cool with it as-is.

            regards, tom lane



Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Michael Paquier
Date:
On Tue, Apr 08, 2025 at 01:26:59AM -0400, Tom Lane wrote:
> +1.  When I saw the patch I was mainly afraid that pgindent would
> make a hash of it.

Yes.  I was actually surprised to see that it did not mess up with the
code generated.
--
Michael

Attachment

Re: [PATCH] clarify palloc comment on quote_literal_cstr

From
Ashutosh Bapat
Date:
On Tue, Apr 8, 2025 at 10:49 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Apr 08, 2025 at 08:47:53AM +0530, Ashutosh Bapat wrote:
> > Thanks for addressing the comment.
> >
> > In PG code, we start a multiline comment with just /* on the first
> > line and end with just */ on the last line. All the lines in-between
> > start with * . Please check other comments in the file.
> >
> > I would write a., b. c. d. on separate lines.
>
> Hmm.  I don't think that grouping all these details in the same
> comment block is an improvement in this case compared to the first
> version, where it is clear which part of the calculation applies to
> what.  In short, I'm OK with how things are on HEAD.

Ah! I didn't realize it was committed already.

--
Best Wishes,
Ashutosh Bapat