Thread: [PATCH] clarify palloc comment on quote_literal_cstr
Hello hackers,
I found the numbers in `quote_literal_cstr` palloc quite magical. So I've added a comment clarifying what they mean. The change is small:
/* We make a worst-case result area; wasting a little space is OK */
- result = palloc(len * 2 + 3 + 1);
+ result = palloc(
+ (len * 2) /* worst-case doubling for every character if each one is a quote */
+ + 3 /* two outer quotes + possibly 'E' if needed */
+ + 1 /* null terminator */
+ );
- result = palloc(len * 2 + 3 + 1);
+ result = palloc(
+ (len * 2) /* worst-case doubling for every character if each one is a quote */
+ + 3 /* two outer quotes + possibly 'E' if needed */
+ + 1 /* null terminator */
+ );
Best regards,
Steve
Attachment
On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote: > I found the numbers in `quote_literal_cstr` palloc quite magical. So I've > added a comment clarifying what they mean. The change is small: > > /* We make a worst-case result area; wasting a little space is OK */ > - result = palloc(len * 2 + 3 + 1); > + result = palloc( > + (len * 2) /* worst-case doubling for every character if > each one is a quote */ > + + 3 /* two outer quotes + possibly 'E' if needed */ > + + 1 /* null terminator */ > + ); Agreed that this is a good idea to have a better documentation here if someone decides to tweak this area in the future, rather than have them guess the numbers. Looking at what quote_literal_internal() does, it seems to me that you are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the extra backslashes added to the destination buffer. -- Michael
Attachment
On Mon, Apr 7, 2025 at 3:19 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote: > > I found the numbers in `quote_literal_cstr` palloc quite magical. So I've > > added a comment clarifying what they mean. The change is small: > > > > /* We make a worst-case result area; wasting a little space is OK */ > > - result = palloc(len * 2 + 3 + 1); > > + result = palloc( > > + (len * 2) /* worst-case doubling for every character if > > each one is a quote */ > > + + 3 /* two outer quotes + possibly 'E' if needed */ > > + + 1 /* null terminator */ > > + ); > > Agreed that this is a good idea to have a better documentation here if > someone decides to tweak this area in the future, rather than have > them guess the numbers. > > Looking at what quote_literal_internal() does, it seems to me that you > are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the > extra backslashes added to the destination buffer. This is an improvement in readability. However, I have not seen this style of commenting arguments of a function. If there's a precedence, please let me know. Usually we explain it in a comment before the call. For example, something like /* Allocate memory for worst case result factoring in a. double the length number of bytes, in case every character is a quote, b. two bytes for two outer quotes c. extra byte for E' d. one byte for null terminator. */ -- Best Wishes, Ashutosh Bapat
I haven't found a similar style of comment on any other function call.
I've attached a new patch using the style you suggest.
That being said, I do find the first form much more readable, but I understand this is a debatable subject.
Best regards,
Steve Chavez
On Mon, 7 Apr 2025 at 06:33, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Mon, Apr 7, 2025 at 3:19 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote:
> > I found the numbers in `quote_literal_cstr` palloc quite magical. So I've
> > added a comment clarifying what they mean. The change is small:
> >
> > /* We make a worst-case result area; wasting a little space is OK */
> > - result = palloc(len * 2 + 3 + 1);
> > + result = palloc(
> > + (len * 2) /* worst-case doubling for every character if
> > each one is a quote */
> > + + 3 /* two outer quotes + possibly 'E' if needed */
> > + + 1 /* null terminator */
> > + );
>
> Agreed that this is a good idea to have a better documentation here if
> someone decides to tweak this area in the future, rather than have
> them guess the numbers.
>
> Looking at what quote_literal_internal() does, it seems to me that you
> are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the
> extra backslashes added to the destination buffer.
This is an improvement in readability. However, I have not seen this
style of commenting arguments of a function. If there's a precedence,
please let me know. Usually we explain it in a comment before the
call. For example, something like
/* Allocate memory for worst case result factoring in a. double the
length number of bytes, in case every character is a quote, b. two
bytes for two outer quotes c. extra byte for E' d. one byte for null
terminator. */
--
Best Wishes,
Ashutosh Bapat
Attachment
On Tue, Apr 8, 2025 at 12:08 AM Steve Chavez <steve@supabase.io> wrote: > > I haven't found a similar style of comment on any other function call. > > I've attached a new patch using the style you suggest. > > That being said, I do find the first form much more readable, but I understand this is a debatable subject. Thanks for addressing the comment. In PG code, we start a multiline comment with just /* on the first line and end with just */ on the last line. All the lines in-between start with * . Please check other comments in the file. I would write a., b. c. d. on separate lines. -- Best Wishes, Ashutosh Bapat
On Tue, Apr 08, 2025 at 08:47:53AM +0530, Ashutosh Bapat wrote: > Thanks for addressing the comment. > > In PG code, we start a multiline comment with just /* on the first > line and end with just */ on the last line. All the lines in-between > start with * . Please check other comments in the file. > > I would write a., b. c. d. on separate lines. Hmm. I don't think that grouping all these details in the same comment block is an improvement in this case compared to the first version, where it is clear which part of the calculation applies to what. In short, I'm OK with how things are on HEAD. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > Hmm. I don't think that grouping all these details in the same > comment block is an improvement in this case compared to the first > version, where it is clear which part of the calculation applies to > what. In short, I'm OK with how things are on HEAD. +1. When I saw the patch I was mainly afraid that pgindent would make a hash of it. But it didn't, so I'm cool with it as-is. regards, tom lane
On Tue, Apr 08, 2025 at 01:26:59AM -0400, Tom Lane wrote: > +1. When I saw the patch I was mainly afraid that pgindent would > make a hash of it. Yes. I was actually surprised to see that it did not mess up with the code generated. -- Michael
Attachment
On Tue, Apr 8, 2025 at 10:49 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Apr 08, 2025 at 08:47:53AM +0530, Ashutosh Bapat wrote: > > Thanks for addressing the comment. > > > > In PG code, we start a multiline comment with just /* on the first > > line and end with just */ on the last line. All the lines in-between > > start with * . Please check other comments in the file. > > > > I would write a., b. c. d. on separate lines. > > Hmm. I don't think that grouping all these details in the same > comment block is an improvement in this case compared to the first > version, where it is clear which part of the calculation applies to > what. In short, I'm OK with how things are on HEAD. Ah! I didn't realize it was committed already. -- Best Wishes, Ashutosh Bapat