Re: Why is pq_begintypsend so slow? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Why is pq_begintypsend so slow?
Date
Msg-id 20200731170008.dp2q5467zlskh6ra@alap3.anarazel.de
Whole thread Raw
In response to Re: Why is pq_begintypsend so slow?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Why is pq_begintypsend so slow?
List pgsql-hackers
Hi,

On 2020-07-31 12:28:04 -0400, Robert Haas wrote:
> On Fri, Jul 31, 2020 at 11:50 AM Andres Freund <andres@anarazel.de> wrote:
> > I hope the above makes this make sene now? It's about subsequent uses of
> > the StringInfo, rather than the body of resetStringInfo itself.
>
> That does make sense, except that
> https://en.cppreference.com/w/c/language/restrict says "During each
> execution of a block in which a restricted pointer P is declared
> (typically each execution of a function body in which P is a function
> parameter), if some object that is accessible through P (directly or
> indirectly) is modified, by any means, then all accesses to that
> object (both reads and writes) in that block must occur through P
> (directly or indirectly), otherwise the behavior is undefined." So my
> interpretation of this was that it couldn't really affect what
> happened outside of the function itself, even if the compiler chose to
> perform inlining. But I think see what you're saying: the *inference*
> is only valid with respect to restrict pointers in a particular
> function, but what can be optimized as a result of that inference may
> be something further afield, if inlining is performed.

Right. There's two aspects:

1) By looking at the function, with the restrict, the compiler can infer
   more about the behaviour of the function. E.g. knowing that -> len
   has a specific value, or that ->data[n] does. That information then
   can be used together with subsequent operations, e.g. avoiding a
   re-read of ->len. That could work in some cases even if subsequent
   operations were *not* marked up with restrict.

2) The restrict signals to the compiler that we guarantee (i.e. it would
   be undefined behaviour if not) that the pointers do not
   overlap. Which means it can assume that in some of the calling code
   as well, if it can analyze that ->data isn't changed, for example.


> Perhaps we could add a comment about this, e.g.
> Marking these pointers with pg_restrict tells the compiler that str
> and str->data can't overlap, which may allow the compiler to optimize
> better when this code is inlined. For example, it may be possible to
> keep str->data in a register across consecutive appendStringInfoString
> operations.
>
> Since pg_restrict is not widely used, I think it's worth adding this
> kind of annotation, lest other hackers get confused. I'm probably not
> the only one who isn't on top of this.

Would it make more sense to have a bit of an explanation at
pg_restrict's definition, instead of having it at (eventually) multiple
places?


> > > In appendStringInfoChar, why do we need to cast to restrict twice? Can
> > > we not just do something like this:
> > >
> > > char *pg_restrict ep = str->data+str->len;
> > > ep[0] = ch;
> > > ep[1] = '\0';
> >
> > I don't think that'd tell the compiler that this couldn't overlap with
> > str itself? A single 'restrict' can never (?) help, you need *two*
> > things that are marked as not overlapping in any way.
>
> But what's the difference between:
>
> +       *(char *pg_restrict) (str->data + str->len) = ch;
> +       str->len++;
> +       *(char *pg_restrict) (str->data + str->len) = '\0';
>
> And:
>
> char *pg_restrict ep = str->data+str->len;
> ep[0] = ch;
> ep[1] = '\0';
> ++str->len;
>
> Whether or not str itself is marked restricted is another question;
> what I'm talking about is why we need to repeat (char *pg_restrict)
> (str->data + str->len).

Ah, I misunderstood. Yea, there's no reason not to do that.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: refactoring basebackup.c
Next
From: Robert Haas
Date:
Subject: Re: Why is pq_begintypsend so slow?