Thread: Linux likely() unlikely() for PostgreSQL

Linux likely() unlikely() for PostgreSQL

From
wenhui qiu
Date:
Hi Hackers
   When I saw this document:https://en.wikipedia.org/wiki/Branch_predictor, I thought of Linux likely() vs unlikely() and thus thought that there are some code segments in src/backend/executor/execMain.c that can be optimized.
For example :
if (ExecutorStart_hook)
(*ExecutorStart_hook) (queryDesc, eflags);
else
standard_ExecutorStart(queryDesc, eflags);
}
void
ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count,
bool execute_once)
{
if (ExecutorRun_hook)
(*ExecutorRun_hook) (queryDesc, direction, count, execute_once);
else
standard_ExecutorRun(queryDesc, direction, count, execute_once);
}
###
if (unlikely(ExecutorRun_hook)),

I hope to get guidance from community experts,Many thanks

Thanks

Re: Linux likely() unlikely() for PostgreSQL

From
Matthias van de Meent
Date:
On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:
>
> Hi Hackers
>    When I saw this document:https://en.wikipedia.org/wiki/Branch_predictor, I thought of Linux likely() vs unlikely()
andthus thought that there are some code segments in src/backend/executor/execMain.c that can be optimized. 
> For example :
> if (ExecutorStart_hook)
> (*ExecutorStart_hook) (queryDesc, eflags);
> else
> standard_ExecutorStart(queryDesc, eflags);
> }
> void
> ExecutorRun(QueryDesc *queryDesc,
> ScanDirection direction, uint64 count,
> bool execute_once)
> {
> if (ExecutorRun_hook)
> (*ExecutorRun_hook) (queryDesc, direction, count, execute_once);
> else
> standard_ExecutorRun(queryDesc, direction, count, execute_once);
> }
> ###
> if (unlikely(ExecutorRun_hook)),
>
> I hope to get guidance from community experts,Many thanks

While hooks are generally not installed by default, I would advise
against marking the hooks as unlikely, as that would unfairly penalize
the performance of extensions that do utilise this hook (or hooks in
general when applied to all hooks).

If this code is hot enough, then CPU branch prediction will likely
correctly predict this branch correctly after a small amount of time
(as hook null-ness is generally approximately constant for the
duration of a backend lifetime); while if this code is cold, the
benefit is not worth the additional binary size overhead of the
out-of-lined code section.


Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)



Re: Linux likely() unlikely() for PostgreSQL

From
Tom Lane
Date:
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:
>> if (unlikely(ExecutorRun_hook)),

> While hooks are generally not installed by default, I would advise
> against marking the hooks as unlikely, as that would unfairly penalize
> the performance of extensions that do utilise this hook (or hooks in
> general when applied to all hooks).

In general, we have a policy of using likely/unlikely very sparingly,
and only in demonstrably hot code paths.  This hook call certainly
doesn't qualify as hot.

Having said that ... something I've been wondering about is how to
teach the compiler that error paths are unlikely.  Doing this
across-the-board wouldn't be "sparingly", but I think surely it'd
result in better code quality on average.  This'd be easy enough
to do in Assert:

 #define Assert(condition) \
     do { \
-        if (!(condition)) \
+        if (unlikely(!(condition))) \
             ExceptionalCondition(#condition, __FILE__, __LINE__); \
     } while (0)

but on the other hand we don't care that much about micro-optimization
of assert-enabled builds, so maybe that's not worth the trouble.  The
real win would be if constructs like

    if (trouble)
        ereport(ERROR, ...);

could be interpreted as

    if (unlikely(trouble))
        ereport(ERROR, ...);

But I surely don't want to make thousands of such changes manually.
And it's possible that smart compilers already realize this, using
a heuristic that any path that ends in pg_unreachable() must be
unlikely.  Is there a way to encourage compilers to believe that?

            regards, tom lane



Re: Linux likely() unlikely() for PostgreSQL

From
wenhui qiu
Date:
Hi Tom ,Matthias
    Thank you  for your explanation.Maybe future compilers will be able to do intelligent prediction?


Thanks

Tom Lane <tgl@sss.pgh.pa.us> 于2024年6月30日周日 23:23写道:
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:
>> if (unlikely(ExecutorRun_hook)),

> While hooks are generally not installed by default, I would advise
> against marking the hooks as unlikely, as that would unfairly penalize
> the performance of extensions that do utilise this hook (or hooks in
> general when applied to all hooks).

In general, we have a policy of using likely/unlikely very sparingly,
and only in demonstrably hot code paths.  This hook call certainly
doesn't qualify as hot.

Having said that ... something I've been wondering about is how to
teach the compiler that error paths are unlikely.  Doing this
across-the-board wouldn't be "sparingly", but I think surely it'd
result in better code quality on average.  This'd be easy enough
to do in Assert:

 #define Assert(condition) \
        do { \
-               if (!(condition)) \
+               if (unlikely(!(condition))) \
                        ExceptionalCondition(#condition, __FILE__, __LINE__); \
        } while (0)

but on the other hand we don't care that much about micro-optimization
of assert-enabled builds, so maybe that's not worth the trouble.  The
real win would be if constructs like

        if (trouble)
                ereport(ERROR, ...);

could be interpreted as

        if (unlikely(trouble))
                ereport(ERROR, ...);

But I surely don't want to make thousands of such changes manually.
And it's possible that smart compilers already realize this, using
a heuristic that any path that ends in pg_unreachable() must be
unlikely.  Is there a way to encourage compilers to believe that?

                        regards, tom lane

Re: Linux likely() unlikely() for PostgreSQL

From
Peter Geoghegan
Date:
On Sun, Jun 30, 2024 at 11:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The real win would be if constructs like
>
>         if (trouble)
>                 ereport(ERROR, ...);
>
> could be interpreted as
>
>         if (unlikely(trouble))
>                 ereport(ERROR, ...);
>
> But I surely don't want to make thousands of such changes manually.
> And it's possible that smart compilers already realize this, using
> a heuristic that any path that ends in pg_unreachable() must be
> unlikely.  Is there a way to encourage compilers to believe that?

Isn't that what commit 913ec71d68 did, by adding a call to
pg_attribute_cold to ereport?

--
Peter Geoghegan



Re: Linux likely() unlikely() for PostgreSQL

From
David Rowley
Date:
On Mon, 1 Jul 2024 at 14:08, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Sun, Jun 30, 2024 at 11:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >         if (trouble)
> >                 ereport(ERROR, ...);
> >
> > could be interpreted as
> >
> >         if (unlikely(trouble))
> >                 ereport(ERROR, ...);
> >
> > But I surely don't want to make thousands of such changes manually.
> > And it's possible that smart compilers already realize this, using
> > a heuristic that any path that ends in pg_unreachable() must be
> > unlikely.  Is there a way to encourage compilers to believe that?
>
> Isn't that what commit 913ec71d68 did, by adding a call to
> pg_attribute_cold to report?

Yes.

David



Re: Linux likely() unlikely() for PostgreSQL

From
Ranier Vilela
Date:
Em dom., 30 de jun. de 2024 às 12:24, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:
>> if (unlikely(ExecutorRun_hook)),

> While hooks are generally not installed by default, I would advise
> against marking the hooks as unlikely, as that would unfairly penalize
> the performance of extensions that do utilise this hook (or hooks in
> general when applied to all hooks).

In general, we have a policy of using likely/unlikely very sparingly,
and only in demonstrably hot code paths.  This hook call certainly
doesn't qualify as hot.

Having said that ... something I've been wondering about is how to
teach the compiler that error paths are unlikely.  Doing this
across-the-board wouldn't be "sparingly", but I think surely it'd
result in better code quality on average.  This'd be easy enough
to do in Assert:

 #define Assert(condition) \
        do { \
-               if (!(condition)) \
+               if (unlikely(!(condition))) \
                        ExceptionalCondition(#condition, __FILE__, __LINE__); \
        } while (0)

but on the other hand we don't care that much about micro-optimization
of assert-enabled builds, so maybe that's not worth the trouble.  The
real win would be if constructs like

        if (trouble)
                ereport(ERROR, ...);

could be interpreted as

        if (unlikely(trouble))
                ereport(ERROR, ...);
Hi Tom.

Let's do it?
But we will need a 1 hour window to apply the patch.
I can generate it in 30 minutes.
The current size is 1.6MB.

All expressions for ERROR, FATAL and PANIC.

if (unlikely(expr))
   ereport(ERROR, ...)

Any other expression was left out, such as:

if (expr)
{
 ereport(ERROR, ...)
}

This attached version needs manual adjustment.
for cases of:
if (expr) /* comments */

If I apply it, I can adjust it.

best regards,
Ranier Vilela
Attachment

Re: Linux likely() unlikely() for PostgreSQL

From
Andres Freund
Date:
Hi,

On 2024-06-30 16:47:02 +0200, Matthias van de Meent wrote:
> While hooks are generally not installed by default, I would advise
> against marking the hooks as unlikely, as that would unfairly penalize
> the performance of extensions that do utilise this hook (or hooks in
> general when applied to all hooks).

I don't think this could be fairly described as penalizing use of the hooks.
likely()/unlikely() isn't the same as __attribute__((cold)).

See https://godbolt.org/z/qdnz65ez8 - only the version using
__attribute__((cold)) gets some code put into a separate section.


What's the possible argument for saying that the generated code should be
optimized for the presence of hooks?


Note that I'm not saying that we ought to use likely() here (nor the
opposite), just that this argument for not doing so doesn't seem to hold
water.


> If this code is hot enough, then CPU branch prediction will likely
> correctly predict this branch correctly after a small amount of time
> (as hook null-ness is generally approximately constant for the
> duration of a backend lifetime);

IMO that's not quite right. The branch predictor has a limited
capacity. Because postgres executes a lot of code it frequently executes
"outer" code without benefit of the branch predictor, just due to the amount
of code that was executed - even if the outer code matters for performance. In
which case the static branch prediction can matter (i.e. backward branches are
assumed to be taken, forward branches not).


> while if this code is cold, the benefit is not worth the additional binary
> size overhead of the out-of-lined code section.

There wouldn't be any such overhead, afaict.

Greetings,

Andres Freund



Re: Linux likely() unlikely() for PostgreSQL

From
John Naylor
Date:
On Sun, Jun 30, 2024 at 10:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> In general, we have a policy of using likely/unlikely very sparingly,
> and only in demonstrably hot code paths.  This hook call certainly
> doesn't qualify as hot.

I just remembered an article I read a while back by a compiler
engineer who said that compilers have heuristics that treat NULL
pointers as "unlikely" since the most common reason to test that is so
we can actually dereference it and do something with it, and in that
case a NULL pointer is an exceptional condition. He also said it
wasn't documented so you can only see this by looking at the source
code. Instead of taking the time to verify that, I created some toy
examples and it seems to be true:

https://godbolt.org/z/dv6M5ecY5

This works all the way back to clang 3.1 and (at least, because
Compiler Explorer doesn't go back any further) gcc 3.4.6, so it's an
ancient heuristic. If we can rely on this, we could remove the handful
of statements of the form "if (unlikely(foo_ptr == NULL))" and
document it in c.h. It may be iffy to rely on an undocumented
heuristic, but it also seems silly to have a use-sparingly policy if
some of our current uses have zero effect on the emitted binary (I
haven't checked yet).



Re: Linux likely() unlikely() for PostgreSQL

From
David Rowley
Date:
On Mon, 12 Aug 2024 at 20:34, John Naylor <johncnaylorls@gmail.com> wrote:
> I just remembered an article I read a while back by a compiler
> engineer who said that compilers have heuristics that treat NULL
> pointers as "unlikely" since the most common reason to test that is so
> we can actually dereference it and do something with it, and in that
> case a NULL pointer is an exceptional condition. He also said it
> wasn't documented so you can only see this by looking at the source
> code. Instead of taking the time to verify that, I created some toy
> examples and it seems to be true:
>
> https://godbolt.org/z/dv6M5ecY5

Interesting.

I agree with Andres' comment about the no increase in binary size
overhead.  The use of unlikely() should just be a swap in the "call"
operands so the standard_Executor* function goes in the happy path.
(These are all sibling calls, so in reality, most compilers should do
"jmp" instead of "call")

I also agree with Tom with his comment about the tests for these
executor hooks not being hot. Each of them are executed once per
query, so an unlikely() isn't going to make any difference to the
performance of a query.

Taking into account what your analysis uncovered, I think what might
be worth spending some time on would be looking for hot var == NULL
tests where the common path is true and seeing if using likely() on
any of those makes a meaningful impact on performance. Maybe something
like [1] could be used to inject a macro or function call in each if
(<var> == NULL test to temporarily install some telemetry and look for
__FILE__ / __LINE__ combinations that are almost always true. Grouping
those results by __FILE__, __LINE__ and sorting by count(*) desc might
yield something interesting.

David

[1] https://coccinelle.gitlabpages.inria.fr/website/