Linux likely() unlikely() for PostgreSQL

Started by wenhui qiualmost 2 years ago10 messageshackers
Jump to latest
#1wenhui qiu
qiuwenhuifx@gmail.com

Hi Hackers
When I saw this document:https://en.wikipedia.org/wiki/Branch_predictor,
I thought of Linux likely() vs unlikely() and thus thought that there are
some code segments in src/backend/executor/execMain.c that can be optimized.
For example :
if (ExecutorStart_hook)
(*ExecutorStart_hook) (queryDesc, eflags);
else
standard_ExecutorStart(queryDesc, eflags);
}
void
ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count,
bool execute_once)
{
if (ExecutorRun_hook)
(*ExecutorRun_hook) (queryDesc, direction, count, execute_once);
else
standard_ExecutorRun(queryDesc, direction, count, execute_once);
}
###
if (unlikely(ExecutorRun_hook)),

I hope to get guidance from community experts,Many thanks

Thanks

#2Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: wenhui qiu (#1)
Re: Linux likely() unlikely() for PostgreSQL

On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:

Hi Hackers
When I saw this document:https://en.wikipedia.org/wiki/Branch_predictor, I thought of Linux likely() vs unlikely() and thus thought that there are some code segments in src/backend/executor/execMain.c that can be optimized.
For example :
if (ExecutorStart_hook)
(*ExecutorStart_hook) (queryDesc, eflags);
else
standard_ExecutorStart(queryDesc, eflags);
}
void
ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count,
bool execute_once)
{
if (ExecutorRun_hook)
(*ExecutorRun_hook) (queryDesc, direction, count, execute_once);
else
standard_ExecutorRun(queryDesc, direction, count, execute_once);
}
###
if (unlikely(ExecutorRun_hook)),

I hope to get guidance from community experts,Many thanks

While hooks are generally not installed by default, I would advise
against marking the hooks as unlikely, as that would unfairly penalize
the performance of extensions that do utilise this hook (or hooks in
general when applied to all hooks).

If this code is hot enough, then CPU branch prediction will likely
correctly predict this branch correctly after a small amount of time
(as hook null-ness is generally approximately constant for the
duration of a backend lifetime); while if this code is cold, the
benefit is not worth the additional binary size overhead of the
out-of-lined code section.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias van de Meent (#2)
Re: Linux likely() unlikely() for PostgreSQL

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:

if (unlikely(ExecutorRun_hook)),

While hooks are generally not installed by default, I would advise
against marking the hooks as unlikely, as that would unfairly penalize
the performance of extensions that do utilise this hook (or hooks in
general when applied to all hooks).

In general, we have a policy of using likely/unlikely very sparingly,
and only in demonstrably hot code paths. This hook call certainly
doesn't qualify as hot.

Having said that ... something I've been wondering about is how to
teach the compiler that error paths are unlikely. Doing this
across-the-board wouldn't be "sparingly", but I think surely it'd
result in better code quality on average. This'd be easy enough
to do in Assert:

#define Assert(condition) \
do { \
- if (!(condition)) \
+ if (unlikely(!(condition))) \
ExceptionalCondition(#condition, __FILE__, __LINE__); \
} while (0)

but on the other hand we don't care that much about micro-optimization
of assert-enabled builds, so maybe that's not worth the trouble. The
real win would be if constructs like

if (trouble)
ereport(ERROR, ...);

could be interpreted as

if (unlikely(trouble))
ereport(ERROR, ...);

But I surely don't want to make thousands of such changes manually.
And it's possible that smart compilers already realize this, using
a heuristic that any path that ends in pg_unreachable() must be
unlikely. Is there a way to encourage compilers to believe that?

regards, tom lane

#4wenhui qiu
qiuwenhuifx@gmail.com
In reply to: Tom Lane (#3)
Re: Linux likely() unlikely() for PostgreSQL

Hi Tom ,Matthias
Thank you for your explanation.Maybe future compilers will be able to
do intelligent prediction?

Thanks

Tom Lane <tgl@sss.pgh.pa.us> 于2024年6月30日周日 23:23写道:

Show quoted text

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:

if (unlikely(ExecutorRun_hook)),

While hooks are generally not installed by default, I would advise
against marking the hooks as unlikely, as that would unfairly penalize
the performance of extensions that do utilise this hook (or hooks in
general when applied to all hooks).

In general, we have a policy of using likely/unlikely very sparingly,
and only in demonstrably hot code paths. This hook call certainly
doesn't qualify as hot.

Having said that ... something I've been wondering about is how to
teach the compiler that error paths are unlikely. Doing this
across-the-board wouldn't be "sparingly", but I think surely it'd
result in better code quality on average. This'd be easy enough
to do in Assert:

#define Assert(condition) \
do { \
-               if (!(condition)) \
+               if (unlikely(!(condition))) \
ExceptionalCondition(#condition, __FILE__,
__LINE__); \
} while (0)

but on the other hand we don't care that much about micro-optimization
of assert-enabled builds, so maybe that's not worth the trouble. The
real win would be if constructs like

if (trouble)
ereport(ERROR, ...);

could be interpreted as

if (unlikely(trouble))
ereport(ERROR, ...);

But I surely don't want to make thousands of such changes manually.
And it's possible that smart compilers already realize this, using
a heuristic that any path that ends in pg_unreachable() must be
unlikely. Is there a way to encourage compilers to believe that?

regards, tom lane

In reply to: Tom Lane (#3)
Re: Linux likely() unlikely() for PostgreSQL

On Sun, Jun 30, 2024 at 11:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The real win would be if constructs like

if (trouble)
ereport(ERROR, ...);

could be interpreted as

if (unlikely(trouble))
ereport(ERROR, ...);

But I surely don't want to make thousands of such changes manually.
And it's possible that smart compilers already realize this, using
a heuristic that any path that ends in pg_unreachable() must be
unlikely. Is there a way to encourage compilers to believe that?

Isn't that what commit 913ec71d68 did, by adding a call to
pg_attribute_cold to ereport?

--
Peter Geoghegan

#6David Rowley
dgrowleyml@gmail.com
In reply to: Peter Geoghegan (#5)
Re: Linux likely() unlikely() for PostgreSQL

On Mon, 1 Jul 2024 at 14:08, Peter Geoghegan <pg@bowt.ie> wrote:

On Sun, Jun 30, 2024 at 11:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

if (trouble)
ereport(ERROR, ...);

could be interpreted as

if (unlikely(trouble))
ereport(ERROR, ...);

But I surely don't want to make thousands of such changes manually.
And it's possible that smart compilers already realize this, using
a heuristic that any path that ends in pg_unreachable() must be
unlikely. Is there a way to encourage compilers to believe that?

Isn't that what commit 913ec71d68 did, by adding a call to
pg_attribute_cold to report?

Yes.

David

#7Ranier Vilela
ranier.vf@gmail.com
In reply to: Tom Lane (#3)
Re: Linux likely() unlikely() for PostgreSQL

Em dom., 30 de jun. de 2024 às 12:24, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Sun, 30 Jun 2024, 15:56 wenhui qiu, <qiuwenhuifx@gmail.com> wrote:

if (unlikely(ExecutorRun_hook)),

While hooks are generally not installed by default, I would advise
against marking the hooks as unlikely, as that would unfairly penalize
the performance of extensions that do utilise this hook (or hooks in
general when applied to all hooks).

In general, we have a policy of using likely/unlikely very sparingly,
and only in demonstrably hot code paths. This hook call certainly
doesn't qualify as hot.

Having said that ... something I've been wondering about is how to
teach the compiler that error paths are unlikely. Doing this
across-the-board wouldn't be "sparingly", but I think surely it'd
result in better code quality on average. This'd be easy enough
to do in Assert:

#define Assert(condition) \
do { \
-               if (!(condition)) \
+               if (unlikely(!(condition))) \
ExceptionalCondition(#condition, __FILE__,
__LINE__); \
} while (0)

but on the other hand we don't care that much about micro-optimization
of assert-enabled builds, so maybe that's not worth the trouble. The
real win would be if constructs like

if (trouble)
ereport(ERROR, ...);

could be interpreted as

if (unlikely(trouble))
ereport(ERROR, ...);

Hi Tom.

Let's do it?
But we will need a 1 hour window to apply the patch.
I can generate it in 30 minutes.
The current size is 1.6MB.

All expressions for ERROR, FATAL and PANIC.

if (unlikely(expr))
ereport(ERROR, ...)

Any other expression was left out, such as:

if (expr)
{
ereport(ERROR, ...)
}

This attached version needs manual adjustment.
for cases of:
if (expr) /* comments */

If I apply it, I can adjust it.

best regards,
Ranier Vilela

Attachments:

convert_ereport_expr_to_unlikely_expr.zipapplication/x-zip-compressed; name=convert_ereport_expr_to_unlikely_expr.zipDownload+7-10
#8Andres Freund
andres@anarazel.de
In reply to: Matthias van de Meent (#2)
Re: Linux likely() unlikely() for PostgreSQL

Hi,

On 2024-06-30 16:47:02 +0200, Matthias van de Meent wrote:

While hooks are generally not installed by default, I would advise
against marking the hooks as unlikely, as that would unfairly penalize
the performance of extensions that do utilise this hook (or hooks in
general when applied to all hooks).

I don't think this could be fairly described as penalizing use of the hooks.
likely()/unlikely() isn't the same as __attribute__((cold)).

See https://godbolt.org/z/qdnz65ez8 - only the version using
__attribute__((cold)) gets some code put into a separate section.

What's the possible argument for saying that the generated code should be
optimized for the presence of hooks?

Note that I'm not saying that we ought to use likely() here (nor the
opposite), just that this argument for not doing so doesn't seem to hold
water.

If this code is hot enough, then CPU branch prediction will likely
correctly predict this branch correctly after a small amount of time
(as hook null-ness is generally approximately constant for the
duration of a backend lifetime);

IMO that's not quite right. The branch predictor has a limited
capacity. Because postgres executes a lot of code it frequently executes
"outer" code without benefit of the branch predictor, just due to the amount
of code that was executed - even if the outer code matters for performance. In
which case the static branch prediction can matter (i.e. backward branches are
assumed to be taken, forward branches not).

while if this code is cold, the benefit is not worth the additional binary
size overhead of the out-of-lined code section.

There wouldn't be any such overhead, afaict.

Greetings,

Andres Freund

#9John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#3)
Re: Linux likely() unlikely() for PostgreSQL

On Sun, Jun 30, 2024 at 10:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

In general, we have a policy of using likely/unlikely very sparingly,
and only in demonstrably hot code paths. This hook call certainly
doesn't qualify as hot.

I just remembered an article I read a while back by a compiler
engineer who said that compilers have heuristics that treat NULL
pointers as "unlikely" since the most common reason to test that is so
we can actually dereference it and do something with it, and in that
case a NULL pointer is an exceptional condition. He also said it
wasn't documented so you can only see this by looking at the source
code. Instead of taking the time to verify that, I created some toy
examples and it seems to be true:

https://godbolt.org/z/dv6M5ecY5

This works all the way back to clang 3.1 and (at least, because
Compiler Explorer doesn't go back any further) gcc 3.4.6, so it's an
ancient heuristic. If we can rely on this, we could remove the handful
of statements of the form "if (unlikely(foo_ptr == NULL))" and
document it in c.h. It may be iffy to rely on an undocumented
heuristic, but it also seems silly to have a use-sparingly policy if
some of our current uses have zero effect on the emitted binary (I
haven't checked yet).

#10David Rowley
dgrowleyml@gmail.com
In reply to: John Naylor (#9)
Re: Linux likely() unlikely() for PostgreSQL

On Mon, 12 Aug 2024 at 20:34, John Naylor <johncnaylorls@gmail.com> wrote:

I just remembered an article I read a while back by a compiler
engineer who said that compilers have heuristics that treat NULL
pointers as "unlikely" since the most common reason to test that is so
we can actually dereference it and do something with it, and in that
case a NULL pointer is an exceptional condition. He also said it
wasn't documented so you can only see this by looking at the source
code. Instead of taking the time to verify that, I created some toy
examples and it seems to be true:

https://godbolt.org/z/dv6M5ecY5

Interesting.

I agree with Andres' comment about the no increase in binary size
overhead. The use of unlikely() should just be a swap in the "call"
operands so the standard_Executor* function goes in the happy path.
(These are all sibling calls, so in reality, most compilers should do
"jmp" instead of "call")

I also agree with Tom with his comment about the tests for these
executor hooks not being hot. Each of them are executed once per
query, so an unlikely() isn't going to make any difference to the
performance of a query.

Taking into account what your analysis uncovered, I think what might
be worth spending some time on would be looking for hot var == NULL
tests where the common path is true and seeing if using likely() on
any of those makes a meaningful impact on performance. Maybe something
like [1]https://coccinelle.gitlabpages.inria.fr/website/ could be used to inject a macro or function call in each if
(<var> == NULL test to temporarily install some telemetry and look for
__FILE__ / __LINE__ combinations that are almost always true. Grouping
those results by __FILE__, __LINE__ and sorting by count(*) desc might
yield something interesting.

David

[1]: https://coccinelle.gitlabpages.inria.fr/website/