JIT breaks PostGIS

Started by Darafei "Komяpa" Praliaskouskiover 7 years ago18 messages

Hi,

Today I spent some time closing PostGIS tickets in preparation to Monday's
release.

One of the blockers, https://trac.osgeo.org/postgis/ticket/4125, was filed
by Postgres APT repository maintainer Christoph Berg who noticed a test
suite failure on Debian Stretch with Postgres 11.

Upon investigation we found:
- A build for Ubuntu Bionic installed on Debian Stretch passes the suite,
requiring llvm6;
- A build for Debian Stretch fails the suite on a call to external library
GEOS, showing no traces of JIT in the stacktrace;
- Setting jit=off lets the suite pass;
- The query called in clean session by itself does not crash Postgres.
Queries above it are required to reproduce the crash;
- The crash affects not only Stretch, but customly collected Postgres 11 /
clang 3.9 on Travis CI running Ubuntu Trusty:
https://github.com/postgis/postgis/pull/262.

I suspect that a fix would require to bisect llvm/clang version which stops
showing this behavior and making it a new minimum for JIT, if this is not a
symptom of bigger (memory management?) problem.

Thank you!
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

#2Andres Freund
andres@anarazel.de
In reply to: Darafei "Komяpa" Praliaskouski (#1)
Re: JIT breaks PostGIS

Hi,

On 2018-07-21 23:14:47 +0300, Darafei "Komяpa" Praliaskouski wrote:

Today I spent some time closing PostGIS tickets in preparation to Monday's
release.

One of the blockers, https://trac.osgeo.org/postgis/ticket/4125, was filed
by Postgres APT repository maintainer Christoph Berg who noticed a test
suite failure on Debian Stretch with Postgres 11.

Upon investigation we found:
- A build for Ubuntu Bionic installed on Debian Stretch passes the suite,
requiring llvm6;
- A build for Debian Stretch fails the suite on a call to external library
GEOS, showing no traces of JIT in the stacktrace;
- Setting jit=off lets the suite pass;
- The query called in clean session by itself does not crash Postgres.
Queries above it are required to reproduce the crash;
- The crash affects not only Stretch, but customly collected Postgres 11 /
clang 3.9 on Travis CI running Ubuntu Trusty:
https://github.com/postgis/postgis/pull/262.

I suspect that a fix would require to bisect llvm/clang version which stops
showing this behavior and making it a new minimum for JIT, if this is not a
symptom of bigger (memory management?) problem.

Could you attempt to come up with a smaller reproducer?

Greetings,

Andres Freund

#3Christoph Berg
myon@debian.org
In reply to: Andres Freund (#2)
Re: JIT breaks PostGIS

Re: Andres Freund 2018-07-21 <20180721202543.ri5jyfclj6kb6lag@alap3.anarazel.de>

Could you attempt to come up with a smaller reproducer?

The original instructions in
https://trac.osgeo.org/postgis/ticket/4125 are fairly easy to follow;
there's also a postgresql-11-postgis-2.5{,-scripts} package (not
mentioned in there) that exhibits the bug.

(Add "stretch-pgdg-testing main 11" to sources.list.)

That said, I'm just rebuilding postgresql-11 on stretch to use
llvm-4.0 instead of 3.9.

Christoph

#4Andres Freund
andres@anarazel.de
In reply to: Christoph Berg (#3)
Re: JIT breaks PostGIS

Hi,

On 2018-07-21 22:36:44 +0200, Christoph Berg wrote:

Re: Andres Freund 2018-07-21 <20180721202543.ri5jyfclj6kb6lag@alap3.anarazel.de>

Could you attempt to come up with a smaller reproducer?

The original instructions in
https://trac.osgeo.org/postgis/ticket/4125 are fairly easy to follow;
there's also a postgresql-11-postgis-2.5{,-scripts} package (not
mentioned in there) that exhibits the bug.

Sure, but a more minimal example (than a 1kloc regression script, wiht
possible inter statement dependencies) still makes the debugging
easier...

Greetings,

Andres Freund

In reply to: Andres Freund (#4)
Re: JIT breaks PostGIS

Hi,

Here's somewhat minimized example.

https://gist.github.com/Komzpa/cc3762175328ff5d11de4b972352003d

You can put this file into regress/jitbug.sql in PostGIS code tree and run
after building postgis:

perl regress/run_test.pl regress/jitbug.sql --expect
perl regress/run_test.pl regress/jitbug.sql

сб, 21 июл. 2018 г. в 23:39, Andres Freund <andres@anarazel.de>:

Hi,

On 2018-07-21 22:36:44 +0200, Christoph Berg wrote:

Re: Andres Freund 2018-07-21 <

20180721202543.ri5jyfclj6kb6lag@alap3.anarazel.de>

Could you attempt to come up with a smaller reproducer?

The original instructions in
https://trac.osgeo.org/postgis/ticket/4125 are fairly easy to follow;
there's also a postgresql-11-postgis-2.5{,-scripts} package (not
mentioned in there) that exhibits the bug.

Sure, but a more minimal example (than a 1kloc regression script, wiht
possible inter statement dependencies) still makes the debugging
easier...

Greetings,

Andres Freund

--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

#6Christoph Berg
myon@debian.org
In reply to: Christoph Berg (#3)
Re: JIT breaks PostGIS

Re: To Andres Freund 2018-07-21 <20180721203644.GB10781@msg.df7cb.de>

That said, I'm just rebuilding postgresql-11 on stretch to use
llvm-4.0 instead of 3.9.

This didn't change anything, it's still crashing on the same query
from tickets.sql.

Does that mean JIT is not ready for prime time yet and should be
disabled in default installations? Or does it just mean that llvm 4.0
is old?

Christoph

#7Andres Freund
andres@anarazel.de
In reply to: Christoph Berg (#6)
Re: JIT breaks PostGIS

On 2018-07-22 20:49:51 +0200, Christoph Berg wrote:

Re: To Andres Freund 2018-07-21 <20180721203644.GB10781@msg.df7cb.de>

That said, I'm just rebuilding postgresql-11 on stretch to use
llvm-4.0 instead of 3.9.

This didn't change anything, it's still crashing on the same query
from tickets.sql.

Does that mean JIT is not ready for prime time yet and should be
disabled in default installations? Or does it just mean that llvm 4.0
is old?

I don't know yet, it's probably just some small bug. But let me debug it
first.

Greetings,

Andres Freund

#8Christoph Berg
myon@debian.org
In reply to: Andres Freund (#7)
Re: JIT breaks PostGIS

Re: Andres Freund 2018-07-22 <20180722185106.qxc5ie745tqdazgq@alap3.anarazel.de>

Does that mean JIT is not ready for prime time yet and should be
disabled in default installations? Or does it just mean that llvm 4.0
is old?

I don't know yet, it's probably just some small bug. But let me debug it
first.

Sure.

The question will be coming up eventually, though, and I think the
options on the packaging side are:

1) Disable jit completely
2) Compile --with-llvm, but disable jit in the config by default
3) Compile --with-llvm, but disable jit for older llvm versions
4) Enable jit everywhere where llvm >= 3.9 is available

Option 4 is what the Debian packages implement now, but it might make
sense to go to 2 or 3 for PG11 (only).

Christoph

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#8)
Re: JIT breaks PostGIS

Christoph Berg <myon@debian.org> writes:

The question will be coming up eventually, though, and I think the
options on the packaging side are:

1) Disable jit completely
2) Compile --with-llvm, but disable jit in the config by default
3) Compile --with-llvm, but disable jit for older llvm versions
4) Enable jit everywhere where llvm >= 3.9 is available

Option 4 is what the Debian packages implement now, but it might make
sense to go to 2 or 3 for PG11 (only).

Well, right now JIT is certainly beta-quality code, so you ought
to expect bugs. We have an open item at
https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items
to decide whether to ship v11 with JIT enabled by default or not,
but I don't expect that decision will be taken until much closer
to release. Until then, I think you should be doing (4) so that
we can gather data to inform the eventual decision.

regards, tom lane

#10Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#9)
Re: JIT breaks PostGIS

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Christoph Berg <myon@debian.org> writes:

The question will be coming up eventually, though, and I think the
options on the packaging side are:

1) Disable jit completely
2) Compile --with-llvm, but disable jit in the config by default
3) Compile --with-llvm, but disable jit for older llvm versions
4) Enable jit everywhere where llvm >= 3.9 is available

Option 4 is what the Debian packages implement now, but it might make
sense to go to 2 or 3 for PG11 (only).

Well, right now JIT is certainly beta-quality code, so you ought
to expect bugs. We have an open item at
https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items
to decide whether to ship v11 with JIT enabled by default or not,
but I don't expect that decision will be taken until much closer
to release. Until then, I think you should be doing (4) so that
we can gather data to inform the eventual decision.

+1.

Thanks!

Stephen

#11Andres Freund
andres@anarazel.de
In reply to: Darafei "Komяpa" Praliaskouski (#1)
Re: JIT breaks PostGIS

Hi,

On 2018-07-21 23:14:47 +0300, Darafei "Komяpa" Praliaskouski wrote:

Today I spent some time closing PostGIS tickets in preparation to Monday's
release.

One of the blockers, https://trac.osgeo.org/postgis/ticket/4125, was filed
by Postgres APT repository maintainer Christoph Berg who noticed a test
suite failure on Debian Stretch with Postgres 11.

Upon investigation we found:
- A build for Ubuntu Bionic installed on Debian Stretch passes the suite,
requiring llvm6;
- A build for Debian Stretch fails the suite on a call to external library
GEOS, showing no traces of JIT in the stacktrace;
- Setting jit=off lets the suite pass;
- The query called in clean session by itself does not crash Postgres.
Queries above it are required to reproduce the crash;
- The crash affects not only Stretch, but customly collected Postgres 11 /
clang 3.9 on Travis CI running Ubuntu Trusty:
https://github.com/postgis/postgis/pull/262.

I suspect that a fix would require to bisect llvm/clang version which stops
showing this behavior and making it a new minimum for JIT, if this is not a
symptom of bigger (memory management?) problem.

It looks to me like it's a LLVM issue, specifically https://bugs.llvm.org/show_bug.cgi?id=34424
fixed in LLVM 5+.

It'll only be an issue for extensions that throw c++ style exceptions. I
don't think that rises to the level of disallowing any LLVM version <
5.0. I suggest postgis adds an error check to its buildprocess that
refuses to run if jit is enabled and a too old version is used?

Greetings,

Andres Freund

In reply to: Andres Freund (#11)
Re: JIT breaks PostGIS

Hello,

пн, 23 июл. 2018 г. в 8:13, Andres Freund <andres@anarazel.de>:

Hi,

On 2018-07-21 23:14:47 +0300, Darafei "Komяpa" Praliaskouski wrote:

I suspect that a fix would require to bisect llvm/clang version which

stops

showing this behavior and making it a new minimum for JIT, if this is

not a

symptom of bigger (memory management?) problem.

It looks to me like it's a LLVM issue, specifically
https://bugs.llvm.org/show_bug.cgi?id=34424
fixed in LLVM 5+.

Thank you for your investigation.

It'll only be an issue for extensions that throw c++ style exceptions. I
don't think that rises to the level of disallowing any LLVM version <
5.0. I suggest postgis adds an error check to its buildprocess that
refuses to run if jit is enabled and a too old version is used?

Unfortunately that's not an option.

Debian Stretch amd64 is quite popular platform, and is supported by
Postgres 10 / PostGIS 2.4.

If Christoph decides to keep LLVM enabled for 11 on that platform, as he is
recommended upthread by Tom, that would mean that PostGIS can't be packaged
at all, and all the people who need it will have to stay on Postgres 10.

If PostGIS decides not to implement the check, and instead tweaks test
runner to execute `set jit to off;` before tickets.sql, then Postgres 11 on
that platform will have a known way to segfault it, even without superuser
rights, as jit GUC is tweakable by anyone.

I think that a good way to deal with it will be to bump minimum required
version of LLVM to 5 on Postgres build, with a notice in docs that will say
that "you can build with lower version, but that will give you this and
this bug".

It also may happen that a patch for LLVM can be applied to LLVM4 build in
Debian and brought in as an update, but such a plan should not be a default
one.
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

#13Christoph Berg
myon@debian.org
In reply to: Darafei "Komяpa" Praliaskouski (#12)
Re: JIT breaks PostGIS

Re: Darafei "Komяpa" Praliaskouski 2018-07-23 <CAC8Q8t+=17oZ4TZGcPn-1BaTCO0_45TBxoc2AssG1Y9A9B6SKQ@mail.gmail.com>

It looks to me like it's a LLVM issue, specifically
https://bugs.llvm.org/show_bug.cgi?id=34424
fixed in LLVM 5+.

Thank you for your investigation.

Thanks!

It'll only be an issue for extensions that throw c++ style exceptions. I
don't think that rises to the level of disallowing any LLVM version <
5.0. I suggest postgis adds an error check to its buildprocess that
refuses to run if jit is enabled and a too old version is used?

Unfortunately that's not an option.

Is it possible to disable JIT per extension, say by removing all .bc
files for it, or via a PGXS variable? Or does this bug still trigger
even without the .bc files for PostGIS?

If Christoph decides to keep LLVM enabled for 11 on that platform, as he is
recommended upthread by Tom, that would mean that PostGIS can't be packaged
at all, and all the people who need it will have to stay on Postgres 10.

We'll definitely need to find a proper fix, not shipping PostGIS is
not an option.

If PostGIS decides not to implement the check, and instead tweaks test
runner to execute `set jit to off;` before tickets.sql, then Postgres 11 on
that platform will have a known way to segfault it, even without superuser
rights, as jit GUC is tweakable by anyone.

Not a good option, ack.

I think that a good way to deal with it will be to bump minimum required
version of LLVM to 5 on Postgres build, with a notice in docs that will say
that "you can build with lower version, but that will give you this and
this bug".

Is LLVM << 5 generally acceptable for use with PostgreSQL, or should
we look into getting a new version to distribute along with it on
apt.postgresql.org? I'd rather like to avoid having to ship a
compiler...

It also may happen that a patch for LLVM can be applied to LLVM4 build in
Debian and brought in as an update, but such a plan should not be a default
one.

That's actually a plan I like very much. Hopefully the patch is small
and backpatchable.

Christoph

#14Andres Freund
andres@anarazel.de
In reply to: Christoph Berg (#13)
Re: JIT breaks PostGIS

Hi,

On 2018-07-25 21:05:33 +0200, Christoph Berg wrote:

It'll only be an issue for extensions that throw c++ style exceptions. I
don't think that rises to the level of disallowing any LLVM version <
5.0. I suggest postgis adds an error check to its buildprocess that
refuses to run if jit is enabled and a too old version is used?

Unfortunately that's not an option.

Is it possible to disable JIT per extension, say by removing all .bc
files for it, or via a PGXS variable? Or does this bug still trigger
even without the .bc files for PostGIS?

I haven't investigated the details here. It certainly would be possible
to have the _PG_init() of postgis's so force JIT to be off, and emit a
warning.

There's no way to just force JIT off whenever anything involving postgis
is present in the query. Not installing the .bc files will prevent
inlining, but I don't think that's sufficient to prevent the problem in
all cases.

I think that a good way to deal with it will be to bump minimum required
version of LLVM to 5 on Postgres build, with a notice in docs that will say
that "you can build with lower version, but that will give you this and
this bug".

Is LLVM << 5 generally acceptable for use with PostgreSQL

It is. Newer version wouldn't hurt though.

, or should we look into getting a new version to distribute along
with it on apt.postgresql.org? I'd rather like to avoid having to ship
a compiler...

Well, you don't need to ship the compiler (clang), strictly
speaking. But yea.

It also may happen that a patch for LLVM can be applied to LLVM4 build in
Debian and brought in as an update, but such a plan should not be a default
one.

That's actually a plan I like very much. Hopefully the patch is small
and backpatchable.

It's not entirely trivial, and I suspect there's independent changes
making it not apply cleanly:
https://github.com/llvm-mirror/llvm/commit/ab3dba86f951a1bdfe01d3102e2850e607af791a

Greetings,

Andres Freund

#15Christoph Berg
myon@debian.org
In reply to: Andres Freund (#14)
Re: JIT breaks PostGIS

Re: Andres Freund 2018-07-25 <20180725191143.sietxlbfehv245hb@alap3.anarazel.de>

I haven't investigated the details here. It certainly would be possible
to have the _PG_init() of postgis's so force JIT to be off, and emit a
warning.

Isn't that too late, if postgis.so gets loaded by a query that is in
to process of being jit'ed?

There's no way to just force JIT off whenever anything involving postgis
is present in the query. Not installing the .bc files will prevent
inlining, but I don't think that's sufficient to prevent the problem in
all cases.

Ok.

Different question, the other way round, is there a way to know that
all files needed to inline a query/extension are there? How does the
JIT machinery determine it can (try to) compile things? (That's
something extension packages might want to test for.)

, or should we look into getting a new version to distribute along
with it on apt.postgresql.org? I'd rather like to avoid having to ship
a compiler...

Well, you don't need to ship the compiler (clang), strictly
speaking. But yea.

We need clang to compile PostgreSQL and extensions, so it needs to
come from somewhere. We could pull it from (stretch-)backports, but
having to enable backports for really every build doesn't look
appealing. (At the moment it's about two dozen out of ~100 packages
that needs backports at build time, and half of that is python
packages that are dependencies of pgadmin4 only.)

It also may happen that a patch for LLVM can be applied to LLVM4 build in
Debian and brought in as an update, but such a plan should not be a default
one.

That's actually a plan I like very much. Hopefully the patch is small
and backpatchable.

It's not entirely trivial, and I suspect there's independent changes
making it not apply cleanly:
https://github.com/llvm-mirror/llvm/commit/ab3dba86f951a1bdfe01d3102e2850e607af791a

The svn link for that is https://llvm.org/viewvc/llvm-project?view=revision&amp;revision=302589

I tried to apply the patch to llvm-toolchain-4.0_4.0.1-10~deb9u2.
There are no rejects, but two of the patched files do not even exist.

patching file include/llvm/ExecutionEngine/Orc/CompileOnDemandLayer.h
patching file include/llvm/ExecutionEngine/Orc/OrcRemoteTargetClient.h
Hunk #1 succeeded at 143 (offset -1 lines).
Hunk #2 succeeded at 319 (offset -1 lines).
Hunk #3 succeeded at 330 (offset -1 lines).
Hunk #4 succeeded at 387 (offset -1 lines).
can't find file to patch at input line 171
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h b/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h
|index babcc7f26aab..5b3426afe584 100644
|--- a/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h
|+++ b/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h
--------------------------
File to patch:
Skip this patch? [y]
Skipping patch.
1 out of 1 hunk ignored
patching file include/llvm/ExecutionEngine/RTDyldMemoryManager.h
patching file include/llvm/ExecutionEngine/RuntimeDyld.h
patching file lib/ExecutionEngine/Orc/OrcMCJITReplacement.h
patching file lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp
patching file lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
patching file lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
patching file lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.h
Hunk #1 succeeded at 144 with fuzz 2 (offset -8 lines).
Hunk #2 succeeded at 167 (offset -12 lines).
patching file lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
Hunk #1 succeeded at 504 (offset -11 lines).
patching file lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h
patching file lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFThumb.h
patching file lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.h
patching file tools/lli/RemoteJITUtils.h
patching file tools/llvm-rtdyld/llvm-rtdyld.cpp
patching file unittests/ExecutionEngine/Orc/ObjectTransformLayerTest.cpp
can't find file to patch at input line 429
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp b/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
|index de99c022fb9d..c13a75a5cbfe 100644
|--- a/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
|+++ b/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
--------------------------
File to patch:
Skip this patch? [y]
Skipping patch.
5 out of 5 hunks ignored

Christoph

#16Andres Freund
andres@anarazel.de
In reply to: Christoph Berg (#15)
Re: JIT breaks PostGIS

Hi,

On 2018-07-25 21:39:26 +0200, Christoph Berg wrote:

Re: Andres Freund 2018-07-25 <20180725191143.sietxlbfehv245hb@alap3.anarazel.de>

I haven't investigated the details here. It certainly would be possible
to have the _PG_init() of postgis's so force JIT to be off, and emit a
warning.

Isn't that too late, if postgis.so gets loaded by a query that is in
to process of being jit'ed?

Hm, I'd guess it'd still be fine, but I haven't thought it sufficiently
through.

Different question, the other way round, is there a way to know that
all files needed to inline a query/extension are there? How does the
JIT machinery determine it can (try to) compile things? (That's
something extension packages might want to test for.)

I'm not 100% sure I understand your question. Let me describe how it
works, and maybe you can then rephrase afterwards?

The way inlining works is that, when referencing a function, the bitcode
summary file corresponding to it (either postgres.index.bc if builtin or
$extension.index.bc if in an extension) gets opened. That contains
metadata (name, number of instructions, ...) about the functions
included in the indexed .bc files (which reside in
$module/path/to/$file.bc). Those .bc files in turn are generated by
clang -emit-llvm. The inlining machinery looks up functions in the
corresponding .index.bc, checks whether they are smaller than the
instruction limit, and inlines them if below.

If a function is not in the summary, or it is too large, it'll just
generate an external function call. It's perfectly normal to have too
large functions and functions that aren't present (e.g. random libraries
including libc).

What precisely would you want to test?

Greetings,

Andres Freund

#17Christoph Berg
myon@debian.org
In reply to: Andres Freund (#16)
Re: JIT breaks PostGIS

Re: Andres Freund 2018-07-25 <20180725195037.jmykfzfporf6auxn@alap3.anarazel.de>

Different question, the other way round, is there a way to know that
all files needed to inline a query/extension are there? How does the
JIT machinery determine it can (try to) compile things? (That's
something extension packages might want to test for.)

I'm not 100% sure I understand your question. Let me describe how it
works, and maybe you can then rephrase afterwards?

The way inlining works is that, when referencing a function, the bitcode
summary file corresponding to it (either postgres.index.bc if builtin or
$extension.index.bc if in an extension) gets opened. That contains
metadata (name, number of instructions, ...) about the functions
included in the indexed .bc files (which reside in
$module/path/to/$file.bc). Those .bc files in turn are generated by
clang -emit-llvm. The inlining machinery looks up functions in the
corresponding .index.bc, checks whether they are smaller than the
instruction limit, and inlines them if below.

Is that top-level functions (SQL language "C" functions), or all
C-code functions?

If a function is not in the summary, or it is too large, it'll just
generate an external function call. It's perfectly normal to have too
large functions and functions that aren't present (e.g. random libraries
including libc).

What happens if some (SQL) function is in there, but calls into a
function that is not? Or is in a different index?

Christoph

#18Andres Freund
andres@anarazel.de
In reply to: Christoph Berg (#17)
Re: JIT breaks PostGIS

Hi,

On 2018-07-25 21:59:32 +0200, Christoph Berg wrote:

Re: Andres Freund 2018-07-25 <20180725195037.jmykfzfporf6auxn@alap3.anarazel.de>

The way inlining works is that, when referencing a function, the bitcode
summary file corresponding to it (either postgres.index.bc if builtin or
$extension.index.bc if in an extension) gets opened. That contains
metadata (name, number of instructions, ...) about the functions
included in the indexed .bc files (which reside in
$module/path/to/$file.bc). Those .bc files in turn are generated by
clang -emit-llvm. The inlining machinery looks up functions in the
corresponding .index.bc, checks whether they are smaller than the
instruction limit, and inlines them if below.

Is that top-level functions (SQL language "C" functions), or all
C-code functions?

The "initial" entry point has to be a C (or INTERNAL) function. But
called function can recursively get inlined (with the size limit being
halved every recursion step). Outside of hooks etc there's no other way
to call extension functions outside of SQL level functions, so that's
not really a restriction.

If a function is not in the summary, or it is too large, it'll just
generate an external function call. It's perfectly normal to have too
large functions and functions that aren't present (e.g. random libraries
including libc).

What happens if some (SQL) function is in there, but calls into a
function that is not? Or is in a different index?

I'm not precisely sure what you mean. You're thinking of a C UDF that's
calling a C function (not a UDF) that's not in the index? If so, then
we'll just not inline that inner function, but generate a normal
function call (i.e. on asm level that'll be a callq instruction or
whatever your platform wants to do).

Greetings,

Andres Freund