Is a UDF binary portable across different minor releases and PostgreSQL distributions?
Hello,
While I was thinking of application binary compatibility between PostgreSQL releases, some questions arose about C language user-defined functions (UDFs) and extensions that depend on them.
[Q1]
Can the same UDF binary be used with different PostgreSQL minor releases? If it is, is it a defined policy (e.g. written somewhere in the manual, wiki, documentation in the source code)?
For example, suppose you build a UDF X (some_extension.so/dll) with PostgreSQL 9.5.0. Can I use the binary with PostgreSQL 9.5.x without rebuilding?
Here, the UDF references the contents of server-side data structures, like pgstattuple accesses the members of HeapScanData. If some bug fix of PostgreSQL changes the member layout of those structures, the UDF binary would possibly misbehave. Basically, should all UDFs be rebuilt with the new minor release? Or, are PostgreSQL developers aware of such incompatibility and careful not to change data structure layout?
[Q2]
Can the same UDF binary be used with different PostgreSQL distributions (EnterpriseDB, OpenSCG, RHEL packages, etc.)? Or should the UDF be built with the target distribution?
I guess the rebuild is necessary if the distribution modified the source code of PostgreSQL. That is, the UDF binary built with the bare PostgreSQL cannot be used with EnterpriseDB's advanced edition, which may modify various data structures.
How about other distributions which probably don't modify the source code? Should the UDF be built with the target PostgreSQL because configure options may differ, which affects data structures?
Regards
Takayuki Tsunakawa
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 1, 2016 at 9:33 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
[Q1]
Can the same UDF binary be used with different PostgreSQL minor releases? If it is, is it a defined policy (e.g. written somewhere in the manual, wiki, documentation in the source code)?For example, suppose you build a UDF X (some_extension.so/dll) with PostgreSQL 9.5.0. Can I use the binary with PostgreSQL 9.5.x without rebuilding?
Yes, that works properly. There could be problems with potential
changes in the backend APIs in a stable branch, but this usually does
not happen much.
Here, the UDF references the contents of server-side data structures, like pgstattuple accesses the members of HeapScanData. If some bug fix of PostgreSQL changes the member layout of those structures, the UDF binary would possibly misbehave. Basically, should all UDFs be rebuilt with the new minor release?
Not necessarily.
Or, are PostgreSQL developers aware of such incompatibility and careful not to change data structure layout?
Committers are aware and careful about that, that's why exposed APIs
and structures are normally kept stable. At least that's what I see.
[Q2]
Can the same UDF binary be used with different PostgreSQL distributions (EnterpriseDB, OpenSCG, RHEL packages, etc.)? Or should the UDF be built with the target distribution?
Each distribution has usually its own compilation options (say page
size, etc.) even if I recall that most of them use the defaults, so it
clearly depends on what kind of things each of them uses. I would
recommend a recompilation just to be safe. It may not be worth
spending time at looking and checking each one's differences.
I guess the rebuild is necessary if the distribution modified the source code of PostgreSQL. That is, the UDF binary built with the bare PostgreSQL cannot be used with EnterpriseDB's advanced edition, which may modify various data structures.
That's for sure.
How about other distributions which probably don't modify the source code? Should the UDF be built with the target PostgreSQL because configure options may differ, which affects data structures?
It depends on how they build it, but recompiling is the safest bet to
avoid any surprises... I recall seeing an extension code that caused a
SIGSEV with fclose(NULL) on SLES and only reported an error with
Ubuntu. The code was faulty in this case.. But recompiling is usually
a better bet of stability.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
On Fri, Jul 1, 2016 at 9:33 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:[Q1]
Can the same UDF binary be used with different PostgreSQL minor releases?If it is, is it a defined policy (e.g. written somewhere in the manual,
wiki, documentation in the source code)?For example, suppose you build a UDF X (some_extension.so/dll) with
PostgreSQL 9.5.0. Can I use the binary with PostgreSQL 9.5.x without
rebuilding?Yes, that works properly. There could be problems with potential changes
in the backend APIs in a stable branch, but this usually does not happen
much.Here, the UDF references the contents of server-side data structures,
like pgstattuple accesses the members of HeapScanData. If some bug fix
of PostgreSQL changes the member layout of those structures, the UDF binary
would possibly misbehave. Basically, should all UDFs be rebuilt with the
new minor release?Not necessarily.
Or, are PostgreSQL developers aware of such incompatibility and careful
not to change data structure layout?
Committers are aware and careful about that, that's why exposed APIs and
structures are normally kept stable. At least that's what I see.[Q2]
Can the same UDF binary be used with different PostgreSQL distributions(EnterpriseDB, OpenSCG, RHEL packages, etc.)? Or should the UDF be built
with the target distribution?Each distribution has usually its own compilation options (say page size,
etc.) even if I recall that most of them use the defaults, so it clearly
depends on what kind of things each of them uses. I would recommend a
recompilation just to be safe. It may not be worth spending time at looking
and checking each one's differences.
Thanks for sharing your experience, Michael.
I'd like to document the policy clearly in the upgrade section of PostgreSQL manual, eliminating any ambiguity, so that users can determine what they should do without fear like "may or may not work". Which of the following policies should I base on?
Option 1:
Rebuild UDFs with the target PostgreSQL distribution and minor release.
Option 2:
Rebuild UDFs with the target PostgreSQL distribution.
You do not have to rebuild UDFs when you upgrade or downgrade the minor release. (If your UDF doesn't work after changing the minor release, it's the bug of PostgreSQL. You can report it to pgsql-bugs.)
Regards
Takayuki Tsunakawa
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 1, 2016 at 10:35 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
I'd like to document the policy clearly in the upgrade section of PostgreSQL manual, eliminating any ambiguity, so that users can determine what they should do without fear like "may or may not work". Which of the following policies should I base on?
Option 1:
Rebuild UDFs with the target PostgreSQL distribution and minor release.Option 2:
Rebuild UDFs with the target PostgreSQL distribution.
You do not have to rebuild UDFs when you upgrade or downgrade the minor release. (If your UDF doesn't work after changing the minor release, it's the bug of PostgreSQL. You can report it to pgsql-bugs.)
That would not be a bug of PostgreSQL, the terms are incorrect. If
there is an API breakage, the extension needs to keep up in this case,
so it would be better to mention asking on the lists what may have
gone wrong.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
"Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> writes:
I'd like to document the policy clearly in the upgrade section of PostgreSQL manual, eliminating any ambiguity, so that users can determine what they should do without fear like "may or may not work". Which of the following policies should I base on?
Option 1:
Rebuild UDFs with the target PostgreSQL distribution and minor release.
Option 2:
Rebuild UDFs with the target PostgreSQL distribution.
You do not have to rebuild UDFs when you upgrade or downgrade the minor release. (If your UDF doesn't work after changing the minor release, it's the bug of PostgreSQL. You can report it to pgsql-bugs.)
I do not like either of those. We try hard not to break extensions in
minor releases, but I'm not willing to state it as a hard-and-fast policy
that we never will --- especially because there's no bright line as to
which internal APIs extensions can rely on or not. With sufficiently
negative assumptions about what third-party authors might have chosen to
do, it could become impossible to fix anything at all in released
branches.
In practice, extensions seldom need to be modified for new minor releases.
But there's a long way between that statement and a promise that it won't
ever happen for any conceivable extension.
To make this situation better, what we'd really need is a bunch of work
to identify and document the specific APIs that we would promise won't
change within a release branch. That idea has been batted around before,
but nobody's stepped up to do all the tedious (and, no doubt, contentious)
work that would be involved.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
On Fri, Jul 1, 2016 at 10:35 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:I'd like to document the policy clearly in the upgrade section of
PostgreSQL manual, eliminating any ambiguity, so that users can determine
what they should do without fear like "may or may not work". Which of the
following policies should I base on?Option 1:
Rebuild UDFs with the target PostgreSQL distribution and minor release.Option 2:
Rebuild UDFs with the target PostgreSQL distribution.
You do not have to rebuild UDFs when you upgrade or downgrade the
minor release. (If your UDF doesn't work after changing the minor
release, it's the bug of PostgreSQL. You can report it to
pgsql-bugs.)That would not be a bug of PostgreSQL, the terms are incorrect. If there
is an API breakage, the extension needs to keep up in this case, so it would
be better to mention asking on the lists what may have gone wrong.
OK, I understood that your choice is option 2. And the UDF developer should report the problem and ask for its reason on pgsql-bugs, possibly end up haveing to rebuild the UDF. But if so, it sounds like option 1. That is, "For safety, rebuild your UDF with each minor release. That way, you can avoid severe problems that might take time to pop up above water." I wonder if this is similar to the Linux's loadable kernel modules.
I'd like to hear opinions from other decision makers here before proceeding, as well as Michael.
Regards
Takayuki Tsunakawa
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 1, 2016 at 11:33 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
OK, I understood that your choice is option 2. And the UDF developer should report the problem and ask for its reason on pgsql-bugs, possibly end up haveing to rebuild the UDF. But if so, it sounds like option 1. That is, "For safety, rebuild your UDF with each minor release. That way, you can avoid severe problems that might take time to pop up above water." I wonder if this is similar to the Linux's loadable kernel modules.
I'd like to hear opinions from other decision makers here before proceeding, as well as Michael.
Speaking of some past experience, I got once bitten the change of
signature of IndexDefine() done in 820ab11 with 9.2, because at some
point, the tree I was maintaining kept a static copy of Postgres code,
and bootparse.c (!) was in the set. Guess the result. That was a lot
of fun to debug to find why Postgres kept crashing at initdb, and
extensions could blow up similarly if they expect routines with a
different shape.
Since then I take it on the safest side and all my in-house backend
extensions get recompiled, for each minor releases, as well as each
point in-between. So that's clearly the option 1, I get to do in for
the internal stuff I work on.
Even if there is a list of routines that are listed as in the docs
telling that those will not get broken, in some cases it is really
hard to not break that promise. Looking at for example the diffs of
820ab11, my guess is that there has been a lot of discussions around
this change, and at the end the signature of DefineIndex had to
change, for the best.
Now, speaking from the heart, it is somewhat a waste to have to
recompile that all the time... But by looking at any package
maintainer history, for example that, there are rebuilds triggered
from time to time because of changes of dependent libraries like
OpenSSL. Take here for example:
https://git.archlinux.org/svntogit/packages.git/log/trunk?h=packages/postgresql
So perhaps the best answer, is not 1 nor 2. Just saying that the
routines are carefully maintained with a best effort, though sometimes
you may need to rebuild depending on unavoidable changes in routine
signatures that had to be introduced.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
"Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> writes:Option 2:
Rebuild UDFs with the target PostgreSQL distribution.
You do not have to rebuild UDFs when you upgrade or downgrade the
minor release. (If your UDF doesn't work after changing the minor
release, it's the bug of PostgreSQL. You can report it to
pgsql-bugs.)I do not like either of those. We try hard not to break extensions in minor
releases, but I'm not willing to state it as a hard-and-fast policy that
we never will --- especially because there's no bright line as to which
internal APIs extensions can rely on or not. With sufficiently negative
assumptions about what third-party authors might have chosen to do, it could
become impossible to fix anything at all in released branches.
I feel empathy, but I think something needs to be documented for users to upgrade and/or change distributions with relief. In practice, though it may be a shame, isn't option 1 the current answer?
Again, the current situation seems similar to the Linux loadable kernel modules. So PostgreSQL is not alone. See "Binary compatibility" section in:
https://en.wikipedia.org/wiki/Loadable_kernel_module
In practice, extensions seldom need to be modified for new minor releases.
But there's a long way between that statement and a promise that it won't
ever happen for any conceivable extension.
I think so, too.
To make this situation better, what we'd really need is a bunch of work
to identify and document the specific APIs that we would promise won't change
within a release branch. That idea has been batted around before, but
nobody's stepped up to do all the tedious (and, no doubt, contentious) work
that would be involved.
I can't yet imagine if such API (including data structures) can really be defined so that UDF developers feel comfortable with its flexibility. I wonder how other OSes provide such API and ABI.
Regards
Takayuki Tsunakawa
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 1, 2016 at 12:19 PM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
To make this situation better, what we'd really need is a bunch of work
to identify and document the specific APIs that we would promise won't change
within a release branch. That idea has been batted around before, but
nobody's stepped up to do all the tedious (and, no doubt, contentious) work
that would be involved.I can't yet imagine if such API (including data structures) can really be defined so that UDF developers feel comfortable with its flexibility. I wonder how other OSes provide such API and ABI.
That would be a lot of work, for little result. And at the end the
risk 0 does not exist and things may change. I still quite like the
answer being the mix between 1 and 2: we do our best to maintain the
backend APIs stable, but be careful that things may break if a change
is proving to be necessary.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
So perhaps the best answer, is not 1 nor 2. Just saying that the routines
are carefully maintained with a best effort, though sometimes you may need
to rebuild depending on unavoidable changes in routine signatures that had
to be introduced.
Good, I'd like to use that "mild" expression in the manual. Although the expression is mild, the reality for users is not, is it?
Because the UDF developers and users cannot easily or correctly determine if rebuilding is necessary, nervous (enterprise) users will rebuild their UDFs with each minor release for the maximum safety as Michael does.
Regards
Takayuki Tsunakawa
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1 July 2016 at 08:33, Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com
wrote:
Hello,
While I was thinking of application binary compatibility between
PostgreSQL releases, some questions arose about C language user-defined
functions (UDFs) and extensions that depend on them.[Q1]
Can the same UDF binary be used with different PostgreSQL minor releases?
If it is, is it a defined policy (e.g. written somewhere in the manual,
wiki, documentation in the source code)?For example, suppose you build a UDF X (some_extension.so/dll) with
PostgreSQL 9.5.0. Can I use the binary with PostgreSQL 9.5.x without
rebuilding?
Probably - but we don't guarantee it.
There's no formal extension API. So there's no boundary between "internal
stuff we might have to change to fix a problem" and "things extensions can
rely on not changing under them". In theory anything that changed behaviour
or changed a header file in almost any way could break an extension.
There's no deliberate breakage and some awareness of possible consequences
to extensions, but no formal process. I would prefer that the manual
explicitly recommend recompiling extensions against each minor update (or
updating them along with the packages), and advise that packagers make
their extensions depend on an = minor version in their package
specifications, not a >= .
However, in practice it's fine almost all the time.
I think making this more formal would require, as Tom noted, a formal
extension API we can promise to maintain, likely incorporating:
- fmgr
- datatype functions and macros
- elog and other core infrastructure
- major shmem structures
- GUC variables
- plan nodes and command structs
- SPI
- replication origins
- bgworkers
- catalog definitions
- ... endlessly more
To actually ensure extensions conform to the API we'd probably have to
build with -fvisibility=hidden (gcc) and on Windows change our .def
generation, so we don't expose anything that's not part of the formal API.
That's a very strict boundary though; there's no practical way an extension
can say "I know what I'm doing, gimme the internals anyway" and reach
through it. I'd prefer a soft boundary that spat warnings when you touch
stuff you're not allowed to, but I don't know of any good way to do that
that works across multiple compilers and toolchains.
We'd almost certainly have to allow ourselves to _expand_ the API in minor
releases since otherwise the early introduction of the formal API would be
a nightmare. That's fine on pretty much every platform though.
The main thing is that it's a great deal of work for limited benefit. I
don't know about you, but I'm not keen.
Can the same UDF binary be used with different PostgreSQL distributions
(EnterpriseDB, OpenSCG, RHEL packages, etc.)? Or should the UDF be built
with the target distribution?
Not especially safely.
If you verified that all the compiler flags were the same and your
extension doesn't transitively bundled reference libraries that might be
different and incompatible versions (notably gettext, which Pg exposes in
its own headers) ... you're probably OK.
Again, in practice it generally works, but I wouldn't recommend it. Nor is
this something we can easily address with an extension API policy.
How about other distributions which probably don't modify the source
code? Should the UDF be built with the target PostgreSQL because configure
options may differ, which affects data structures
Yeah. And exposed ABI. I don't recommend it.
It's probably safe-ish on MS Windows, which is designed to allow greater
compatibility between executables built with differing toolchains and
options. I wouldn't do it on any unix.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Craig Ringer
There's no formal extension API. So there's no boundary between "internal stuff we might have to change to fix a problem" and "things extensions can rely on not changing under them". In theory anything that changed behaviour or changed a header file in almost any way could break an extension.
There's no deliberate breakage and some awareness of possible consequences to extensions, but no formal process. I would prefer that the manual explicitly recommend recompiling extensions against each minor update (or updating them along with the packages), and advise that packagers make their extensions depend on an = minor version in their package specifications, not a >= .
Yes, I think such recommendation in the manual is the best.
However, in practice it's fine almost all the time.
Maybe most extensions don’t use sensitive parts of the server…
I think making this more formal would require, as Tom noted, a formal extension API we can promise to maintain, likely incorporating:
- ... endlessly more
Endless (^^;)
The main thing is that it's a great deal of work for limited benefit. I don't know about you, but I'm not keen.
I’m not keen, either… I don’t think I can form the API that advanced extension developers will be satisfied with. I’ll just document the compabibility article in the upgrade section.
Regards
Takayuki Tsunakawa