Syncing sql extension versions with shared library versions
Hi All,
I am developing the TimescaleDB extension for postgres (
https://github.com/timescale/timescaledb) and have some questions about
versioning. First of all, I have to say that the versioning system on the
sql side is wonderful. It's really simple to write migrations etc.
However when thinking through the implications of having a database cluster
with databases having different extension versions installed, it was not
apparently clear to me how to synchronize the installed extension version
with a shared library version. For example, if I have timescaledb version
0.1.0 in one db and version 0.2.0 in another db, I'd like for
timescaledb-0.1.0.so and timescaledb-0.2.0.so to be used, respectively. (I
want to avoid having to keep backwards-compatibility for all functions in
future shared-libraries). In our case, this is further complicated by the
fact that we need to preload the shared library since we are accessing the
planner hooks etc. Below, I'll describe some solutions I have been thinking
about, but wanted to hear if anyone else on this list has already solved
this problem and has some insight. It is also quite possible I am missing
something.
Issue 1: Preloading the right shared library.
When preloading libraries (either via local_preload_libraries, or
session_preload_libraries, shared_preload_libraries), it would be nice to
preload the shared_library according to it's version. But, I looked through
the code and found no logic for adding version numbers to shared library
names.
Solution 1: Set session_preload_libraries on the database via ALTER
DATABASE SET. This can be done in the sql and the sql version-migration
scripts can change this value as you change extensions versions. I think
this would work, but it seems very hack-ish and less-than-ideal.
Solution 2: Create some kind of stub shared-library that will, in turn,
load another shared library of the correct version. This seems like the
cleaner approach. Has anybody seen/implemented something like this already?
Issue 2: module_pathname
I believe that for user defined function the MODULE_PATHNAME substitution
will not work since that setting is set once per-extension. Thus, for
example, the migration scripts that include function definitions for older
versions would use the latest .so file if MODULE_PATHNAME was used in the
definition. This would be a problem if upgrading to an intermediate (not
latest) version.
Solution: MODULE_PATHNAME cannot be used, and we should build our own
templating/makefile infrastructure to link files to the right-versioned
shared library in the CREATE FUNCTION definition.
Thanks a lot in advance,
Mat Arye
http://www.timescale.com/
On 07/21/2017 04:17 PM, Mat Arye wrote:
Hi All,
I am developing the TimescaleDB extension for postgres
(https://github.com/timescale/timescaledb) and have some questions
about versioning. First of all, I have to say that the versioning
system on the sql side is wonderful. It's really simple to write
migrations etc.However when thinking through the implications of having a database
cluster with databases having different extension versions installed,
it was not apparently clear to me how to synchronize the installed
extension version with a shared library version. For example, if I
have timescaledb version 0.1.0 in one db and version 0.2.0 in another
db, I'd like for timescaledb-0.1.0.so <http://timescaledb-0.1.0.so>
and timescaledb-0.2.0.so <http://timescaledb-0.2.0.so> to be used,
respectively. (I want to avoid having to keep backwards-compatibility
for all functions in future shared-libraries). In our case, this is
further complicated by the fact that we need to preload the shared
library since we are accessing the planner hooks etc. Below, I'll
describe some solutions I have been thinking about, but wanted to hear
if anyone else on this list has already solved this problem and has
some insight. It is also quite possible I am missing something.Issue 1: Preloading the right shared library.
When preloading libraries (either via local_preload_libraries, or
session_preload_libraries, shared_preload_libraries), it would be nice
to preload the shared_library according to it's version. But, I looked
through the code and found no logic for adding version numbers to
shared library names.
Solution 1: Set session_preload_libraries on the database via ALTER
DATABASE SET. This can be done in the sql and the sql
version-migration scripts can change this value as you change
extensions versions. I think this would work, but it seems very
hack-ish and less-than-ideal.
Solution 2: Create some kind of stub shared-library that will, in
turn, load another shared library of the correct version. This seems
like the cleaner approach. Has anybody seen/implemented something like
this already?Issue 2: module_pathname
I believe that for user defined function the MODULE_PATHNAME
substitution will not work since that setting is set once
per-extension. Thus, for example, the migration scripts that include
function definitions for older versions would use the latest .so file
if MODULE_PATHNAME was used in the definition. This would be a problem
if upgrading to an intermediate (not latest) version.
Solution: MODULE_PATHNAME cannot be used, and we should build our own
templating/makefile infrastructure to link files to the
right-versioned shared library in the CREATE FUNCTION definition.
It would be nice if we could teach yhe load mechanism to expand a a
version escape in the MODULE_PATHNAME. e.g.
MODULE_PATHNAME = '$libdir/foo-$version'
cheers
andtrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 21, 2017 at 4:17 PM, Mat Arye <mat@timescaledb.com> wrote:
(I
want to avoid having to keep backwards-compatibility for all functions in
future shared-libraries).
Are you sure that's a good idea? It seems like swimming upstream
against the design. I mean, instead of creating a dispatcher library
that loads either v1 or v2, maybe you could just put it all in one
library, add a "v1" or "v2" suffix to the actual function names where
appropriate, and then set up the SQL definitions to call the correct
one. I mean, it's the same thing, but with less chance of the dynamic
loader ruining your day.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Jul 21, 2017 at 4:17 PM, Mat Arye <mat@timescaledb.com> wrote:
(I
want to avoid having to keep backwards-compatibility for all functions in
future shared-libraries).
Are you sure that's a good idea? It seems like swimming upstream
against the design. I mean, instead of creating a dispatcher library
that loads either v1 or v2, maybe you could just put it all in one
library, add a "v1" or "v2" suffix to the actual function names where
appropriate, and then set up the SQL definitions to call the correct
one. I mean, it's the same thing, but with less chance of the dynamic
loader ruining your day.
Worth noting also is that we have a fair amount of experience now with
handling API changes in contrib modules. It's worth looking through
the update histories of the contrib modules that have shipped multiple
versions to see how they dealt with such issues. As Robert suggests,
it's just not that hard; usually a few shim functions in the C code will
do the trick.
I'd also point out that while you may think you don't need to keep
backwards compatibility across versions, your users are probably
going to think differently. The amount of practical freedom you'd
gain here is probably not so much as you're hoping.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 22 Jul. 2017 04:19, "Mat Arye" <mat@timescaledb.com> wrote:
Hi All,
I am developing the TimescaleDB extension for postgres (
https://github.com/timescale/timescaledb) and have some questions about
versioning. First of all, I have to say that the versioning system on the
sql side is wonderful. It's really simple to write migrations etc.
However when thinking through the implications of having a database cluster
with databases having different extension versions installed, it was not
apparently clear to me how to synchronize the installed extension version
with a shared library version. For example, if I have timescaledb version
0.1.0 in one db and version 0.2.0 in another db, I'd like for
timescaledb-0.1.0.so and timescaledb-0.2.0.so to be used, respectively. (I
want to avoid having to keep backwards-compatibility for all functions in
future shared-libraries). In our case, this is further complicated by the
fact that we need to preload the shared library since we are accessing the
planner hooks etc. Below, I'll describe some solutions I have been thinking
about, but wanted to hear if anyone else on this list has already solved
this problem and has some insight. It is also quite possible I am missing
something.
Issue 1: Preloading the right shared library.
When preloading libraries (either via local_preload_libraries, or
session_preload_libraries, shared_preload_libraries), it would be nice to
preload the shared_library according to it's version. But, I looked through
the code and found no logic for adding version numbers to shared library
names.
You can't do that for shared_preload_libraries, because at
shared_preload_libraries time we don't have access to the DB and can't look
up the installed extension version(s). There might be different ones in
different DBs too.
Solution 1: Set session_preload_libraries on the database via ALTER
DATABASE SET. This can be done in the sql and the sql version-migration
scripts can change this value as you change extensions versions. I think
this would work, but it seems very hack-ish and less-than-ideal.
Won't work for some hooks, right?
I've faces this issue with pglogical and BDR. If the user tries to update
the extension before a new enough .so is loaded we ERROR due to failure to
load missing C functions.
If the .so is updated first the old extension function definitions can fail
at runtime if funcs are removed or change signature, but won't fail at
startup or load.
So we let the C extension detect when it's newer than the loaded SQL ext
during its startup and run an ALTER EXTENSION to update it.
We don't attempt to support downgrades.
On Sat, Jul 22, 2017 at 10:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jul 21, 2017 at 4:17 PM, Mat Arye <mat@timescaledb.com> wrote:
(I
want to avoid having to keep backwards-compatibility for all functions in
future shared-libraries).Are you sure that's a good idea?
No :). But we have a lot of (most of) code that is not
user-called-functions (planner/other hooks etc.). It seems dangerous to
update that code in the .so and have it possibly affect customers that are
using old versions of the extension. While it's possible to do that kind of
_v1 suffix code for planner functions as well, this seems like a nightmare
in terms of code maintenance (we already have 1000s of lines of C code). I
think a dynamic loader might be more work upfront but have major payoffs
for speed of development in the long term for us. It may also have
advantages in terms of update safety. It's also worth noting that our C
code has some SPI upcalls, so keeping some sync between the sql and C code
is even more of an issue for us (if we can't make the dynamic/stub loader
approach work, this might be an anti-pattern and we may have to review
doing upcalls at all).
Show quoted text
It seems like swimming upstream
against the design. I mean, instead of creating a dispatcher library
that loads either v1 or v2, maybe you could just put it all in one
library, add a "v1" or "v2" suffix to the actual function names where
appropriate, and then set up the SQL definitions to call the correct
one. I mean, it's the same thing, but with less chance of the dynamic
loader ruining your day.--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
(adding -hackers back into thread, got dropped by my email client, sorry)
On Mon, Jul 24, 2017 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mat Arye <mat@timescaledb.com> writes:
I tried looking in the contrib modules and didn't find many with lots of
planner hook usage.I'm not really sure why planner hooks would have anything to do with your
exposed SQL API?
Sorry what I meant was i'd like to package different versions of my
extension -- both .sql and .c --
and have the extension act consistently for any version until I do a ALTER
EXTENSION UPDATE.
So, I'd prefer a DB with an older extension to have the logic/code in the
hook not change even if I install a newer version's .so for use in another
database
(but don't update the extension to the newer version). Does that make any
sense?
You will need to have separate builds of your extension for each PG
release branch you work with; we force that through PG_MODULE_MAGIC
whether you like it or not. But that doesn't translate to needing
different names for the library .so files. Generally people either
mantain separate source code per-branch (just as the core code does)
or put in a lot of #ifs testing CATALOG_VERSION_NO to see which
generation of PG they're compiling against.
Yeah we plan to use different branches for different PG versions.
Show quoted text
regards, tom lane
Import Notes
Reply to msg id not found: CADsUR0DJZ81ju_PotkRdNitgv0CYn8MKnYjmbiK4zRY_OHPQ@mail.gmail.com
Mat Arye <mat@timescaledb.com> writes:
On Mon, Jul 24, 2017 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm not really sure why planner hooks would have anything to do with your
exposed SQL API?
Sorry what I meant was i'd like to package different versions of my
extension -- both .sql and .c --
and have the extension act consistently for any version until I do a ALTER
EXTENSION UPDATE.
So, I'd prefer a DB with an older extension to have the logic/code in the
hook not change even if I install a newer version's .so for use in another
database
(but don't update the extension to the newer version). Does that make any
sense?
The newer version's .so simply is not going to load into the older
version; we intentionally prevent that from happening. It's not necessary
anyway because versions do not share library directories. Therefore,
you can have foo.so for 9.5 in your 9.5 version's library directory,
and foo.so for 9.6 in your 9.6 version's library directory, and the
filesystem will keep them straight for you. It is not necessary to
call them foo-9.5.so and foo-9.6.so.
As for the other point, the usual idea is that if you have a
SQL-accessible C function xyz() that needs to behave differently after an
extension version update, then you make the extension update script point
the SQL function to a different library entry point. If your 1.0
extension version originally had
CREATE FUNCTION xyz(...) RETURNS ...
LANGUAGE C AS 'MODULE_PATHNAME', 'xyz';
(note that the second part of the AS clause might have been implicit;
no matter), then your update script for version 1.1 could do
CREATE OR REPLACE FUNCTION xyz(...) RETURNS ...
LANGUAGE C AS 'MODULE_PATHNAME', 'xyz_1_1';
Then in the 1.1 version of the C code, the xyz_1_1() C function provides
the new behavior, while the xyz() C function provides the old behavior,
or maybe just throws an error if you conclude it's impractical to emulate
the old behavior exactly. As I mentioned earlier, you can usually set
things up so that you can share much of the code between two such
functions.
The pgstattuple C function in contrib/pgstattuple is one example of
having changed a C function's behavior in this way over multiple versions.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jul 24, 2017 at 6:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mat Arye <mat@timescaledb.com> writes:
On Mon, Jul 24, 2017 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm not really sure why planner hooks would have anything to do with
your
exposed SQL API?
Sorry what I meant was i'd like to package different versions of my
extension -- both .sql and .c --
and have the extension act consistently for any version until I do aALTER
EXTENSION UPDATE.
So, I'd prefer a DB with an older extension to have the logic/code in the
hook not change even if I install a newer version's .so for use inanother
database
(but don't update the extension to the newer version). Does that makeany
sense?
The newer version's .so simply is not going to load into the older
version; we intentionally prevent that from happening. It's not necessary
anyway because versions do not share library directories. Therefore,
you can have foo.so for 9.5 in your 9.5 version's library directory,
and foo.so for 9.6 in your 9.6 version's library directory, and the
filesystem will keep them straight for you. It is not necessary to
call them foo-9.5.so and foo-9.6.so.
I meant the extension version not the PG version. Let me try to explain:
If version 0.1.0 has optimization A in the planner hook, and 0.2.0 has
optimization B,
I'd like the property that even if I install foo-0.2.0.so (and also have
foo-0.1.0.so) in the
cluster, any database that has not done an ALTER EXTENSION UPDATE will
still do A
while any databases that have updated the extension will do B. I'd also
like to avoid doing a bunch
of if/else statements to make this happen. But that's the ideal, not sure
if I can make this happen.
As for the other point, the usual idea is that if you have a
SQL-accessible C function xyz() that needs to behave differently after an
extension version update, then you make the extension update script point
the SQL function to a different library entry point. If your 1.0
extension version originally hadCREATE FUNCTION xyz(...) RETURNS ...
LANGUAGE C AS 'MODULE_PATHNAME', 'xyz';(note that the second part of the AS clause might have been implicit;
no matter), then your update script for version 1.1 could doCREATE OR REPLACE FUNCTION xyz(...) RETURNS ...
LANGUAGE C AS 'MODULE_PATHNAME', 'xyz_1_1';Then in the 1.1 version of the C code, the xyz_1_1() C function provides
the new behavior, while the xyz() C function provides the old behavior,
or maybe just throws an error if you conclude it's impractical to emulate
the old behavior exactly. As I mentioned earlier, you can usually set
things up so that you can share much of the code between two such
functions.
Thanks for that explanation. It's clear now.
Show quoted text
The pgstattuple C function in contrib/pgstattuple is one example of
having changed a C function's behavior in this way over multiple versions.regards, tom lane
(Re-added hackers to Cc as this doesn't seem private, just accidentally
didn't reply-all?)
On 24 July 2017 at 23:50, Mat Arye <mat@timescaledb.com> wrote:
Issue 1: Preloading the right shared library.
When preloading libraries (either via local_preload_libraries, or
session_preload_libraries, shared_preload_libraries), it would be nice to
preload the shared_library according to it's version. But, I looked through
the code and found no logic for adding version numbers to shared library
names.You can't do that for shared_preload_libraries, because at
shared_preload_libraries time we don't have access to the DB and can't look
up the installed extension version(s). There might be different ones in
different DBs too.Yeah shared_preload_libraries is a special case I guess. Something like
that could work with local_preload_libraries or session_preload_libraries
right?
It could work, but since it doesn't offer a complete solution I don't think
it's especially compelling.
Solution 1: Set session_preload_libraries on the database via ALTER
DATABASE SET. This can be done in the sql and the sql version-migration
scripts can change this value as you change extensions versions. I think
this would work, but it seems very hack-ish and less-than-ideal.Won't work for some hooks, right?
I've faced this issue with pglogical and BDR. If the user tries to update
the extension before a new enough .so is loaded we ERROR due to failure to
load missing C functions.This is a good point. Thanks for bringing it to my attention. I guess if
the CREATE FUNCTION call contained the name of the new .so then it would
force a load, right? But that means you need to be safe with regard to
having both .so file loaded at once (not sure that's possible). I think
this is the biggest unknown in terms of whether a stub-loader /can/ work.
Unless both .so's have different filenames, you can't have both loaded in
the same backend. Though if you unlink and replace the .so with the same
file name while Pg is running, different backends could have different
versions loaded.
If you do give them different names and they both get linked into one
backend, whether it works will depend on details of linker options, etc. I
wouldn't want to do it personally, at least not unless I prefixed all the
.so's exported symbols. If you're not worried about being portable it's
less of a concern.
Personally I just make sure to retain stub functions in the C extension for
anything removed. It's trivial clutter, easily swept into a corner in a
backward compat file.
If the .so is updated first the old extension function definitions can
fail at runtime if funcs are removed or change signature, but won't fail at
startup or load.So we let the C extension detect when it's newer than the loaded SQL ext
during its startup and run an ALTER EXTENSION to update it.Yeah that's very similar to what we do now. It doesn't work for multiple
dbs having different extension versions, though (at least for us).
Makes sense. Not a case I have ever cared to support.
--
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Import Notes
Reply to msg id not found: CADsUR0D0a0W+A0MZF6eWg+n+R-OPnY+pphXcghy3VA2bMERQ@mail.gmail.com