Idempotent DDL Updates
What is the general consensus within the community on idempotent DDL
scripts, ie. consistent usage of IF EXISTS/IF NOT EXISTS/OR REPLACE for SQL
init files that get checked into source control?
My experience has been that it's much easier to manage than an ever-growing
set of migration files, but I'm just a data point of one. Obviously for
other engines that don't support transactional DDL, it's a non-starter,
which leads me toward the notion that its lack of obvious popularity is due
to its limited cross-engine viability. But PG does have transaction DDL,
preventing incomplete DDL updates. However this may just be my personal
bias talking. Yet I cannot seem to discount the advantages over
ever-increasing volumes of Flyway-style migration files & directories:
* being able to do diffs of changes in source control
* limiting the size of the total SQL over time relative to what's actually
in the DB structure
* much more easily determining the most current version of a
function/procedure (this bit me hard in the past with dozens of migration
files)
* the ability to prune old changes that no longer apply to any deployed
version of the database
* treating database structure as code
The drawbacks I've run across are those areas where the EXISTS/REPLACE
constructs aren't implemented like roles management, domains, constraints,
etc. However those cases seem to be handled with only minor increases in
complexity with judicious use of inline plpgsql.
In others' opinions, has DDL idempotency been viable for maintenance of PG
databases fo you in production?
- Miles Elam
On 8/27/21 11:19 AM, Miles Elam wrote:
What is the general consensus within the community on idempotent DDL
scripts, ie. consistent usage of IF EXISTS/IF NOT EXISTS/OR REPLACE for
SQL init files that get checked into source control?My experience has been that it's much easier to manage than an
ever-growing set of migration files, but I'm just a data point of one.
Obviously for other engines that don't support transactional DDL, it's a
non-starter, which leads me toward the notion that its lack of obvious
popularity is due to its limited cross-engine viability. But PG does
have transaction DDL, preventing incomplete DDL updates. However this
may just be my personal bias talking. Yet I cannot seem to discount the
advantages over ever-increasing volumes of Flyway-style migration files
& directories:
* being able to do diffs of changes in source control
* limiting the size of the total SQL over time relative to what's
actually in the DB structure
* much more easily determining the most current version of a
function/procedure (this bit me hard in the past with dozens of
migration files)
* the ability to prune old changes that no longer apply to any
deployed version of the database
* treating database structure as codeThe drawbacks I've run across are those areas where the EXISTS/REPLACE
constructs aren't implemented like roles management, domains,
constraints, etc. However those cases seem to be handled with only minor
increases in complexity with judicious use of inline plpgsql.In others' opinions, has DDL idempotency been viable for maintenance of
PG databases fo you in production?
For me at least you will need to show examples of what you trying to
achieve. I'm not seeing how a migration(change) can happen without a
change of some sort. More to the point how *EXISTS/OR REPLACE helps?
- Miles Elam
--
Adrian Klaver
adrian.klaver@aklaver.com
On Aug 27, 2021, at 1:32 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 8/27/21 11:19 AM, Miles Elam wrote:
What is the general consensus within the community on idempotent DDL scripts, ie. consistent usage of IF EXISTS/IF NOT EXISTS/OR REPLACE for SQL init files that get checked into source control?
My experience has been that it's much easier to manage than an ever-growing set of migration files, but I'm just a data point of one. Obviously for other engines that don't support transactional DDL, it's a non-starter, which leads me toward the notion that its lack of obvious popularity is due to its limited cross-engine viability. But PG does have transaction DDL, preventing incomplete DDL updates. However this may just be my personal bias talking. Yet I cannot seem to discount the advantages over ever-increasing volumes of Flyway-style migration files & directories:
For production or dev environments?
The latter is made easier if the baseline moves forward regularly. For the former the sticky bit possibly has as much to do with data migration as DDL evolution, especially if there are multiple instances (many customers)
Show quoted text
* being able to do diffs of changes in source control
* limiting the size of the total SQL over time relative to what's actually in the DB structure
* much more easily determining the most current version of a function/procedure (this bit me hard in the past with dozens of migration files)
* the ability to prune old changes that no longer apply to any deployed version of the database
* treating database structure as code
The drawbacks I've run across are those areas where the EXISTS/REPLACE constructs aren't implemented like roles management, domains, constraints, etc. However those cases seem to be handled with only minor increases in complexity with judicious use of inline plpgsql.
In others' opinions, has DDL idempotency been viable for maintenance of PG databases fo you in production?For me at least you will need to show examples of what you trying to achieve. I'm not seeing how a migration(change) can happen without a change of some sort. More to the point how *EXISTS/OR REPLACE helps?
- Miles Elam
--
Adrian Klaver
adrian.klaver@aklaver.com
On Sat, Aug 28, 2021 at 2:19 AM Miles Elam <miles.elam@productops.com> wrote:
What is the general consensus within the community on idempotent DDL scripts, ie. consistent usage of IF EXISTS/IF NOT EXISTS/OR REPLACE for SQL init files that get checked into source control?
[...]
The drawbacks I've run across are those areas where the EXISTS/REPLACE constructs aren't implemented like roles management, domains, constraints, etc. However those cases seem to be handled with only minor increases in complexity with judicious use of inline plpgsql.In others' opinions, has DDL idempotency been viable for maintenance of PG databases fo you in production?
Note that the IF EXISTS / IF NOT EXISTS are *not* idempotent. If you
need to write idempotent schema update scripts, you need to query the
catalogs to check if the specific change you want to apply has already
been applied or not.
On Fri, Aug 27, 2021 at 7:14 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
Note that the IF EXISTS / IF NOT EXISTS are *not* idempotent. If you
need to write idempotent schema update scripts, you need to query the
catalogs to check if the specific change you want to apply has already
been applied or not.
Poor choice of words. You're absolutely right. The goal is for the script
to be idempotent, not that individual statements like that are idempotent.
For example, adding ADD COLUMN foo IF NOT EXISTS and DROP COLUMN… to the
script in addition to CREATE TABLE IF NOT EXISTS statements so that the end
result is always the same column definitions no matter how often the script
is run.
Eventually the individual ADD and DROP COLUMN statements can be removed
once all databases are up to date.
Not sure that querying the catalogs is strictly necessary though… Could you
say more?
This removes the ability to have "down" migration scripts, but I'll be
honest, I've never actually used a "down" script in production. If the "up"
script failed for some reason, the validity of the logic in the "down"
script is immediately suspect. It's always a new "up" script to fix the
problem. That's leaving aside the issue of "down" scripts not getting
anywhere near the same level of scrutiny and testing as "up" migration
scripts get.
- Miles
On 8/30/21 8:56 AM, Miles Elam wrote:
On Fri, Aug 27, 2021 at 7:14 PM Julien Rouhaud <rjuju123@gmail.com
<mailto:rjuju123@gmail.com>> wrote:Note that the IF EXISTS / IF NOT EXISTS are *not* idempotent. If you
need to write idempotent schema update scripts, you need to query the
catalogs to check if the specific change you want to apply has already
been applied or not.Poor choice of words. You're absolutely right. The goal is for the
script to be idempotent, not that individual statements like that are
idempotent.For example, adding ADD COLUMN foo IF NOT EXISTS and DROP COLUMN… to the
script in addition to CREATE TABLE IF NOT EXISTS statements so that the
end result is always the same column definitions no matter how often the
script is run.Eventually the individual ADD and DROP COLUMN statements can be removed
once all databases are up to date.Not sure that querying the catalogs is strictly necessary though… Could
you say more?This removes the ability to have "down" migration scripts, but I'll be
honest, I've never actually used a "down" script in production. If the
"up" script failed for some reason, the validity of the logic in the
"down" script is immediately suspect. It's always a new "up" script to
fix the problem. That's leaving aside the issue of "down" scripts not
getting anywhere near the same level of scrutiny and testing as "up"
migration scripts get.
I think you need to investigate Sqitch:
When working on dev database I run the deploy(up) script and then the
revert(down) every time I do a change to make sure it does work. Not
only that I routinely revert back to some previous state. Helped by
Sqitch tags that allow you set a marker in your change history. I'm
going to say that if you spend some time with the documentation you will
find that Sqitch is the scratch that eliminates your itch:)
- Miles
--
Adrian Klaver
adrian.klaver@aklaver.com
On Mon, Aug 30, 2021 at 11:56 PM Miles Elam <miles.elam@productops.com> wrote:
Not sure that querying the catalogs is strictly necessary though… Could you say more?
I meant for anything that doesn't have an IF [NOT] EXISTS, including
cases where such a clause wouldn't be possible. For instance if you
have to provide an upgrade script that change a column datatype.
As Adrian mentioned, sqitch should provide a sensible framework for
that, but there may be similar tools more suitable for your
environment if you're already using other frameworks.