Best Approach for Swapping a Table with its Copy

Started by Marcelo Fernandesabout 1 year ago23 messagesgeneral
Jump to latest
#1Marcelo Fernandes
marcefern7@gmail.com

Hi folks,

I have a scenario where I need to swap an original table with a copy of that
table.

The copy has an exclusion constraint that the original does not have. The main
challenge is to ensure that the swap does not break any existing foreign keys
to the original table and handles the associated TOAST tables correctly.

Both tables are fairly large and exist in a system where there are no
maintenance time windows where the application is shut down.

My key questions are:

- What is the best process for swapping the original table with the copy in
such a way that the foreign key relations are preserved?

- Are there any special considerations for managing the TOAST tables during
this swap?

- Should I perform this operation in multiple steps, or is there a
straightforward way to achieve this atomically?

- Are there any risks of potential issues I should be aware of when doing this
swap? Specifically related to foreign key integrity and TOAST data?

Thank you!
- Marcelo

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marcelo Fernandes (#1)
Re: Best Approach for Swapping a Table with its Copy

On 2/12/25 12:57 PM, Marcelo Fernandes wrote:

Hi folks,

I have a scenario where I need to swap an original table with a copy of that
table.

The copy has an exclusion constraint that the original does not have. The main
challenge is to ensure that the swap does not break any existing foreign keys
to the original table and handles the associated TOAST tables correctly.

Both tables are fairly large and exist in a system where there are no
maintenance time windows where the application is shut down.

This needs more information:

1) Postgres version.

2) The table definition.

3) The exclusion constraint definition.

4) Definition of what 'fairly large' is.

5) How is the application interfacing with the database?

My key questions are:

- What is the best process for swapping the original table with the copy in
such a way that the foreign key relations are preserved?

- Are there any special considerations for managing the TOAST tables during
this swap?

- Should I perform this operation in multiple steps, or is there a
straightforward way to achieve this atomically?

- Are there any risks of potential issues I should be aware of when doing this
swap? Specifically related to foreign key integrity and TOAST data?

Thank you!
- Marcelo

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marcelo Fernandes (#1)
Re: Best Approach for Swapping a Table with its Copy

On 2/12/25 12:57 PM, Marcelo Fernandes wrote:

Hi folks,

I have a scenario where I need to swap an original table with a copy of that
table.

Should have added to previous post:

Why can't you just add the exclusion constraint to the original table?

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Marcelo Fernandes
marcefern7@gmail.com
In reply to: Adrian Klaver (#2)
Re: Best Approach for Swapping a Table with its Copy

On Thu, Feb 13, 2025 at 10:02 AM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

This needs more information:
1) Postgres version.

That would be for Postgres 15.

2) The table definition.

This is a simplified version of the original table:

CREATE TABLE bookings (
id SERIAL PRIMARY KEY,
resource_id INT NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL
);

3) The exclusion constraint definition.

The copy table would have an exclusion constraint such as:

ALTER TABLE bookings
ADD CONSTRAINT no_date_overlap_for_resource_id
EXCLUDE USING gist (
resource_id WITH =,
daterange(start_date, end_date, '[]') WITH &&
);

4) Definition of what 'fairly large' is.

This table is over 400GB

5) How is the application interfacing with the database?

This is a web application that interfaces with the database using psycopg.

Also pulling in your question in the other reply:

Why can't you just add the exclusion constraint to the original table?

With unique constraints, one can use a unique index to create the constraint
concurrently.

With check constraints, one can create the constraint as invalid and then
validate it while only requiring a share update exclusive lock.

But with exclusion constraints, neither of those techniques are available. In
that sense, there is no way to create this type of constraint in a large table
without copying the original table, adding the constraint, and performing a
table swap.

This is done to avoid having to hold an exclusive lock for a long amount of
time, thus creating application outages.

Hope that clarifies the situation a bit better
- Marcelo

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marcelo Fernandes (#4)
Re: Best Approach for Swapping a Table with its Copy

On 2/12/25 1:27 PM, Marcelo Fernandes wrote:

On Thu, Feb 13, 2025 at 10:02 AM Adrian Klaver

This is a simplified version of the original table:

CREATE TABLE bookings (
id SERIAL PRIMARY KEY,
resource_id INT NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL
);

3) The exclusion constraint definition.

The copy table would have an exclusion constraint such as:

ALTER TABLE bookings
ADD CONSTRAINT no_date_overlap_for_resource_id
EXCLUDE USING gist (
resource_id WITH =,
daterange(start_date, end_date, '[]') WITH &&
);

Do you know this will not fail on the existing data?

4) Definition of what 'fairly large' is.

This table is over 400GB

Do you have room for a complete copy of the table?

This is done to avoid having to hold an exclusive lock for a long amount of
time, thus creating application outages.

I am not seeing how this can be done without some outage for that table.

What sort of time frame is acceptable?

Hope that clarifies the situation a bit better
- Marcelo

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Marcelo Fernandes
marcefern7@gmail.com
In reply to: Adrian Klaver (#5)
Re: Best Approach for Swapping a Table with its Copy

On Thu, Feb 13, 2025 at 10:40 AM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

Do you know this will not fail on the existing data?

Yes, all the current data in the original table respects the constraint.

Do you have room for a complete copy of the table?

Yes, in this scenario the copy is already created, and triggers keep the copy
in sync with the original table.

I am not seeing how this can be done without some outage for that table.

Swapping tables without an outage is possible. I believe that this something
involved in the process repacking a table when using pg_repack as it relies on
copying the whole table and then swapping it for the original.

My main question is "How?". I know that the pg_repack involves swapping the
relfilenode values and something about TOAST tables, but I am not super
acquainted with pg_repack code or debugging tools to verify precisely what it
does.

What sort of time frame is acceptable?

The scan phase in this table is very slow, on top of it the exclusion
constraint needs to create the underlying index to service the constraint.

Anything that takes more than 10s in this system is prohibitive, in this sense
creating the constraint without having a table copy is not viable for the size
of this table.

Regards,

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marcelo Fernandes (#6)
Re: Best Approach for Swapping a Table with its Copy

On 2/12/25 14:04, Marcelo Fernandes wrote:

On Thu, Feb 13, 2025 at 10:40 AM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

Do you know this will not fail on the existing data?

Yes, all the current data in the original table respects the constraint.

Do you have room for a complete copy of the table?

Yes, in this scenario the copy is already created, and triggers keep the copy
in sync with the original table.

To confirm, this copy has the exclusion constraint defined?

I am not seeing how this can be done without some outage for that table.

Swapping tables without an outage is possible. I believe that this something
involved in the process repacking a table when using pg_repack as it relies on
copying the whole table and then swapping it for the original.

Not seeing it:

https://reorg.github.io/pg_repack/

"Details
Full Table Repacks

[...]

pg_repack will only hold an ACCESS EXCLUSIVE lock for a short period
during initial setup (steps 1 and 2 above) and during the final
swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack
only needs to hold an ACCESS SHARE lock on the original table, meaning
INSERTs, UPDATEs, and DELETEs may proceed as usual."

During the ACCESS EXCLUSIVE stages you will not have access.

Not only that with pg_repack you are not changing the table definition,
whereas in your case you are introducing a new constraint and associated
index.

My main question is "How?". I know that the pg_repack involves swapping the
relfilenode values and something about TOAST tables, but I am not super
acquainted with pg_repack code or debugging tools to verify precisely what it
does.

What sort of time frame is acceptable?

The scan phase in this table is very slow, on top of it the exclusion
constraint needs to create the underlying index to service the constraint.

Anything that takes more than 10s in this system is prohibitive, in this sense
creating the constraint without having a table copy is not viable for the size
of this table.

Do you have a dev setup where you can test alternatives with a test
sample of data?

Regards,

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Marcelo Fernandes
marcefern7@gmail.com
In reply to: Adrian Klaver (#7)
Re: Best Approach for Swapping a Table with its Copy

On Thu, Feb 13, 2025 at 1:33 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

Not seeing it:

https://reorg.github.io/pg_repack/

"Details
Full Table Repacks

[...]

pg_repack will only hold an ACCESS EXCLUSIVE lock for a short period
during initial setup (steps 1 and 2 above) and during the final
swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack
only needs to hold an ACCESS SHARE lock on the original table, meaning
INSERTs, UPDATEs, and DELETEs may proceed as usual."

During the ACCESS EXCLUSIVE stages you will not have access.

The operations under ACCESS EXCLUSIVE are short-lived. In my benchmarks, the
"repack_swap" function finishes within the order of magnitude of milliseconds.
The operations seem to be catalogue-only. I'd love someone to confirm this
though, because I am not a specialist in C and Postgres extensions.

Here is the code if you want to have a look:

https://github.com/reorg/pg_repack/blob/306b0d4f7f86e807498ac00baec89ecd33411398/lib/repack.c#L843

What I am after is the same, but I seek a deeper understanding of what it does,
and why it does it. For example, it swaps relfilenode. Why?

Not only that with pg_repack you are not changing the table definition,
whereas in your case you are introducing a new constraint and associated
index.

Correct, but I am not using pg_repack. I have cloned the table using my own
tool. I'm citing pg_repack because it does perform a table swap (a bloated table
is swapped by a new non-bloated table).

Given that I know pg_repack works well in large databases, it has to follow
that the approach they have to swapping the tables is robust.

Do you have a dev setup where you can test alternatives with a test
sample of data?

Do you mean alternatives to table-cloning-and-swapping?

Regards,
- Marcelo

#9Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Marcelo Fernandes (#6)
Re: Best Approach for Swapping a Table with its Copy

On Thu, 2025-02-13 at 11:04 +1300, Marcelo Fernandes wrote:

I am not seeing how this can be done without some outage for that table.

Swapping tables without an outage is possible.

Yes, but only if you are willing to write C code that runs inside the
database server. That way, you can do anything (and cause arbitrary
damage).

The big challenge here would be to do the swap in a safe way. How do
you intend to guarantee that the foreign keys are valid without a table
scan? How do you handle concurrent data modifications?

Yours,
Laurenz Albe

--

*E-Mail Disclaimer*
Der Inhalt dieser E-Mail ist ausschliesslich fuer den
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte,
dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder
Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich
in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen.

*CONFIDENTIALITY NOTICE & DISCLAIMER
*This message and any attachment are
confidential and may be privileged or otherwise protected from disclosure
and solely for the use of the person(s) or entity to whom it is intended.
If you have received this message in error and are not the intended
recipient, please notify the sender immediately and delete this message and
any attachment from your system. If you are not the intended recipient, be
advised that any use of this message is prohibited and may be unlawful, and
you must not copy this message or attachment or disclose the contents to
any other person.

#10Greg Sabino Mullane
greg@turnstep.com
In reply to: Marcelo Fernandes (#8)
Re: Best Approach for Swapping a Table with its Copy

On Wed, Feb 12, 2025 at 9:02 PM Marcelo Fernandes <marcefern7@gmail.com>
wrote:

What I am after is the same, but I seek a deeper understanding of what it
does, and why it does it. For example, it swaps relfilenode. Why?

It is surgically replacing all pointers to the old data with pointers to
the new data. Yes, with lots of system catalog shenanigans.

pg_repack is meant to do what vacuum full does, but in a faster way.
Imagine your table is an 18-wheeler truck, with a cab (system catalog
stuff) and a trailer (full of data). We don't want a whole new truck, we
want to change out the trailer.

With VACUUM FULL, you stop all traffic while you pull the truck to the side
of the road and turn it off. A new truck is pulled alongside it, and
everything from the old trailer is unloaded and placed in the new one. The
new trailer is hooked to the cab, and pulls away into the now-moving
traffic.

With pg_repack, you keep driving full speed. A new truck pulls up alongside
your truck, and the new trailer is filled based on the old one. At the last
moment, all the wires are pulled from the old trailer and hooked to the new
trailer. The old trailer is detached and left to crash into the mutant
bikers who have been pursuing you. It's the same cab, but the trailer (e.g.
relfilenodes) has been changed.

It's technically possible to do something similar for your use case, but
it's not trivial. All the cab to trailer wires must be precisely changed.
Everything directly related to the data must be swapped: heap, indexes,
toast.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

#11Dominique Devienne
ddevienne@gmail.com
In reply to: Greg Sabino Mullane (#10)
Re: Best Approach for Swapping a Table with its Copy

On Thu, Feb 13, 2025 at 4:09 PM Greg Sabino Mullane <htamfids@gmail.com>
wrote:

On Wed, Feb 12, 2025 at 9:02 PM Marcelo Fernandes <marcefern7@gmail.com>
wrote:

What I am after is the same, but I seek a deeper understanding of what it
does, and why it does it. For example, it swaps relfilenode. Why?

It is surgically replacing all pointers to the old data with pointers to
the new data. Yes, with lots of system catalog shenanigans.
pg_repack is meant to do what vacuum full does, but in a faster way.
Imagine your table is an 18-wheeler truck, with a cab (system catalog
stuff) and a trailer (full of data). We don't want a whole new truck, we
want to change out the trailer.
With VACUUM FULL, you stop all traffic while you pull the truck to the
side of the road and turn it off. A new truck is pulled alongside it, and
everything from the old trailer is unloaded and placed in the new one. The
new trailer is hooked to the cab, and pulls away into the now-moving
traffic.
With pg_repack, you keep driving full speed. A new truck pulls up
alongside your truck, and the new trailer is filled based on the old one.
At the last moment, all the wires are pulled from the old trailer and
hooked to the new trailer. The old trailer is detached and left to crash
into the mutant bikers who have been pursuing you. It's the same cab, but
the trailer (e.g. relfilenodes) has been changed.
It's technically possible to do something similar for your use case, but
it's not trivial. All the cab to trailer wires must be precisely changed.
Everything directly related to the data must be swapped: heap, indexes,
toast.

Thanks for the colorful analogy Greg :).

Maybe the better option is to support ALTER TABLE to ADD an exclusion
constraint, no?
I get that it's not support now. But is it more difficult than the above?
And why then? --DD

#12Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dominique Devienne (#11)
Re: Best Approach for Swapping a Table with its Copy

On 2/13/25 07:25, Dominique Devienne wrote:

On Thu, Feb 13, 2025 at 4:09 PM Greg Sabino Mullane <htamfids@gmail.com
<mailto:htamfids@gmail.com>> wrote:

Thanks for the colorful analogy Greg :).

Maybe the better option is to support ALTER TABLE to ADD an exclusion
constraint, no?

That exists:

select version();
PostgreSQL 14.15

create table exclusion_test(id integer primary key, dt1 timestamptz, dt2
timestamptz);

ALTER TABLE exclusion_test ADD CONSTRAINT dt_overlap
EXCLUDE USING gist (
id WITH =,
tstzrange(dt1, dt2, '[]') WITH &&
);

\d exclusion_test
Table "public.exclusion_test"
Column | Type | Collation | Nullable | Default
--------+--------------------------+-----------+----------+---------
id | integer | | not null |
dt1 | timestamp with time zone | | |
dt2 | timestamp with time zone | | |
Indexes:
"exclusion_test_pkey" PRIMARY KEY, btree (id)
"dt_overlap" EXCLUDE USING gist (id WITH =, tstzrange(dt1, dt2,
'[]'::text) WITH &&)

I get that it's not support now. But is it more difficult than the
above? And why then? --DD

From here:

/messages/by-id/CAM2F1VOOn1izCrtcrDx4YUtCY-H64Vj5yvN5H1Lhk8kCY+_W6Q@mail.gmail.com

"

Why can't you just add the exclusion constraint to the original table?

With unique constraints, one can use a unique index to create the
constraint concurrently.

With check constraints, one can create the constraint as invalid and
then validate it while only requiring a share update exclusive lock.

But with exclusion constraints, neither of those techniques are
available. In that sense, there is no way to create this type of
constraint in a large table without copying the original table, adding
the constraint, and performing a table swap.

This is done to avoid having to hold an exclusive lock for a long amount
of time, thus creating application outages.

Hope that clarifies the situation a bit better
"

--
Adrian Klaver
adrian.klaver@aklaver.com

#13Marcelo Fernandes
marcefern7@gmail.com
In reply to: Adrian Klaver (#12)
Re: Best Approach for Swapping a Table with its Copy

On Thu, Feb 13, 2025 at 7:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Yes, but only if you are willing to write C code that runs inside the
database server. That way, you can do anything (and cause arbitrary
damage).

The big challenge here would be to do the swap in a safe way. How do
you intend to guarantee that the foreign keys are valid without a table
scan? How do you handle concurrent data modifications?

Exactly! This is part of what I am trying to figure out (-:

The plan that I have in mind so far to swap a table safely is:

1. Run the following query to grab information about the toast and the toast
index of the tables involved in the swapping:

SELECT
X.oid,
X.reltoastrelid,
X.relowner,
TOAST_X.indexrelid
FROM pg_catalog.pg_class X
LEFT JOIN
pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
TOAST_X.indrelid AND TOAST_X.indisvalid
WHERE X.oid = ('my_sweet_table')::regclass;

2. Open a transaction and acquire an access exclusive lock.
3. Ensure that the cloned table has the same owner as the original table

ALTER TABLE copy_table OWNER TO owner_of_the_original_table;

4. Now I need to swap all the data in the catalogue tables that point to the
old table and the toast to point to the new one and vice-versa (in case I
need to rollback).
5. Commit

Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
And for what reasons?

It may be a risky operation, as you say and I might decide not to do pursue it
in the end, but first I must understand (-:

Regards,
- Marcelo

#14Marcelo Fernandes
marcefern7@gmail.com
In reply to: Marcelo Fernandes (#13)
Re: Best Approach for Swapping a Table with its Copy

On Fri, Feb 14, 2025 at 4:09 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:

It is surgically replacing all pointers to the old data with pointers to the new data. Yes, with lots of system catalog shenanigans.

Love your analogy Greg, thanks for that.

It's technically possible to do something similar for your use case, but it's not trivial. All the cab to trailer wires must be precisely changed. Everything directly related to the data must be swapped: heap, indexes, toast.

I'd really appreciate to know more about how I can do this, as I think this is
the crux of what I am trying to solve.

If you have pointers, thoughts, or resources where I can better understand
what's involved, that would be much appreciated.

In terms of where I am at currently, I summarised in my previous reply:

On Fri, Feb 14, 2025 at 11:59 AM Marcelo Fernandes <marcefern7@gmail.com> wrote:

The plan that I have in mind so far to swap a table safely is:

1. Run the following query to grab information about the toast and the toast
index of the tables involved in the swapping:

SELECT
X.oid,
X.reltoastrelid,
X.relowner,
TOAST_X.indexrelid
FROM pg_catalog.pg_class X
LEFT JOIN
pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
TOAST_X.indrelid AND TOAST_X.indisvalid
WHERE X.oid = ('my_sweet_table')::regclass;

2. Open a transaction and acquire an access exclusive lock.
3. Ensure that the cloned table has the same owner as the original table

ALTER TABLE copy_table OWNER TO owner_of_the_original_table;

4. Now I need to swap all the data in the catalogue tables that point to the
old table and the toast to point to the new one and vice-versa (in case I
need to rollback).
5. Commit

Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
And for what reasons?

It may be a risky operation, as you say and I might decide not to do pursue it
in the end, but first I must understand (-:

Regards,
- Marcelo

#15Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marcelo Fernandes (#13)
Re: Best Approach for Swapping a Table with its Copy

On 2/13/25 14:59, Marcelo Fernandes wrote:

On Thu, Feb 13, 2025 at 7:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Yes, but only if you are willing to write C code that runs inside the
database server. That way, you can do anything (and cause arbitrary
damage).

The big challenge here would be to do the swap in a safe way. How do
you intend to guarantee that the foreign keys are valid without a table
scan? How do you handle concurrent data modifications?

Exactly! This is part of what I am trying to figure out (-:

The plan that I have in mind so far to swap a table safely is:

1. Run the following query to grab information about the toast and the toast
index of the tables involved in the swapping:

SELECT
X.oid,
X.reltoastrelid,
X.relowner,
TOAST_X.indexrelid
FROM pg_catalog.pg_class X
LEFT JOIN
pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
TOAST_X.indrelid AND TOAST_X.indisvalid
WHERE X.oid = ('my_sweet_table')::regclass;

2. Open a transaction and acquire an access exclusive lock.
3. Ensure that the cloned table has the same owner as the original table

ALTER TABLE copy_table OWNER TO owner_of_the_original_table;

4. Now I need to swap all the data in the catalogue tables that point to the
old table and the toast to point to the new one and vice-versa (in case I
need to rollback).
5. Commit

Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
And for what reasons?

1) In a previous post you said:

"Yes, in this scenario the copy is already created, and triggers keep
the copy in sync with the original table."

In that case the copy will already have TOAST tables associated with it.

2) What are the FK relationships and how many?

Also could you just not add the FK's as NOT VALID?

It may be a risky operation, as you say and I might decide not to do pursue it
in the end, but first I must understand (-:

Regards,
- Marcelo

--
Adrian Klaver
adrian.klaver@aklaver.com

#16Marcelo Fernandes
marcefern7@gmail.com
In reply to: Adrian Klaver (#15)
Re: Best Approach for Swapping a Table with its Copy

On Fri, Feb 14, 2025 at 12:35 PM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

1) In a previous post you said:

"Yes, in this scenario the copy is already created, and triggers keep
the copy in sync with the original table."

In that case the copy will already have TOAST tables associated with it.

If we follow the idea behind repack_swap, we would have swapped the oid's of
the two tables.

This means you have to swap the TOAST table in the catalogue as well.

Otherwise the new table will be linked to the old TOAST and the old table will
be linked to the new TOAST. We want the opposite.

2) What are the FK relationships and how many?

I think that for theoretical purposes we can just say there are "N" FKs.
Because no matter how many there are, they need to be updated to point towards
the new table.

Also could you just not add the FK's as NOT VALID?

That's an interesting compromise I haven't thought of. Thanks.

However, ideally I'd like to swap the catalogue entries instead - as that would
be a cleaner approach since it wouldn't require dropping old constraints,
creating NOT VALID ones, and then optionally validating them later.

Regards,
- Marcelo

#17Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Marcelo Fernandes (#13)
Re: Best Approach for Swapping a Table with its Copy

On Fri, 2025-02-14 at 11:59 +1300, Marcelo Fernandes wrote:

On Thu, Feb 13, 2025 at 7:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

The big challenge here would be to do the swap in a safe way. How do
you intend to guarantee that the foreign keys are valid without a table
scan? How do you handle concurrent data modifications?

Exactly! This is part of what I am trying to figure out (-:

The plan that I have in mind so far to swap a table safely is:

1. Run the following query to grab information about the toast and the toast
index of the tables involved in the swapping:

SELECT
X.oid,
X.reltoastrelid,
X.relowner,
TOAST_X.indexrelid
FROM pg_catalog.pg_class X
LEFT JOIN
pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
TOAST_X.indrelid AND TOAST_X.indisvalid
WHERE X.oid = ('my_sweet_table')::regclass;

2. Open a transaction and acquire an access exclusive lock.
3. Ensure that the cloned table has the same owner as the original table

ALTER TABLE copy_table OWNER TO owner_of_the_original_table;

4. Now I need to swap all the data in the catalogue tables that point to the
old table and the toast to point to the new one and vice-versa (in case I
need to rollback).
5. Commit

Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
And for what reasons?

It may be a risky operation, as you say and I might decide not to do pursue it
in the end, but first I must understand (-:

In your steps, you carefully avoid the question of whether the foreign key is
valid or not. So you are trusting the user to have made sure that everything
is fine with the foreign key...

I may forget something, but I'd say that swapping out the files from under a
table's butt is just a question of updating the "pg_class.relfilenode" of the
tables and the TOAST tables. But if you just UPDATE the catalogs, concurrent
statements that are using the tables will be in trouble, so you need to take
an ACCESS EXCLUSIVE lock. Moreover, you have to make sure to send out
invalidation messages so that every session that caches statistics or
execution plans for the tables discards them.

Yours,
Laurenz Albe

--

*E-Mail Disclaimer*
Der Inhalt dieser E-Mail ist ausschliesslich fuer den
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte,
dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder
Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich
in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen.

*CONFIDENTIALITY NOTICE & DISCLAIMER
*This message and any attachment are
confidential and may be privileged or otherwise protected from disclosure
and solely for the use of the person(s) or entity to whom it is intended.
If you have received this message in error and are not the intended
recipient, please notify the sender immediately and delete this message and
any attachment from your system. If you are not the intended recipient, be
advised that any use of this message is prohibited and may be unlawful, and
you must not copy this message or attachment or disclose the contents to
any other person.

#18Michał Kłeczek
michal@kleczek.org
In reply to: Marcelo Fernandes (#4)
Re: Best Approach for Swapping a Table with its Copy

On 12 Feb 2025, at 22:27, Marcelo Fernandes <marcefern7@gmail.com> wrote:

On Thu, Feb 13, 2025 at 10:02 AM Adrian Klaver

Also pulling in your question in the other reply:

Why can't you just add the exclusion constraint to the original table?

With unique constraints, one can use a unique index to create the constraint
concurrently.

With check constraints, one can create the constraint as invalid and then
validate it while only requiring a share update exclusive lock.

But with exclusion constraints, neither of those techniques are available. In
that sense, there is no way to create this type of constraint in a large table
without copying the original table, adding the constraint, and performing a
table swap.

This is done to avoid having to hold an exclusive lock for a long amount of
time, thus creating application outages.

Just a wild idea (not sure if anyone suggested it in this thread and not sure if it is doable):

Create index concurrently and then fiddle with the catalog tables to define the constraint using this index?


Michal

#19Greg Sabino Mullane
greg@turnstep.com
In reply to: Marcelo Fernandes (#14)
Re: Best Approach for Swapping a Table with its Copy

On Thu, Feb 13, 2025 at 6:06 PM Marcelo Fernandes <marcefern7@gmail.com>
wrote:

It's technically possible to do something similar for your use case, but

it's not trivial. All the cab to trailer wires must be precisely changed.
Everything directly related to the data must be swapped: heap, indexes,
toast.

I'd really appreciate to know more about how I can do this, as I think
this is
the crux of what I am trying to solve.

The pg_repack link posted earlier has the details on how it is done. But
messing with system catalogs like this is highly discouraged, for good
reasons. Still, if you need to go that route, test heavily and post the
solutions here for feedback.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

#20Greg Sabino Mullane
greg@turnstep.com
In reply to: Laurenz Albe (#17)
Re: Best Approach for Swapping a Table with its Copy

On Fri, Feb 14, 2025 at 12:41 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Moreover, you have to make sure to send out invalidation messages so that
every session that caches statistics or
execution plans for the tables discards them.

Hmm...is that really necessary? Because if so, there is no direct SQL-level
way to do that I am aware of, so we are back to a C extension. Or just
restarting the cluster :)

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

#21Greg Sabino Mullane
greg@turnstep.com
In reply to: Michał Kłeczek (#18)
#22Marcelo Fernandes
marcefern7@gmail.com
In reply to: Greg Sabino Mullane (#19)
#23Greg Sabino Mullane
greg@turnstep.com
In reply to: Marcelo Fernandes (#22)