Thread: Best Approach for Swapping a Table with its Copy

Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

12 February, 23:57:36

Hi folks,

I have a scenario where I need to swap an original table with a copy of that
table.

The copy has an exclusion constraint that the original does not have. The main
challenge is to ensure that the swap does not break any existing foreign keys
to the original table and handles the associated TOAST tables correctly.

Both tables are fairly large and exist in a system where there are no
maintenance time windows where the application is shut down.

My key questions are:

- What is the best process for swapping the original table with the copy in
  such a way that the foreign key relations are preserved?

- Are there any special considerations for managing the TOAST tables during
  this swap?

- Should I perform this operation in multiple steps, or is there a
  straightforward way to achieve this atomically?

- Are there any risks of potential issues I should be aware of when doing this
  swap? Specifically related to foreign key integrity and TOAST data?

Thank you!
  - Marcelo

Re: Best Approach for Swapping a Table with its Copy

From

Adrian Klaver

Date:

13 February, 00:02:44


On 2/12/25 12:57 PM, Marcelo Fernandes wrote:
> Hi folks,
> 
> I have a scenario where I need to swap an original table with a copy of that
> table.
> 
> The copy has an exclusion constraint that the original does not have. The main
> challenge is to ensure that the swap does not break any existing foreign keys
> to the original table and handles the associated TOAST tables correctly.
> 
> Both tables are fairly large and exist in a system where there are no
> maintenance time windows where the application is shut down.

This needs more information:

1) Postgres version.

2) The table definition.

3) The exclusion constraint definition.

4) Definition of what 'fairly large' is.

5) How is the application interfacing with the database?

> 
> My key questions are:
> 
> - What is the best process for swapping the original table with the copy in
>    such a way that the foreign key relations are preserved?
> 
> - Are there any special considerations for managing the TOAST tables during
>    this swap?
> 
> - Should I perform this operation in multiple steps, or is there a
>    straightforward way to achieve this atomically?
> 
> - Are there any risks of potential issues I should be aware of when doing this
>    swap? Specifically related to foreign key integrity and TOAST data?
> 
> Thank you!
>    - Marcelo
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Best Approach for Swapping a Table with its Copy

From

Adrian Klaver

Date:

13 February, 00:04:38

On 2/12/25 12:57 PM, Marcelo Fernandes wrote:
> Hi folks,
> 
> I have a scenario where I need to swap an original table with a copy of that
> table.
> 

Should have added to previous post:

Why can't you just add the exclusion constraint to the original table?

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

13 February, 00:27:55

On Thu, Feb 13, 2025 at 10:02 AM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:
> This needs more information:
> 1) Postgres version.

That would be for Postgres 15.

> 2) The table definition.

This is a simplified version of the original table:

CREATE TABLE bookings (
    id SERIAL PRIMARY KEY,
    resource_id INT NOT NULL,
    start_date DATE NOT NULL,
    end_date DATE NOT NULL
);

> 3) The exclusion constraint definition.

The copy table would have an exclusion constraint such as:

ALTER TABLE bookings
ADD CONSTRAINT no_date_overlap_for_resource_id
EXCLUDE USING gist (
    resource_id WITH =,
    daterange(start_date, end_date, '[]') WITH &&
);

> 4) Definition of what 'fairly large' is.

This table is over 400GB

> 5) How is the application interfacing with the database?

This is a web application that interfaces with the database using psycopg.

Also pulling in your question in the other reply:

> Why can't you just add the exclusion constraint to the original table?

With unique constraints, one can use a unique index to create the constraint
concurrently.

With check constraints, one can create the constraint as invalid and then
validate it while only requiring a share update exclusive lock.

But with exclusion constraints, neither of those techniques are available. In
that sense, there is no way to create this type of constraint in a large table
without copying the original table, adding the constraint, and performing a
table swap.

This is done to avoid having to hold an exclusive lock for a long amount of
time, thus creating application outages.

Hope that clarifies the situation a bit better
  - Marcelo

Re: Best Approach for Swapping a Table with its Copy

From

Adrian Klaver

Date:

13 February, 00:40:25


On 2/12/25 1:27 PM, Marcelo Fernandes wrote:
> On Thu, Feb 13, 2025 at 10:02 AM Adrian Klaver

> This is a simplified version of the original table:
> 
> CREATE TABLE bookings (
>      id SERIAL PRIMARY KEY,
>      resource_id INT NOT NULL,
>      start_date DATE NOT NULL,
>      end_date DATE NOT NULL
> );
> 
>> 3) The exclusion constraint definition.
> 
> The copy table would have an exclusion constraint such as:
> 
> ALTER TABLE bookings
> ADD CONSTRAINT no_date_overlap_for_resource_id
> EXCLUDE USING gist (
>      resource_id WITH =,
>      daterange(start_date, end_date, '[]') WITH &&
> );

Do you know this will not fail on the existing data?

> 
>> 4) Definition of what 'fairly large' is.
> 
> This table is over 400GB

Do you have room for a complete copy of the table?


> 
> This is done to avoid having to hold an exclusive lock for a long amount of
> time, thus creating application outages.

I am not seeing how this can be done without some outage for that table.

What sort of time frame is acceptable?

> 
> Hope that clarifies the situation a bit better
>    - Marcelo

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

13 February, 01:04:57

On Thu, Feb 13, 2025 at 10:40 AM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:
> Do you know this will not fail on the existing data?

Yes, all the current data in the original table respects the constraint.

> Do you have room for a complete copy of the table?

Yes, in this scenario the copy is already created, and triggers keep the copy
in sync with the original table.

> I am not seeing how this can be done without some outage for that table.

Swapping tables without an outage is possible. I believe that this something
involved in the process repacking a table when using pg_repack as it relies on
copying the whole table and then swapping it for the original.

My main question is "How?". I know that the pg_repack involves swapping the
relfilenode values and something about TOAST tables, but I am not super
acquainted with pg_repack code or debugging tools to verify precisely what it
does.

> What sort of time frame is acceptable?

The scan phase in this table is very slow, on top of it the exclusion
constraint needs to create the underlying index to service the constraint.

Anything that takes more than 10s in this system is prohibitive, in this sense
creating the constraint without having a table copy is not viable for the size
of this table.

Regards,

Re: Best Approach for Swapping a Table with its Copy

From

Adrian Klaver

Date:

13 February, 03:33:12

On 2/12/25 14:04, Marcelo Fernandes wrote:
> On Thu, Feb 13, 2025 at 10:40 AM Adrian Klaver
> <adrian.klaver@aklaver.com> wrote:
>> Do you know this will not fail on the existing data?
> 
> Yes, all the current data in the original table respects the constraint.
> 
>> Do you have room for a complete copy of the table?
> 
> Yes, in this scenario the copy is already created, and triggers keep the copy
> in sync with the original table.

To confirm, this copy has the exclusion constraint defined?

> 
>> I am not seeing how this can be done without some outage for that table.
> 
> Swapping tables without an outage is possible. I believe that this something
> involved in the process repacking a table when using pg_repack as it relies on
> copying the whole table and then swapping it for the original.

Not seeing it:

https://reorg.github.io/pg_repack/

"Details
Full Table Repacks

[...]

pg_repack will only hold an ACCESS EXCLUSIVE lock for a short period 
during initial setup (steps 1 and 2 above) and during the final 
swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack 
only needs to hold an ACCESS SHARE lock on the original table, meaning 
INSERTs, UPDATEs, and DELETEs may proceed as usual."

During the ACCESS EXCLUSIVE stages you will not have access.

Not only that with pg_repack you are not changing the table definition, 
whereas in your case you are introducing a new constraint and associated 
index.

> 
> My main question is "How?". I know that the pg_repack involves swapping the
> relfilenode values and something about TOAST tables, but I am not super
> acquainted with pg_repack code or debugging tools to verify precisely what it
> does.
> 
>> What sort of time frame is acceptable?
> 
> The scan phase in this table is very slow, on top of it the exclusion
> constraint needs to create the underlying index to service the constraint.
> 
> Anything that takes more than 10s in this system is prohibitive, in this sense
> creating the constraint without having a table copy is not viable for the size
> of this table.

Do you have a dev setup where you can test alternatives with a test 
sample of data?

> 
> Regards,

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

13 February, 05:02:22

On Thu, Feb 13, 2025 at 1:33 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
> Not seeing it:
>
> https://reorg.github.io/pg_repack/
>
> "Details
> Full Table Repacks
>
> [...]
>
> pg_repack will only hold an ACCESS EXCLUSIVE lock for a short period
> during initial setup (steps 1 and 2 above) and during the final
> swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack
> only needs to hold an ACCESS SHARE lock on the original table, meaning
> INSERTs, UPDATEs, and DELETEs may proceed as usual."
>
> During the ACCESS EXCLUSIVE stages you will not have access.

The operations under ACCESS EXCLUSIVE are short-lived. In my benchmarks, the
"repack_swap" function finishes within the order of magnitude of milliseconds.
The operations seem to be catalogue-only. I'd love someone to confirm this
though, because I am not a specialist in C and Postgres extensions.

Here is the code if you want to have a look:

https://github.com/reorg/pg_repack/blob/306b0d4f7f86e807498ac00baec89ecd33411398/lib/repack.c#L843

What I am after is the same, but I seek a deeper understanding of what it does,
and why it does it. For example, it swaps relfilenode. Why?

> Not only that with pg_repack you are not changing the table definition,
> whereas in your case you are introducing a new constraint and associated
> index.

Correct, but I am not using pg_repack. I have cloned the table using my own
tool. I'm citing pg_repack because it does perform a table swap (a bloated table
is swapped by a new non-bloated table).

Given that I know pg_repack works well in large databases, it has to follow
that the approach they have to swapping the tables is robust.

> Do you have a dev setup where you can test alternatives with a test
> sample of data?

Do you mean alternatives to table-cloning-and-swapping?

Regards,
  - Marcelo

Re: Best Approach for Swapping a Table with its Copy

From

Laurenz Albe

Date:

13 February, 09:37:50

On Thu, 2025-02-13 at 11:04 +1300, Marcelo Fernandes wrote:
> > I am not seeing how this can be done without some outage for that table.
> 
> Swapping tables without an outage is possible.

Yes, but only if you are willing to write C code that runs inside the
database server.  That way, you can do anything (and cause arbitrary
damage).

The big challenge here would be to do the swap in a safe way.  How do
you intend to guarantee that the foreign keys are valid without a table
scan?  How do you handle concurrent data modifications?

Yours,
Laurenz Albe

-- 

*E-Mail Disclaimer*
Der Inhalt dieser E-Mail ist ausschliesslich fuer den 
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat 
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, 
dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder 
Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich 
in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen.

*CONFIDENTIALITY NOTICE & DISCLAIMER
*This message and any attachment are 
confidential and may be privileged or otherwise protected from disclosure 
and solely for the use of the person(s) or entity to whom it is intended. 
If you have received this message in error and are not the intended 
recipient, please notify the sender immediately and delete this message and 
any attachment from your system. If you are not the intended recipient, be 
advised that any use of this message is prohibited and may be unlawful, and 
you must not copy this message or attachment or disclose the contents to 
any other person.

Re: Best Approach for Swapping a Table with its Copy

From

Greg Sabino Mullane

Date:

13 February, 18:08:27

On Wed, Feb 12, 2025 at 9:02 PM Marcelo Fernandes <marcefern7@gmail.com> wrote:

What I am after is the same, but I seek a deeper understanding of what it does, and why it does it. For example, it swaps relfilenode. Why?

It is surgically replacing all pointers to the old data with pointers to the new data. Yes, with lots of system catalog shenanigans.

pg_repack is meant to do what vacuum full does, but in a faster way. Imagine your table is an 18-wheeler truck, with a cab (system catalog stuff) and a trailer (full of data). We don't want a whole new truck, we want to change out the trailer.

With VACUUM FULL, you stop all traffic while you pull the truck to the side of the road and turn it off. A new truck is pulled alongside it, and everything from the old trailer is unloaded and placed in the new one. The new trailer is hooked to the cab, and pulls away into the now-moving traffic.

With pg_repack, you keep driving full speed. A new truck pulls up alongside your truck, and the new trailer is filled based on the old one. At the last moment, all the wires are pulled from the old trailer and hooked to the new trailer. The old trailer is detached and left to crash into the mutant bikers who have been pursuing you. It's the same cab, but the trailer (e.g. relfilenodes) has been changed.

It's technically possible to do something similar for your use case, but it's not trivial. All the cab to trailer wires must be precisely changed. Everything directly related to the data must be swapped: heap, indexes, toast.

Cheers,

Greg

Crunchy Data - https://www.crunchydata.com

Enterprise Postgres Software Products & Tech Support

Re: Best Approach for Swapping a Table with its Copy

From

Dominique Devienne

Date:

13 February, 18:25:40

On Thu, Feb 13, 2025 at 4:09 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:

On Wed, Feb 12, 2025 at 9:02 PM Marcelo Fernandes <marcefern7@gmail.com> wrote:
What I am after is the same, but I seek a deeper understanding of what it does, and why it does it. For example, it swaps relfilenode. Why?

It is surgically replacing all pointers to the old data with pointers to the new data. Yes, with lots of system catalog shenanigans.
pg_repack is meant to do what vacuum full does, but in a faster way. Imagine your table is an 18-wheeler truck, with a cab (system catalog stuff) and a trailer (full of data). We don't want a whole new truck, we want to change out the trailer.
With VACUUM FULL, you stop all traffic while you pull the truck to the side of the road and turn it off. A new truck is pulled alongside it, and everything from the old trailer is unloaded and placed in the new one. The new trailer is hooked to the cab, and pulls away into the now-moving traffic.
With pg_repack, you keep driving full speed. A new truck pulls up alongside your truck, and the new trailer is filled based on the old one. At the last moment, all the wires are pulled from the old trailer and hooked to the new trailer. The old trailer is detached and left to crash into the mutant bikers who have been pursuing you. It's the same cab, but the trailer (e.g. relfilenodes) has been changed.
It's technically possible to do something similar for your use case, but it's not trivial. All the cab to trailer wires must be precisely changed. Everything directly related to the data must be swapped: heap, indexes, toast.

Thanks for the colorful analogy Greg :).

Maybe the better option is to support ALTER TABLE to ADD an exclusion constraint, no?

I get that it's not support now. But is it more difficult than the above? And why then? --DD

Re: Best Approach for Swapping a Table with its Copy

From

Adrian Klaver

Date:

13 February, 20:00:07

On 2/13/25 07:25, Dominique Devienne wrote:
> On Thu, Feb 13, 2025 at 4:09 PM Greg Sabino Mullane <htamfids@gmail.com 
> <mailto:htamfids@gmail.com>> wrote:
>
> 
> Thanks for the colorful analogy Greg :).
> 
> Maybe the better option is to support ALTER TABLE to ADD an exclusion 
> constraint, no?

That exists:

select version();
PostgreSQL 14.15

create table exclusion_test(id integer primary key, dt1 timestamptz, dt2 
timestamptz);

  ALTER TABLE exclusion_test ADD CONSTRAINT dt_overlap
EXCLUDE USING gist (
     id WITH =,
     tstzrange(dt1, dt2, '[]') WITH &&
);


\d exclusion_test
                    Table "public.exclusion_test"
  Column |           Type           | Collation | Nullable | Default
--------+--------------------------+-----------+----------+---------
  id     | integer                  |           | not null |
  dt1    | timestamp with time zone |           |          |
  dt2    | timestamp with time zone |           |          |
Indexes:
     "exclusion_test_pkey" PRIMARY KEY, btree (id)
     "dt_overlap" EXCLUDE USING gist (id WITH =, tstzrange(dt1, dt2, 
'[]'::text) WITH &&)


> I get that it's not support now. But is it more difficult than the 
> above? And why then? --DD

 From here:

https://www.postgresql.org/message-id/CAM2F1VOOn1izCrtcrDx4YUtCY-H64Vj5yvN5H1Lhk8kCY%2B_W6Q%40mail.gmail.com

"
 > Why can't you just add the exclusion constraint to the original table?


With unique constraints, one can use a unique index to create the 
constraint concurrently.


With check constraints, one can create the constraint as invalid and 
then validate it while only requiring a share update exclusive lock.


But with exclusion constraints, neither of those techniques are 
available. In that sense, there is no way to create this type of 
constraint in a large table without copying the original table, adding 
the constraint, and performing a table swap.


This is done to avoid having to hold an exclusive lock for a long amount 
of time, thus creating application outages.


Hope that clarifies the situation a bit better
"

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

14 February, 01:59:50

On Thu, Feb 13, 2025 at 7:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> Yes, but only if you are willing to write C code that runs inside the
> database server.  That way, you can do anything (and cause arbitrary
> damage).
>
> The big challenge here would be to do the swap in a safe way.  How do
> you intend to guarantee that the foreign keys are valid without a table
> scan?  How do you handle concurrent data modifications?

Exactly! This is part of what I am trying to figure out (-:

The plan that I have in mind so far to swap a table safely is:

1. Run the following query to grab information about the toast and the toast
   index of the tables involved in the swapping:

  SELECT
    X.oid,
    X.reltoastrelid,
    X.relowner,
    TOAST_X.indexrelid
  FROM pg_catalog.pg_class X
    LEFT JOIN
      pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
TOAST_X.indrelid AND TOAST_X.indisvalid
  WHERE X.oid = ('my_sweet_table')::regclass;

2. Open a transaction and acquire an access exclusive lock.
3. Ensure that the cloned table has the same owner as the original table

  ALTER TABLE copy_table OWNER TO owner_of_the_original_table;

4. Now I need to swap all the data in the catalogue tables that point to the
   old table and the toast to point to the new one and vice-versa (in case I
   need to rollback).
5. Commit

Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
And for what reasons?

It may be a risky operation, as you say and I might decide not to do pursue it
in the end, but first I must understand (-:

Regards,
  - Marcelo

Re: Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

14 February, 02:06:40

On Fri, Feb 14, 2025 at 4:09 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:
> It is surgically replacing all pointers to the old data with pointers to the new data. Yes, with lots of system
catalogshenanigans. 

Love your analogy Greg, thanks for that.

> It's technically possible to do something similar for your use case, but it's not trivial. All the cab to trailer
wiresmust be precisely changed. Everything directly related to the data must be swapped: heap, indexes, toast. 

I'd really appreciate to know more about how I can do this, as I think this is
the crux of what I am trying to solve.

If you have pointers, thoughts, or resources where I can better understand
what's involved, that would be much appreciated.

In terms of where I am at currently, I summarised in my previous reply:

On Fri, Feb 14, 2025 at 11:59 AM Marcelo Fernandes <marcefern7@gmail.com> wrote:
> The plan that I have in mind so far to swap a table safely is:
>
> 1. Run the following query to grab information about the toast and the toast
>    index of the tables involved in the swapping:
>
>   SELECT
>     X.oid,
>     X.reltoastrelid,
>     X.relowner,
>     TOAST_X.indexrelid
>   FROM pg_catalog.pg_class X
>     LEFT JOIN
>       pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
> TOAST_X.indrelid AND TOAST_X.indisvalid
>   WHERE X.oid = ('my_sweet_table')::regclass;
>
> 2. Open a transaction and acquire an access exclusive lock.
> 3. Ensure that the cloned table has the same owner as the original table
>
>   ALTER TABLE copy_table OWNER TO owner_of_the_original_table;
>
> 4. Now I need to swap all the data in the catalogue tables that point to the
>    old table and the toast to point to the new one and vice-versa (in case I
>    need to rollback).
> 5. Commit
>
> Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
> And for what reasons?
>
> It may be a risky operation, as you say and I might decide not to do pursue it
> in the end, but first I must understand (-:

Regards,
  - Marcelo

Re: Best Approach for Swapping a Table with its Copy

From

Adrian Klaver

Date:

14 February, 02:35:43

On 2/13/25 14:59, Marcelo Fernandes wrote:
> On Thu, Feb 13, 2025 at 7:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>> Yes, but only if you are willing to write C code that runs inside the
>> database server.  That way, you can do anything (and cause arbitrary
>> damage).
>>
>> The big challenge here would be to do the swap in a safe way.  How do
>> you intend to guarantee that the foreign keys are valid without a table
>> scan?  How do you handle concurrent data modifications?
> 
> Exactly! This is part of what I am trying to figure out (-:
> 
> The plan that I have in mind so far to swap a table safely is:
> 
> 1. Run the following query to grab information about the toast and the toast
>     index of the tables involved in the swapping:
> 
>    SELECT
>      X.oid,
>      X.reltoastrelid,
>      X.relowner,
>      TOAST_X.indexrelid
>    FROM pg_catalog.pg_class X
>      LEFT JOIN
>        pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
> TOAST_X.indrelid AND TOAST_X.indisvalid
>    WHERE X.oid = ('my_sweet_table')::regclass;
> 
> 2. Open a transaction and acquire an access exclusive lock.
> 3. Ensure that the cloned table has the same owner as the original table
> 
>    ALTER TABLE copy_table OWNER TO owner_of_the_original_table;
> 
> 4. Now I need to swap all the data in the catalogue tables that point to the
>     old table and the toast to point to the new one and vice-versa (in case I
>     need to rollback).
> 5. Commit
> 
> Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
> And for what reasons?

1) In a previous post you said:

"Yes, in this scenario the copy is already created, and triggers keep 
the copy in sync with the original table."

In that case the copy will already have TOAST tables associated with it.

2) What are the FK relationships and how many?

Also could you just not add the FK's as NOT VALID?

> 
> It may be a risky operation, as you say and I might decide not to do pursue it
> in the end, but first I must understand (-:
> 
> Regards,
>    - Marcelo

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

14 February, 03:32:52

On Fri, Feb 14, 2025 at 12:35 PM Adrian Klaver
<adrian.klaver@aklaver.com> wrote:
> 1) In a previous post you said:
>
> "Yes, in this scenario the copy is already created, and triggers keep
> the copy in sync with the original table."
>
> In that case the copy will already have TOAST tables associated with it.

If we follow the idea behind repack_swap, we would have swapped the oid's of
the two tables.

This means you have to swap the TOAST table in the catalogue as well.

Otherwise the new table will be linked to the old TOAST and the old table will
be linked to the new TOAST. We want the opposite.

> 2) What are the FK relationships and how many?

I think that for theoretical purposes we can just say there are "N" FKs.
Because no matter how many there are, they need to be updated to point towards
the new table.

> Also could you just not add the FK's as NOT VALID?

That's an interesting compromise I haven't thought of. Thanks.

However, ideally I'd like to swap the catalogue entries instead - as that would
be a cleaner approach since it wouldn't require dropping old constraints,
creating NOT VALID ones, and then optionally validating them later.

Regards,
  - Marcelo

Re: Best Approach for Swapping a Table with its Copy

From

Laurenz Albe

Date:

14 February, 08:41:07

On Fri, 2025-02-14 at 11:59 +1300, Marcelo Fernandes wrote:
> On Thu, Feb 13, 2025 at 7:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > The big challenge here would be to do the swap in a safe way.  How do
> > you intend to guarantee that the foreign keys are valid without a table
> > scan?  How do you handle concurrent data modifications?
>
> Exactly! This is part of what I am trying to figure out (-:
>
> The plan that I have in mind so far to swap a table safely is:
>
> 1. Run the following query to grab information about the toast and the toast
>    index of the tables involved in the swapping:
>
>   SELECT
>     X.oid,
>     X.reltoastrelid,
>     X.relowner,
>     TOAST_X.indexrelid
>   FROM pg_catalog.pg_class X
>     LEFT JOIN
>       pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
> TOAST_X.indrelid AND TOAST_X.indisvalid
>   WHERE X.oid = ('my_sweet_table')::regclass;
>
> 2. Open a transaction and acquire an access exclusive lock.
> 3. Ensure that the cloned table has the same owner as the original table
>
>   ALTER TABLE copy_table OWNER TO owner_of_the_original_table;
>
> 4. Now I need to swap all the data in the catalogue tables that point to the
>    old table and the toast to point to the new one and vice-versa (in case I
>    need to rollback).
> 5. Commit
>
> Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
> And for what reasons?
>
> It may be a risky operation, as you say and I might decide not to do pursue it
> in the end, but first I must understand (-:

In your steps, you carefully avoid the question of whether the foreign key is
valid or not.  So you are trusting the user to have made sure that everything
is fine with the foreign key...

I may forget something, but I'd say that swapping out the files from under a
table's butt is just a question of updating the "pg_class.relfilenode" of the
tables and the TOAST tables.  But if you just UPDATE the catalogs, concurrent
statements that are using the tables will be in trouble, so you need to take
an ACCESS EXCLUSIVE lock.  Moreover, you have to make sure to send out
invalidation messages so that every session that caches statistics or
execution plans for the tables discards them.

Yours,
Laurenz Albe

--

*E-Mail Disclaimer*
Der Inhalt dieser E-Mail ist ausschliesslich fuer den
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte,
dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder
Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich
in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen.

*CONFIDENTIALITY NOTICE & DISCLAIMER
*This message and any attachment are
confidential and may be privileged or otherwise protected from disclosure
and solely for the use of the person(s) or entity to whom it is intended.
If you have received this message in error and are not the intended
recipient, please notify the sender immediately and delete this message and
any attachment from your system. If you are not the intended recipient, be
advised that any use of this message is prohibited and may be unlawful, and
you must not copy this message or attachment or disclose the contents to
any other person.

Re: Best Approach for Swapping a Table with its Copy

From

Michał Kłeczek

Date:

14 February, 09:01:55


> On 12 Feb 2025, at 22:27, Marcelo Fernandes <marcefern7@gmail.com> wrote:
>
> On Thu, Feb 13, 2025 at 10:02 AM Adrian Klaver
>
> Also pulling in your question in the other reply:
>
>> Why can't you just add the exclusion constraint to the original table?
>
> With unique constraints, one can use a unique index to create the constraint
> concurrently.
>
> With check constraints, one can create the constraint as invalid and then
> validate it while only requiring a share update exclusive lock.
>
> But with exclusion constraints, neither of those techniques are available. In
> that sense, there is no way to create this type of constraint in a large table
> without copying the original table, adding the constraint, and performing a
> table swap.
>
> This is done to avoid having to hold an exclusive lock for a long amount of
> time, thus creating application outages.
>

Just a wild idea (not sure if anyone suggested it in this thread and not sure if it is doable):

Create index concurrently and then fiddle with the catalog tables to define the constraint using this index?

—
Michal

Re: Best Approach for Swapping a Table with its Copy

From

Greg Sabino Mullane

Date:

14 February, 18:11:48

On Thu, Feb 13, 2025 at 6:06 PM Marcelo Fernandes <marcefern7@gmail.com> wrote:

> It's technically possible to do something similar for your use case, but it's not trivial. All the cab to trailer wires must be precisely changed. Everything directly related to the data must be swapped: heap, indexes, toast.

I'd really appreciate to know more about how I can do this, as I think this is
the crux of what I am trying to solve.

The pg_repack link posted earlier has the details on how it is done. But messing with system catalogs like this is highly discouraged, for good reasons. Still, if you need to go that route, test heavily and post the solutions here for feedback.

Cheers,

Greg

Crunchy Data - https://www.crunchydata.com

Enterprise Postgres Software Products & Tech Support

Re: Best Approach for Swapping a Table with its Copy

From

Greg Sabino Mullane

Date:

14 February, 18:14:45

On Fri, Feb 14, 2025 at 12:41 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Moreover, you have to make sure to send out invalidation messages so that every session that caches statistics or
execution plans for the tables discards them.

Hmm...is that really necessary? Because if so, there is no direct SQL-level way to do that I am aware of, so we are back to a C extension. Or just restarting the cluster :)

Cheers,

Greg

Crunchy Data - https://www.crunchydata.com

Enterprise Postgres Software Products & Tech Support

Re: Best Approach for Swapping a Table with its Copy

From

Greg Sabino Mullane

Date:

14 February, 18:19:57

On Fri, Feb 14, 2025 at 1:02 AM Michał Kłeczek <michal@kleczek.org> wrote:

Create index concurrently and then fiddle with the catalog tables to define the constraint using this index?

You mean an ALTER TABLE ... ADD CONSTRAINT ... EXCLUDE without actually doing an ALTER TABLE. Nope, that's far worse than the pg_repack shenanigans, as we would be creating new catalog objects.

Ideally would be that someone adds support for USING INDEX for an exclusion constraint.

Cheers,

Greg

Crunchy Data - https://www.crunchydata.com

Enterprise Postgres Software Products & Tech Support

Re: Best Approach for Swapping a Table with its Copy

From

Marcelo Fernandes

Date:

17 February, 01:58:31

On Sat, Feb 15, 2025 at 4:12 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:
> The pg_repack link posted earlier has the details on how it is done. But messing with system catalogs like this is
highlydiscouraged, for good reasons. Still, if you need to go that route, test heavily and post the solutions here for
feedback.

I'm trying to digest what pg_repack does by reproducing the same behaviour
using SQL. I have come up with the script below to test the whole scenario, but
I have two major problems:

- The foreign keys are not being updated to point to the new table.

- pg_repack seems to update relcache entries. I am not sure how to do that with
  SQL. See:
  https://github.com/marcelofern/pg_repack/blob/9f36c65bd57ca1b228025687843758556b56df8e/lib/repack.c#L1373-L1387

And for reference, here is the script I have used so far. Keen for any
suggestions on how to swap the foreign keys so that they can point towards the
new table.

```sql
-- Create the original table that will be later swapped by its copy.
DROP TABLE IF EXISTS original CASCADE;
CREATE TABLE original (
    id SERIAL PRIMARY KEY,
    name VARCHAR(5000) NOT NULL, -- necessary for the TOAST table.
    value INTEGER NOT NULL
);
-- Insert 10_000 rows into it.
INSERT INTO original (name, value)
SELECT
    'item_' || generate_series(1, 10000) AS name,
    (generate_series(1, 10000) % 10000) + 1 AS value;

-- Create the copy table, this table will be swapped for the original table
-- later
DROP TABLE IF EXISTS copy;
CREATE TABLE copy (
    id SERIAL PRIMARY KEY,
    name VARCHAR(5000) NOT NULL,
    value INTEGER NOT NULL
);
-- Pull all the data from the original table into the copy table.
INSERT INTO copy SELECT id, name, value FROM ONLY original;

-- Create a table with a foreign key to the original table to verify if the
-- swap addresses the foreign key table.
DROP TABLE IF EXISTS table_with_fk;
CREATE TABLE table_with_fk (
    id SERIAL PRIMARY KEY,
    original_id INTEGER NOT NULL,
    CONSTRAINT fk_original FOREIGN KEY (original_id)
        REFERENCES original(id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);
-- Insert 10_000 rows into it.
INSERT INTO table_with_fk (original_id)
SELECT generate_series(1, 10000);

-- Analogously, create a table with a NOT VALID foreign key.
DROP TABLE IF EXISTS table_with_not_valid_fk;
CREATE TABLE table_with_not_valid_fk (
    id SERIAL PRIMARY KEY,
    original_id INTEGER NOT NULL,
    CONSTRAINT not_valid_fk_original FOREIGN KEY (original_id)
        REFERENCES original(id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
        NOT VALID
);
-- Insert 10_000 rows
INSERT INTO table_with_not_valid_fk (original_id)
SELECT generate_series(1, 10000);

-- All tables must have 10_000 rows in them.
SELECT count(*) FROM original;
SELECT count(*) FROM copy;
SELECT count(*) FROM table_with_fk;
SELECT count(*) FROM table_with_not_valid_fk;

-- See relation info for the tables and their TOASTs.
SELECT
    X.relname,
    X.reltablespace,
    X.oid,
    X.reltoastrelid,
    X.relowner,
    X.relkind,
    X.relfrozenxid,
    X.relminmxid,
    X.relpages,
    X.reltuples,
    X.relallvisible,
    X.relfilenode,
    TOAST_X.indexrelid as toast_indexrelid
FROM pg_catalog.pg_class X
LEFT JOIN
  pg_catalog.pg_index TOAST_X ON X.reltoastrelid = TOAST_X.indrelid
AND TOAST_X.indisvalid
WHERE X.oid IN (('original')::regclass, ('copy')::regclass)
ORDER BY X.relname;
-- -[ RECORD 1 ]----+---------
-- relname          | copy
-- reltablespace    | 0
-- oid              | 22522
-- reltoastrelid    | 22526
-- relowner         | 10
-- relkind          | r
-- relfrozenxid     | 2068
-- relminmxid       | 1
-- relpages         | 64
-- reltuples        | 10000
-- relallvisible    | 64
-- relfilenode      | 22522
-- toast_indexrelid | 22527
-- -[ RECORD 2 ]----+---------
-- relname          | original
-- reltablespace    | 0
-- oid              | 22513
-- reltoastrelid    | 22517
-- relowner         | 10
-- relkind          | r
-- relfrozenxid     | 2065
-- relminmxid       | 1
-- relpages         | 64
-- reltuples        | 10000
-- relallvisible    | 64
-- relfilenode      | 22513
-- toast_indexrelid | 22518

-- Take note of the dependencies for the toast table to compare later.
SELECT
    d1.objid AS original_objid,
    d2.objid AS copy_objid
FROM pg_depend d1, pg_depend d2
WHERE d1.objid = ('original'::regclass) AND d2.objid = ('copy'::regclass);
-- -[ RECORD 1 ]--+------
-- original_objid | 22513
-- copy_objid     | 22522

-- Start table swap inside a transaction.
BEGIN;
LOCK TABLE original, copy IN ACCESS EXCLUSIVE MODE;

SELECT * FROM pg_class
WHERE relname in ('original', 'copy')
FOR UPDATE;

WITH swapped AS (
    SELECT
        c1.oid AS original_oid, c2.oid AS copy_oid,
        c1.relfilenode AS original_filenode, c2.relfilenode AS copy_filenode,
        c1.reltablespace AS original_tablespace, c2.reltablespace AS
copy_tablespace,
        c1.reltoastrelid AS original_toast, c2.reltoastrelid AS copy_toast,
        c1.relfrozenxid AS original_frozenxid, c2.relfrozenxid AS
copy_frozenxid,
        c1.relminmxid as original_relminmxid, c2.relminmxid AS copy_relminmxid,
        c1.relpages AS original_pages, c2.relpages AS copy_pages,
        c1.reltuples AS original_tuples, c2.reltuples AS copy_tuples,
        c1.relallvisible AS original_allvisible, c2.relallvisible AS
copy_allvisible
    FROM pg_class c1, pg_class c2
    WHERE c1.relname = 'original'
      AND c2.relname = 'copy'
)

UPDATE pg_class
SET relfilenode = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_filenode FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_filenode FROM swapped)
END,
reltablespace = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_tablespace FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_tablespace FROM swapped)
END,
reltoastrelid = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_toast FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_toast FROM swapped)
END,
relfrozenxid = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_frozenxid FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_frozenxid FROM swapped)
END,
relminmxid = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_relminmxid FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_relminmxid FROM swapped)
END,
relpages = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_pages FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_pages FROM swapped)
END,
reltuples = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_tuples FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_tuples FROM swapped)
END,
relallvisible = CASE
    WHEN oid = (SELECT original_oid FROM swapped) THEN (SELECT
copy_allvisible FROM swapped)
    WHEN oid = (SELECT copy_oid FROM swapped) THEN (SELECT
original_allvisible FROM swapped)
END
WHERE oid IN (SELECT original_oid FROM swapped UNION SELECT copy_oid
FROM swapped);

-- See that relevant fields have been swapped
SELECT
    X.relname,
    X.reltablespace,
    X.oid,
    X.reltoastrelid,
    X.relowner,
    X.relkind,
    X.relfrozenxid,
    X.relminmxid,
    X.relpages,
    X.reltuples,
    X.relallvisible,
    X.relfilenode,
    TOAST_X.indexrelid as toast_indexrelid
FROM pg_catalog.pg_class X
LEFT JOIN
  pg_catalog.pg_index TOAST_X ON X.reltoastrelid = TOAST_X.indrelid
AND TOAST_X.indisvalid
WHERE X.oid IN (('original')::regclass, ('copy')::regclass)
ORDER BY X.relname;
-- -[ RECORD 1 ]----+---------
-- relname          | copy
-- reltablespace    | 0
-- oid              | 22522
-- reltoastrelid    | 22517
-- relowner         | 10
-- relkind          | r
-- relfrozenxid     | 2065
-- relminmxid       | 1
-- relpages         | 64
-- reltuples        | 10000
-- relallvisible    | 64
-- relfilenode      | 22513
-- toast_indexrelid | 22518
-- -[ RECORD 2 ]----+---------
-- relname          | original
-- reltablespace    | 0
-- oid              | 22513
-- reltoastrelid    | 22526
-- relowner         | 10
-- relkind          | r
-- relfrozenxid     | 2068
-- relminmxid       | 1
-- relpages         | 64
-- reltuples        | 10000
-- relallvisible    | 64
-- relfilenode      | 22522
-- toast_indexrelid | 22527

-- Lock the pg_depend rows that correspond to 'original' and 'copy'
SELECT * FROM pg_depend
WHERE objid IN (('original')::regclass, ('copy')::regclass)
FOR UPDATE;

-- Swap the objid values for the two dependencies
WITH swapped_dep AS (
    SELECT
        d1.objid AS original_objid, d2.objid AS copy_objid
    FROM pg_depend d1, pg_depend d2
    WHERE d1.objid = ('original'::regclass)
      AND d2.objid = ('copy'::regclass)
)
-- TODO: this update is not working, maybe it needs to be deleted and then
-- inserted again? A delete-followed-by create is what pg_repack seems to do.
UPDATE pg_depend
SET objid = CASE
    WHEN objid = (SELECT original_objid FROM swapped_dep) THEN (SELECT
copy_objid FROM swapped_dep)
    WHEN objid = (SELECT copy_objid FROM swapped_dep) THEN (SELECT
original_objid FROM swapped_dep)
END
WHERE objid IN (SELECT original_objid FROM swapped_dep UNION SELECT
copy_objid FROM swapped_dep);

-- Verify the dependencies have been swapped.
SELECT
    d1.objid AS original_objid,
    d2.objid AS copy_objid
FROM pg_depend d1, pg_depend d2
WHERE d1.objid = ('original'::regclass) AND d2.objid = ('copy'::regclass);
---[ RECORD 1 ]--+------
--original_objid | 22513
--copy_objid     | 22522

-- Renames!
ALTER TABLE original RENAME TO temp_original;
ALTER TABLE copy RENAME TO original;
ALTER TABLE temp_original RENAME TO copy;
DROP TABLE copy CASCADE;

-- Insert a couple of rows in the new "original" to verify it works
INSERT INTO original (id, name, value) values (10001, 'my_new_row', 10);
SELECT * from original order by id DESC;

-- TODO (minor): Index names for pks and its seq have not been renamed.

-- TODO (major): The FKs on the related tables weren't updated to use the new
-- table
-- \d table_with_fk
--                                Table "public.table_with_fk"
--    Column    |  Type   | Collation | Nullable |                  Default
-- -------------+---------+-----------+----------+-------------------------------------------
--  id          | integer |           | not null |
nextval('table_with_fk_id_seq'::regclass)
--  original_id | integer |           | not null |
-- Indexes:
--     "table_with_fk_pkey" PRIMARY KEY, btree (id)
--
-- \d table_with_not_valid_fk
--                                Table "public.table_with_not_valid_fk"
--    Column    |  Type   | Collation | Nullable |                       Default
-- -------------+---------+-----------+----------+-----------------------------------------------------
--  id          | integer |           | not null |
nextval('table_with_not_valid_fk_id_seq'::regclass)
--  original_id | integer |           | not null |
-- Indexes:
--     "table_with_not_valid_fk_pkey" PRIMARY KEY, btree (id)

-- Roll this back so that your postgres db doesn't get potentially messed up.
ROLLBACK;
```

Re: Best Approach for Swapping a Table with its Copy

From

Greg Sabino Mullane

Date:

17 February, 02:48:43

On Sun, Feb 16, 2025 at 5:58 PM Marcelo Fernandes <marcefern7@gmail.com> wrote:

- The foreign keys are not being updated to point to the new table.

You started out okay with your test script, but the pg_depend bit needs work. I would recommend examining that table closely until you have a really good understanding of how it works - both on the objid and refobjid side. Create and modify tables and see how the pg_depend entries change. Hint: Try it with and without foreign keys.

https://www.postgresql.org/docs/current/catalog-pg-depend.html

- pg_repack seems to update relcache entries. I am not sure how to do that with SQL.

Yes, as I mentioned upthread, the safest thing will probably be a restart.

Cheers,

Greg

Crunchy Data - https://www.crunchydata.com

Enterprise Postgres Software Products & Tech Support