Thread: Removing oids with pg_repack

Removing oids with pg_repack

From
CG
Date:
I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. 

To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data.

Next I make this modification:

CREATE OR REPLACE FUNCTION repack.get_storage_param(oid)
  RETURNS TEXT AS
$$
SELECT string_agg(param, ', ')
FROM (
    -- table storage parameter
    SELECT unnest(reloptions) as param
    FROM pg_class
    WHERE oid = $1
    UNION ALL
    -- TOAST table storage parameter
    SELECT ('toast.' || unnest(reloptions)) as param
    FROM (
        SELECT reltoastrelid from pg_class where oid = $1
         ) as t,
        pg_class as c
    WHERE c.oid = t.reltoastrelid
    UNION ALL
    /*-- table oid
    SELECT 'oids = ' ||
        CASE WHEN relhasoids
            THEN 'true'
            ELSE 'false'
        END
    FROM pg_class
    WHERE oid = $1*/

SELECT 'oids = false' --my modification

    ) as t
$$
LANGUAGE sql STABLE STRICT;


Then I run pg_repack:

$ pg_repack -d mydata -Upostgres -t wipe_oid_test -e
LOG: (query) SET search_path TO pg_catalog, pg_temp, public
LOG: (query) SET search_path TO pg_catalog, pg_temp, public
LOG: (query) select repack.version(), repack.version_sql()
LOG: (query) SET statement_timeout = 0
LOG: (query) SET search_path = pg_catalog, pg_temp, public
LOG: (query) SET client_min_messages = warning
LOG: (query) SELECT t.*, coalesce(v.tablespace, t.tablespace_orig) as tablespace_dest FROM repack.tables t,  (VALUES (quote_ident($1::text))) as v (tablespace) WHERE (relid = $2::regclass) ORDER BY t.relname, t.schemaname
LOG: (param:0) = (null)
LOG: (param:1) = wipe_oid_test
INFO: repacking table "public.wipe_oid_test"
LOG: (query) SELECT pg_try_advisory_lock($1, CAST(-2147483648 + $2::bigint AS integer))
LOG: (param:0) = 16185446
LOG: (param:1) = 2273648077
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE public.wipe_oid_test IN ACCESS EXCLUSIVE MODE
LOG: (query) RESET statement_timeout
LOG: (query) SELECT pg_get_indexdef(indexrelid) FROM pg_index WHERE indrelid = $1 AND NOT indisvalid
LOG: (param:0) = 2273648077
LOG: (query) SELECT indexrelid, repack.repack_indexdef(indexrelid, indrelid, $2, FALSE)  FROM pg_index WHERE indrelid = $1 AND indisvalid
LOG: (param:0) = 2273648077
LOG: (param:1) = (null)
LOG: (query) SELECT repack.conflicted_triggers($1)
LOG: (param:0) = 2273648077
LOG: (query) CREATE TYPE repack.pk_2273648077 AS (k text)
LOG: (query) CREATE TABLE repack.log_2273648077 (id bigserial PRIMARY KEY, pk repack.pk_2273648077, row public.wipe_oid_test)
LOG: (query) CREATE TRIGGER repack_trigger AFTER INSERT OR DELETE OR UPDATE ON public.wipe_oid_test FOR EACH ROW EXECUTE PROCEDURE repack.repack_trigger('INSERT INTO repack.log_2273648077(pk, row) VALUES( CASE WHEN $1 IS NULL THEN NULL ELSE (ROW($1.k)::repack.pk_2273648077) END, $2)')
LOG: (query) ALTER TABLE public.wipe_oid_test ENABLE ALWAYS TRIGGER repack_trigger
LOG: (query) SELECT repack.disable_autovacuum('repack.log_2273648077')
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) SELECT pg_backend_pid()
LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'relation' AND granted = false AND relation = 2273648077 AND mode = 'AccessExclusiveLock' AND pid <> pg_backend_pid()
LOG: (query) COMMIT
LOG: (query) BEGIN ISOLATION LEVEL SERIALIZABLE
LOG: (query) SELECT set_config('work_mem', current_setting('maintenance_work_mem'), true)
LOG: (query) SELECT coalesce(array_agg(l.virtualtransaction), '{}')   FROM pg_locks AS l   LEFT JOIN pg_stat_activity AS a     ON l.pid = a.pid   LEFT JOIN pg_database AS d     ON a.datid = d.oid   WHERE l.locktype = 'virtualxid'   AND l.pid NOT IN (pg_backend_pid(), $1)   AND (l.virtualxid, l.virtualtransaction) <> ('1/1', '-1/0')   AND (a.application_name IS NULL OR a.application_name <> $2)  AND a.query !~* E'^\\s*vacuum\\s+'   AND a.query !~ E'^autovacuum: '   AND ((d.datname IS NULL OR d.datname = current_database()) OR l.database = 0)
LOG: (param:0) = 15246
LOG: (param:1) = pg_repack
LOG: (query) DELETE FROM repack.log_2273648077
LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'relation' AND granted = false AND relation = 2273648077 AND mode = 'AccessExclusiveLock' AND pid <> pg_backend_pid()
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE public.wipe_oid_test IN ACCESS SHARE MODE
LOG: (query) RESET statement_timeout
LOG: (query) CREATE TABLE repack.table_2273648077 WITH (oids = false) TABLESPACE pg_default AS SELECT k,v FROM ONLY public.wipe_oid_test WITH NO DATA
LOG: (query) INSERT INTO repack.table_2273648077 SELECT k,v FROM ONLY public.wipe_oid_test
LOG: (query) SELECT repack.disable_autovacuum('repack.table_2273648077')
LOG: (query) COMMIT
LOG: (query) CREATE UNIQUE INDEX index_2273648083 ON repack.table_2273648077 USING btree (k)
LOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, $5, $6)
LOG: (param:0) = SELECT * FROM repack.log_2273648077 ORDER BY id LIMIT $1
LOG: (param:1) = INSERT INTO repack.table_2273648077 VALUES ($1.*)
LOG: (param:2) = DELETE FROM repack.table_2273648077 WHERE (k) = ($1.k)
LOG: (param:3) = UPDATE repack.table_2273648077 SET (k, v) = ($2.k, $2.v) WHERE (k) = ($1.k)
LOG: (param:4) = DELETE FROM repack.log_2273648077 WHERE id IN (
LOG: (param:5) = 1000
LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'virtualxid' AND pid <> pg_backend_pid() AND virtualtransaction = ANY($1)
LOG: (param:0) = {}
LOG: (query) SAVEPOINT repack_sp1
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE public.wipe_oid_test IN ACCESS EXCLUSIVE MODE
LOG: (query) RESET statement_timeout
LOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, $5, $6)
LOG: (param:0) = SELECT * FROM repack.log_2273648077 ORDER BY id LIMIT $1
LOG: (param:1) = INSERT INTO repack.table_2273648077 VALUES ($1.*)
LOG: (param:2) = DELETE FROM repack.table_2273648077 WHERE (k) = ($1.k)
LOG: (param:3) = UPDATE repack.table_2273648077 SET (k, v) = ($2.k, $2.v) WHERE (k) = ($1.k)
LOG: (param:4) = DELETE FROM repack.log_2273648077 WHERE id IN (
LOG: (param:5) = 0
LOG: (query) SELECT repack.repack_swap($1)
LOG: (param:0) = 2273648077
LOG: (query) COMMIT
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) SAVEPOINT repack_sp1
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE public.wipe_oid_test IN ACCESS EXCLUSIVE MODE
LOG: (query) RESET statement_timeout
LOG: (query) SELECT repack.repack_drop($1, $2)
LOG: (param:0) = 2273648077
LOG: (param:1) = 4
LOG: (query) COMMIT
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) ANALYZE public.wipe_oid_test
LOG: (query) COMMIT
LOG: (query) SELECT pg_advisory_unlock($1, CAST(-2147483648 + $2::bigint AS integer))
LOG: (param:0) = 16185446
LOG: (param:1) = 2273648077


But PostgreSQL still thinks that the table has oids:

mydata=# \d+ wipe_oid_test
                   Table "public.wipe_oid_test"
 Column | Type | Modifiers | Storage  | Stats target | Description 
--------+------+-----------+----------+--------------+-------------
 k      | text | not null  | extended |              | 
 v      | text |           | extended |              | 
Indexes:
    "wipe_oid_test_pkey" PRIMARY KEY, btree (k)
Has OIDs: yes

mydata=# select oid,* from wipe_oid_test;
 oid | k | v 
-----+---+---
   0 | 1 | 2
   0 | 3 | 4
   0 | a | b
   0 | c | d
(4 rows)

I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class.

So I'm definitely missing something. I really need a point in the right direction.... Please help! ;)

CG




Re: Removing oids with pg_repack

From
Achilleas Mantzios
Date:
Στις 21/11/23 20:41, ο/η CG έγραψε:
I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. 

To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data.
........
But PostgreSQL still thinks that the table has oids:

mydata=# \d+ wipe_oid_test
                   Table "public.wipe_oid_test"
 Column | Type | Modifiers | Storage  | Stats target | Description 
--------+------+-----------+----------+--------------+-------------
 k      | text | not null  | extended |              | 
 v      | text |           | extended |              | 
Indexes:
    "wipe_oid_test_pkey" PRIMARY KEY, btree (k)
Has OIDs: yes
Except where does it mention in the pg_repack docs (or source) that it is meant to be used for NO OIDS conversion ?
I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class.

Just Dont!
So I'm definitely missing something. I really need a point in the right direction.... Please help! ;)

There are a few of methods to get rid of OIDs :

- ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that)

- Use table copy +  use of a trigger to log changes : https://dba.stackexchange.com/questions/259359/eliminating-oids-while-upgrading-postgresql-from-9-4-to-12

- Use of Inheritance (the most neat solution I have seen, this is what I used for a 2TB table conversion) : https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/


CG




-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt

Re: Removing oids with pg_repack

From
CG
Date:


On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


Στις 21/11/23 20:41, ο/η CG έγραψε:
I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. 

To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data.
........
But PostgreSQL still thinks that the table has oids:

mydata=# \d+ wipe_oid_test
                   Table "public.wipe_oid_test"
 Column | Type | Modifiers | Storage  | Stats target | Description 
--------+------+-----------+----------+--------------+-------------
 k      | text | not null  | extended |              | 
 v      | text |           | extended |              | 
Indexes:
    "wipe_oid_test_pkey" PRIMARY KEY, btree (k)
Has OIDs: yes
Except where does it mention in the pg_repack docs (or source) that it is meant to be used for NO OIDS conversion ?

It does not-- I was trying to leverage and tweak the base functionality of pg_repack which sets up triggers and migrates data. I figured if the target table was created without OIDs that when pg_repack did the "swap" operation that the new table would take over with the added bonus of not having oids.

I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class.

Just Dont!

Noted. ;)

So I'm definitely missing something. I really need a point in the right direction.... Please help! ;)

There are a few of methods to get rid of OIDs :

- ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that)

This makes the database unusable for hours and hours and hours because it locks the table entirely while it performs the operation. That's just something that we can't afford.


That SO is not quite the effect I'm going for. The poster of that SO was using OIDS in their application and needed a solution to maintain those values after conversion. I simply want to eliminate them without the extraordinary downtime the database would experience during ALTER operations.

- Use of Inheritance (the most neat solution I have seen, this is what I used for a 2TB table conversion) : https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/

This is closest to the effect I was going for. pg_repack essentially creates a second table and fills it with the data from the first table while ensuring standard db operations against that table continue to function while the data is being moved from the old table to the new table. The process outlined in the Percona ETL strategy has to be repeated per-table, which is work I was hoping to avoid by leveraging 95% of the functionality of pg_repack while supplying my own 5% as the resulting table would not have oids regardless of the source table's configuration.

For my experiment, Table A did have oids. Table B (created by pg_repack) did not (at least at creation). When the "swap" operation happened in pg_repack, the metadata for Table A was assigned to Table B. I'm just trying to figure out what metadata I need to change in the system tables to reflect the actual table structure. 

I have the fallback position for the Percona ETL strategy. But I feel like I'm REALLY close with pg_repack and I just don't understand enough about the system internals to nudge it to correctness and need some expert assistance to tap it in the hole.


CG




-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt

Re: Removing oids with pg_repack

From
Ron Johnson
Date:
On Tue, Nov 21, 2023 at 1:43 PM CG <cgg007@yahoo.com> wrote:
I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. 

Too bad that pg_dump doesn't have an --exclude-oids option. 

Re: Removing oids with pg_repack

From
Tom Lane
Date:
CG <cgg007@yahoo.com> writes:
> I have the fallback position for the Percona ETL strategy. But I feel like I'm REALLY close with pg_repack and I just
don'tunderstand enough about the system internals to nudge it to correctness and need some expert assistance to tap it
inthe hole. 

The only "system metadata" that would need changing is
pg_class.relhasoids.  However, it's certainly necessary to rewrite
all the physical table rows (and hence all the table's indexes).

I don't know a lot about pg_repack, but I'm a bit skeptical that
it can really make that transparent.  I'm even more skeptical that
it could almost support it except this obvious use-case never
occurred to the developers.  I suspect that either the support
already exists but you missed it, or else there's some fundamental
stumbling block.  Maybe go discuss this with the pg_repack
developers?  (Maybe they read pgsql-general, but I wouldn't count
on it.)

            regards, tom lane



Re: Removing oids with pg_repack

From
Achilleas Mantzios
Date:
Στις 22/11/23 15:14, ο/η CG έγραψε:


On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


Στις 21/11/23 20:41, ο/η CG έγραψε:
I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. 

To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data.
........
But PostgreSQL still thinks that the table has oids:

mydata=# \d+ wipe_oid_test
                   Table "public.wipe_oid_test"
 Column | Type | Modifiers | Storage  | Stats target | Description 
--------+------+-----------+----------+--------------+-------------
 k      | text | not null  | extended |              | 
 v      | text |           | extended |              | 
Indexes:
    "wipe_oid_test_pkey" PRIMARY KEY, btree (k)
Has OIDs: yes
Except where does it mention in the pg_repack docs (or source) that it is meant to be used for NO OIDS conversion ?

It does not-- I was trying to leverage and tweak the base functionality of pg_repack which sets up triggers and migrates data. I figured if the target table was created without OIDs that when pg_repack did the "swap" operation that the new table would take over with the added bonus of not having oids.

I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class.

Just Dont!

Noted. ;)

So I'm definitely missing something. I really need a point in the right direction.... Please help! ;)

There are a few of methods to get rid of OIDs :

- ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that)

This makes the database unusable for hours and hours and hours because it locks the table entirely while it performs the operation. That's just something that we can't afford.


That SO is not quite the effect I'm going for. The poster of that SO was using OIDS in their application and needed a solution to maintain those values after conversion. I simply want to eliminate them without the extraordinary downtime the database would experience during ALTER operations.
Sorry I meant this one : https://dba.stackexchange.com/questions/286453/stripping-oids-from-tables-in-preparation-for-pg-upgrade

- Use of Inheritance (the most neat solution I have seen, this is what I used for a 2TB table conversion) : https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/

This is closest to the effect I was going for. pg_repack essentially creates a second table and fills it with the data from the first table while ensuring standard db operations against that table continue to function while the data is being moved from the old table to the new table. The process outlined in the Percona ETL strategy has to be repeated per-table, which is work I was hoping to avoid by leveraging 95% of the functionality of pg_repack while supplying my own 5% as the resulting table would not have oids regardless of the source table's configuration.

For my experiment, Table A did have oids. Table B (created by pg_repack) did not (at least at creation). When the "swap" operation happened in pg_repack, the metadata for Table A was assigned to Table B. I'm just trying to figure out what metadata I need to change in the system tables to reflect the actual table structure. 

I have the fallback position for the Percona ETL strategy. But I feel like I'm REALLY close with pg_repack and I just don't understand enough about the system internals to nudge it to correctness and need some expert assistance to tap it in the hole.
Why don't just inspect the code pg_repack ?


CG




-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt
-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt

Re: Removing oids with pg_repack

From
CG
Date:


On Wednesday, November 22, 2023 at 12:38:54 PM EST, Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


Στις 22/11/23 15:14, ο/η CG έγραψε:


On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


Στις 21/11/23 20:41, ο/η CG έγραψε:
I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. 

To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data.
........
But PostgreSQL still thinks that the table has oids:

mydata=# \d+ wipe_oid_test
                   Table "public.wipe_oid_test"
 Column | Type | Modifiers | Storage  | Stats target | Description 
--------+------+-----------+----------+--------------+-------------
 k      | text | not null  | extended |              | 
 v      | text |           | extended |              | 
Indexes:
    "wipe_oid_test_pkey" PRIMARY KEY, btree (k)
Has OIDs: yes
Except where does it mention in the pg_repack docs (or source) that it is meant to be used for NO OIDS conversion ?

It does not-- I was trying to leverage and tweak the base functionality of pg_repack which sets up triggers and migrates data. I figured if the target table was created without OIDs that when pg_repack did the "swap" operation that the new table would take over with the added bonus of not having oids.

I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class.

Just Dont!

Noted. ;)

So I'm definitely missing something. I really need a point in the right direction.... Please help! ;)

There are a few of methods to get rid of OIDs :

- ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that)

This makes the database unusable for hours and hours and hours because it locks the table entirely while it performs the operation. That's just something that we can't afford.


That SO is not quite the effect I'm going for. The poster of that SO was using OIDS in their application and needed a solution to maintain those values after conversion. I simply want to eliminate them without the extraordinary downtime the database would experience during ALTER operations.
Sorry I meant this one : Stripping OIDs from tables in preparation for pg_upgrade





This is the same idea as the percona ETL strategy, and essentially 90% of what pg_repack already does (creates new tables, sets up triggers, locks the tables, and swaps new for old at the end of the process) 



- Use of Inheritance (the most neat solution I have seen, this is what I used for a 2TB table conversion) : https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/

This is closest to the effect I was going for. pg_repack essentially creates a second table and fills it with the data from the first table while ensuring standard db operations against that table continue to function while the data is being moved from the old table to the new table. The process outlined in the Percona ETL strategy has to be repeated per-table, which is work I was hoping to avoid by leveraging 95% of the functionality of pg_repack while supplying my own 5% as the resulting table would not have oids regardless of the source table's configuration.

For my experiment, Table A did have oids. Table B (created by pg_repack) did not (at least at creation). When the "swap" operation happened in pg_repack, the metadata for Table A was assigned to Table B. I'm just trying to figure out what metadata I need to change in the system tables to reflect the actual table structure. 

I have the fallback position for the Percona ETL strategy. But I feel like I'm REALLY close with pg_repack and I just don't understand enough about the system internals to nudge it to correctness and need some expert assistance to tap it in the hole.
Why don't just inspect the code pg_repack ?

I have, and I have modified pg_repack (modification was shown in my first post) to create and write to a new table without oids, the problem is when the "swap" operation happens the old tabledefs with all the old oid baggage gets mapped on top of the new table that doesn't have oids in it. I need to know what PostgreSQL is seeing in the tabledefs that makes it think this new table, swapped out with the old table, has oids. My thought is if I correct those values in pg_class and elsewhere, the tabledefs will match what is actually on the filesystem after my modified pg_repack has finished processing the tables.



CG




-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt
-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt

Re: Removing oids with pg_repack

From
Achilleas Mantzios
Date:
Στις 27/11/23 16:51, ο/η CG έγραψε:


On Wednesday, November 22, 2023 at 12:38:54 PM EST, Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


Στις 22/11/23 15:14, ο/η CG έγραψε:


On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


Στις 21/11/23 20:41, ο/η CG έγραψε:
I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. 

To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data.
........
But PostgreSQL still thinks that the table has oids:

mydata=# \d+ wipe_oid_test
                   Table "public.wipe_oid_test"
 Column | Type | Modifiers | Storage  | Stats target | Description 
--------+------+-----------+----------+--------------+-------------
 k      | text | not null  | extended |              | 
 v      | text |           | extended |              | 
Indexes:
    "wipe_oid_test_pkey" PRIMARY KEY, btree (k)
Has OIDs: yes
Except where does it mention in the pg_repack docs (or source) that it is meant to be used for NO OIDS conversion ?

It does not-- I was trying to leverage and tweak the base functionality of pg_repack which sets up triggers and migrates data. I figured if the target table was created without OIDs that when pg_repack did the "swap" operation that the new table would take over with the added bonus of not having oids.

I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class.

Just Dont!

Noted. ;)

So I'm definitely missing something. I really need a point in the right direction.... Please help! ;)

There are a few of methods to get rid of OIDs :

- ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that)

This makes the database unusable for hours and hours and hours because it locks the table entirely while it performs the operation. That's just something that we can't afford.


That SO is not quite the effect I'm going for. The poster of that SO was using OIDS in their application and needed a solution to maintain those values after conversion. I simply want to eliminate them without the extraordinary downtime the database would experience during ALTER operations.
Sorry I meant this one : Stripping OIDs from tables in preparation for pg_upgrade





This is the same idea as the percona ETL strategy, and essentially 90% of what pg_repack already does (creates new tables, sets up triggers, locks the tables, and swaps new for old at the end of the process) 



- Use of Inheritance (the most neat solution I have seen, this is what I used for a 2TB table conversion) : https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/

This is closest to the effect I was going for. pg_repack essentially creates a second table and fills it with the data from the first table while ensuring standard db operations against that table continue to function while the data is being moved from the old table to the new table. The process outlined in the Percona ETL strategy has to be repeated per-table, which is work I was hoping to avoid by leveraging 95% of the functionality of pg_repack while supplying my own 5% as the resulting table would not have oids regardless of the source table's configuration.

For my experiment, Table A did have oids. Table B (created by pg_repack) did not (at least at creation). When the "swap" operation happened in pg_repack, the metadata for Table A was assigned to Table B. I'm just trying to figure out what metadata I need to change in the system tables to reflect the actual table structure. 

I have the fallback position for the Percona ETL strategy. But I feel like I'm REALLY close with pg_repack and I just don't understand enough about the system internals to nudge it to correctness and need some expert assistance to tap it in the hole.
Why don't just inspect the code pg_repack ?

I have, and I have modified pg_repack (modification was shown in my first post) to create and write to a new table without oids, the problem is when the "swap" operation happens the old tabledefs with all the old oid baggage gets mapped on top of the new table that doesn't have oids in it. I need to know what PostgreSQL is seeing in the tabledefs that makes it think this new table, swapped out with the old table, has oids. My thought is if I correct those values in pg_class and elsewhere, the tabledefs will match what is actually on the filesystem after my modified pg_repack has finished processing the tables.


Hi, I think pg_repack eventually calls : src/backend/access/heap/heapam.c : heap_update , and this just makes sure all cols of both old and new pg_class  have same vals.

That's why, the new table ends up with hasoids = true.




CG




-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt
-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt
-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt