Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function - Mailing list pgsql-hackers

From Morris de Oryx
Subject Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function
Date
Msg-id CAKqncci1GoprLGtcSLk4ytPBNeQVW9Ab=CyJUdbiUkWy28WN8g@mail.gmail.com
Whole thread Raw
In response to [BUG] pg_dump does not properly deal with BEGIN ATOMIC function  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function
Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function
List pgsql-hackers
Thanks to Imseih and Sami at AWS for reporting this. The original case comes from an upgrade I've been trying to complete for a couple of months now, since RDS started supporting 15 with a 15.2 release.

The process has been slow and painful because, originally, there was a bug on the RDS side that stopped any of the upgrade logs from appearing in RDS or CloudWatch. Now, minimal log errors are shown, but not much detail.

I think that BEGIN ATOMIC is the sleeper feature of Postgres 14. It is a fantastic addition to the dependency-tracking system. However, it does not seem to work. 

I just found this thread now looking for the string warning: could not resolve dependency loop among these items. I got that far by setting up a new database, and a simple test case that reproduces the problem. I called the test database ba for Begin Atomic:

------------------------------------
-- New database
------------------------------------
CREATE DATABASE ba;

------------------------------------
-- Connect
------------------------------------
-- Connect to new database...

------------------------------------
-- Setup schemas
------------------------------------
CREATE SCHEMA data; -- I don't have public running, so create a schema.

-- All of my UPSERT view and function plumbing is tucked away here:
CREATE SCHEMA types_plus;

------------------------------------
-- Define table
------------------------------------ 
DROP TABLE IF EXISTS data.test_event;

CREATE TABLE IF NOT EXISTS data.test_event (
    id        uuid          NOT NULL DEFAULT NULL PRIMARY KEY,

    ts_dts    timestamp     NOT NULL DEFAULT 'epoch',

    who       text          NOT NULL DEFAULT NULL,
    what      text          NOT NULL DEFAULT NULL
);

-- PK is created by default as test_event_pkey, used in ON CONFLICT later.

------------------------------------
-- Create view, get type for free
------------------------------------ 
CREATE VIEW types_plus.test_event_v1 AS

SELECT
       id,
       ts_dts,
       who,
       what

  FROM data.test_event;

-- Create a function to accept an array of rows formatted as test_event_v1 for UPSERT into test_event.
DROP FUNCTION IF EXISTS types_plus.insert_test_event_v1 (types_plus.test_event_v1[]);

CREATE OR REPLACE FUNCTION types_plus.insert_test_event_v1 (data_in types_plus.test_event_v1[])

RETURNS void

LANGUAGE SQL

BEGIN ATOMIC

INSERT INTO data.test_event (
  id,
  ts_dts,
  who,
  what)

SELECT
  rows_in.id,
  rows_in.ts_dts,
  rows_in.who,
  rows_in.what

FROM unnest(data_in) as rows_in

ON CONFLICT ON CONSTRAINT test_event_pkey DO UPDATE SET
  id         = EXCLUDED.id,
  ts_dts     = EXCLUDED.ts_dts,
  who        = EXCLUDED.who,
  what       = EXCLUDED.what;

END;


I've tested pg_dump with the plain, custom, directory, and tar options. All report the same problem:


Note that I'm using Postgres and pg_dump 14.8 locally, RDS is still at 14.7 of Postgres and presumably 14.7 of pg_dump

pg_dump: warning: could not resolve dependency loop among these items:
pg_dump:   FUNCTION insert_test_event_v1  (ID 224 OID 1061258)
pg_dump:   CONSTRAINT test_event_pkey  (ID 3441 OID 1061253)
pg_dump:   POST-DATA BOUNDARY  (ID 3584)
pg_dump:   TABLE DATA test_event  (ID 3582 OID 1061246)
pg_dump:   PRE-DATA BOUNDARY  (ID 3583)


Hunting around earlier, I found a thread here from 2020 that mentioned that BEGIN ATOMIC was going to make dependency resolution tougher for pg_dump. Makes sense, it can become circular or ambiguous in a hurry. However, in my case, I don't see that the dependencies are any kind of crazy spaghetti. I have hundreds of tables with the same pattern of dependencies for UPSERT work:

1. CREATE TABLE foo
2. CREATE PK foo_pk and other constraints.
3. CREATE VIEW foo_v1  (I could use CREATE TYPE here, for my purposes, but prefer CREATE VIEW.)
4. CREATE FUNCTION insert_foo_v1 (foo_v1[])


The example I listed earlier is a simplified version of this. I didn't even check that the new database works, that's not important....I am only trying to check out pg_dump/pg_restore.

Can anyone suggest a path forward for me with the upgrade to PG 15? I'm waiting on that as we need to use MERGE and I'd like other PG 15 improvements, like the sort optimizations. As far as I can see it, my best bet is to 

1. Delete all of my routines with BEGIN ATOMIC. That's roughly 250 routines.
2. Upgrade.
3. Add back in the routines in PG 15.

That likely would work for me as my dependencies are shallow and not circular. They simply require a specific order. I avoid chaining views of views and functions off functions as a deliberate practice in Postgres.

Down the track, does my sort of dependency problem seem resolvable by pg_dump? I've got my own build-the-system-from-scratch system that use for local testing out of the source files, and I had to resort to hinting files to inject some things in the correct order. So, I'm not assuming that it is possible for pg_dump to resolve all sequences. Then again, all of this could go away if DDL dependency checking were deferrable. But, I'm just a Postgres user, not a C-coder.

Thanks for looking at this bug, thanks again for the AWS staff for posting it, and thanks for any suggestions on my day-to-day problem of upgrading.

pgsql-hackers by date:

Previous
From: "蔡梦娟(玊于)"
Date:
Subject: Fix missing initialization of delayChkptEnd
Next
From: Morris de Oryx
Date:
Subject: Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function