Support logical replication of large objects - Mailing list pgsql-hackers

From Borui Yang
Subject Support logical replication of large objects
Date
Msg-id CAOiV-_9_9yZbaaH-NXCOf-547RfThN182-A8kGRj8Tq4LCWu5Q@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hello,

I’m working on a patch to support logical replication of large objects
(LOBs). This is a useful feature when a database in logical
replication has lots of tables, functions and other objects that
change over time, such as in online cross major version upgrade.  As
an example, this lets users replicate large objects between different
PostgreSQL major versions.

The topic was previously discussed in [1]. Moreover, we need to
address the following 3 challenges. I worked on some designs and
appreciate feedback :

1. Replication of the change stream of LOBs
    My suggestion is that we can just add a check when any LOB
function or API is called and executed in the backend, and then add a
simple SQL command in WAL files to do the replication . Take lo_unlink
as example[2] : we can create a “simple” SQL like  SELECT
lo_unlink(<PID>); and log it in WAL, so that replica only needs to
replay the “simple” SQL command. We can unlink the LOBs in replica
accordingly.
    Pros :
    a. We do not need to add much additional functionality.
    b. For most of the LOBs related APIs, we don’t need to touch whole
LOBs, except for the case creation of LOBs.
    Cons:
    a. For the case creation of LOBs, we may need to split the whole
LOB content into WAL files which will increase volume of WAL and
replicated writes dramatically. This could be prevented if we can make
sure the original file is publicly shared, like a url from cloud
storage, or exists on host on replica as well.
2. Initializing replication of LOBs
    When a subscription is established, LOBs in the source should be
replicated even if they are not created in replica. Here are two
options to approach this problem:
    Option 1 : Support LOB related system catalogs in logical replication
    We can make an exception in this line[3] in the
“check_publication_add_schema” function.
           Pros: All required LOBs can be transferred to replica.
           Cons: There is currently no support for allowing logical
replication of system catalogs.
    Option 2 : Run a function or a script from source instance when it
detects logical replication is established.
    The point is that we can ask the source to replicate whole LOBs
when a new subscription is created.
    Maybe we can copy the whole LOBs related system catalogs and
replicate the copy to replica, then restore the original LOBs into
replica from the copy.
           Cons :  This will increase the volume of WAL and replicated
writes dramatically. I currently do not have a preference on either
option. I would like to see if others have thoughts on how we could
approach this.
3. OID conflicts
    A common case is that the OID we want to publish is already used
in subscriber.
    Option 1 (My recommendation) :  Create/Update existing System
catalog for mapping the OID if conflict happens
    Maybe we could add another column naming like mapping_oid in
system catalog pg_largeobject_metaqdate on the replica. When the
replica detects the OID (E.g. 16400) that  source is replicating is
already used in replica, replica could store the 16400 as mapping_oid
and create a new OID (E.g. 16500) as oid to be used in replica, so
whatever operation is done on 16400 in source, in replica we just need
to perform on 16500.
           Cons : We would need to add additional columns to the system catalog
    Option 2 :  Prompt error message in Replica and let user handle it manually
           Cons : This is not user-friendly.

Please let me know your thoughts.

Borui Yang
Amazon RDS/Aurora for PostgreSQL

[1] https://www.postgresql.org/message-id/VisenaEmail.16.35d9e854e3626e81.15cd0de93df%40tc7-visena
[2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/libpq/be-fsstubs.c#l312
[3] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/catalog/pg_publication.c#l98



pgsql-hackers by date:

Previous
From: "Gurjeet"
Date:
Subject: Re: Auto explain after query timeout
Next
From: "Jonathan S. Katz"
Date:
Subject: Re: why can't a table be part of the same publication as its schema