Re: Non-text mode for pg_dumpall - Mailing list pgsql-hackers

From Mahendra Singh Thalor
Subject Re: Non-text mode for pg_dumpall
Date
Msg-id CAKYtNAoyQV-UKosfFDfm_ZqdC-oYaRT-01jMv1k+o9OSyXovTg@mail.gmail.com
Whole thread Raw
In response to Re: Non-text mode for pg_dumpall  (jian he <jian.universality@gmail.com>)
Responses Re: Non-text mode for pg_dumpall
List pgsql-hackers
On Tue, 18 Feb 2025 at 10:00, Srinath Reddy <srinath2133@gmail.com> wrote:
>
> Hi,
> i think during restore we should not force user to use -C during cases like
> ./pg_restore pdd -g -f -
> ./pg_restore pdd -a -f -
> ./pg_restore pdd -s -f -
> because its not good to use -C to create database every time when we are using these options individually.
> latest patch throws following error for all the above cases

Fixed. (./pg_restore pdd -g -f -)

Thanks Jian and Srinath for the review and testing.

On Tue, 18 Feb 2025 at 11:41, jian he <jian.universality@gmail.com> wrote:
>
> hi.
>
>  <refnamediv>
>   <refname>pg_restore</refname>
>   <refpurpose>
>    restore a <productname>PostgreSQL</productname> database from an
>    archive file created by <application>pg_dump</application>
>    or restore multiple <productname>PostgreSQL</productname> database from an
>    archive directory created by <application>pg_dumpall</application>
>   </refpurpose>
>  </refnamediv>
>
> i think it's way too verbose. we can change it to:
> <refpurpose>
>    restore <productname>PostgreSQL</productname> database from an
>    archive file created by <application>pg_dump</application> or
> <application>pg_dumpall</application>
>   </refpurpose>

Fixed.

>
>
>   <para>
>    <application>pg_restore</application> is a utility for restoring a
>    <productname>PostgreSQL</productname> database from an archive
>    created by <xref linkend="app-pgdump"/> in one of the non-plain-text
>    formats.
> we can change it to
>   <para>
>    <application>pg_restore</application> is a utility for restoring
>    <productname>PostgreSQL</productname> databases from an archive
>    created by <xref linkend="app-pgdump"/> or <xref
> linkend="app-pgdumpall"/> in one of the non-plain-text
>    formats.

Fixed.

>
>
> similarly, pg_dumpall first 3 sentences in the description section
> needs to change.
>

I think we can keep them for pg_dumpall.

>
> in pg_restore.sgml <option>--create</option section,
> maybe we can explicitly mention that restoring multiple databases,
> <option>--create</option> is required.
> like: "This option is required when restoring multiple databases."

Fixed.

>
>
> restoreAllDatabases
> + if (!conn)
> + pg_log_info("there is no database connection so consider pattern as
> simple name for --exclude-database");
> filter_dbnames_for_restore
> + if (!conn)
> + pg_log_info("considering PATTERN as NAME for --exclude-database
> option as no db connection while doing pg_restore.");
>
> these two log messages sent out the same information.
> maybe we can remove the first one, and change the second to
>     if (!conn && db_exclude_patterns.head != NULL)
>         pg_log_info("considering PATTERN as NAME for
> --exclude-database option as no db connection while doing
> pg_restore.");

Fixed.

>
>
> as mentioned in the previous thread, there is no need to change PrintTOCSummary.

Yes, I removed it.

>
>
> another minor issue about comments.
> I guess we can tolerate this minor issue.
> $BIN10/pg_restore --format=tar --create --file=1.sql
> --exclude-database=src10 --verbose tar10 > dir_format 2>&1
> 1.sql file will copy tar10/global.dat as is. but we already excluded
> src10. but 1.sql will still have comments as
> --
> -- Database "src10" dump
> --

Fixed.

>
> $BIN10/pg_dumpall --format=custom --file=x1.dump --globals-only
> $BIN10/pg_dumpall --format=custom --file=x2.dump
>
> Currently x1.dump/global.dat is differ from x2.dump/global.dat
> if we dump multiple databases using pg_dumpall we have
> "
> --
> -- Databases
> --
> --
> -- Database "template1" dump
> --
> --
> -- Database "src10" dump
> --
> --
> -- Database "x" dump
> --
> "
> maybe there are not need, since we already have map.dat file

Okay. Fixed.

>
>
> I am not sure if the following is as expected or not.
> $BIN10/pg_dumpall --format=custom --file=x1.dump --globals-only
> $BIN10/pg_restore --create --file=3.sql --globals-only x1.dump --verbose
> $BIN10/pg_restore --create --file=3.sql x1.dump --verbose
>
> the first pg_restore command  will copy x1.dump/global.dat as is to 3.sql,
> the second pg_restore will not copy anything to 3.sql.
> but the second command implies copying global dumps to 3.sql?

We should copy global.dat. I fixed this in the v18 patch.

On Tue, 18 Feb 2025 at 14:02, jian he <jian.universality@gmail.com> wrote:
>
> On Tue, Feb 18, 2025 at 2:10 PM jian he <jian.universality@gmail.com> wrote:
> >
> > hi.
>
> hi. more cosmetic minor issues.
>
> +static int
> +get_dbname_oid_list_from_mfile(const char *dumpdirpath,
> SimpleDatabaseOidList *dbname_oid_list)
> ...
> + /*
> + * XXX : before adding dbname into list, we can verify that this db
> + * needs to skipped for restore or not but as of now, we are making
> + * a list of all the databases.
> + */
> i think the above comment in get_dbname_oid_list_from_mfile is not necessary.
> we already have comments in filter_dbnames_for_restore.

As of now, I am keeping this comment as this will be helpful while
implementing parallel pg_restore.

>
> in get_dbname_oid_list_from_mfile:
> ```
>     pfile = fopen(map_file_path, PG_BINARY_R);
>     if (pfile == NULL)
>         pg_fatal("could not open map.dat file: \"%s\"", map_file_path);
> ```
> file does not exist, we use pg_fatal, so if the directory does not
> exist, we should also use pg_fatal.
> so
>     if (!IsFileExistsInDirectory(pg_strdup(dumpdirpath), "map.dat"))
>     {
>         pg_log_info("databases restoring is skipped as map.dat file is
> not present in \"%s\"", dumpdirpath);
>         return 0;
>     }
> can be
>     if (!IsFileExistsInDirectory(pg_strdup(dumpdirpath), "map.dat"))
>         pg_fatal("map.dat file: \"%s\"/map.dat does not exists", dumpdirpath);

No, we can't add FATAL here as in case  of global-only dump, we will
not have a map.dat file.

>
>
> + /* Report error if file has any corrupted data. */
> + if (!OidIsValid(db_oid) || strlen(dbname) == 0)
> + pg_fatal("invalid entry in map.dat file at line : %d", count + 1);
> i think the comments should be
> + /* Report error and exit if the file has any corrupted data. */

Fixed.

>
>
> +/*
> + * filter_dbnames_for_restore
> + *
> + * This will remove names from all dblist those can
> + * be constructed from database_exclude_pattern list.
> + *
> + * returns number of dbnames those will be restored.
> + */
> +static int
> +filter_dbnames_for_restore(PGconn *conn,
> +   SimpleDatabaseOidList *dbname_oid_list,
> there is no "database_exclude_pattern" list, so the above comments are
> slightly wrong.

Fixed.

>
>
> +/*
> + * ReadOneStatement
> + *
> + * This will start reading from passed file pointer using fgetc and read till
> + * semicolon(sql statement terminator for global.sql file)
> + *
> + * EOF is returned if end-of-file input is seen; time to shut down.
> + */
> here, "global sql" should change to "gloal.dat".

Fixed.

>
>
>   /* sync the resulting file, errors are not fatal */
> - if (dosync)
> + if (dosync && (archDumpFormat == archNull))
>   (void) fsync_fname(filename, false);
> does this mean pg_dumpall --no-sync option only works for plain format.
> if so, we need to update the pg_dumpall --no-sync section.
As of now, we are using this option with plain format as we dump
server commands in different db file. We can test this more.

On Wed, 19 Feb 2025 at 17:08, jian he <jian.universality@gmail.com> wrote:
>
> hi.
>
> Currently, pg_retore says
> --exit-on-error
> Exit if an error is encountered while sending SQL commands to the
> database. The default is to continue and to display a count of errors
> at the end of the restoration.
> Do we need to apply this to restore executing global commands (create
> role, create tablespace)?
> If not then we need to put some words in pg_restoe --exit-on-error
> option saying that while restoring global objects --exit-on-error
> option is ignored.

I think this is the same for all pg_restore commands. Still if we want
to add some docs, we can put.

>
>
>
> IMHO, in pg_restore.sgml, we need words explicitly saying that
> when restoring multiple databases, all the specified options will
> apply to each individual database.

We can skip this extra info. I will try in the next version if we can
add something in doc.

>
> I tested the following options for restoring multiple databases. The
> results look good to me.
> --index=index
> --table=table
> --schema-only
> --transaction-size
> --no-comments
> some part of (--filter=filename)
> --exclude-schema=schema

Thank you for detailed testing.

>
> attach is a minor cosmetic change.
Okay.

Here, I am attaching an updated patch for review and testing.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Shlok Kyal
Date:
Subject: Re: Restrict copying of invalidated replication slots
Next
From: vignesh C
Date:
Subject: Re: Added schema level support for publication.