Re: pg_dump multi VALUES INSERT - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: pg_dump multi VALUES INSERT
Date
Msg-id alpine.DEB.2.21.1901231633430.16643@lancre
Whole thread Raw
In response to Re: pg_dump multi VALUES INSERT  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: pg_dump multi VALUES INSERT  (David Rowley <david.rowley@2ndquadrant.com>)
Re: pg_dump multi VALUES INSERT  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
Hello David,

> I thought about this and looked into it, but I decided it didn't look
> like a smart thing to do.  The reason is that if --inserts sets
> dump_inserts to 1 then --rows-per-insert sets it to something else
> that's likely fine, but if that happens in the opposite order then the
> --rows-per-insert gets overwritten with 1.

You can test before doing that!

   case X:
     if (opt.dump_inserts == 0)
       opt.dump_inserts = 1;
     // otherwise option is already set

> The bad news is the order that happens is defined by the order of the 
> command line args.

> It might be possible to make it work by having --inserts set some other 
> variable,

ISTM that it is enough to test whether the variable is zero.

> then set dump_inserts to 1 if it's set to 0 and the other variable is 
> set to >= 1... but that requires another variable, which is what you 
> want to avoid...

I still do not understand the need for another variable.

   int ninserts = 0; // default is to use copy
   while (getopt...)
   {
     switch (...) {
       case "--inserts":
         if (ninserts == 0) ninserts = 1;
         break;
       case "--rows-per-insert":
         ninserts = arg_value;
         checks...
         break;
      ...

> I think it's best to have a variable per argument.

I disagree, because it adds complexity where none is needed: here the new 
option is an extension of a previous one, thus the previous one just 
becomes a particular case, so it seems simpler to manage it as the 
particular case it is rather than a special case, creating the need for 
checking the consistency and so if two variables are used.

> I could get rid of the got_rows_per_insert variable, but it would
> require setting the default value for rows_per_insert in the main()
> function rather than in InitDumpOptions().   I thought
> InitDumpOptions() looked like just the place to do this, so went with
> that option.  To make it work without got_rows_per_insert,
> rows_per_insert would have to be 0 by default and we'd know we saw a
> --rows-per-insert command line arg by the fact that rows_per_insert
> was non-zero.  Would you rather have it that way?

Yep, esp as rows_per_insert & dump_inserts could be the same.

>> The feature is not tested anywhere. I still think that there should be a
>> test on empty/small/larger-than-rows-per-insert tables, possibly added to
>> existing TAP-tests.
>
> I was hoping to get away with not having to do that... mainly because
> I've no idea how.

Hmmm. That is another question! Maybe someone will help.

-- 
Fabien.


pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: ArchiveEntry optional arguments refactoring
Next
From: John Naylor
Date:
Subject: Re: WIP: Avoid creation of the free space map for small tables