Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Sutou Kouhei
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id 20240730.115824.1120108664579849899.kou@clear-code.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Junwang Zhao <zhjwpku@gmail.com>)
List pgsql-hackers
Hi,

In <CAEG8a3+KN=uofw5ksnCwh5s3m_VcfFYd=jTzcpO5uVLBHwSQEg@mail.gmail.com>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on Sun, 28 Jul 2024 22:49:47 +0800,
  Junwang Zhao <zhjwpku@gmail.com> wrote:

> Thanks for updating the patches, I applied them and test
> in my local machine, I did not use tmpfs in my test, I guess
> if I run the tests enough rounds, the OS will cache the
> pages, below is my numbers(I run each test 30 times, I
> count for the last 10 ones):
> 
> HEAD                                PATCHED
> 
> COPY to_tab_30 TO '/dev/null' WITH (FORMAT text);
> 
> 5628.280 ms                   5679.860 ms
> 5583.144 ms                   5588.078 ms
> 5604.444 ms                   5628.029 ms
> 5617.133 ms                   5613.926 ms
> 5575.570 ms                   5601.045 ms
> 5634.828 ms                   5616.409 ms
> 5693.489 ms                   5637.434 ms
> 5585.857 ms                   5609.531 ms
> 5613.948 ms                   5643.629 ms
> 5610.394 ms                   5580.206 ms
> 
> COPY from_tab_30 FROM '/tmp/to_tab_30.txt' WITH (FORMAT text);
> 
> 3929.955 ms                   4050.895 ms
> 3909.061 ms                   3890.156 ms
> 3940.272 ms                   3927.614 ms
> 3907.535 ms                   3925.560 ms
> 3952.719 ms                   3942.141 ms
> 3933.751 ms                   3904.250 ms
> 3958.274 ms                   4025.581 ms
> 3937.065 ms                   3894.149 ms
> 3949.896 ms                   3933.878 ms
> 3925.399 ms                   3936.170 ms
> 
> I did not see obvious performance degradation, maybe it's
> because I did not use tmpfs, but I think this OTH means
> that the *function call* and *if branch* added for each row
> is not the bottleneck of the whole execution path.

Thanks for sharing your numbers. I agree with there are no
obvious performance degradation.


> In 0001,
> 
> +typedef struct CopyFromRoutine
> +{
> + /*
> + * Called when COPY FROM is started to set up the input functions
> + * associated to the relation's attributes writing to.  `finfo` can be
> + * optionally filled to provide the catalog information of the input
> + * function.  `typioparam` can be optionally filled to define the OID of
> + * the type to pass to the input function.  `atttypid` is the OID of data
> + * type used by the relation's attribute.
> 
> +typedef struct CopyToRoutine
> +{
> + /*
> + * Called when COPY TO is started to set up the output functions
> + * associated to the relation's attributes reading from.  `finfo` can be
> + * optionally filled.  `atttypid` is the OID of data type used by the
> + * relation's attribute.
> 
> The second comment has a simplified description for `finfo`, I think it
> should match the first by:
> 
> `finfo` can be optionally filled to provide the catalog information of the
> output function.

Good catch. I'll update it as suggested in the next patch set.

> After I post the patch diffs, the gmail grammer shows some hints that
> it should be *associated with* rather than *associated to*, but I'm
> not sure about this one.

Thanks. I'll use "associated with".

> I think the patches are in good shape, I can help to do some
> further tests if needed, thanks for working on this.

Thanks!

-- 
kou



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Remove last traces of HPPA support
Next
From: Zhang Mingli
Date:
Subject: COPY FROM crash