Re: Performance degradation on concurrent COPY into a single relation in PG16. - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Performance degradation on concurrent COPY into a single relation in PG16.
Date
Msg-id CAD21AoAEwHTLYhuQ6PaBRPXKWN-CgW9iw+4hm=2EOFXbJQ3tOg@mail.gmail.com
Whole thread Raw
In response to Re: Performance degradation on concurrent COPY into a single relation in PG16.  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Jul 11, 2023 at 1:24 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-07-03 11:53:56 +0200, Jakub Wartak wrote:
> > Out of curiosity I've tried and it is reproducible as you have stated : XFS
> > @ 4.18.0-425.10.1.el8_7.x86_64:
> >...
> > According to iostat and blktrace -d /dev/sda -o - | blkparse -i - output ,
> > the XFS issues sync writes while ext4 does not, xfs looks like constant
> > loop of sync writes (D) by kworker/2:1H-kblockd:
>
> That clearly won't go well.  It's not reproducible on newer systems,
> unfortunately :(. Or well, fortunately maybe.
>
>
> I wonder if a trick to avoid this could be to memorialize the fact that we
> bulk extended before and extend by that much going forward? That'd avoid the
> swapping back and forth.
>
>
> One thing that confuses me is that Sawada-san observed a regression at a
> single client - yet from what I can tell, there's actually not a single
> fallocate() being generated in that case, because the table is so narrow that
> we never end up extending by a sufficient number of blocks in
> RelationAddBlocks() to reach that path. Yet there's a regression at 1 client.
>
> I don't yet have a RHEL8 system at hand, could either of you send the result
> of a
>   strace -c -p $pid_of_backend_doing_copy
>

Here are the results:

* PG16: nclients = 1, execution time = 23.758
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 53.18    0.308675           0    358898           pwrite64
 33.92    0.196900           2     81202           pwritev
  7.78    0.045148           0     81378           lseek
  5.06    0.029371           2     11141           read
  0.04    0.000250           2        91           openat
  0.02    0.000094           1        89           close
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0        84           brk
  0.00    0.000000           0         1           sendto
  0.00    0.000000           0         2         1 recvfrom
  0.00    0.000000           0         2           kill
  0.00    0.000000           0         1           futex
  0.00    0.000000           0         1           epoll_wait
------ ----------- ----------- --------- --------- ----------------
100.00    0.580438           1    532891         1 total

* PG16: nclients = 2, execution time = 18.045
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 54.19    0.218721           1    187803           pwrite64
 29.24    0.118002           2     40242           pwritev
  6.23    0.025128           0     41239           lseek
  5.03    0.020285           2      9133           read
  2.04    0.008229           9       855           fallocate
  1.28    0.005151           0      5598      1000 futex
  1.12    0.004516           1      3965           kill
  0.78    0.003141           1      3058         1 epoll_wait
  0.06    0.000224           2       100           openat
  0.03    0.000136           1        98           close
  0.01    0.000050           0        84           brk
  0.00    0.000013           0        22           setitimer
  0.00    0.000006           0        15         1 rt_sigreturn
  0.00    0.000002           2         1           munmap
  0.00    0.000002           2         1           sendto
  0.00    0.000002           0         3         2 recvfrom
------ ----------- ----------- --------- --------- ----------------
100.00    0.403608           1    292217      1004 total

* PG16: nclients = 4, execution time = 18.807
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 64.61    0.240171           2     93868           pwrite64
 15.11    0.056173           4     11337           pwritev
  7.29    0.027100           7      3465           fallocate
  4.05    0.015048           2      5179           read
  3.55    0.013188           0     14941           lseek
  2.65    0.009858           1      8675      2527 futex
  1.76    0.006536           1      4190           kill
  0.88    0.003268           1      2199           epoll_wait
  0.06    0.000213           2       101           openat
  0.03    0.000130           1        99           close
  0.01    0.000031           1        18           rt_sigreturn
  0.01    0.000027           1        17           setitimer
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0        84           brk
  0.00    0.000000           0         1           sendto
  0.00    0.000000           0         1           recvfrom
------ ----------- ----------- --------- --------- ----------------
100.00    0.371743           2    144176      2527 total

* PG16: nclients = 8, execution time = 11.982
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 73.16    0.180095           3     47895           pwrite64
  8.61    0.021194           5      4199           pwritev
  5.93    0.014598           6      2199           fallocate
  3.42    0.008425           1      6723      2206 futex
  3.18    0.007824           2      3068           read
  2.44    0.005995           0      6510           lseek
  1.82    0.004475           1      2665           kill
  1.27    0.003118           1      1758         2 epoll_wait
  0.10    0.000239           2        95           openat
  0.06    0.000141           1        93           close
  0.01    0.000034           2        16           setitimer
  0.01    0.000020           2        10         2 rt_sigreturn
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0        84           brk
  0.00    0.000000           0         1           sendto
  0.00    0.000000           0         2         1 recvfrom
------ ----------- ----------- --------- --------- ----------------
100.00    0.246158           3     75319      2211 total

* PG16: nclients = 16, execution time = 7.507
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 79.45    0.078310           5     14870           pwrite64
  5.52    0.005440           5       973           pwritev
  4.51    0.004443           6       640           fallocate
  3.69    0.003640           1      2884      1065 futex
  2.23    0.002200           2       866           read
  1.80    0.001775           1      1685           lseek
  1.44    0.001421           1       782           kill
  1.08    0.001064           2       477         1 epoll_wait
  0.13    0.000129           2        57           openat
  0.08    0.000078           1        56           close
  0.06    0.000055           0        84           brk
  0.00    0.000003           3         1           munmap
  0.00    0.000003           3         1           sendto
  0.00    0.000003           1         2         1 recvfrom
  0.00    0.000002           0         5           setitimer
  0.00    0.000001           0         3         1 rt_sigreturn
------ ----------- ----------- --------- --------- ----------------
100.00    0.098567           4     23386      1068 total

* PG16: nclients = 32, execution time = 4.644
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 88.90    0.147208          12     11571           pwrite64
  3.11    0.005151           1      2643       943 futex
  2.61    0.004314           4      1039           pwritev
  1.74    0.002879           8       327           fallocate
  1.21    0.001998           3       624           read
  0.90    0.001498           1      1439           lseek
  0.66    0.001090           3       358         1 epoll_wait
  0.63    0.001049           2       426           kill
  0.12    0.000206           3        65           openat
  0.07    0.000118           1        64           close
  0.03    0.000045           0        84           brk
  0.01    0.000011          11         1           munmap
  0.00    0.000008           8         1           sendto
  0.00    0.000007           3         2         1 recvfrom
  0.00    0.000002           0         3         1 rt_sigreturn
  0.00    0.000001           0         3           setitimer
------ ----------- ----------- --------- --------- ----------------
100.00    0.165585           8     18650       946 total

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Performance degradation on concurrent COPY into a single relation in PG16.
Next
From: Amit Kapila
Date:
Subject: Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL