Re: where should I stick that backup? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: where should I stick that backup?
Date
Msg-id 20200417234408.u2uvqa5n22japewk@alap3.anarazel.de
Whole thread Raw
In response to Re: where should I stick that backup?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: where should I stick that backup?  (Robert Haas <robertmhaas@gmail.com>)
Re: where should I stick that backup?  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

On 2020-04-17 12:19:32 -0400, Robert Haas wrote:
> On Thu, Apr 16, 2020 at 10:22 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > Hmm. Could we learn what we need to know about this by doing something
> > as taking a basebackup of a cluster with some data in it (say, created
> > by pgbench -i -s 400 or something) and then comparing the speed of cat
> > < base.tar | gzip > base.tgz to the speed of gzip < base.tar >
> > base.tgz? It seems like there's no difference between those except
> > that the first one relays through an extra process and an extra pipe.
>
> I decided to try this. First I experimented on my laptop using a
> backup of a pristine pgbench database, scale factor 100, ~1.5GB.
>
> [rhaas pgbackup]$ for i in 1 2 3; do echo "= run number $i = "; sync;
> sync; time gzip < base.tar > base.tar.gz; rm -f base.tar.gz; sync;
> sync; time cat < base.tar | gzip > base.tar.gz; rm -f base.tar.gz;
> sync; sync; time cat < base.tar | cat | cat | gzip > base.tar.gz; rm
> -f base.tar.gz; done

Given that gzip is too slow to be practically usable for anything where
compression speed matters (like e.g. practical database backups), i'm
not sure this measures something useful. The overhead of gzip will
dominate to a degree that even the slowest possible pipe implementation
would be fast enough.

andres@awork3:/tmp/pgbase$ ls -lh
total 7.7G
-rw------- 1 andres andres 137K Apr 17 13:09 backup_manifest
-rw------- 1 andres andres 7.7G Apr 17 13:09 base.tar
-rw------- 1 andres andres  17M Apr 17 13:09 pg_wal.tar

Measuring with pv base.tar |gzip > /dev/null I can see that the
performance varies from somewhere around 20MB/s to about 90MB/s,
averaging ~60MB/s.

andres@awork3:/tmp/pgbase$ pv base.tar |gzip > /dev/null
7.62GiB 0:02:09 [60.2MiB/s]
[===============================================================================================================>]100%
 

Whereas e.g. zstd takes a much much shorter time, even in single
threaded mode:

andres@awork3:/tmp/pgbase$ pv base.tar |zstd -T1 |wc -c
7.62GiB 0:00:14 [ 530MiB/s]
[===============================================================================================================>]100%
 
448956321

not to speak of using parallel compression (pigz is parallel gzip):

andres@awork3:/tmp/pgbase$ pv base.tar |pigz -p 20 |wc -c
7.62GiB 0:00:07 [1.03GiB/s]
[===============================================================================================================>]100%
 
571718276

andres@awork3:/tmp/pgbase$ pv base.tar |zstd -T20 |wc -c
7.62GiB 0:00:04 [1.78GiB/s]
[===============================================================================================================>]100%
 
448956321


Looking at raw pipe speed, I think it's not too hard to see some
limitations:

andres@awork3:/tmp/pgbase$ time (cat base.tar | wc -c )
8184994304

real    0m3.217s
user    0m0.054s
sys    0m4.856s
andres@awork3:/tmp/pgbase$ time (cat base.tar | cat | wc -c )
8184994304

real    0m3.246s
user    0m0.113s
sys    0m7.086s
andres@awork3:/tmp/pgbase$ time (cat base.tar | cat | cat | cat | cat | cat | wc -c )
8184994304

real    0m4.262s
user    0m0.257s
sys    0m20.706s

but I'm not sure how deep pipelines we're thinking would be common.

To make sure this is still relevant in the compression context:

andres@awork3:/tmp/pgbase$ pv base.tar | zstd -T20 > /dev/null
7.62GiB 0:00:04 [1.77GiB/s]
[===============================================================================================================>]100%
 
andres@awork3:/tmp/pgbase$ pv base.tar | cat | cat | zstd -T20 > /dev/null
7.62GiB 0:00:05 [1.38GiB/s]
[===============================================================================================================>]100%
 

It's much less noticable if the cat's are after the zstd, there's so
much less data as pgbench's data is so compressible.


This does seem to suggest that composing features through chains of
pipes wouldn't be a good idea. But not that we shouldn't implement
compression via pipes (nor the opposite).


> > I don't know exactly how to do the equivalent of this on Windows, but
> > I bet somebody does.
>
> However, I still don't know what the situation is on Windows. I did do
> some searching around on the Internet to try to find out whether pipes
> being slow on Windows is a generally-known phenomenon, and I didn't
> find anything very compelling, but I don't have an environment set up
> to the test myself.

I tried to measure something. But I'm not a windows person. And it's
just a kvm VM. I don't know how well that translates into other
environments.

I downloaded gnuwin32 coreutils and zstd and performed some
measurements. The first results were *shockingly* bad:

zstd -T0 < onegbofrandom  | wc -c
linux host:    0.467s
windows guest:    0.968s

zstd -T0 < onegbofrandom  | cat | wc -c
linux host:    0.479s
windows guest:    6.058s

zstd -T0 < onegbofrandom  | cat | cat | wc -c
linux host:    0.516s
windows guest:    7.830s

I think that's because cat reads or writes in too small increments for
windows (but damn, that's slow). Replacing cat with dd:

zstd -T0 < onegbofrandom  | dd bs=512 | wc -c
linux host:    3.091s
windows guest:    5.909s

zstd -T0 < onegbofrandom  | dd bs=64k | wc -c
linux host:    0.540s
windows guest:    1.128s

zstd -T0 < onegbofrandom  | dd bs=1M | wc -c
linux host:    0.516s
windows guest:    1.043s

zstd -T0 < onegbofrandom  | dd bs=1 | wc -c
linux host:    1547s
windows guest:    2607s
(yes, really, it's this slow)

zstd -T0 < onegbofrandom > NUL
zstd -T0 < onegbofrandom > /dev/null
linux host:    0.361s
windows guest:    0.602s

zstd -T0 < onegbofrandom | dd bs=1M of=NUL
zstd -T0 < onegbofrandom | dd bs=1M of=/dev/null
linux host:    0.454s
windows guest:    0.802s

zstd -T0 < onegbofrandom | dd bs=64k | dd bs=64k | dd bs=64k | wc -c
linux host:    0.521s
windows guest:    1.376s


This suggest that pipes do have a considerably higher overhead on
windows, but that it's not all that terrible if one takes care to use
large buffers in each pipe element.

It's notable though that even the simplest use of a pipe does add a
considerable overhead compared to using the files directly.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Poll: are people okay with function/operator table redesign?
Next
From: Tom Lane
Date:
Subject: Re: Poll: are people okay with function/operator table redesign?