Home > mailing lists

Re: multithreaded zstd backup compression for client and server - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: multithreaded zstd backup compression for client and server
Date	March 24, 2022 00:14:28
Msg-id	20220323211428.frazv4typmnm2t3x@alap3.anarazel.de Whole thread Raw
In response to	multithreaded zstd backup compression for client and server (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: multithreaded zstd backup compression for client and server (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

On 2022-03-23 16:34:04 -0400, Robert Haas wrote:
> Therefore, I think it might be safe to just ...  turn this on. One reason I
> think that is that this whole approach was recommended to me by Andres ...

I didn't do a super careful analysis of the issues... But I do think it's
pretty much the one case where it "should" be safe.

The most likely source of problem would errors thrown while zstd threads are
alive. Should make sure that that can't happen.

What is the lifetime of the threads zstd spawns? Are they tied to a single
compression call? A single ZSTD_createCCtx()? If the latter, how bulletproof
is our code ensuring that we don't leak such contexts?

If they're short-lived, are we compressing large enough batches to not waste a
lot of time starting/stopping threads?

> but that's not to say that there couldn't be problems.  I worry a bit that
> the mere presence of threads could in some way mess things up, but I don't
> know what the mechanism for that would be, and I don't want to postpone
> shipping useful features based on nebulous fears.

One thing that'd be good to tests for is cancelling in-progress server-side
compression.  And perhaps a few assertions that ensure that we don't escape
with some threads still running. That'd have to be platform dependent, but I
don't see a problem with that in this case.

> For both parallel and non-parallel zstd compression, I see differences
> between the compressed size depending on where the compression is
> done. I don't know whether this is an expected behavior of the zstd
> library or a bug. Both files uncompress OK and pass pg_verifybackup,
> but that doesn't mean we're not, for example, selecting different
> compression levels where we shouldn't be. I'll try to figure out
> what's going on here.
>
> zstd, client-side: 1.7GB, 17 seconds
> zstd, server-side: 1.3GB, 25 seconds
> parallel zstd, 4 workers, client-side: 1.7GB, 7.5 seconds
> parallel zstd, 4 workers, server-side: 1.3GB, 7.2 seconds

What causes this fairly massive client-side/server-side size difference?

> +    /*
> +     * We check for failure here because (1) older versions of the library
> +     * do not support ZSTD_c_nbWorkers and (2) the library might want to
> +     * reject unreasonable values (though in practice it does not seem to do
> +     * so).
> +     */
> +    ret = ZSTD_CCtx_setParameter(streamer->cctx, ZSTD_c_nbWorkers,
> +                                 compress->workers);
> +    if (ZSTD_isError(ret))
> +    {
> +        pg_log_error("could not set compression worker count to %d: %s",
> +                     compress->workers, ZSTD_getErrorName(ret));
> +        exit(1);
> +    }

Will this cause test failures on systems with older zstd?

Greetings,

Andres Freund

pgsql-hackers by date:

From: Pavel Borisov
Date: 24 March 2022, 00:12:27
Subject: Re: [PATCH] Remove workarounds to format [u]int64's

From: "Euler Taveira"
Date: 24 March 2022, 00:19:34
Subject: Re: logical replication restrictions

Re: multithreaded zstd backup compression for client and server - Mailing list pgsql-hackers

Previous

Next