Re: perf tuning for 28 cores and 252GB RAM - Mailing list pgsql-general

From Merlin Moncure
Subject Re: perf tuning for 28 cores and 252GB RAM
Date
Msg-id CAHyXU0xJr9iRXuh0Nj_gibwsK+NA_OQDm48vUm8jJULqnb9UUg@mail.gmail.com
Whole thread Raw
In response to Re: perf tuning for 28 cores and 252GB RAM  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-general
On Mon, Jun 17, 2019 at 6:46 PM Jeff Janes <jeff.janes@gmail.com> wrote:
>
> On Mon, Jun 17, 2019 at 4:51 PM Michael Curry <curry@cs.umd.edu> wrote:
>>
>> I am using a Postgres instance in an HPC cluster, where they have generously given me an entire node. This means I
have28 cores and 252GB RAM. I have to assume that the very conservative default settings for things like buffers and
maxworking memory are too small here. 
>>
>> We have about 20 billion rows in a single large table.
>
>
> What is that in bytes?  Do you only have that one table?
>
>>
>> The database is not intended to run an application but rather to allow a few individuals to do data analysis, so we
canguarantee the number of concurrent queries will be small, and that nothing else will need to use the server.
Creatingmultiple different indices on a few subsets of the columns will be needed to support the kinds of queries we
want.
>>
>> What settings should be changed to maximize performance?
>
>
> With 28 cores for only a few users, parallelization will probably be important.  That feature is fairly new to
PostgreSQLand rapidly improving from version to version, so you will want to use the last version you can (v11).  And
thenincrease the values for max_worker_processes, max_parallel_maintenance_workers, max_parallel_workers_per_gather,
andmax_parallel_workers.  With the potential for so many parallel workers running at once, you wouldn't want to go
overboardon work_mem, maybe 2GB.  If you don't think all allowed users will be running large queries at the same time
(becausethey are mostly thinking what query to run, or thinking about the results of the last one they ran, rather than
actuallyrunning queries), then maybe higher than that. 
>
> If your entire database can comfortably fit in RAM, I would make shared_buffers large enough to hold the entire
database. If not, I would set the value small (say, 8GB) and let the OS do the heavy lifting of deciding what to keep
incache.  If you go with the first option, you probably want to use pg_prewarm after each restart to get the data into
cacheas fast as you can, rather than let it get loaded in naturally as you run queries;  Also, you would probably want
toset random_page_cost and seq_page_cost quite low, like maybe 0.1 and 0.05. 
>
> You haven't described what kind of IO capacity and setup you have, knowing that could suggest other changes to make.
Also,seeing the results of `explain (analyze, buffers)`, especially with track_io_timing turned on, for some actual
queriescould provide good insight for what else might need changing. 

This is all fantastic advice.  If all the data fits in memory (or at
least, all the data that is typically read from) and the cache is warm
then your database becomes an in memory database with respect to read
operations and all the i/o concerns and buffer management overhead go
away.

If your database does not fit in memory and your storage is fast, one
influential setting besides the above to look at besides the above is
effective_io_concurrency; it gets you faster (in some cases much
faster) bitmap heap scans. Also make sure to set effective_cache_size
high reflecting the large amount of memory you have; this will
influence query plan choice.

merlin



pgsql-general by date:

Previous
From: Michael Lewis
Date:
Subject: Re: perf tuning for 28 cores and 252GB RAM
Next
From: Fabio Ugo Venchiarutti
Date:
Subject: Re: perf tuning for 28 cores and 252GB RAM