Re: DBT-3 with SF=20 got failed - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: DBT-3 with SF=20 got failed
Date
Msg-id 560439B0.4070800@2ndquadrant.com
Whole thread Raw
In response to Re: DBT-3 with SF=20 got failed  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: DBT-3 with SF=20 got failed  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

On 09/24/2015 07:42 PM, Robert Haas wrote:
> On Thu, Sep 24, 2015 at 12:40 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> There are two machines - one with 32GB of RAM and work_mem=2GB, the other
>> one with 256GB of RAM and work_mem=16GB. The machines are hosting about the
>> same data, just scaled accordingly (~8x more data on the large machine).
>>
>> Let's assume there's a significant over-estimate - we expect to get about
>> 10x the actual number of tuples, and the hash table is expected to almost
>> exactly fill work_mem. Using the 1:3 ratio (as in the query at the beginning
>> of this thread) we'll use ~512MB and ~4GB for the buckets, and the rest is
>> for entries.
>>
>> Thanks to the 10x over-estimate, ~64MB and 512MB would be enough for the
>> buckets, so we're wasting ~448MB (13% of RAM) on the small machine and
>> ~3.5GB (~1.3%) on the large machine.
>>
>> How does it make any sense to address the 1.3% and not the 13%?
>
> One of us is confused, because from here it seems like 448MB is 1.3%
> of 32GB, not 13%.

Meh, you're right - I got the math wrong. It's 1.3% in both cases.

However the question still stands - why should we handle the 
over-estimate in one case and not the other? We're wasting the same 
fraction of memory in both cases.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: clearing opfuncid vs. parallel query
Next
From: Tom Lane
Date:
Subject: Re: clearing opfuncid vs. parallel query