RE: Big performance slowdown from 11.2 to 13.3 - Mailing list pgsql-performance

From ldh@laurent-hasson.com
Subject RE: Big performance slowdown from 11.2 to 13.3
Date
Msg-id MN2PR15MB2560C1E738F58FD71B71255D85E49@MN2PR15MB2560.namprd15.prod.outlook.com
Whole thread Raw
In response to Re: Big performance slowdown from 11.2 to 13.3  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Big performance slowdown from 11.2 to 13.3
List pgsql-performance

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Thursday, July 22, 2021 11:57
To: ldh@laurent-hasson.com
Cc: David Rowley <dgrowleyml@gmail.com>; Peter Geoghegan <pg@bowt.ie>; Justin Pryzby <pryzby@telsasoft.com>;
pgsql-performance@postgresql.org
Subject: Re: Big performance slowdown from 11.2 to 13.3

I wrote:
> "ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
>> It's still spilling to disk and seems to cap at 2.5GB of memory usage in spite of configuration.

> That is ... weird.

Oh: see get_hash_mem:

    hash_mem = (double) work_mem * hash_mem_multiplier;

    /*
     * guc.c enforces a MAX_KILOBYTES limitation on work_mem in order to
     * support the assumption that raw derived byte values can be stored in
     * 'long' variables.  The returned hash_mem value must also meet this
     * assumption.
     *
     * We clamp the final value rather than throw an error because it should
     * be possible to set work_mem and hash_mem_multiplier independently.
     */
    if (hash_mem < MAX_KILOBYTES)
        return (int) hash_mem;

    return MAX_KILOBYTES;

So basically, we now have a hard restriction that hashaggs can't use more than INT_MAX kilobytes, or approximately
2.5GB,and this use case is getting eaten alive by that restriction.  Seems like we need to do something about that. 

            regards, tom lane





------------------------------------------------------------------------------------------------------------------------------------------------------

Hello!


Ah... int vs long then? Tried even more (multiplier=16) and this seems to be definitely the case.

Is it fair then to deduce that the total memory usage would be 2,400,305kB + 126,560kB? Is this what under the covers
V11is consuming more or less? 

Is it also expected that a spill over of just 100MB (on top of 2.4GB memory consumption) would cause the query to
collapselike this? I am still not visualizing in my head how that would happen. 100MB just seems so small, and our SSD
isfast. 

Generating a dataset would take me a lot of time. This is a clinical database so I cannot reuse the current table. I
wouldhave to entirely mock the use case and create a dummy dataset from scratch. 



HashAggregate  (cost=1774568.21..1774579.21 rows=200 width=1260) (actual time=94618.303..1795311.542 rows=722853
loops=1)
  Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk
  Batches: 1  Memory Usage: 1277985kB
  Buffers: shared hit=14 read=169854, temp read=15777 written=27588
  ->  HashAggregate  (cost=1360804.75..1374830.63 rows=1402588 width=56) (actual time=30753.022..45384.558
rows=13852618loops=1) 
        Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, assessmenticcqa_raw.iccqar_ques_code
        Batches: 5  Memory Usage: 2400305kB  Disk Usage: 126560kB
        Buffers: shared read=169851, temp read=15777 written=27588
        ->  Seq Scan on assessmenticcqa_raw  (cost=0.00..1256856.62 rows=13859750 width=38) (actual
time=0.110..14342.258rows=13852618 loops=1) 
              Filter: ((iccqar_ques_code)::text = ANY ('{"DEBRIDEMENT DATE","DEBRIDEMENT THIS VISIT","DEBRIDEMENT
TYPE","DEPTH(CM)","DEPTH DESCRIPTION","DOES PATIENT HAVE PAIN ASSOCIATED WITH THIS WOUND?","DRAIN PRESENT","DRAIN
TYPE","EDGE/ SURROUNDING TISSUE - MACERATION",EDGES,EPITHELIALIZATION,"EXUDATE AMOUNT","EXUDATE TYPE","GRANULATION
TISSUE","INDICATEOTHER TYPE OF WOUND CLOSURE","INDICATE TYPE","INDICATE WOUND CLOSURE","IS THIS A CLOSED SURGICAL WOUND
ORSUSPECTED DEEP TISSUE INJURY?","LENGTH (CM)","MEASUREMENTS TAKEN","NECROTIC TISSUE AMOUNT","NECROTIC TISSUE
TYPE",ODOR,"OTHERCOMMENTS REGARDING DEBRIDEMENT TYPE","OTHER COMMENTS REGARDING DRAIN TYPE","OTHER COMMENTS REGARDING
PAININTERVENTIONS","OTHER COMMENTS REGARDING PAIN QUALITY","OTHER COMMENTS REGARDING REASON MEASUREMENTS NOT
TAKEN","PAINFREQUENCY","PAIN INTERVENTIONS","PAIN QUALITY","PERIPHERAL TISSUE EDEMA","PERIPHERAL TISSUE
INDURATION","REASONMEASUREMENTS NOT TAKEN","RESPONSE TO PAIN INTERVENTIONS",SHAPE,"SIGNS AND SYMPTOMS OF
INFECTION","SKINCOLOR SURROUNDING WOUND",STATE,"SURFACE AREA (SQ CM)","TOTAL NECROTIC TISSUE ESCHAR","TOTAL NECROTIC
TISSUESLOUGH",TUNNELING,"TUNNELING SIZE(CM)/LOCATION - 12 - 3 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 3 - 6
O''CLOCK","TUNNELINGSIZE(CM)/LOCATION - 6 - 9 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 9 - 12
O''CLOCK",UNDERMINING,"UNDERMININGSIZE(CM)/LOCATION - 12 - 3 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 3 - 6
O''CLOCK","UNDERMININGSIZE(CM)/LOCATION - 6 - 9 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 9 - 12 O''CLOCK","WIDTH
(CM)","WOUNDPAIN LEVEL, WHERE 0 = \"NO PAIN\" AND 10 = \"WORST POSSIBLE PAIN\""}'::text[])) 
              Rows Removed by Filter: 171680
              Buffers: shared read=169851
Settings: effective_cache_size = '52GB', from_collapse_limit = '24', hash_mem_multiplier = '16', jit = 'off',
jit_above_cost= '2e+08', jit_inline_above_cost = '5e+08', jit_optimize_above_cost = '5e+08', join_collapse_limit =
'24',max_parallel_workers = '20', max_parallel_workers_per_gather = '8', random_page_cost = '1.1', temp_buffers =
'4GB',work_mem = '1GB' 
Planning:
  Buffers: shared hit=186 read=37
Planning Time: 55.709 ms
Execution Time: 1795921.717 ms


Thank you,
Laurent.




pgsql-performance by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Big performance slowdown from 11.2 to 13.3
Next
From: David Rowley
Date:
Subject: Re: Big performance slowdown from 11.2 to 13.3