Re: More speedups for tuple deformation - Mailing list pgsql-hackers

From Andres Freund
Subject Re: More speedups for tuple deformation
Date
Msg-id uhqul2ryci4tyg5ylddjrmf4kybzwb7m5z7rmurhhjp37vrn5f@zgxil7egr62n
Whole thread
In response to Re: More speedups for tuple deformation  (Andres Freund <andres@anarazel.de>)
Responses Re: More speedups for tuple deformation
List pgsql-hackers
Hi,

On 2026-02-25 13:05:14 -0500, Andres Freund wrote:
> At least gcc is doing some truly weird shit in the
> firstNonGuaranteed/firstNonCachedOffsetAttr loop "header" (i.e. just before
> the first entrance to the loop) , which leads to the register pressure being
> high, which leads to spilling on the stack, making the few-tuples case slower:
>
> [ lots of stuff trimmed ]
> 
> I.e. the compiler creates an offset version of tts_values[tts_nvalid],
> tts_isnull[tts_nvalid], which then creates register allocation pressure,
> because later the original tts_values/tts_isnulll etc are accessed again and
> thus the underlying registers are preserved.  And this is all for zero gain,
> from what I can tell, because the acceses are still done with indexed
> addressing  (like  mov           %rdi,(%r12,%rcx,8)), which would work just as
> well if rcx were indexed based on attnum, not zero indexed within the loop.
> 
> I see about a 10% improvement if I dissuade the compiler from doing that by
> adding
>   __asm__ volatile ("" : "+r"(attnum) : :);
> 
> In the loop body.
> 
> 
> I'm getting to the point where I'd like to just hand write the assembler for
> this stupid function. Gah.

Huh.  It, at least partially, seems to be related to using an integer for
attnum et al. Due to us using -fwrapv, the compiler can't actually assume that
an attnum++ won't overflow. An overflow would make the loop trip counts a lot
more complicated.   Even with that I don't understand how it ends up
generating such crappy code, but since using size_t fixes it...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: "Joel Jacobson"
Date:
Subject: Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq