Re: BUG #15923: Prepared statements take way too much memory. - Mailing list pgsql-bugs

From Thomas Munro
Subject Re: BUG #15923: Prepared statements take way too much memory.
Date
Msg-id CA+hUKGLfa6ANa0vs7Lf0op0XBH05HE8SyX8NFhDyT7k2CHYLXw@mail.gmail.com
Whole thread Raw
In response to RE: BUG #15923: Prepared statements take way too much memory.  (Daniel Migowski <dmigowski@ikoffice.de>)
Responses AW: BUG #15923: Prepared statements take way too much memory.
Re: BUG #15923: Prepared statements take way too much memory.
List pgsql-bugs
On Sun, Jul 28, 2019 at 11:14 PM Daniel Migowski <dmigowski@ikoffice.de> wrote:
> * Why are MemoryContextAllocZeroAligned and MemoryContextAllocZero the same function?

It's not really relevant to the discussion about memory usage, but since
you asked about MemoryContextAllocZeroAligned:

I noticed that stuff a while back when another project I follow was
tweaking related stuff.  The idea behind those functions is that we
have our own specialised memset()-like macros.
MemoryContextAllocZeroAligned() uses MemSetLoop(), while
MemoryContextAllocZero() uses MemSetAligned().  Those date back nearly
two decades and were apparently based on experiments done at the time.
The comments talk about avoiding function call overhead and
specialising for constant lengths, so I suppose we assumed that
compilers at the time didn't automatically know how to inline plain
old memset() calls.

These days, you'd think you could add "assume aligned" attributes to
all memory-allocation-like functions and then change a few key things
to be static inline so that the constants can be seen in the right
places, and get all of the same benefits (and probably more) for free
from the compiler's alignment analysis and inlining powers.

As for what benefits might actually be available, for constants it's
clear but for alignment, I'm doubtful.  When Clang 8 on amd64 inlines
a memset() with a known aligned destination (for various alignment
sizes) and constant size, it generates a bunch of movups instructions,
or with -march set to a modern arch, maybe vmovups, and some
variations depending on size.  Note 'u' (unaligned) instructions, not
'a' (aligned): that is, it doesn't even care that I said it the
destination was aligned!  I found claims from a couple of Intel
sources that this sort of thing stopped making any difference to
performance in the Nehalem microarchitecture (2008).  It still matters
whether the data actually is aligned, but not whether you use the
instructions that tolerate misalignment, so there is apparently no
point in generating different code, and therefore, as far as memset()
goes, apparently no gain from annotating allocation functions as
returning aligned pointers.  As for other architectures or compilers,
I don't know.

For curiosity, here's an experimental patch that gets rid of the
MemSetXXX() stuff, adds some (useless?) annotations about alignment
and makes palloc0() and MemoryContextAllocZero() inline so that they
can benefit from inlining with a visible constant size in eg
newNode().  It didn't seem to do anything very interesting apart from
remove a few hundred lines of code, so I didn't get around to digging
further or sharing it earlier.  Or maybe I was just doing it wrong.
(In this patch it's using sizeof(long) which isn't enough to be
considered aligned for the wider move instructions, but I tried
various sizes when trying to trigger different codegen, without
success).


--
Thomas Munro
https://enterprisedb.com

Attachment

pgsql-bugs by date:

Previous
From: Murali Krishna
Date:
Subject: GoldenGate to PostgreSQL connectivity
Next
From: Manuel Rigger
Date:
Subject: Re: REINDEX CONCURRENTLY causes ALTER TABLE to fail