Thread: Large writable variables

Large writable variables

From
Andres Freund
Date:
Hi,

while briefly thinking about the per-process overhead of postgres, I
looked at the section sizes of a modern postgres. In particular which
memory areas are *not* shared between processes.

If you look at the section sizes that are mapped read-write:

$ size --format=sysv src/backend/postgres
src/backend/postgres  :
section                  size      addr
.rodata               1593617   5730304 (read-only, for reference)
.data.rel.ro           134944   8039904 (read only after start)
.data                   53888   8178912 (read-write, initialized)
.bss                   510416   8232800 (read-write, uninitialized)
Total                52417197

So we have 500kb of not-initialized memory mapped into every
process. That's, uh, not nothing.

top unitialized allocations:
$ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[bB]\b'|head
0000000008251872 0000000000131144 b LagTracker
0000000008585088 0000000000131104 b hist_entries
0000000008435040 0000000000085280 b DCHCache
0000000008391168 0000000000043840 b NUMCache
0000000008560224 0000000000023440 b tzdefrules_s
0000000008536704 0000000000023440 b gmtmem.7009
0000000008716192 0000000000016384 b hist_start
0000000008238336 0000000000008192 b PqRecvBuffer
0000000008734208 0000000000005136 B BackendWritebackContext
0000000008386368 0000000000003200 b held_lwlocks

So we have a two variables sized 130kb. Yikes.

struct {
        XLogRecPtr                 last_lsn;             /*     0     8 */
        WalTimeSample              buffer[8192];         /*     8 131072 */
        /* --- cacheline 2048 boundary (131072 bytes) was 8 bytes ago --- */
        int                        write_head;           /* 131080     4 */
        int                        read_heads[3];        /* 131084    12 */
        WalTimeSample              last_read[3];         /* 131096    48 */
        /* --- cacheline 2049 boundary (131136 bytes) was 8 bytes ago --- */

        /* size: 131144, cachelines: 2050, members: 5 */
        /* last cacheline: 8 bytes */
};

that's not actually used very often, nor in all processes... Thomas?

hist_entries is pg_lzcompress; DCHCache,NUMCache are formatting.c.

top itialized allocations:
$ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[dD]\b'|head
0000000008086944 0000000000087904 D fmgr_builtins
0000000008201120 0000000000017280 d ConfigureNamesInt
0000000008218400 0000000000013680 d ConfigureNamesBool
0000000008189248 0000000000008512 d ConfigureNamesString
0000000008077344 0000000000007040 D ScanKeywords
0000000008184928 0000000000004320 d ConfigureNamesEnum
0000000008197760 0000000000003360 d ConfigureNamesReal
0000000008062976 0000000000002304 d DCH_keywords
0000000008069952 0000000000002016 D pg_wchar_table
0000000008075552 0000000000001776 d encoding_match_list

fmgr_builtins isn't readonly even though declared a const - I assume
because it's full of addresses that will be mapped differently from
execution to execution.

ConfigureNames* isn't marked as const, because we update them:
        /* Rather than requiring vartype to be filled in by hand, do this: */
        conf->gen.vartype = PGC_BOOL;

I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
section.

Greetings,

Andres Freund


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-15 13:07:54 -0700, Andres Freund wrote:
> top itialized allocations:
> $ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[dD]\b'|head
> 0000000008086944 0000000000087904 D fmgr_builtins
> 0000000008201120 0000000000017280 d ConfigureNamesInt
> 0000000008218400 0000000000013680 d ConfigureNamesBool
> 0000000008189248 0000000000008512 d ConfigureNamesString
> 0000000008077344 0000000000007040 D ScanKeywords
> 0000000008184928 0000000000004320 d ConfigureNamesEnum
> 0000000008197760 0000000000003360 d ConfigureNamesReal
> 0000000008062976 0000000000002304 d DCH_keywords
> 0000000008069952 0000000000002016 D pg_wchar_table
> 0000000008075552 0000000000001776 d encoding_match_list
> 
> fmgr_builtins isn't readonly even though declared a const - I assume
> because it's full of addresses that will be mapped differently from
> execution to execution.
> 
> ConfigureNames* isn't marked as const, because we update them:
>         /* Rather than requiring vartype to be filled in by hand, do this: */
>         conf->gen.vartype = PGC_BOOL;
> 
> I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
> section.

It's because they contain pointers to strings, which are affected by
relocations (and position independent executables force everything to be
relocatable). They do go into .data.rel.ro however:

$ objdump -t ~/build/postgres/dev-optimize/vpath/src/backend/postgres|grep -E
'\b(ScanKeywords|fmgr_builtins|DCH_keywords|pg_wchar_table)\b'
00000000007b0800 l     O .data.rel.ro    0000000000000900              DCH_keywords
00000000007b4020 g     O .data.rel.ro    0000000000001b80              ScanKeywords
00000000007b65a0 g     O .data.rel.ro    0000000000015760              fmgr_builtins
00000000007b2340 g     O .data.rel.ro    00000000000007e0              pg_wchar_table

as long as we're using forking rather than EXEC_BACKEND, that's
perfectly fine.  I don't quite know how windows handles any of this, so
I can't say whether it's a problem there.

Greetings,

Andres Freund


Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> So we have 500kb of not-initialized memory mapped into every
> process. That's, uh, not nothing.

Bleah.

> 0000000008585088 0000000000131104 b hist_entries
> 0000000008716192 0000000000016384 b hist_start

I'm unsure what fraction of processes would have use for these.

> 0000000008435040 0000000000085280 b DCHCache
> 0000000008391168 0000000000043840 b NUMCache

We could surely allocate these on first use.

> 0000000008560224 0000000000023440 b tzdefrules_s
> 0000000008536704 0000000000023440 b gmtmem.7009

I think that tzdefrules_s is not used in common cases (though I could be
wrong about that), so we could win by alloc-on-first-use.  The same might
be true for gmtmem, but there's a sticking point: there is no provision
for failure there, so I'm unsure how we avoid crashing on OOM.

> 0000000008238336 0000000000008192 b PqRecvBuffer
> 0000000008734208 0000000000005136 B BackendWritebackContext
> 0000000008386368 0000000000003200 b held_lwlocks

These are below my personal threshold of pain.

> fmgr_builtins isn't readonly even though declared a const - I assume
> because it's full of addresses that will be mapped differently from
> execution to execution.

Check.

> I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
> section.

I think it's the same problem: pointers can't be truly const because
they have to be changed if one relocates the executable.

We could possibly fix these by changing the data structure so that
what's in a ScanKeywords entry is an offset into some giant string
constant somewhere.  No idea how that would affect performance, but
I do notice that we could reduce the sizeof(ScanKeyword), which can't
hurt.

            regards, tom lane


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-15 16:36:26 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > So we have 500kb of not-initialized memory mapped into every
> > process. That's, uh, not nothing.
> 
> Bleah.

Yea...

> > 0000000008585088 0000000000131104 b hist_entries
> > 0000000008716192 0000000000016384 b hist_start
> 
> I'm unsure what fraction of processes would have use for these.

Yea, I'm not sure either. Although I suspect that given the cost of
compression having an "allocate on first use" check should be quite
doable.


> > 0000000008435040 0000000000085280 b DCHCache
> > 0000000008391168 0000000000043840 b NUMCache
> 
> We could surely allocate these on first use.

yep.


> > 0000000008560224 0000000000023440 b tzdefrules_s
> > 0000000008536704 0000000000023440 b gmtmem.7009
> 
> I think that tzdefrules_s is not used in common cases (though I could be
> wrong about that), so we could win by alloc-on-first-use.  The same might
> be true for gmtmem, but there's a sticking point: there is no provision
> for failure there, so I'm unsure how we avoid crashing on OOM.

I guess we could return false / NULL to the caller. Not perfect, but
there's not that many callers. We could wrap them in wrappers that throw
errors...


> > 0000000008238336 0000000000008192 b PqRecvBuffer
> > 0000000008734208 0000000000005136 B BackendWritebackContext
> > 0000000008386368 0000000000003200 b held_lwlocks
> 
> These are below my personal threshold of pain.

Yea, agreed. PqRecvBuffer and held_lwlocks are common enough that it
makes sense to pre-allocate them anyway. I guess you could argue that
BackendWritebackContext should be dynamically allocated.


> > I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
> > section.
> 
> I think it's the same problem: pointers can't be truly const because
> they have to be changed if one relocates the executable.

Right. It's, as I noticed when looking via objdupm, in .data.rel.ro, so
I think it's not that bad.


> We could possibly fix these by changing the data structure so that
> what's in a ScanKeywords entry is an offset into some giant string
> constant somewhere.  No idea how that would affect performance, but
> I do notice that we could reduce the sizeof(ScanKeyword), which can't
> hurt.

Yea, that might even help performancewise. Alternatively we could change
ScanKeyword to store the keyword name inline, but that'd be a measurable
size increase...

Greetings,

Andres Freund


Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2018-10-15 16:36:26 -0400, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> 0000000008585088 0000000000131104 b hist_entries
>>> 0000000008716192 0000000000016384 b hist_start

>> I'm unsure what fraction of processes would have use for these.

> Yea, I'm not sure either. Although I suspect that given the cost of
> compression having an "allocate on first use" check should be quite
> doable.

I don't think the extra check would be a problem, but if we end up
needing the space in most processes, we're not really buying anything.
It'd need some investigation.

>> We could possibly fix these by changing the data structure so that
>> what's in a ScanKeywords entry is an offset into some giant string
>> constant somewhere.  No idea how that would affect performance, but
>> I do notice that we could reduce the sizeof(ScanKeyword), which can't
>> hurt.

> Yea, that might even help performancewise. Alternatively we could change
> ScanKeyword to store the keyword name inline, but that'd be a measurable
> size increase...

Yeah.  It also seems like doing it this way would improve locality of
access: the pieces of the giant string would presumably be in the same
order as the ScanKeywords entries, whereas with the current setup,
who knows where the compiler has put 'em or in what order.

We'd need some tooling to generate the constants that way, though;
I can't see how to make it directly from kwlist.h.

            regards, tom lane


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-15 16:54:53 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-10-15 16:36:26 -0400, Tom Lane wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >>> 0000000008585088 0000000000131104 b hist_entries
> >>> 0000000008716192 0000000000016384 b hist_start
> 
> >> I'm unsure what fraction of processes would have use for these.
> 
> > Yea, I'm not sure either. Although I suspect that given the cost of
> > compression having an "allocate on first use" check should be quite
> > doable.
> 
> I don't think the extra check would be a problem, but if we end up
> needing the space in most processes, we're not really buying anything.
> It'd need some investigation.

I don't think it's particularly uncommon to have processes that don't
use any toasted datums.  I'm not sure however how to put numbers on
that? After all, it'll be drastically workload dependant.


> >> We could possibly fix these by changing the data structure so that
> >> what's in a ScanKeywords entry is an offset into some giant string
> >> constant somewhere.  No idea how that would affect performance, but
> >> I do notice that we could reduce the sizeof(ScanKeyword), which can't
> >> hurt.
> 
> > Yea, that might even help performancewise. Alternatively we could change
> > ScanKeyword to store the keyword name inline, but that'd be a measurable
> > size increase...
> 
> Yeah.  It also seems like doing it this way would improve locality of
> access: the pieces of the giant string would presumably be in the same
> order as the ScanKeywords entries, whereas with the current setup,
> who knows where the compiler has put 'em or in what order.

I assume you're talking about the offset approach. Performancewise I
assume that my suggestion of inlining the names into the struct would be
faster.  Are there many realistic cases where performance matters enough
to warrant the size increase?


> We'd need some tooling to generate the constants that way, though;
> I can't see how to make it directly from kwlist.h.

I guess we could create a struct with all the strings as members. But
that seems fairly nasty.

Greetings,

Andres Freund


Re: Large writable variables

From
Thomas Munro
Date:
On Tue, Oct 16, 2018 at 9:07 AM Andres Freund <andres@anarazel.de> wrote:
> $ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[bB]\b'|head
> 0000000008251872 0000000000131144 b LagTracker
...
> So we have a two variables sized 130kb. Yikes.
...
> that's not actually used very often, nor in all processes... Thomas?

Yeah, here's a patch to move it in to the heap.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2018-10-15 16:54:53 -0400, Tom Lane wrote:
>> Yeah.  It also seems like doing it this way would improve locality of
>> access: the pieces of the giant string would presumably be in the same
>> order as the ScanKeywords entries, whereas with the current setup,
>> who knows where the compiler has put 'em or in what order.

> I assume you're talking about the offset approach. Performancewise I
> assume that my suggestion of inlining the names into the struct would be
> faster.  Are there many realistic cases where performance matters enough
> to warrant the size increase?

Doubt it, because there'd be an awful lot of wasted space due to the need
to set the struct size large enough for the longest keyword.  (Plus it
would likely not come out to be a power-of-2 size, slowing array
indexing.)  If you want this to be cache-friendly, I'd think the smaller
the better.

            regards, tom lane


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-16 10:16:34 +1300, Thomas Munro wrote:
> On Tue, Oct 16, 2018 at 9:07 AM Andres Freund <andres@anarazel.de> wrote:
> > $ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[bB]\b'|head
> > 0000000008251872 0000000000131144 b LagTracker
> ...
> > So we have a two variables sized 130kb. Yikes.
> ...
> > that's not actually used very often, nor in all processes... Thomas?
> 
> Yeah, here's a patch to move it in to the heap.
> 
> -- 
> Thomas Munro
> http://www.enterprisedb.com

> From f3fb64e67c1e86e11dfafc8a712e9750f650f60b Mon Sep 17 00:00:00 2001
> From: Thomas Munro <thomas.munro@enterprisedb.com>
> Date: Tue, 16 Oct 2018 10:08:57 +1300
> Subject: [PATCH] Move the replication lag tracker into heap memory.
> 
> Andres Freund complained about the 128KB of .bss occupied by LagTracker.
> It's only needed in the walsender process, so allocate it in heap
> memory there.

Cool, let's do that.

Greetings,

Andres Freund


Re: Large writable variables

From
Thomas Munro
Date:
On Tue, Oct 16, 2018 at 10:57 AM Andres Freund <andres@anarazel.de> wrote:
> > Subject: [PATCH] Move the replication lag tracker into heap memory.
> >
> > Andres Freund complained about the 128KB of .bss occupied by LagTracker.
> > It's only needed in the walsender process, so allocate it in heap
> > memory there.
>
> Cool, let's do that.

Pushed.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Large writable variables

From
Tom Lane
Date:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> 0000000008560224 0000000000023440 b tzdefrules_s
>> 0000000008536704 0000000000023440 b gmtmem.7009

> I think that tzdefrules_s is not used in common cases (though I could be
> wrong about that), so we could win by alloc-on-first-use.  The same might
> be true for gmtmem, but there's a sticking point: there is no provision
> for failure there, so I'm unsure how we avoid crashing on OOM.

So my intuition was exactly backwards on this.

1. tzdefrules_s is filled in the postmaster, and if it's not filled there
it'd be filled by every child process, so there's zero point in converting
it to malloc-on-the-fly style.  This is because tzparse() always wants
it filled.  That's actually a tad annoying, because it looks to me like
with many timezone settings the data will not get used, but I'm hesitant
to mess with the IANA code's logic enough to improve that.  Maybe I'll try
submitting an upstream patch and see what they think of it first.

2. gmtmem, on the other hand, is only used if somebody calls pg_gmtime,
which already has a perfectly good error-report convention, cf gmtime(3).
We have exactly one caller, which was not bothering to test for error :-(
and would dump core on failure.  And that caller isn't reached in most
processes, at least not in the regression tests.  So this side of it is
easy to improve.

Hence I propose the attached patch for point 2, which I'd want to
backpatch to keep our copies of the IANA code in sync.  Point 1
maybe can be addressed in future.

            regards, tom lane

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba15a29..449164a 100644
*** a/src/backend/utils/adt/timestamp.c
--- b/src/backend/utils/adt/timestamp.c
*************** GetEpochTime(struct pg_tm *tm)
*** 2008,2013 ****
--- 2008,2016 ----

      t0 = pg_gmtime(&epoch);

+     if (t0 == NULL)
+         elog(ERROR, "could not convert epoch to timestamp: %m");
+
      tm->tm_year = t0->tm_year;
      tm->tm_mon = t0->tm_mon;
      tm->tm_mday = t0->tm_mday;
diff --git a/src/timezone/localtime.c b/src/timezone/localtime.c
index 31b06b0..04a1013 100644
*** a/src/timezone/localtime.c
--- b/src/timezone/localtime.c
*************** gmtsub(pg_time_t const *timep, int32 off
*** 1328,1340 ****
      struct pg_tm *result;

      /* GMT timezone state data is kept here */
!     static struct state gmtmem;
!     static bool gmt_is_set = false;
! #define gmtptr        (&gmtmem)

!     if (!gmt_is_set)
      {
!         gmt_is_set = true;
          gmtload(gmtptr);
      }
      result = timesub(timep, offset, gmtptr, tmp);
--- 1328,1341 ----
      struct pg_tm *result;

      /* GMT timezone state data is kept here */
!     static struct state *gmtptr = NULL;

!     if (gmtptr == NULL)
      {
!         /* Allocate on first use. */
!         gmtptr = (struct state *) malloc(sizeof(struct state));
!         if (gmtptr == NULL)
!             return NULL;        /* errno should be set by malloc */
          gmtload(gmtptr);
      }
      result = timesub(timep, offset, gmtptr, tmp);

Re: Large writable variables

From
Andres Freund
Date:
Hi,

I've a patch making .data variables constant, fwiw. Will post in a
bit. Just so we don't duplicate work too much.


On 2018-10-15 19:41:49 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> 0000000008560224 0000000000023440 b tzdefrules_s
> >> 0000000008536704 0000000000023440 b gmtmem.7009
> 
> > I think that tzdefrules_s is not used in common cases (though I could be
> > wrong about that), so we could win by alloc-on-first-use.  The same might
> > be true for gmtmem, but there's a sticking point: there is no provision
> > for failure there, so I'm unsure how we avoid crashing on OOM.
> 
> So my intuition was exactly backwards on this.
> 
> 1. tzdefrules_s is filled in the postmaster, and if it's not filled there
> it'd be filled by every child process, so there's zero point in converting
> it to malloc-on-the-fly style.  This is because tzparse() always wants
> it filled.  That's actually a tad annoying, because it looks to me like
> with many timezone settings the data will not get used, but I'm hesitant
> to mess with the IANA code's logic enough to improve that.  Maybe I'll try
> submitting an upstream patch and see what they think of it first.

Ok, that makes sense.


> 2. gmtmem, on the other hand, is only used if somebody calls pg_gmtime,
> which already has a perfectly good error-report convention, cf gmtime(3).
> We have exactly one caller, which was not bothering to test for error :-(
> and would dump core on failure.  And that caller isn't reached in most
> processes, at least not in the regression tests.  So this side of it is
> easy to improve.
> 
> Hence I propose the attached patch for point 2, which I'd want to
> backpatch to keep our copies of the IANA code in sync.

That makes sense, too.

Greetings,

Andres Freund


Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> top unitialized allocations:
> 0000000008435040 0000000000085280 b DCHCache
> 0000000008391168 0000000000043840 b NUMCache

Here's a patch to improve that situation.

            regards, tom lane

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b3ff133..43a0307 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -93,6 +93,7 @@
 #include "utils/float.h"
 #include "utils/formatting.h"
 #include "utils/int8.h"
+#include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"

@@ -385,12 +386,12 @@ typedef struct
 } NUMCacheEntry;

 /* global cache for date/time format pictures */
-static DCHCacheEntry DCHCache[DCH_CACHE_ENTRIES];
+static DCHCacheEntry *DCHCache[DCH_CACHE_ENTRIES];
 static int    n_DCHCache = 0;        /* current number of entries */
 static int    DCHCounter = 0;        /* aging-event counter */

 /* global cache for number format pictures */
-static NUMCacheEntry NUMCache[NUM_CACHE_ENTRIES];
+static NUMCacheEntry *NUMCache[NUM_CACHE_ENTRIES];
 static int    n_NUMCache = 0;        /* current number of entries */
 static int    NUMCounter = 0;        /* aging-event counter */

@@ -3368,13 +3369,13 @@ DCH_cache_getnew(const char *str)
 {
     DCHCacheEntry *ent;

-    /* counter overflow check - paranoia? */
+    /* handle counter overflow by resetting all ages */
     if (DCHCounter >= (INT_MAX - DCH_CACHE_ENTRIES))
     {
         DCHCounter = 0;

-        for (ent = DCHCache; ent < (DCHCache + DCH_CACHE_ENTRIES); ent++)
-            ent->age = (++DCHCounter);
+        for (int i = 0; i < n_DCHCache; i++)
+            DCHCache[i]->age = (++DCHCounter);
     }

     /*
@@ -3382,15 +3383,16 @@ DCH_cache_getnew(const char *str)
      */
     if (n_DCHCache >= DCH_CACHE_ENTRIES)
     {
-        DCHCacheEntry *old = DCHCache + 0;
+        DCHCacheEntry *old = DCHCache[0];

 #ifdef DEBUG_TO_FROM_CHAR
         elog(DEBUG_elog_output, "cache is full (%d)", n_DCHCache);
 #endif
         if (old->valid)
         {
-            for (ent = DCHCache + 1; ent < (DCHCache + DCH_CACHE_ENTRIES); ent++)
+            for (int i = 1; i < DCH_CACHE_ENTRIES; i++)
             {
+                ent = DCHCache[i];
                 if (!ent->valid)
                 {
                     old = ent;
@@ -3414,7 +3416,9 @@ DCH_cache_getnew(const char *str)
 #ifdef DEBUG_TO_FROM_CHAR
         elog(DEBUG_elog_output, "NEW (%d)", n_DCHCache);
 #endif
-        ent = DCHCache + n_DCHCache;
+        Assert(DCHCache[n_DCHCache] == NULL);
+        DCHCache[n_DCHCache] = ent = (DCHCacheEntry *)
+            MemoryContextAllocZero(TopMemoryContext, sizeof(DCHCacheEntry));
         ent->valid = false;
         StrNCpy(ent->str, str, DCH_CACHE_SIZE + 1);
         ent->age = (++DCHCounter);
@@ -3428,20 +3432,19 @@ DCH_cache_getnew(const char *str)
 static DCHCacheEntry *
 DCH_cache_search(const char *str)
 {
-    int            i;
-    DCHCacheEntry *ent;
-
-    /* counter overflow check - paranoia? */
+    /* handle counter overflow by resetting all ages */
     if (DCHCounter >= (INT_MAX - DCH_CACHE_ENTRIES))
     {
         DCHCounter = 0;

-        for (ent = DCHCache; ent < (DCHCache + DCH_CACHE_ENTRIES); ent++)
-            ent->age = (++DCHCounter);
+        for (int i = 0; i < n_DCHCache; i++)
+            DCHCache[i]->age = (++DCHCounter);
     }

-    for (i = 0, ent = DCHCache; i < n_DCHCache; i++, ent++)
+    for (int i = 0; i < n_DCHCache; i++)
     {
+        DCHCacheEntry *ent = DCHCache[i];
+
         if (ent->valid && strcmp(ent->str, str) == 0)
         {
             ent->age = (++DCHCounter);
@@ -4047,13 +4050,13 @@ NUM_cache_getnew(const char *str)
 {
     NUMCacheEntry *ent;

-    /* counter overflow check - paranoia? */
+    /* handle counter overflow by resetting all ages */
     if (NUMCounter >= (INT_MAX - NUM_CACHE_ENTRIES))
     {
         NUMCounter = 0;

-        for (ent = NUMCache; ent < (NUMCache + NUM_CACHE_ENTRIES); ent++)
-            ent->age = (++NUMCounter);
+        for (int i = 0; i < n_NUMCache; i++)
+            NUMCache[i]->age = (++NUMCounter);
     }

     /*
@@ -4061,15 +4064,16 @@ NUM_cache_getnew(const char *str)
      */
     if (n_NUMCache >= NUM_CACHE_ENTRIES)
     {
-        NUMCacheEntry *old = NUMCache + 0;
+        NUMCacheEntry *old = NUMCache[0];

 #ifdef DEBUG_TO_FROM_CHAR
         elog(DEBUG_elog_output, "Cache is full (%d)", n_NUMCache);
 #endif
         if (old->valid)
         {
-            for (ent = NUMCache + 1; ent < (NUMCache + NUM_CACHE_ENTRIES); ent++)
+            for (int i = 1; i < NUM_CACHE_ENTRIES; i++)
             {
+                ent = NUMCache[i];
                 if (!ent->valid)
                 {
                     old = ent;
@@ -4093,7 +4097,9 @@ NUM_cache_getnew(const char *str)
 #ifdef DEBUG_TO_FROM_CHAR
         elog(DEBUG_elog_output, "NEW (%d)", n_NUMCache);
 #endif
-        ent = NUMCache + n_NUMCache;
+        Assert(NUMCache[n_NUMCache] == NULL);
+        NUMCache[n_NUMCache] = ent = (NUMCacheEntry *)
+            MemoryContextAllocZero(TopMemoryContext, sizeof(NUMCacheEntry));
         ent->valid = false;
         StrNCpy(ent->str, str, NUM_CACHE_SIZE + 1);
         ent->age = (++NUMCounter);
@@ -4107,20 +4113,19 @@ NUM_cache_getnew(const char *str)
 static NUMCacheEntry *
 NUM_cache_search(const char *str)
 {
-    int            i;
-    NUMCacheEntry *ent;
-
-    /* counter overflow check - paranoia? */
+    /* handle counter overflow by resetting all ages */
     if (NUMCounter >= (INT_MAX - NUM_CACHE_ENTRIES))
     {
         NUMCounter = 0;

-        for (ent = NUMCache; ent < (NUMCache + NUM_CACHE_ENTRIES); ent++)
-            ent->age = (++NUMCounter);
+        for (int i = 0; i < n_NUMCache; i++)
+            NUMCache[i]->age = (++NUMCounter);
     }

-    for (i = 0, ent = NUMCache; i < n_NUMCache; i++, ent++)
+    for (int i = 0; i < n_NUMCache; i++)
     {
+        NUMCacheEntry *ent = NUMCache[i];
+
         if (ent->valid && strcmp(ent->str, str) == 0)
         {
             ent->age = (++NUMCounter);

Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-15 16:45:03 -0700, Andres Freund wrote:
> I've a patch making .data variables constant, fwiw. Will post in a
> bit. Just so we don't duplicate work too much.

I pushed a few very obvious ones.

An additional fix, for the system attributes in heap.c, is attached. But
that's a bit more invasive, so I wanted to get some input.

I think all of the changes are fairly obvious and the new const
attributes just document what already had to be the case.  But to
correctly propagate the const through typedef-the-pointer-away typedefs,
I had to use the underlying type (e.g. use const NameData *n2 instad of
Name n2). To me that's the correct fix, but then I hate these typedefs
with a passion anyway.  An alternative fix is to add additional
typedefs like ConstName, but I don't find that particularly
advantageous.

Comments?

Arguably we should use consts more aggresively in signatures anyway, but
I think that's a separate effort.

Greetings,

Andres Freund

Attachment

Re: Large writable variables

From
Andres Freund
Date:
Hi,

After the last few changes (including a proposed one), we now have in
the .data segment (i.e. mutable initialized variables):

$ objdump -j .data -t ~/build/postgres/dev-optimize/vpath/src/backend/postgres|awk '{print $4, $5, $6}'|sort
-r|lessGreetings,
.data 0000000000004380 ConfigureNamesInt
.data 0000000000003570 ConfigureNamesBool
.data 0000000000002140 ConfigureNamesString
.data 00000000000010e0 ConfigureNamesEnum
.data 0000000000000d20 ConfigureNamesReal

These are by far the largest chunk of the .data segment (like 47152
bytes out of 51168 bytes).  It's not entirely trivial to fix. While we
can slap consts onto large parts of the functions taking a struct
config_generic*, there's a few places where *do* modify them.

ISTM, to actually fix, we'd have to split
struct config_generic
{
    /* constant fields, must be set correctly in initial value: */
    const char *name;            /* name of variable - MUST BE FIRST */
    GucContext    context;        /* context required to set the variable */
    enum config_group group;    /* to help organize variables by function */
    const char *short_desc;        /* short desc. of this variable's purpose */
    const char *long_desc;        /* long desc. of this variable's purpose */
    int            flags;            /* flag bits, see guc.h */
    /* variable fields, initialized at runtime: */
    enum config_type vartype;    /* type of variable (set only at startup) */
    int            status;            /* status bits, see below */
    GucSource    source;            /* source of the current actual value */
    GucSource    reset_source;    /* source of the reset_value */
    GucContext    scontext;        /* context that set the current value */
    GucContext    reset_scontext; /* context that set the reset value */
    GucStack   *stack;            /* stacked prior values */
    void       *extra;            /* "extra" pointer for current actual value */
    char       *sourcefile;        /* file current setting is from (NULL if not
                                 * set in config file) */
    int            sourceline;        /* line in source file */
};

into two different arrays. Namely the already denoted 'constant fields'
and 'variable fields.  But it's a bit more complicated than that: The
size doesn't come just from the base config_struct, but also from the
config_{bool,int,real,string,enum} arrays.  Where we again have separate
'constant fields' and 'variable fields'.

So we'd have to refactor more heavily than just splitting the above
array into two.

I kind of wonder, if we shouldn't remove the separate ConfigureNamesInt*
int arrays and change things so we have one

struct config_def
{
    const char *name;                       /* name of variable - MUST BE FIRST */
    GucContext      context;                /* context required to set the variable */
    enum config_group group;        /* to help organize variables by function */
    const char *short_desc;         /* short desc. of this variable's purpose */
    const char *long_desc;          /* long desc. of this variable's purpose */
    int                     flags;                  /* flag bits, see guc.h */
    enum config_type vartype;       /* type of variable (set only at startup) */

    union
    {
        struct
        {
            bool       *variable;
            bool            boot_val;
            GucBoolCheckHook check_hook;
            GucBoolAssignHook assign_hook;
            GucShowHook show_hook;
        } config_bool;

        struct
        {
            int           *variable;
            int            boot_val;
            int            min;
            int            max;
            GucIntCheckHook check_hook;
            GucIntAssignHook assign_hook;
            GucShowHook show_hook;
        } config_int

        ...
    } pertype;
}

and then a corresponding struct config_val with a similar union.

Besides reducing modified memory, it'd also have the benefit of making
the grouping within the various arrays much more sensible, because
related fields could be grouped together.


.data 0000000000000420 intRelOpts
.data 00000000000001c0 realRelOpts
.data 0000000000000118 boolRelOpts
.data 00000000000000a8 stringRelOpts

these should be readonly, but the code doesn't make that easy. There's
pending refactorings, I don't know how they effect this.


.data 0000000000000068 SnapshotSelfData
.data 0000000000000068 SnapshotAnyData
.data 0000000000000068 SecondarySnapshotData
.data 0000000000000068 CurrentSnapshotData
.data 0000000000000068 CatalogSnapshotData

The obviously actually are modifyable.


.data 0000000000000028 spi_printtupDR
.data 0000000000000028 printsimpleDR
.data 0000000000000028 donothingDR
.data 0000000000000028 debugtupDR

These we could actually make constant, but CreateDestReceiver() as an
API makes that inconvenient. They also are pretty darn small...  There's
a security benefit in making them constant and casting the constness
away - I think that might not be insane.


- Andres


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-15 21:50:51 -0700, Andres Freund wrote:
> .data 0000000000000028 spi_printtupDR
> .data 0000000000000028 printsimpleDR
> .data 0000000000000028 donothingDR
> .data 0000000000000028 debugtupDR
> 
> These we could actually make constant, but CreateDestReceiver() as an
> API makes that inconvenient. They also are pretty darn small...  There's
> a security benefit in making them constant and casting the constness
> away - I think that might not be insane.

I.e. do something like the attached.

Greetings,

Andres Freund

Attachment

Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> [ let's rearrange the GUC structs ]

I find it hard to believe that the API breaks you'll cause for extensions
are a good trade for a few kB reduction in the size of the .data segment.

            regards, tom lane


Re: Large writable variables

From
Tom Lane
Date:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> top unitialized allocations:
>> 0000000008435040 0000000000085280 b DCHCache
>> 0000000008391168 0000000000043840 b NUMCache

> Here's a patch to improve that situation.

Hm, looking at that more closely, there's a problem with the idea of
allocating the cache slots one-at-a-time.  Currently,
sizeof(DCHCacheEntry) and sizeof(NUMCacheEntry) are each just a bit more
than a power of 2, which would cause palloc to waste nearly 50% of the
allocation it makes for them.

We could forget the one-at-a-time idea and just allocate the whole
array on first use, but I feel like that's probably not a good answer.
I'm inclined to instead reduce DCH_CACHE_SIZE and NUM_CACHE_SIZE enough
to make the structs just less than powers of 2 instead of just more ---
AFAICS, both of those numbers were picked out of the air, and so there's
no reason not to pick a different number out of the air.

Also, I noticed that the biggest part of those structs are arrays of
FormatNode, which has been designed with complete lack of thought about
size or padding issues.  We can very easily cut it in half on 64-bit
machines.

The attached patch does both of those things.  It's independent of
the other one but is important to make that one efficient.

            regards, tom lane

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b3ff133..9c2649e 100644
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
***************
*** 124,133 ****
   */
  typedef struct
  {
!     char       *name;            /* suffix string        */
      int            len,            /* suffix length        */
                  id,                /* used in node->suffix */
!                 type;            /* prefix / postfix            */
  } KeySuffix;

  /* ----------
--- 124,133 ----
   */
  typedef struct
  {
!     const char *name;            /* suffix string        */
      int            len,            /* suffix length        */
                  id,                /* used in node->suffix */
!                 type;            /* prefix / postfix        */
  } KeySuffix;

  /* ----------
*************** typedef struct
*** 155,164 ****

  typedef struct
  {
!     int            type;            /* NODE_TYPE_XXX, see below */
!     const KeyWord *key;            /* if type is ACTION */
      char        character[MAX_MULTIBYTE_CHAR_LEN + 1];    /* if type is CHAR */
!     int            suffix;            /* keyword prefix/suffix code, if any */
  } FormatNode;

  #define NODE_TYPE_END        1
--- 155,164 ----

  typedef struct
  {
!     uint8        type;            /* NODE_TYPE_XXX, see below */
      char        character[MAX_MULTIBYTE_CHAR_LEN + 1];    /* if type is CHAR */
!     uint8        suffix;            /* keyword prefix/suffix code, if any */
!     const KeyWord *key;            /* if type is ACTION */
  } FormatNode;

  #define NODE_TYPE_END        1
*************** typedef struct
*** 358,370 ****
   * For simplicity, the cache entries are fixed-size, so they allow for the
   * worst case of a FormatNode for each byte in the picture string.
   *
   * The max number of entries in the caches is DCH_CACHE_ENTRIES
   * resp. NUM_CACHE_ENTRIES.
   * ----------
   */
! #define NUM_CACHE_SIZE        64
  #define NUM_CACHE_ENTRIES    20
! #define DCH_CACHE_SIZE        128
  #define DCH_CACHE_ENTRIES    20

  typedef struct
--- 358,374 ----
   * For simplicity, the cache entries are fixed-size, so they allow for the
   * worst case of a FormatNode for each byte in the picture string.
   *
+  * The CACHE_SIZE constants are chosen to make sizeof(DCHCacheEntry) and
+  * sizeof(NUMCacheEntry) be powers of 2, or just less than that, so that
+  * we don't waste too much space by palloc'ing them individually.
+  *
   * The max number of entries in the caches is DCH_CACHE_ENTRIES
   * resp. NUM_CACHE_ENTRIES.
   * ----------
   */
! #define NUM_CACHE_SIZE        56
  #define NUM_CACHE_ENTRIES    20
! #define DCH_CACHE_SIZE        119
  #define DCH_CACHE_ENTRIES    20

  typedef struct
*************** do { \
*** 496,502 ****
   *****************************************************************************/

  /* ----------
!  * Suffixes:
   * ----------
   */
  #define DCH_S_FM    0x01
--- 500,506 ----
   *****************************************************************************/

  /* ----------
!  * Suffixes (FormatNode.suffix is an OR of these codes)
   * ----------
   */
  #define DCH_S_FM    0x01

Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-16 01:49:22 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > [ let's rearrange the GUC structs ]
> 
> I find it hard to believe that the API breaks you'll cause for extensions
> are a good trade for a few kB reduction in the size of the .data segment.

I'm doubtful that it's worth it too. But more because it seems like a
fair bit of work. I don't think that many extensions use guc_tables.h -
shouldn't most use guc.h, which wouldn't be affected?

Greetings,

Andres Freund


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-15 22:02:00 -0700, Andres Freund wrote:
> Hi,
> 
> On 2018-10-15 21:50:51 -0700, Andres Freund wrote:
> > .data 0000000000000028 spi_printtupDR
> > .data 0000000000000028 printsimpleDR
> > .data 0000000000000028 donothingDR
> > .data 0000000000000028 debugtupDR
> > 
> > These we could actually make constant, but CreateDestReceiver() as an
> > API makes that inconvenient. They also are pretty darn small...  There's
> > a security benefit in making them constant and casting the constness
> > away - I think that might not be insane.
> 
> I.e. do something like the attached.

This just reminded me that a couple times I wanted a cast that casts
away const, but otherwise makes sure the type stays the same. I don't
think there's a way to do that in C, but we can write one that verifies
the cast doesn't do something bad if gcc is used:

#if defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P)
#define unconstify(cst, var) StaticAssertExpr(__builtin_types_compatible_p(__typeof(var), const cst), "wrong cast"),
(cst)(var)
 
#else
#define unconstify(cst, var) ((cst) (var))
#endif

Does anybody besides me see value in adding a cleaned up version of
that?

Greetings,

Andres Freund


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-16 01:59:00 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> top unitialized allocations:
> >> 0000000008435040 0000000000085280 b DCHCache
> >> 0000000008391168 0000000000043840 b NUMCache
> 
> > Here's a patch to improve that situation.
> 
> Hm, looking at that more closely, there's a problem with the idea of
> allocating the cache slots one-at-a-time.  Currently,
> sizeof(DCHCacheEntry) and sizeof(NUMCacheEntry) are each just a bit more
> than a power of 2, which would cause palloc to waste nearly 50% of the
> allocation it makes for them.

Hm, that's a bit annoying...


> We could forget the one-at-a-time idea and just allocate the whole
> array on first use, but I feel like that's probably not a good answer.

I suspect it'd be fine, but obviously we can do better.


> Also, I noticed that the biggest part of those structs are arrays of
> FormatNode, which has been designed with complete lack of thought about
> size or padding issues.  We can very easily cut it in half on 64-bit
> machines.

Heh, neat. I feel like we've paid very little attention to that in a
myriad of places :(

Greetings,

Andres Freund


Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2018-10-16 01:59:00 -0400, Tom Lane wrote:
>> Also, I noticed that the biggest part of those structs are arrays of
>> FormatNode, which has been designed with complete lack of thought about
>> size or padding issues.  We can very easily cut it in half on 64-bit
>> machines.

> Heh, neat. I feel like we've paid very little attention to that in a
> myriad of places :(

Most of the time, we probably *shouldn't* pay attention to it; logical
field ordering is worth a good deal IMO.  But in a case like this,
where there are large arrays of the things and it's not very painful
to avoid padding waste, it's worth the trouble.

            regards, tom lane


Re: Large writable variables

From
Robert Haas
Date:
On Tue, Oct 16, 2018 at 2:30 AM Andres Freund <andres@anarazel.de> wrote:
> This just reminded me that a couple times I wanted a cast that casts
> away const, but otherwise makes sure the type stays the same. I don't
> think there's a way to do that in C, but we can write one that verifies
> the cast doesn't do something bad if gcc is used:
>
> #if defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P)
> #define unconstify(cst, var) StaticAssertExpr(__builtin_types_compatible_p(__typeof(var), const cst), "wrong cast"),
(cst)(var)
 
> #else
> #define unconstify(cst, var) ((cst) (var))
> #endif
>
> Does anybody besides me see value in adding a cleaned up version of
> that?

Under what circumstances would we consider this to be a legitimate thing to use?

I think if we add something this, we'd better accompany it with some
detailed and very clearly-written statements about when you're allowed
to use it.  Otherwise, I predict that people will use it in cases
where it's not actually safe, and we'll end up with low-grade bugs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-16 11:22:31 -0400, Robert Haas wrote:
> On Tue, Oct 16, 2018 at 2:30 AM Andres Freund <andres@anarazel.de> wrote:
> > This just reminded me that a couple times I wanted a cast that casts
> > away const, but otherwise makes sure the type stays the same. I don't
> > think there's a way to do that in C, but we can write one that verifies
> > the cast doesn't do something bad if gcc is used:
> >
> > #if defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P)
> > #define unconstify(cst, var) StaticAssertExpr(__builtin_types_compatible_p(__typeof(var), const cst), "wrong
cast"),(cst) (var)
 
> > #else
> > #define unconstify(cst, var) ((cst) (var))
> > #endif
> >
> > Does anybody besides me see value in adding a cleaned up version of
> > that?
> 
> Under what circumstances would we consider this to be a legitimate thing to use?

When the variable actually *will not* be modified, but language or API
design reasons makes it unfeasiable to express that. Look e.g.
DestReceiver * CreateDestReceiver(CommandDest dest);
some of the returned receivers (e.g. donothingDR, printsimpleDR) are
statically allocated and *any* modification would be a bug. But other
return values will be modified, e.g. CreateIntoRelDestReceiver().

It's safe to cast constness away if the variable will not actually be
modified after. Which is e.g. the case above. But making the static
allocations const will a) save memory b) trigger sigbuses if you modify
them. So the casting constness away here *increases* robustness.

The problem is that just adding a cast like
        case DestNone:
            return (DestReceiver *) &donothingDR;
also hides errors. If you e.g. changed the type of donothingDR you'd
still not get an error.

So I was wishing for a form of a cast that only casts the const away,
but errors out if there's any other type difference. That's the above, I
think.


> I think if we add something this, we'd better accompany it with some
> detailed and very clearly-written statements about when you're allowed
> to use it.  Otherwise, I predict that people will use it in cases
> where it's not actually safe, and we'll end up with low-grade bugs.

Well, right now people will (and have) just cast the const away like
above. So I don't really see it being more likely to cause problems than
we're doing now.  But yea, it definitely should have a big red warning
label.

Greetings,

Andres Freund


Re: Large writable variables

From
Robert Haas
Date:
On Tue, Oct 16, 2018 at 12:06 PM Andres Freund <andres@anarazel.de> wrote:
> But yea, it definitely should have a big red warning
> label.

That's all I'm asking.  And the warning label shouldn't just say "use
with caution!" but should rather explain how to know whether you're
doing the right thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-16 12:19:55 -0400, Robert Haas wrote:
> On Tue, Oct 16, 2018 at 12:06 PM Andres Freund <andres@anarazel.de> wrote:
> > But yea, it definitely should have a big red warning
> > label.
> 
> That's all I'm asking.  And the warning label shouldn't just say "use
> with caution!" but should rather explain how to know whether you're
> doing the right thing.

I'm thinking of something like:

/*
 * Macro that allows to cast constness away from a variable, but doesn't
 * allow changing the underlying type.  Enforcement of the latter
 * currently only works for gcc like compilers.
 *
 * Please note IT IS NOT SAFE to cast constness away if the variable will ever
 * be modified (it would be undefined behaviour). Doing so anyway can cause
 * compiler misoptimizations or runtime crashes (modifying readonly memory).
 * It is only safe to use when the the variabble will not be modified, but API
 * design or language restrictions prevent you from declaring that
 * (e.g. because a function returns both const and non-const variables).
 */

Greetings,

Andres Freund


Re: Large writable variables

From
Robert Haas
Date:
On Tue, Oct 16, 2018 at 12:26 PM Andres Freund <andres@anarazel.de> wrote:
> /*
>  * Macro that allows to cast constness away from a variable, but doesn't
>  * allow changing the underlying type.  Enforcement of the latter
>  * currently only works for gcc like compilers.
>  *
>  * Please note IT IS NOT SAFE to cast constness away if the variable will ever
>  * be modified (it would be undefined behaviour). Doing so anyway can cause
>  * compiler misoptimizations or runtime crashes (modifying readonly memory).
>  * It is only safe to use when the the variabble will not be modified, but API
>  * design or language restrictions prevent you from declaring that
>  * (e.g. because a function returns both const and non-const variables).
>  */

"variabble" is a little too rich in "b"s.

In terms of a function that returns both const and non-const
variables, it seems a bit sketchy that the caller would know what the
function is doing in particular cases and make decisions based on it,
but maybe that's just how life is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-16 12:41:44 -0400, Robert Haas wrote:
> On Tue, Oct 16, 2018 at 12:26 PM Andres Freund <andres@anarazel.de> wrote:
> > /*
> >  * Macro that allows to cast constness away from a variable, but doesn't
> >  * allow changing the underlying type.  Enforcement of the latter
> >  * currently only works for gcc like compilers.
> >  *
> >  * Please note IT IS NOT SAFE to cast constness away if the variable will ever
> >  * be modified (it would be undefined behaviour). Doing so anyway can cause
> >  * compiler misoptimizations or runtime crashes (modifying readonly memory).
> >  * It is only safe to use when the the variabble will not be modified, but API
> >  * design or language restrictions prevent you from declaring that
> >  * (e.g. because a function returns both const and non-const variables).
> >  */
> 
> "variabble" is a little too rich in "b"s.

variababble.


> In terms of a function that returns both const and non-const
> variables, it seems a bit sketchy that the caller would know what the
> function is doing in particular cases and make decisions based on it,
> but maybe that's just how life is.

I don't think it's necessary the callers doing so in most cases. E.g. in
the DestReceiver case, it'll be the choice of the testreceiver (say
intorel_receive modifying things for DestIntoRel), not the caller
choosing when to modify things.  The caller / users of dest receivers
won't necessarily know.

I agree it's not pretty, but I don't quite see any really realistic
other approaches here.

Greetings,

Andres Freund


Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2018-10-16 12:41:44 -0400, Robert Haas wrote:
>> In terms of a function that returns both const and non-const
>> variables, it seems a bit sketchy that the caller would know what the
>> function is doing in particular cases and make decisions based on it,
>> but maybe that's just how life is.

> I don't think it's necessary the callers doing so in most cases. E.g. in
> the DestReceiver case, it'll be the choice of the testreceiver (say
> intorel_receive modifying things for DestIntoRel), not the caller
> choosing when to modify things.  The caller / users of dest receivers
> won't necessarily know.

Yeah, I think the use-case is more like "this API specifies non-const
pointers, but some instances can return pointers to const objects,
while others return non-const objects".  Changing the API to const
isn't better, it just moves where you have to cast away const.

            regards, tom lane


Re: Large writable variables

From
Robert Haas
Date:
On Mon, Oct 15, 2018 at 4:08 PM Andres Freund <andres@anarazel.de> wrote:
> So we have 500kb of not-initialized memory mapped into every
> process. That's, uh, not nothing.

Thinking about this a bit more, why is this bad?  I mean, if the
memory is never touched, the OS does not really need to allocate or
zero any pages, or even make any page table entries.  If somebody
actually accesses the data, then we'll take a page fault and have to
really allocate, but otherwise I would think we could have 50MB of
unused bss floating around and it wouldn't really matter, let alone
500kB.

What am I missing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-15 13:07:54 -0700, Andres Freund wrote:
> Hi,
>
> while briefly thinking about the per-process overhead of postgres, I
> looked at the section sizes of a modern postgres. In particular which
> memory areas are *not* shared between processes.
>
> If you look at the section sizes that are mapped read-write:
>
> $ size --format=sysv src/backend/postgres
> src/backend/postgres  :
> section                  size      addr
> .rodata               1593617   5730304 (read-only, for reference)
> .data.rel.ro           134944   8039904 (read only after start)
> .data                   53888   8178912 (read-write, initialized)
> .bss                   510416   8232800 (read-write, uninitialized)
> Total                52417197

We improve a fair bit over the last two days:

   text       data        bss        dec        hex    filename
8030289     191888     227024    8449201     80ecb1    src/backend/postgres
8029097     192880     510416    8732393     853ee9    src/backend/postgres.before

breakdown:
section                  size      addr
.rodata               1594577   5730304 (read-only, for reference)
.data.rel.ro           136928   8037920 (read only after start)
.data                   50912   8178912 (read-write, initialized)
.bss                   227024   8229824 (read-write, uninitialized)

As to be expected .rodata and .data.rel.ro gained a bit in size (as they
contain constant data and we made more vars constant), whereas .data
shrunk a bit, and .bss shrunk drastically by moving things to be
dynamically allocated.

Nice progress, thanks everyone!

The top .bss entries (uninitialized modifyable vars) are now:
.bss 0000000000020020 hist_entries
.bss 0000000000005b90 tzdefrules_s
.bss 0000000000004000 hist_start
.bss 0000000000002000 PqRecvBuffer
.bss 0000000000001410 BackendWritebackContext
.bss 0000000000000c80 held_lwlocks
.bss 0000000000000b00 re_array
.bss 0000000000000800 buffer_lists
.bss 0000000000000600 syscache_callback_list
.bss 0000000000000560 reverse_dispatch_table
.bss 0000000000000400 tzdir.7228
.bss 0000000000000400 pkglib_path
.bss 0000000000000400 OutputFileName
.bss 0000000000000400 my_exec_path
.bss 0000000000000400 ExtraOptions
.bss 0000000000000398 errordata
.bss 0000000000000320 seq_scan_tables

The top .data entries (initialized modifyable vars) are now:
.data 0000000000004380 ConfigureNamesInt
.data 0000000000003570 ConfigureNamesBool
.data 0000000000002140 ConfigureNamesString
.data 00000000000010e0 ConfigureNamesEnum
.data 0000000000000d20 ConfigureNamesReal
.data 0000000000000420 intRelOpts
.data 00000000000001c0 realRelOpts
.data 0000000000000118 boolRelOpts
.data 00000000000000a8 stringRelOpts
.data 0000000000000068 SnapshotSelfData
.data 0000000000000068 SnapshotAnyData
.data 0000000000000068 SecondarySnapshotData
.data 0000000000000068 CurrentSnapshotData
.data 0000000000000068 CatalogSnapshotData


- Andres


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-16 15:13:43 -0400, Robert Haas wrote:
> On Mon, Oct 15, 2018 at 4:08 PM Andres Freund <andres@anarazel.de> wrote:
> > So we have 500kb of not-initialized memory mapped into every
> > process. That's, uh, not nothing.
> 
> Thinking about this a bit more, why is this bad?  I mean, if the
> memory is never touched, the OS does not really need to allocate or
> zero any pages, or even make any page table entries.  If somebody
> actually accesses the data, then we'll take a page fault and have to
> really allocate, but otherwise I would think we could have 50MB of
> unused bss floating around and it wouldn't really matter, let alone
> 500kB.
> 
> What am I missing?

For one the OS will actually reserve all that memory when you use memory
overcommit = 2 (which you should). There's no need to reserve memory for
data that's guaranteed to be shared.  It also reduces the frequency of
page faults when looking at other global variables laid out nearby - we
could do something about that by giving the linker placement
instructions, but that's also work.  The size of the pagetable also has
performance effects, although it'll often be dwarfed by the shared
buffers entry if not using huge pages.  WRT .data entries, marking them
as const where appropriate both reduces memory usage, possibly allows
more compiler optimizations, and allows to catch bugs.

There's also the fact that not every OS has COW pagetables, or we use
processes in a way that don't allow that (say windows, where we don't
fork).

Greetings,

Andres Freund


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-16 10:16:33 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-10-16 01:59:00 -0400, Tom Lane wrote:
> >> Also, I noticed that the biggest part of those structs are arrays of
> >> FormatNode, which has been designed with complete lack of thought about
> >> size or padding issues.  We can very easily cut it in half on 64-bit
> >> machines.
> 
> > Heh, neat. I feel like we've paid very little attention to that in a
> > myriad of places :(
> 
> Most of the time, we probably *shouldn't* pay attention to it; logical
> field ordering is worth a good deal IMO.

Sure. But there's plenty structs which we allocate a bunch off, that are
frequently accessed, where a lot of space is wasted to padding.  I agree
that we don't need to contort many structs, but there's plenty where we
should.   Often enough it's possible to reorder without making things
make meaningfully less sense.

> But in a case like this,
> where there are large arrays of the things and it's not very painful
> to avoid padding waste, it's worth the trouble.

Attached is a patch that shrinks fmgr_builtins by 25%. That seems
worthwhile, it's pretty frequently accessed, making it more dense is
helpful.  Unless somebody protests soon, I'm going to apply that...

Greetings,

Andres Freund

Attachment

Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> Attached is a patch that shrinks fmgr_builtins by 25%. That seems
> worthwhile, it's pretty frequently accessed, making it more dense is
> helpful.  Unless somebody protests soon, I'm going to apply that...

Hah.  I'm pretty sure that struct *was* set up with an eye to padding ...
on 32-bit machines.  This does make it shorter on 64-bit, but also
makes the size not a power of 2, which might add a few cycles to
array indexing calculations.  Might be worth checking whether that's
going to be an issue anywhere.

What's the point of the extra const decoration on funcName?  ISTM
the whole struct should be const, or not.

            regards, tom lane


Re: Large writable variables

From
Gavin Flower
Date:
On 17/10/2018 09:36, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Attached is a patch that shrinks fmgr_builtins by 25%. That seems
>> worthwhile, it's pretty frequently accessed, making it more dense is
>> helpful.  Unless somebody protests soon, I'm going to apply that...
> Hah.  I'm pretty sure that struct *was* set up with an eye to padding ...
> on 32-bit machines.  This does make it shorter on 64-bit, but also
> makes the size not a power of 2, which might add a few cycles to
> array indexing calculations.  Might be worth checking whether that's
> going to be an issue anywhere.
>
> What's the point of the extra const decoration on funcName?  ISTM
> the whole struct should be const, or not.
>
>             regards, tom lane
>
Would it be useful to add dummy variable(s) to bring it up to a power of 2?


Cheers,
Gavin



Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-17 09:38:18 +1300, Gavin Flower wrote:
> On 17/10/2018 09:36, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > Attached is a patch that shrinks fmgr_builtins by 25%. That seems
> > > worthwhile, it's pretty frequently accessed, making it more dense is
> > > helpful.  Unless somebody protests soon, I'm going to apply that...
> > Hah.  I'm pretty sure that struct *was* set up with an eye to padding ...
> > on 32-bit machines.  This does make it shorter on 64-bit, but also
> > makes the size not a power of 2, which might add a few cycles to
> > array indexing calculations.  Might be worth checking whether that's
> > going to be an issue anywhere.
> > 
> > What's the point of the extra const decoration on funcName?  ISTM
> > the whole struct should be const, or not.

> Would it be useful to add dummy variable(s) to bring it up to a power of 2?

Err. Reread what we're talking about. The point was to reduce the size,
it's a power of two right now (32).  We could of course also just do
nothing (re-add a dummy variable), which would, drumroll, do nothing.

Greetings,

Andres Freund


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-16 16:36:12 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Attached is a patch that shrinks fmgr_builtins by 25%. That seems
> > worthwhile, it's pretty frequently accessed, making it more dense is
> > helpful.  Unless somebody protests soon, I'm going to apply that...
> 
> Hah.  I'm pretty sure that struct *was* set up with an eye to padding ...
> on 32-bit machines.

Possible, the new layout should work just as well there, luckily.


> This does make it shorter on 64-bit, but also
> makes the size not a power of 2, which might add a few cycles to
> array indexing calculations.  Might be worth checking whether that's
> going to be an issue anywhere.

I can't imagine that that outweight the cost of additional cache misses
on any platform where performance matters.  On x86 I assume indexing
into an array with 24byte stride, will normally be just two leas (lea
eax, [eax + eax * 2]; lea eax, [ebx + eax * 8]; where eax initially is
the index, and ebx the array base).  Indexing also plays less of a role
than in the past, because previously we did a binary search, but now we
normally look up the index via fmgr_builtin_oid_index.


> What's the point of the extra const decoration on funcName?  ISTM
> the whole struct should be const, or not.

The whole array is const.  There's cases where that allows the compiler
more freedom - but I guess it's really a bit redundant here.

Greetings,

Andres Freund


Re: Large writable variables

From
Tom Lane
Date:
I wrote:
> 1. tzdefrules_s is filled in the postmaster, and if it's not filled there
> it'd be filled by every child process, so there's zero point in converting
> it to malloc-on-the-fly style.  This is because tzparse() always wants
> it filled.  That's actually a tad annoying, because it looks to me like
> with many timezone settings the data will not get used, but I'm hesitant
> to mess with the IANA code's logic enough to improve that.  Maybe I'll try
> submitting an upstream patch and see what they think of it first.

I spent some time studying this, and realized that the reason that they
always try to load the TZDEFRULES zone data is that they want to absorb
whatever leap-seconds data it has.  We don't want that and would be glad
to just drop said data, but I'm sure that's intentional on their end,
so they are not going to be interested in a patch here.

However ... if you assume that any leap-seconds data in that zone can
be ignored, then the only case where the data need be loaded is when
parsing a POSIX-style zone name that has a DST component but does not
specify a POSIX-style transition date rule.  That's possible in our code
but I do not think it is a mainstream case; in particular, that seems
never to be reached when loading a real zone data file.

Hence, the attached patch postpones the TZDEFRULES load until we find
we actually need it, and ignores any leap-second data therein, and
incidentally makes the data space be malloc-on-demand rather than static.

This is actually less of a hack compared to the upstream code than what
we were doing before, so I feel pretty pleased with it.

            regards, tom lane

diff --git a/src/timezone/localtime.c b/src/timezone/localtime.c
index a2260e5..e3029d8 100644
*** a/src/timezone/localtime.c
--- b/src/timezone/localtime.c
*************** static const char gmt[] = "GMT";
*** 54,60 ****
   * PG: We cache the result of trying to load the TZDEFRULES zone here.
   * tzdefrules_loaded is 0 if not tried yet, +1 if good, -1 if failed.
   */
! static struct state tzdefrules_s;
  static int    tzdefrules_loaded = 0;

  /*
--- 54,60 ----
   * PG: We cache the result of trying to load the TZDEFRULES zone here.
   * tzdefrules_loaded is 0 if not tried yet, +1 if good, -1 if failed.
   */
! static struct state *tzdefrules_s = NULL;
  static int    tzdefrules_loaded = 0;

  /*
*************** tzparse(const char *name, struct state *
*** 908,927 ****
      stdname = name;
      if (lastditch)
      {
!         /*
!          * This is intentionally somewhat different from the IANA code.  We do
!          * not want to invoke tzload() in the lastditch case: we can't assume
!          * pg_open_tzfile() is sane yet, and we don't care about leap seconds
!          * anyway.
!          */
          stdlen = strlen(name);    /* length of standard zone name */
          name += stdlen;
-         if (stdlen >= sizeof sp->chars)
-             stdlen = (sizeof sp->chars) - 1;
-         charcnt = stdlen + 1;
          stdoffset = 0;
-         sp->goback = sp->goahead = false;    /* simulate failed tzload() */
-         load_ok = false;
      }
      else
      {
--- 908,917 ----
      stdname = name;
      if (lastditch)
      {
!         /* Unlike IANA, don't assume name is exactly "GMT" */
          stdlen = strlen(name);    /* length of standard zone name */
          name += stdlen;
          stdoffset = 0;
      }
      else
      {
*************** tzparse(const char *name, struct state *
*** 945,971 ****
          name = getoffset(name, &stdoffset);
          if (name == NULL)
              return false;
-         charcnt = stdlen + 1;
-         if (sizeof sp->chars < charcnt)
-             return false;
-
-         /*
-          * This bit also differs from the IANA code, which doesn't make any
-          * attempt to avoid repetitive loadings of the TZDEFRULES zone.
-          */
-         if (tzdefrules_loaded == 0)
-         {
-             if (tzload(TZDEFRULES, NULL, &tzdefrules_s, false) == 0)
-                 tzdefrules_loaded = 1;
-             else
-                 tzdefrules_loaded = -1;
-         }
-         load_ok = (tzdefrules_loaded > 0);
-         if (load_ok)
-             memcpy(sp, &tzdefrules_s, sizeof(struct state));
      }
!     if (!load_ok)
!         sp->leapcnt = 0;        /* so, we're off a little */
      if (*name != '\0')
      {
          if (*name == '<')
--- 935,957 ----
          name = getoffset(name, &stdoffset);
          if (name == NULL)
              return false;
      }
!     charcnt = stdlen + 1;
!     if (sizeof sp->chars < charcnt)
!         return false;
!
!     /*
!      * The IANA code always tries tzload(TZDEFRULES) here.  We do not want to
!      * do that; it would be bad news in the lastditch case, where we can't
!      * assume pg_open_tzfile() is sane yet.  Moreover, the only reason to do
!      * it unconditionally is to absorb the TZDEFRULES zone's leap second info,
!      * which we don't want to do anyway.  Without that, we only need to load
!      * TZDEFRULES if the zone name specifies DST but doesn't incorporate a
!      * POSIX-style transition date rule, which is not a common case.
!      */
!     sp->goback = sp->goahead = false;    /* simulate failed tzload() */
!     sp->leapcnt = 0;            /* intentionally assume no leap seconds */
!
      if (*name != '\0')
      {
          if (*name == '<')
*************** tzparse(const char *name, struct state *
*** 996,1003 ****
          }
          else
              dstoffset = stdoffset - SECSPERHOUR;
!         if (*name == '\0' && !load_ok)
!             name = TZDEFRULESTRING;
          if (*name == ',' || *name == ';')
          {
              struct rule start;
--- 982,1019 ----
          }
          else
              dstoffset = stdoffset - SECSPERHOUR;
!         if (*name == '\0')
!         {
!             /*
!              * The POSIX zone name does not provide a transition-date rule.
!              * Here we must load the TZDEFRULES zone, if possible, to serve as
!              * source data for the transition dates.  Unlike the IANA code, we
!              * try to cache the data so it's only loaded once.
!              */
!             if (tzdefrules_loaded == 0)
!             {
!                 /* Allocate on first use */
!                 if (tzdefrules_s == NULL)
!                     tzdefrules_s = (struct state *) malloc(sizeof(struct state));
!                 if (tzdefrules_s != NULL)
!                 {
!                     if (tzload(TZDEFRULES, NULL, tzdefrules_s, false) == 0)
!                         tzdefrules_loaded = 1;
!                     else
!                         tzdefrules_loaded = -1;
!                     /* In any case, we ignore leap-second data from the file */
!                     tzdefrules_s->leapcnt = 0;
!                 }
!             }
!             load_ok = (tzdefrules_loaded > 0);
!             if (load_ok)
!                 memcpy(sp, tzdefrules_s, sizeof(struct state));
!             else
!             {
!                 /* If we can't load TZDEFRULES, fall back to hard-wired rule */
!                 name = TZDEFRULESTRING;
!             }
!         }
          if (*name == ',' || *name == ';')
          {
              struct rule start;

Re: Large writable variables

From
Tom Lane
Date:
BTW, I looked around for .o files with large BSS numbers, and came across

$ size src/interfaces/ecpg/ecpglib/prepare.o
   text    data     bss     dec     hex filename
   4023       4 1048576 1052603  100fbb src/interfaces/ecpg/ecpglib/prepare.o

That megabyte is from a statically allocated statement cache array.
Seems a bit unfriendly to users of ecpglib, given that many apps
would never use the statement cache (AFAICT you have to explicitly
ask for auto-prepare mode to get to that code).

Doesn't look hard to fix though.

            regards, tom lane


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-16 19:48:42 -0400, Tom Lane wrote:
> BTW, I looked around for .o files with large BSS numbers, and came across
> 
> $ size src/interfaces/ecpg/ecpglib/prepare.o
>    text    data     bss     dec     hex filename
>    4023       4 1048576 1052603  100fbb src/interfaces/ecpg/ecpglib/prepare.o

Btw, I think one can fairly argue that large .data vars have a
higher cost than .bss. That's not to say we shouldn't fix .bss ones ;)

I've this awful shell script to figure out the sizes:

for file in $(find . -type f -name '*.o'); do objdump -j .data -t $file|tail -n +5|grep -v '^$'|awk -v "file=$file"
'{print$4, $5, $6, file}';done|sort -nr|less
 

.data 0000000000001180 datetktbl ./src/interfaces/ecpg/pgtypeslib/dt_common.o
.data 0000000000000c28 ibmkanji ./src/backend/utils/mb/conversion_procs/euc_jp_and_sjis/euc_jp_and_sjis.o
.data 00000000000003f0 deltatktbl ./src/interfaces/ecpg/pgtypeslib/dt_common.o
.data 0000000000000100 sqlca_init ./src/interfaces/ecpg/ecpglib/misc.o
.data 0000000000000100 sqlca_init ./src/interfaces/ecpg/compatlib/informix.o
.data 0000000000000068 day_tab ./src/interfaces/ecpg/pgtypeslib/dt_common.o
.data 0000000000000030 ecpg_query ./src/interfaces/ecpg/preproc/preproc.o
.data 0000000000000030 ecpg_no_indicator ./src/interfaces/ecpg/preproc/preproc.o
.data 0000000000000030 descriptor_type.5770 ./src/interfaces/ecpg/preproc/descriptor.o
.data 000000000000002b ssl_nomem ./src/interfaces/libpq/fe-secure-openssl.o
.data 000000000000002a PLy_subtransaction_doc ./src/pl/plpython/plpy_subxactobject.o
.data 0000000000000023 PLy_cursor_doc ./src/pl/plpython/plpy_cursorobject.o
.data 000000000000001e PLy_result_doc ./src/pl/plpython/plpy_resultobject.o
.data 0000000000000018 PLy_plan_doc ./src/pl/plpython/plpy_planobject.o
.data 0000000000000010 base_yylloc ./src/interfaces/ecpg/preproc/preproc.o

Should also work for other sections.


Hm, there's a lot of ecpg in this :(

> That megabyte is from a statically allocated statement cache array.
> Seems a bit unfriendly to users of ecpglib, given that many apps
> would never use the statement cache (AFAICT you have to explicitly
> ask for auto-prepare mode to get to that code).
> 
> Doesn't look hard to fix though.

Thanks for tackling that.

Greetings,

Andres Freund


Re: Large writable variables

From
Peter Eisentraut
Date:
On 16/10/2018 08:30, Andres Freund wrote:
> This just reminded me that a couple times I wanted a cast that casts
> away const, but otherwise makes sure the type stays the same. I don't
> think there's a way to do that in C, but we can write one that verifies
> the cast doesn't do something bad if gcc is used:
> 
> #if defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P)
> #define unconstify(cst, var) StaticAssertExpr(__builtin_types_compatible_p(__typeof(var), const cst), "wrong cast"),
(cst)(var)
 
> #else
> #define unconstify(cst, var) ((cst) (var))
> #endif
> 
> Does anybody besides me see value in adding a cleaned up version of
> that?

I've had the attached patch lying around.  As you can see there, there
are a few places where it could be useful, but not too many.

The name CONST_CAST() is adapted from C++.

Your version with __builtin_types_compatible_p() doesn't work for
casting between char * and const char *, so it wouldn't be very useful.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Large writable variables

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> .data 0000000000001180 datetktbl ./src/interfaces/ecpg/pgtypeslib/dt_common.o
> .data 0000000000000c28 ibmkanji ./src/backend/utils/mb/conversion_procs/euc_jp_and_sjis/euc_jp_and_sjis.o
> .data 00000000000003f0 deltatktbl ./src/interfaces/ecpg/pgtypeslib/dt_common.o

Hmm.  I think these can just be marked const, will investigate.

The rest of those look to be mostly below the threshold of pain ...

            regards, tom lane


Re: Large writable variables

From
Tom Lane
Date:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> .data 0000000000001180 datetktbl ./src/interfaces/ecpg/pgtypeslib/dt_common.o
>> .data 0000000000000c28 ibmkanji ./src/backend/utils/mb/conversion_procs/euc_jp_and_sjis/euc_jp_and_sjis.o
>> .data 00000000000003f0 deltatktbl ./src/interfaces/ecpg/pgtypeslib/dt_common.o

> Hmm.  I think these can just be marked const, will investigate.

I pushed fixes for these, but curiously, the ibmkanji change did not
make any difference to section sizes --- AFAICT, my toolchain already
figured out that it could treat that table as const.  The dt_common.c
changes are a win though.

            regards, tom lane


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-17 21:06:13 +0200, Peter Eisentraut wrote:
> On 16/10/2018 08:30, Andres Freund wrote:
> > This just reminded me that a couple times I wanted a cast that casts
> > away const, but otherwise makes sure the type stays the same. I don't
> > think there's a way to do that in C, but we can write one that verifies
> > the cast doesn't do something bad if gcc is used:
> > 
> > #if defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P)
> > #define unconstify(cst, var) StaticAssertExpr(__builtin_types_compatible_p(__typeof(var), const cst), "wrong
cast"),(cst) (var)
 
> > #else
> > #define unconstify(cst, var) ((cst) (var))
> > #endif
> > 
> > Does anybody besides me see value in adding a cleaned up version of
> > that?
> 
> I've had the attached patch lying around.

Heh.


> As you can see there, there are a few places where it could be useful,
> but not too many.

Yea.

> The name CONST_CAST() is adapted from C++.
> 
> Your version with __builtin_types_compatible_p() doesn't work for
> casting between char * and const char *, so it wouldn't be very useful.

Huh, why wouldn't it work for char *? Seems to work fine for me?


> +/*
> + * Safer casting -- use this for casting away const or volatile.  It ensures
> + * that the source and target types are otherwise compatible.
> + */
> +#define CONST_CAST(t, x) ((void)((t)0 == (x)), (t)(x))
> +

Has the advantage that it probably works for globals, but OTOH, it only
works correctly for pointers, and it doesn't reliably trigger warnigns /
errors.

Greetings,

Andres Freund


Re: Large writable variables

From
Peter Eisentraut
Date:
On 17/10/2018 22:02, Andres Freund wrote:
>> Your version with __builtin_types_compatible_p() doesn't work for
>> casting between char * and const char *, so it wouldn't be very useful.
> Huh, why wouldn't it work for char *? Seems to work fine for me?

__builtin_types_compatible_p(const char *, char *) returns false (0) for me.


-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Large writable variables

From
Andres Freund
Date:
Hi,

On 2018-10-17 23:43:31 +0200, Peter Eisentraut wrote:
> On 17/10/2018 22:02, Andres Freund wrote:
> >> Your version with __builtin_types_compatible_p() doesn't work for
> >> casting between char * and const char *, so it wouldn't be very useful.
> > Huh, why wouldn't it work for char *? Seems to work fine for me?
> 
> __builtin_types_compatible_p(const char *, char *) returns false (0) for me.

Right, that's why I added a const, inside the macro,  to the type
specified in the unconstify argument. So typeof() yields a const char *,
and the return type is specified as char *, and adding a const in the
argument also yields a const char *.

I'd merged my version since, I don't feel particularly attached to it
though...

Greetings,

Andres Freund


Re: Large writable variables

From
Peter Eisentraut
Date:
On 17/10/2018 23:51, Andres Freund wrote:
>> __builtin_types_compatible_p(const char *, char *) returns false (0) for me.
> 
> Right, that's why I added a const, inside the macro,  to the type
> specified in the unconstify argument. So typeof() yields a const char *,
> and the return type is specified as char *, and adding a const in the
> argument also yields a const char *.

Yeah, that works.  The C++-inspired version also allowed casting from
not-const to const, which we don't really need.

I'd perhaps change the signature

#define unconstify(underlying_type, var)

because the "var" doesn't actually have to be a variable.

Attached is my previous patch adapted to your macro.

I'm tempted to get rid of the stuff in dfmgr.c and just let the "older
platforms" get the warning.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Large writable variables

From
Peter Eisentraut
Date:
On 18/10/2018 22:17, Peter Eisentraut wrote:
> On 17/10/2018 23:51, Andres Freund wrote:
>>> __builtin_types_compatible_p(const char *, char *) returns false (0) for me.
>>
>> Right, that's why I added a const, inside the macro,  to the type
>> specified in the unconstify argument. So typeof() yields a const char *,
>> and the return type is specified as char *, and adding a const in the
>> argument also yields a const char *.
> 
> Yeah, that works.  The C++-inspired version also allowed casting from
> not-const to const, which we don't really need.

> Attached is my previous patch adapted to your macro.

Oh, I forgot to mention, your version doesn't work for this code in
pqexpbuffer.c:

    str->data = (char *) oom_buffer;

That's probably not a big deal though.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-18 22:20:29 +0200, Peter Eisentraut wrote:
> On 18/10/2018 22:17, Peter Eisentraut wrote:
> > On 17/10/2018 23:51, Andres Freund wrote:
> >>> __builtin_types_compatible_p(const char *, char *) returns false (0) for me.
> >>
> >> Right, that's why I added a const, inside the macro,  to the type
> >> specified in the unconstify argument. So typeof() yields a const char *,
> >> and the return type is specified as char *, and adding a const in the
> >> argument also yields a const char *.
> >
> > Yeah, that works.  The C++-inspired version also allowed casting from
> > not-const to const, which we don't really need.
>
> > Attached is my previous patch adapted to your macro.
>
> Oh, I forgot to mention, your version doesn't work for this code in
> pqexpbuffer.c:
>
>     str->data = (char *) oom_buffer;

That kind of seems correct ;). Can easily be done via unconstify(char *,
&oom_buffer[0]), if necessary.

> That's probably not a big deal though.

Greetings,

Andres Freund


Re: Large writable variables

From
Andres Freund
Date:
On 2018-10-18 22:17:55 +0200, Peter Eisentraut wrote:
> I'd perhaps change the signature
> 
> #define unconstify(underlying_type, var)
> 
> because the "var" doesn't actually have to be a variable.

Hm, so expr, or what would you use?

> Attached is my previous patch adapted to your macro.
> 
> I'm tempted to get rid of the stuff in dfmgr.c and just let the "older
> platforms" get the warning.

Yea, that works for me.


Are you planning to apply this?

Greetings,

Andres Freund


Re: Large writable variables

From
Peter Eisentraut
Date:
On 18/10/2018 21:31, Andres Freund wrote:
> On 2018-10-18 22:17:55 +0200, Peter Eisentraut wrote:
>> I'd perhaps change the signature
>>
>> #define unconstify(underlying_type, var)
>>
>> because the "var" doesn't actually have to be a variable.
> 
> Hm, so expr, or what would you use?

done

>> Attached is my previous patch adapted to your macro.
>>
>> I'm tempted to get rid of the stuff in dfmgr.c and just let the "older
>> platforms" get the warning.

and done

> Yea, that works for me.
> 
> Are you planning to apply this?

and done

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services