HI hackers,
I thought it would be better to start a new thread to discuss.
While working with sorting patch, and read others threads,
I have some ideas to reduces memory consumption by aset and generation
memory modules.
I have done basic benchmarks, and it seems to improve performance.
I think it's really worth it, if it really is possible to reduce memory consumption.
Linux Ubuntu 64 bits
work_mem = 64MB
set max_parallel_workers_per_gather = 0;
create table t (a bigint not null, b bigint not null, c bigint not
null, d bigint not null, e bigint not null, f bigint not null);
insert into t select x,x,x,x,x,x from generate_Series(1,140247142) x; -- 10GB!
vacuum freeze t;
select * from t order by a offset 140247142;
HEAD:
postgres=# select * from t order by a offset 140247142;
a | b | c | d | e | f
---+---+---+---+---+---
(0 rows)
work_mem=64MB
Time: 99603,544 ms (01:39,604)
Time: 94000,342 ms (01:34,000)
postgres=# set work_mem="64.2MB";
SET
Time: 0,210 ms
postgres=# select * from t order by a offset 140247142;
a | b | c | d | e | f
---+---+---+---+---+---
(0 rows)
Time: 95306,254 ms (01:35,306)
PATCHED:
postgres=# explain analyze select * from t order by a offset 140247142;
a | b | c | d | e | f
---+---+---+---+---+---
(0 rows)
work_mem=64MB
Time: 90946,482 ms (01:30,946)
postgres=# set work_mem="64.2MB";
SET
Time: 0,210 ms
postgres=# select * from t order by a offset 140247142;
a | b | c | d | e | f
---+---+---+---+---+---
(0 rows)
Time: 91817,533 ms (01:31,818)
There is still room for further improvements, and at this point I need help.
Regarding the patches we have:
1) 001-aset-reduces-memory-consumption.patch
Reduces memory used by struct AllocBlockData by minus 8 bits,
reducing the total size to 32 bits, which leads to "fitting" two structs in a 64bit cache.
Move some stores to fields struct, for the order of declaration, within the structures.
Remove tests elog(ERROR, "could not find block containing chunk %p" and
elog(ERROR, "could not find block containing chunk %p", moving them to
MEMORY_CONTEXT_CHECKING context.
Since 8.2 versions, nobody complains about these tests.
But if is not acceptable, have the option (3) 003-aset-reduces-memory-consumption.patch
2) 002-generation-reduces-memory-consumption.patch
Reduces memory used by struct GenerationBlock, by minus 8 bits,
reducing the total size to 32 bits, which leads to "fitting" two structs in a 64bit cache.
Remove all references to the field *block* used by struct GenerationChunk,
enabling its removal! (not done yet).
What would take the final size to 16 bits, which leads to "fitting" four structs in a 64bit cache.
Unfortunately, everything works only for the size 24, see the (4).
Move some stores to fields struct, for the order of declaration, within the structures.
3) 003-aset-reduces-memory-consumption.patch
Same to the (1), but without remove the tests:
elog(ERROR, "could not find block containing chunk %p" and
elog(ERROR, "could not find block containing chunk %p",
But at the cost of removing a one tiny part of the tests.
Since 8.2 versions, nobody complains about these tests.
4) 004-generation-reduces-memory-consumption-BUG.patch
Same to the (2), but with BUG.
It only takes a few tweaks to completely remove the field block.
@@ -117,9 +116,9 @@ struct GenerationChunk
/* this is zero in a free chunk */
Size requested_size;
-#define GENERATIONCHUNK_RAWSIZE (SIZEOF_SIZE_T * 2 + SIZEOF_VOID_P * 2)
+#define GENERATIONCHUNK_RAWSIZE (SIZEOF_SIZE_T * 2 + SIZEOF_VOID_P)
#else
-#define GENERATIONCHUNK_RAWSIZE (SIZEOF_SIZE_T + SIZEOF_VOID_P * 2)
+#define GENERATIONCHUNK_RAWSIZE (SIZEOF_SIZE_T + SIZEOF_VOID_P)
#endif /* MEMORY_CONTEXT_CHECKING */
/* ensure proper alignment by adding padding if needed */
@@ -127,7 +126,6 @@ struct GenerationChunk
char padding[MAXIMUM_ALIGNOF - GENERATIONCHUNK_RAWSIZE % MAXIMUM_ALIGNOF];
#endif
- GenerationBlock *block; /* block owning this chunk */
GenerationContext *context; /* owning context, or NULL if freed chunk */
/* there must not be any padding to reach a MAXALIGN boundary here! */
};
This fails with make check.
I couldn't figure out why it doesn't work with 16 bits (struct GenerationChunk).