Re: Using per-transaction memory contexts for storing decoded tuples - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Using per-transaction memory contexts for storing decoded tuples |
Date | |
Msg-id | CAD21AoDaO1txkgic+uE6u2_SDt=BxL9a_5=7-CtADZxKh6g1pw@mail.gmail.com Whole thread Raw |
In response to | Re: Using per-transaction memory contexts for storing decoded tuples (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Using per-transaction memory contexts for storing decoded tuples
|
List | pgsql-hackers |
On Fri, Sep 27, 2024 at 12:39 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Mon, 23 Sept 2024 at 09:59, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Sun, Sep 22, 2024 at 11:27 AM David Rowley <dgrowleyml@gmail.com> wrote: > > > > > > On Fri, 20 Sept 2024 at 17:46, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Fri, Sep 20, 2024 at 5:13 AM David Rowley <dgrowleyml@gmail.com> wrote: > > > > > In general, it's a bit annoying to have to code around this > > > > > GenerationContext fragmentation issue. > > > > > > > > Right, and I am also slightly afraid that this may not cause some > > > > regression in other cases where defrag wouldn't help. > > > > > > Yeah, that's certainly a possibility. I was hoping that > > > MemoryContextMemAllocated() being much larger than logical_work_mem > > > could only happen when there is fragmentation, but certainly, you > > > could be wasting effort trying to defrag transactions where the > > > changes all arrive in WAL consecutively and there is no > > > defragmentation. It might be some other large transaction that's > > > causing the context's allocations to be fragmented. I don't have any > > > good ideas on how to avoid wasting effort on non-problematic > > > transactions. Maybe there's something that could be done if we knew > > > the LSN of the first and last change and the gap between the LSNs was > > > much larger than the WAL space used for this transaction. That would > > > likely require tracking way more stuff than we do now, however. > > > > > > > With more information tracking, we could avoid some non-problematic > > transactions but still, it would be difficult to predict that we > > didn't harm many cases because to make the memory non-contiguous, we > > only need a few interleaving small transactions. We can try to think > > of ideas for implementing defragmentation in our code if we first can > > prove that smaller block sizes cause problems. > > > > > With the smaller blocks idea, I'm a bit concerned that using smaller > > > blocks could cause regressions on systems that are better at releasing > > > memory back to the OS after free() as no doubt malloc() would often be > > > slower on those systems. There have been some complaints recently > > > about glibc being a bit too happy to keep hold of memory after free() > > > and I wondered if that was the reason why the small block test does > > > not cause much of a performance regression. I wonder how the small > > > block test would look on Mac, FreeBSD or Windows. I think it would be > > > risky to assume that all is well with reducing the block size after > > > testing on a single platform. > > > > > > > Good point. We need extensive testing on different platforms, as you > > suggest, to verify if smaller block sizes caused any regressions. > > I did similar tests on Windows. rb_mem_block_size was changed from 8kB > to 8MB. Below table shows the result (average of 5 runs) and Standard > Deviation (of 5 runs) for each block-size. > > =============================================== > block-size | Average time (ms) | Standard Deviation (ms) > ------------------------------------------------------------------------------------- > 8kb | 12580.879 ms | 144.6923467 > 16kb | 12442.7256 ms | 94.02799006 > 32kb | 12370.7292 ms | 97.7958552 > 64kb | 11877.4888 ms | 222.2419142 > 128kb | 11828.8568 ms | 129.732941 > 256kb | 11801.086 ms | 20.60030913 > 512kb | 12361.4172 ms | 65.27390105 > 1MB | 12343.3732 ms | 80.84427202 > 2MB | 12357.675 ms | 79.40017604 > 4MB | 12395.8364 ms | 76.78273689 > 8MB | 11712.8862 ms | 50.74323039 > ============================================== > > From the results, I think there is a small regression for small block size. > > I ran the tests in git bash. I have also attached the test script. Thank you for testing on Windows! I've run the same benchmark on Mac (Sonoma 14.7, M1 Pro): 8kB: 4852.198 ms 16kB: 4822.733 ms 32kB: 4776.776 ms 64kB: 4851.433 ms 128kB: 4804.821 ms 256kB: 4781.778 ms 512kB: 4776.486 ms 1MB: 4783.456 ms 2MB: 4770.671 ms 4MB: 4785.800 ms 8MB: 4747.447 ms I can see there is a small regression for small block sizes. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: