Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Date | |
Msg-id | CAH2-WzkejfjFohkYyaqGG8N7FXj+UV3_HjjsJqKiMi1MWfAtow@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>) |
Responses |
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
|
List | pgsql-hackers |
On Fri, Jul 19, 2019 at 10:53 AM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote: > Patch 0002 (must be applied on top of 0001) implements preserving of > correct TID order > inside posting list when inserting new tuples. > This version passes all regression tests including amcheck test. > I also used following script to test insertion into the posting list: Nice! > I suppose it is not the final version of the patch yet, > so I left some debug messages and TODO comments to ease review. I'm fine with leaving them in. I have sometimes distributed a separate patch with debug messages, but now that I think about it, that probably wasn't a good use of time. You will probably want to remove at least some of the debug messages during performance testing. I'm thinking of code that appears in very tight inner loops, such as the _bt_compare() code. > Please, in your review, pay particular attention to usage of > BTreeTupleGetHeapTID. > For posting tuples it returns the first tid from posting list like > BTreeTupleGetMinTID, > but maybe some callers are not ready for that and want > BTreeTupleGetMaxTID instead. > Incorrect usage of these macros may cause some subtle bugs, > which are probably not covered by tests. So, please double-check it. One testing strategy that I plan to use for the patch is to deliberately corrupt a compressed index in a subtle way using pg_hexedit, and then see if amcheck detects the problem. For example, I may swap the order of two TIDs in the middle of a posting list, which is something that is unlikely to produce wrong answers to queries, and won't even be detected by the "heapallindexed" check, but is still wrong. If we can detect very subtle, adversarial corruption like this, then we can detect any real-world problem. Once we have confidence in amcheck's ability to detect problems with posting lists in general, we can use it in many different contexts without much thought. For example, we'll probably need to do long running benchmarks to validate the performance of the patch. It's easy to add amcheck testing at the end of each run. Every benchmark is now also a correctness/stress test, for free. > Next week I'm going to check performance and try to find specific > scenarios where this > feature can lead to degradation and measure it, to understand if we need > to make this deduplication optional. Sounds good, though I think it might be a bit too early to decide whether or not it needs to be enabled by default. For one thing, the approach to WAL-logging within _bt_compress_one_page() is probably fairly inefficient, which may be a problem for certain workloads. It's okay to leave it that way for now, because it is not relevant to the core design of the patch. I'm sure that _bt_compress_one_page() can be carefully optimized when the time comes. My current focus is not on the raw performance itself. For now, I am focussed on making sure that the compression works well, and that the resulting indexes "look nice" in general. FWIW, the first few versions of my v12 work on nbtree didn't actually make *anything* go faster. It took a couple of months to fix the more important regressions, and a few more months to fix all of them. I think that the work on this patch may develop in a similar way. I am willing to accept regressions in the unoptimized code during development because it seems likely that you have the right idea about the data structure itself, which is the one thing that I *really* care about. Once you get that right, the remaining problems are very likely to either be fixable with further work on optimizing specific code, or a price that users will mostly be happy to pay to get the benefits. -- Peter Geoghegan
pgsql-hackers by date: