Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers
From | Alena Rybakina |
---|---|
Subject | Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan |
Date | |
Msg-id | c8bb964c-b1ad-534e-79a8-b617577cc555@yandex.ru Whole thread Raw |
In response to | Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
|
List | pgsql-hackers |
Hi, all! > CNF -> DNF conversion > ===================== > > Like many great papers, the MDAM paper takes one core idea, and finds > ways to leverage it to the hilt. Here the core idea is to take > predicates in conjunctive normal form (an "AND of ORs"), and convert > them into disjunctive normal form (an "OR of ANDs"). DNF quals are > logically equivalent to CNF quals, but ideally suited to SAOP-array > style processing by an ordered B-Tree index scan -- they reduce > everything to a series of non-overlapping primitive index scans, that > can be processed in keyspace order. We already do this today in the > case of SAOPs, in effect. The nbtree "next array keys" state machine > already materializes values that can be seen as MDAM style DNF single > value predicates. The state machine works by outputting the cartesian > product of each array as a multi-column index is scanned, but that > could be taken a lot further in the future. We can use essentially the > same kind of state machine to do everything described in the paper -- > ultimately, it just needs to output a list of disjuncts, like the DNF > clauses that the paper shows in "Table 3". > > In theory, anything can be supported via a sufficiently complete CNF > -> DNF conversion framework. There will likely always be the potential > for unsafe/unsupported clauses and/or types in an extensible system > like Postgres, though. So we will probably need to retain some notion > of safety. It seems like more of a job for nbtree preprocessing (or > some suitably index-AM-agnostic version of the same idea) than the > optimizer, in any case. But that's not entirely true, either (that > would be far too easy). > > The optimizer still needs to optimize. It can't very well do that > without having some kind of advanced notice of what is and is not > supported by the index AM. And, the index AM cannot just unilaterally > decide that index quals actually should be treated as filter/qpquals, > after all -- it doesn't get a veto. So there is a mutual dependency > that needs to be resolved. I suspect that there needs to be a two way > conversation between the optimizer and nbtree code to break the > dependency -- a callback that does some of the preprocessing work > during planning. Tom said something along the same lines in passing, > when discussing the MDAM paper last year [2]. Much work remains here. > Honestly, I'm just reading and delving into this thread and other topics related to it, so excuse me if I ask you a few obvious questions. I noticed that you are going to add CNF->DNF transformation at the index construction stage. If I understand correctly, you will rewrite restrictinfo node, change boolean "AND" expressions to "OR" expressions, but would it be possible to apply such a procedure earlier? Otherwise I suppose you could face the problem of incorrect selectivity of the calculation and, consequently, the cardinality calculation? I can't clearly understand at what stage it is clear that the such a transformation needs to be applied? -- Regards, Alena Rybakina Postgres Professional
pgsql-hackers by date: