Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers

From Alena Rybakina
Subject Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date
Msg-id c8bb964c-b1ad-534e-79a8-b617577cc555@yandex.ru
Whole thread Raw
In response to Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
List pgsql-hackers
Hi, all!

>   CNF -> DNF conversion
> =====================
>
> Like many great papers, the MDAM paper takes one core idea, and finds
> ways to leverage it to the hilt. Here the core idea is to take
> predicates in conjunctive normal form (an "AND of ORs"), and convert
> them into disjunctive normal form (an "OR of ANDs"). DNF quals are
> logically equivalent to CNF quals, but ideally suited to SAOP-array
> style processing by an ordered B-Tree index scan -- they reduce
> everything to a series of non-overlapping primitive index scans, that
> can be processed in keyspace order. We already do this today in the
> case of SAOPs, in effect. The nbtree "next array keys" state machine
> already materializes values that can be seen as MDAM style DNF single
> value predicates. The state machine works by outputting the cartesian
> product of each array as a multi-column index is scanned, but that
> could be taken a lot further in the future. We can use essentially the
> same kind of state machine to do everything described in the paper --
> ultimately, it just needs to output a list of disjuncts, like the DNF
> clauses that the paper shows in "Table 3".
>
> In theory, anything can be supported via a sufficiently complete CNF
> -> DNF conversion framework. There will likely always be the potential
> for unsafe/unsupported clauses and/or types in an extensible system
> like Postgres, though. So we will probably need to retain some notion
> of safety. It seems like more of a job for nbtree preprocessing (or
> some suitably index-AM-agnostic version of the same idea) than the
> optimizer, in any case. But that's not entirely true, either (that
> would be far too easy).
>
> The optimizer still needs to optimize. It can't very well do that
> without having some kind of advanced notice of what is and is not
> supported by the index AM. And, the index AM cannot just unilaterally
> decide that index quals actually should be treated as filter/qpquals,
> after all -- it doesn't get a veto. So there is a mutual dependency
> that needs to be resolved. I suspect that there needs to be a two way
> conversation between the optimizer and nbtree code to break the
> dependency -- a callback that does some of the preprocessing work
> during planning. Tom said something along the same lines in passing,
> when discussing the MDAM paper last year [2]. Much work remains here.
>
Honestly, I'm just reading and delving into this thread and other topics 
related to it, so excuse me if I ask you a few obvious questions.

I noticed that you are going to add CNF->DNF transformation at the index 
construction stage. If I understand correctly, you will rewrite 
restrictinfo node,
change boolean "AND" expressions to "OR" expressions, but would it be 
possible to apply such a procedure earlier? Otherwise I suppose you 
could face the problem of
incorrect selectivity of the calculation and, consequently, the 
cardinality calculation?
I can't clearly understand at what stage it is clear that the such a 
transformation needs to be applied?

-- 
Regards,
Alena Rybakina
Postgres Professional




pgsql-hackers by date:

Previous
From: Jelte Fennema
Date:
Subject: Re: proposal: psql: show current user in prompt
Next
From: Dave Cramer
Date:
Subject: Re: Request for comment on setting binary format output per session