Re: [INTERFACES] Re: [HACKERS] changes in 6.4 - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [INTERFACES] Re: [HACKERS] changes in 6.4 |
Date | |
Msg-id | 199909182010.QAA18416@candle.pha.pa.us Whole thread Raw |
In response to | Re: [INTERFACES] Re: [HACKERS] changes in 6.4 (David Hartwig <daybee@bellatlantic.net>) |
Responses |
PERL
|
List | pgsql-hackers |
This is an old message, but still relivant. I belive 6.6 will have much better OR memory usage. > > > Bruce Momjian wrote: > > > > > > > Hannu Krosing wrote: > > > > > > > > The days where every release fixed server crashes, or added a feature > > > > > that users were 'screaming for' may be a thing of the past. > > > > > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > > > at least the canonic case used by access? > > > > > > > > My impression is that this has fallen somewhere between > > > > insightdist and Vadim. > > > > > > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many > > > desktop libraries and end-user tools depend on this "record set" strategy to > > > operate effectively. > > > > > > I have put together a workable hack that runs just before cnfify(). The > > > option is activated through the SET command. Once activated, it identifies > > > queries with this particular multi-OR pattern generated by these RECORD SET > > > strategies. Qualified query trees are rewritten as multiple UNIONs. (One > > > for each OR grouping). > > > > > > The results are profound. Queries that used to scan tables because of the > > > ORs, now make use of any indexes. Thus, the size of the table has virtually > > > no effect on performance. Furthermore, queries that used to crash the > > > backend, now run in under a second. > > > > > > Currently the down sides are: > > > 1. If there is no usable index, performance is significantly worse. The > > > patch does not check to make sure that there is a usable index. I could use > > > some pointers on this. > > > > > > 2. Small tables are actually a bit slower than without the patch. > > > > > > 3. Not very elegant. I am looking for a more generalized solution. > > > I have lots of ideas, but I would need to know the backend much better before > > > attempting any of them. My favorite idea is before cnfify(), to factor the > > > OR terms and pull out the constants into a virtual (temporary) table spaces. > > > Then rewrite the query as a join. The optimizer will (should) treat the new > > > query accordingly. This assumes that an efficient factoring algorithm exists > > > and that temporary tables can exist in the heap. > > > > > > Illustration: > > > SELECT ... FROM tab WHERE > > > (var1 = const1 AND var2 = const2) OR > > > (var1 = const3 AND var2 = const4) OR > > > (var1 = const5 AND var2 = const6) > > > > > > SELECT ... FROM tab, tmp WHERE > > > (var1 = var_x AND var2 = var_y) > > > > > > tmp > > > var_x | var_y > > > -------------- > > > const1|const2 > > > const3|const4 > > > const5|const6 > > > > David, where are we on this? I know we have OR's using indexes. Do we > > still need to look this as a fix, or are we OK. I have not gotten far > > enough in the optimizer to know how to fix the > > Bruce, > > If the question is, have I come up with a solution for the cnf'ify problem: No > > If the question is, is it still important: Very much yes. > > It is essential for many RAD tools using remote data objects which make use of key > sets. Your recent optimization of the OR list goes a long way, but inevitably > users are confronted with multi-part keys. > > When I look at the problem my head spins. I do not have the experience (yet?) > with the backend to be mucking around in the optimizer. As I see it, cnf'ify is > doing just what it is supposed to do. Boundless boolean logic. > > I think hope may lay though, in identifying each AND'ed group associated with a key > and tagging it as a special sub-root node which cnf'ify does not penetrate. This > node would be allowed to pass to the later stages of the optimizer where it will be > used to plan index scans. Easy for me to say. > > In the meantime, I still have the patch that I described in prior email. It has > worked well for us. Let me restate that. We could not survive without it! > However, I do not feel that is a sufficiently functional approach that should be > incorporated as a final solution. I will submit the patch if you, (anyone) does > not come up with a better solution. It is coded to be activated by a SET KSQO to > minimize its reach. > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: