Thread: Re: slow join on postgresql6.5
At 11:10 AM 3/30/00 -0500, Tom Lane wrote: >Don Baccus <dhogaza@pacifier.com> writes: >> This is an example where commercial systems that have indices >> synchronized with data such that queries referencing only the >> fields in indices can win big vs. PG in SOME (not all) cases. >> In particular, when the indices are to a table that has a bunch >> of other, perhaps long, columns. PG has to read the table and >> drag all that dead weight around to do RI referential checking >> and semantic actions. > >Keep in mind, though, that once we have TOAST the long columns are >likely to get pushed out to a secondary table, so that the amount >of data you have to read is reduced (as long as you don't touch >any of the long columns, of course). Sure...and you can BLOB or CLOB longer data in Oracle, too. TOASTing isn't without costs, either...life's a tradeoff! > >The main reason that Postgres indexes can't be used without also >consulting the main table is that we do not store transaction status >information in index entries, only in real tuples. After finding >an index entry we must still consult the referenced tuple to see >if it's been deleted, or even committed yet. I believe this is a >pretty good tradeoff. I must wonder, though, given that proper syncing seems to be the norm in commercial systems. Or so I'm lead to believe when Gray's book, for instance. Or a good book on speeding up Oracle queries. Whatever ... in this particular case - referential integrity with MATCH <unspecified> and MATCH PARTIAL and multi-column foreign keys - performance will likely drop spectacularly once the leading column is NULL, while (say) with Oracle you'd expect much less of a performance hit. The point of my note is that this is probably worth documenting. Don't get me wrong, these semantics and RI and multi-column keys appear to be pretty inefficient by nature, I don't think anyone is likely to be horrified to read that it might well be even worse in PG than in certain commercial systems... >I suppose that keeping tuple status in index entries could be a win >on nearly-read-only tables, but I think that on average it'd be >a performance loser. Well...I've personally not studied the issue in detail, but just have to wonder if the folks at Oracle are really as stupid as the above analysis would make them appear to be. I presume that they have a pretty good idea of the kind of mix large database installations make, and presumably make choices designed to win on average. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> -----Original Message----- > From: majordomo-owner@hub.org [mailto:majordomo-owner@hub.org]On Behalf > Of Don Baccus > > Whatever ... in this particular case - referential integrity > with MATCH <unspecified> and MATCH PARTIAL and multi-column > foreign keys - performance will likely drop spectacularly once the > leading column is NULL, while (say) with Oracle you'd expect much > less of a performance hit. > As for NULL,it seems possible to look up NULL keys in a btree index because NULL == NULL for btree indexes. I've wondered why PostgreSQL's planner/executor never looks up indexes for queries using 'IS NULL'. Regards. Hiroshi Inoue Inoue@tpf.co.jp
At 07:05 PM 3/31/00 +0900, Hiroshi Inoue wrote: >> -----Original Message----- >> From: majordomo-owner@hub.org [mailto:majordomo-owner@hub.org]On Behalf >> Of Don Baccus >> >> Whatever ... in this particular case - referential integrity >> with MATCH <unspecified> and MATCH PARTIAL and multi-column >> foreign keys - performance will likely drop spectacularly once the >> leading column is NULL, while (say) with Oracle you'd expect much >> less of a performance hit. >> > >As for NULL,it seems possible to look up NULL keys in a btree index >because NULL == NULL for btree indexes. >I've wondered why PostgreSQL's planner/executor never looks up >indexes for queries using 'IS NULL'. Unfortunately for the RI MATCH PARTIAL case, NULL is a "wildcard". This doesn't affect the validity of your observation in the general case, though. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> -----Original Message----- > From: Don Baccus [mailto:dhogaza@pacifier.com] > Sent: Friday, March 31, 2000 11:34 PM > > At 07:05 PM 3/31/00 +0900, Hiroshi Inoue wrote: > >> -----Original Message----- > >> From: majordomo-owner@hub.org [mailto:majordomo-owner@hub.org]On Behalf > >> Of Don Baccus > >> > >> Whatever ... in this particular case - referential integrity > >> with MATCH <unspecified> and MATCH PARTIAL and multi-column > >> foreign keys - performance will likely drop spectacularly once the > >> leading column is NULL, while (say) with Oracle you'd expect much > >> less of a performance hit. > >> > > > >As for NULL,it seems possible to look up NULL keys in a btree index > >because NULL == NULL for btree indexes. > >I've wondered why PostgreSQL's planner/executor never looks up > >indexes for queries using 'IS NULL'. > > Unfortunately for the RI MATCH PARTIAL case, NULL is a "wildcard". > Oops I misunderstood NULL. Hmm,is the following TODO worth the work ? * Use index to restrict rows returned by multi-key index when used with non-consecutive keys or OR clauses, so fewer heapaccesses. Probably this is for the case likeSELECT .. FROM .. WHERE key1 = val1 and key3 = val3; ,where (key1,key2,key3) is a multi-column index. Currently index scan doesn't take 'key3=val3' into account because (key1,key3) isn't consecutive. The TODO may include the caseSELECT .. FROM .. WHERE key2 = val2; Though we have to scan the index entirely,access to the main table is needed only when key2 = val2. If (key2 = val2) is sufficiently restrictive, the scan would be faster than simple sequential scan. Comments ? Regards. Hiroshi Inoue Inoue@tpf.co.jp
> > >As for NULL,it seems possible to look up NULL keys in a btree index > > >because NULL == NULL for btree indexes. > > >I've wondered why PostgreSQL's planner/executor never looks up > > >indexes for queries using 'IS NULL'. > > > > Unfortunately for the RI MATCH PARTIAL case, NULL is a "wildcard". > > > > Oops I misunderstood NULL. > > Hmm,is the following TODO worth the work ? > * Use index to restrict rows returned by multi-key index when used with > non-consecutive keys or OR clauses, so fewer heap accesses. This is a Vadim item. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026