Re: pg9.6 segfault using simple query (related to use fk for join estimates) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: pg9.6 segfault using simple query (related to use fk for join estimates) |
Date | |
Msg-id | CA+TgmoaNS3=fhkOyxriurhhFxSO+idFt5FR0b-TxsqQiWMCXYQ@mail.gmail.com Whole thread Raw |
In response to | Re: pg9.6 segfault using simple query (related to use fk for join estimates) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: pg9.6 segfault using simple query (related to use fk
for join estimates)
|
List | pgsql-hackers |
On Wed, May 4, 2016 at 2:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I spent some time trying to make a test case that was impossibly slow, > without any really impressive result: I saw at most about a 25% growth in > planning time, for a 27-way join with one or two foreign keys per table. > I noted however that with a simple FROM list of tables, you don't really > get the full force of the combinatorial explosion, because > join_search_one_level prefers to generate left-deep trees first, and so > the first creation of a given joinrel is always N left-side rels against 1 > right-side rel, causing the second level of looping to always iterate just > once. (GEQO behaves likewise, I think.) I spent a little time trying to > devise join order constraints that would result in a lot of high-level > joinrels being formed with many relations on both sides, which would cause > both of the second and third levels to iterate O(N) times not just once. > I didn't have much luck, but I have zero confidence that our users won't > find such cases. I' > > The reason it's so brute force is that it's rediscovering the same facts > over and over. In all simple (non-outer-join) cases, the only clauses > that are of any interest are derived from EquivalenceClasses, and all > that you really need to know is "are the vars mentioned on each side > of this FK present in the same EquivalenceClass?". ISTM that could be > determined once per FK per query and cached in or near the EC, not > expensively rediscovered at each level of joining. I'm not sure whether > it'd be worth considering non-EC-derived equalities (ie, outer join > clauses) at all, and note that the current patch fails to do so anyway; > see below. > > My other design-level complaint is that basing this on foreign keys is > fundamentally the wrong thing. What actually matters is the unique index > underlying the FK; that is, if we have "a.x = b.y" and there's a > compatible unique index on b.y, we can conclude that no A row will match > more than one B row, whether or not an explicit FK relationship has been > declared. So we should drive this off unique indexes instead of FKs, > first because we will find more cases, and second because the planner > already examines indexes and doesn't need any additional catalog lookups > to get the required data. (IOW, the relcache additions that were made in > this patch series should go away too.) > > Aside from the design being basically wrong, the code quality seems pretty > low. Aside from the missing IsA check that started this thread, I found > the following problems in a quick once-over of 137805f89: > > Bugs in quals_match_foreign_key(): > > * Test for clause->opno == fkinfo->conpfeqop[i] fails to consider > cross-type operators, ie, what's in the clause might be int2 = int4 > while the conpfeqop is int4 = int2. > > * parent_ec check isn't really the right thing, since EC-derived clauses > might not have that set. I think it may be adequate given that you only > deal with simple Vars, but at least a comment about that would be good. > > * Come to think of it, you could probably dodge the commutator operator > problem altogether for clauses with nonnull parent_ec, because they must > contain a relevant equality operator. (Although if it's redesigned as I > suggest above, the code path for a clause with parent_ec set would look > totally different from this anyway.) > > * Maintaining the fkmatches bitmapset is useless expense, just use a > counter of matched keys. Or for that matter, why not just fail > immediately if i'th key is not found? > > find_best_foreign_key_quals(): > > * Test on enable_fkey_estimates should be one call level up to dodge the > useless doubly-nested loop in clauselist_join_selectivity (which would > make the "fast path" exit here pretty much useless) > > clauselist_join_selectivity(): > > * "either could be zero, but not both" is a pretty unhelpful comment given > the if-test just above it. What *could* have used some explanation is > what the next two dozen lines are doing, because they're as opaque as can > be; and the XXX comment doesn't leave a warm feeling that the author > understands it either. I'm not prepared to opine on whether this segment > of code is correct at all without better commentary. > > calc_joinrel_size_estimate(): > > * why is clauselist_join_selectivity applied to pushed-down clauses and > not local ones in an outer join? If that's not an oversight, it at least > requires an explanatory comment. Note that this means we are not applying > FK knowledge for "a left join b on x = y", though it seems like we could. > > compute_semi_anti_join_factors isn't using clauselist_join_selectivity > either. I don't know whether it should be, but if not, a comment about > why not seems appropriate. More generally, I wonder why this logic > wasn't just folded into clauselist_selectivity. > > guc.c: > undocumented GUCs are not acceptable > > paths.h: > patch introduces an extern that's referenced noplace > > In short, I'm not left with a warm fuzzy feeling about this patch having > been ready to commit. The predecessor patch 015e88942 was also > underreviewed, cf 5306df283. > > regards, tom lane > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: