On Sat, May 9, 2009 at 11:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
As best I can tell, the selectivity numbers are about what they should be --- for instance, using these stats I get a selectivity of 0.0000074 for the join clause fkr.submission_id = tr.submission_id. Over the entire relations (646484 and 142698 rows) that's predicting a join size of 683551, which seems to be in the right ballpark (it looks like actually it's one join row per canvas_foreignkeyresponse row, correct?).
The design is that for each submission_id there are any number of responses of different types. This particular questionnaire has 78 questions, 2 of which are text responses and 28 are foreignkey responses. The restrictions on the question_id limit the rows returned from those tables to 1 each in this case however. So yes, it's one to one in this case.
How is it that each fkr row matching those question_ids has a join match in tr that has those other two question_ids? It seems like there must be a whole lot of hidden correlation here.
As I mentioned before, they are all linked by the submission_id which indicates they are part of a single submission against a particular questionnaire (chart_id in the ddl). It is a design that I based on Elein Mustain's Question/Answer problem [1]. This particular query includes 2 chart_id's because they contain virtually the same data (sets of questions), but have different validation requirements. Does that shed any more light?