Re: Rowcounts marked by create_foreignscan_path() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Rowcounts marked by create_foreignscan_path()
Date
Msg-id 3166.1394145753@sss.pgh.pa.us
Whole thread Raw
In response to Re: Rowcounts marked by create_foreignscan_path()  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
> Maybe I'm missing something.  But I don't think
> postgresGetForeignPaths() marks the parameterized path with the correct
> row estimate.  Also, that function doesn't seem to estimate the cost of
> the parameterized path correctly.  Please find attached a patch.

[ looks at that... ]  Oh, I see what you're worried about: we select
a specific "safe" join clause to drive creation of the parameterized
path, but then there might be additional join clauses that come along
with that because they're also movable to the parameterized path.
You're right, the current postgres_fdw code is not accounting for that.

I experimented with this patch using this test case in the database
created by the postgres_fdw regression test:

explain verbose SELECT * FROM ft2 a, ft2 b WHERE a.c2 = 47 AND b.c1 = a.c1 and a.c6 = 'fooo' and b.c7=a.c7;

What I saw in this example was that with the patch, the join clause got
counted *twice* in the cost estimate, because the clause returned by
generate_implied_equalities_for_column() isn't pointer-equal to the one in
the ppi_clauses list (note the comment about such cases in
postgresGetForeignPlan).  So that's not right.

However, we've got bigger problems than that: with or without the patch,

explain verbose SELECT * FROM ft2 a, ft2 b WHERE a.c2 = 47 AND b.c1 = a.c1 and a.c6 = 'fooo' and b.c7=upper(a.c7);

fails with
TRAP: FailedAssertion("!(is_foreign_expr(root, baserel, rinfo->clause))", File: "postgres_fdw.c", Line: 759)

This happens because we select the "safe" joinclause b.c1=a.c1, and then
the PPI machinery adds all the other movable join clauses for that outer
relation, including the unsafe one involving an upper() call.
postgresGetForeignPlan believes that the only joinclauses it can find in
scan_clauses are the safe one(s) selected by postgresGetForeignPaths, but
that's completely wrong.  I think this is probably relatively simple to
fix: in postgresGetForeignPlan, we need to actually test whether a join
clause is safe or not, and shove it into the appropriate list.

So far as the original issue goes, I think what this consideration shows
is that postgresGetForeignPaths is going about things incorrectly.
I thought that considering only one join clause per path would be valid,
if perhaps sometimes dumb.  But the way the movable-clause machinery works,
we don't get to consider only one join clause; it's all-or-nothing for a
given outer relation.  So what we really ought to be doing here is
generating one parameterized path per valid outer relation.

What we could possibly do to resolve this without much extra code is to
keep using the same logic as a guide to which outer relations are
interesting, but for each such relation, do get_baserel_parampathinfo()
and then use the ppi_clauses list as the quals-to-apply for purposes of
cost/selectivity estimation.  postgresGetForeignPaths would have to check
each of those clauses for safety rather than assuming anything.  One
possible benefit is that we could record the results in the path and not
have to repeat the is_foreign_expr tests in postgresGetForeignPlan.
I'm not entirely sure if that's worth the trouble though.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Daniele Varrazzo
Date:
Subject: Re: jsonb and nested hstore
Next
From: Noah Misch
Date:
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To: