Re: Patch to support SEMI and ANTI join removal - Mailing list pgsql-hackers

From David Rowley
Subject Re: Patch to support SEMI and ANTI join removal
Date
Msg-id CAApHDvox2L2z3iZn23mE+HVpRf3U3dKC2KEpaznCcb-AS1Q9bg@mail.gmail.com
Whole thread Raw
In response to Re: Patch to support SEMI and ANTI join removal  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Patch to support SEMI and ANTI join removal
List pgsql-hackers
On Sat, Sep 13, 2014 at 1:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
> On Fri, Sep 12, 2014 at 3:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I haven't read the patch, but I think the question is why this needs
>> to be different than what we do for left join removal.

> I discovered over here ->
> http://www.postgresql.org/message-id/CAApHDvo5wCRk7uHBuMHJaDpbW-b_oGKQOuisCZzHC25=H3__fA@mail.gmail.com
> during the early days of the semi and anti join removal code that the
> planner was trying to generate paths to the dead rel. I managed to track
> the problem down to eclass members still existing for the dead rel. I guess
> we must not have eclass members for outer rels? or we'd likely have seen
> some troubles with left join removals already.

Mere existence of an eclass entry ought not cause paths to get built.
It'd be worth looking a bit harder into what's happening there.


It took me a bit of time to create this problem again as I didn't record the actual query where I hit the assert failure the first time. Though, now I have managed to recreate the problem again by removing the code that I had added which removes eclass members for dead rels.

Using the attached patch, the failing test case is:

create table b (id int primary key);
create table a (id int primary key, b_id int references b);
create index on a (b_id); -- add index to create alternative path

explain select a.* from a inner join b on b.id=a.b_id;

What seems to be happening is that generate_implied_equalities_for_column generates a RestrictInfo for the dead rel due to the eclass member still existing. This new rinfo gets matched to the index by match_clauses_to_index()

The code then later fails in get_loop_count:

TRAP: FailedAssertion("!(outer_rel->rows > 0)", File: "src\backend\optimizer\path\indxpath.c", Line: 1861)

The call stack looks like:

> postgres.exe!get_loop_count(PlannerInfo * root, Bitmapset * outer_relids) Line 1861 C
  postgres.exe!build_index_paths(PlannerInfo * root, RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * clauses, char useful_predicate, SaOpControl saop_control, ScanTypeControl scantype) Line 938 C
  postgres.exe!get_index_paths(PlannerInfo * root, RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * clauses, List * * bitindexpaths) Line 745 C
  postgres.exe!get_join_index_paths(PlannerInfo * root, RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet * jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths, Bitmapset * relids, List * * considered_relids) Line 672 C
  postgres.exe!consider_index_join_outer_rels(PlannerInfo * root, RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet * jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths, List * indexjoinclauses, int considered_clauses, List * * considered_relids) Line 585 C
  postgres.exe!consider_index_join_clauses(PlannerInfo * root, RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet * jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths) Line 485 C
  postgres.exe!create_index_paths(PlannerInfo * root, RelOptInfo * rel) Line 308 C
  postgres.exe!set_plain_rel_pathlist(PlannerInfo * root, RelOptInfo * rel, RangeTblEntry * rte) Line 403 C
  postgres.exe!set_rel_pathlist(PlannerInfo * root, RelOptInfo * rel, unsigned int rti, RangeTblEntry * rte) Line 337 C
  postgres.exe!set_base_rel_pathlists(PlannerInfo * root) Line 223 C
  postgres.exe!make_one_rel(PlannerInfo * root, List * joinlist) Line 157 C
  postgres.exe!query_planner(PlannerInfo * root, List * tlist, void (PlannerInfo *, void *) * qp_callback, void * qp_extra) Line 236 C
  postgres.exe!grouping_planner(PlannerInfo * root, double tuple_fraction) Line 1289 C
  postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse, PlannerInfo * parent_root, char hasRecursion, double tuple_fraction, PlannerInfo * * subroot) Line 573 C
  postgres.exe!standard_planner(Query * parse, int cursorOptions, ParamListInfoData * boundParams) Line 211 C
  postgres.exe!planner(Query * parse, int cursorOptions, ParamListInfoData * boundParams) Line 139 C
  postgres.exe!pg_plan_query(Query * querytree, int cursorOptions, ParamListInfoData * boundParams) Line 750 C
  postgres.exe!ExplainOneQuery(Query * query, IntoClause * into, ExplainState * es, const char * queryString, ParamListInfoData * params) Line 330 C
  postgres.exe!ExplainQuery(ExplainStmt * stmt, const char * queryString, ParamListInfoData * params, _DestReceiver * dest) Line 231 C
  postgres.exe!standard_ProcessUtility(Node * parsetree, const char * queryString, ProcessUtilityContext context, ParamListInfoData * params, _DestReceiver * dest, char * completionTag) Line 647 C
  postgres.exe!ProcessUtility(Node * parsetree, const char * queryString, ProcessUtilityContext context, ParamListInfoData * params, _DestReceiver * dest, char * completionTag) Line 314 C
  postgres.exe!PortalRunUtility(PortalData * portal, Node * utilityStmt, char isTopLevel, _DestReceiver * dest, char * completionTag) Line 1195 C
  postgres.exe!FillPortalStore(PortalData * portal, char isTopLevel) Line 1063 C
  postgres.exe!PortalRun(PortalData * portal, long count, char isTopLevel, _DestReceiver * dest, _DestReceiver * altdest, char * completionTag) Line 790 C
  postgres.exe!exec_simple_query(const char * query_string) Line 1052 C
  postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Line 4012 C
  postgres.exe!BackendRun(Port * port) Line 4113 C
  postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4618 C
  postgres.exe!main(int argc, char * * argv) Line 207 C

So my solution to this was just to add the code that strips out the eclass members that belong to the newly dead rel in join removal.
The only other solution I can think of would be to add a bitmap set of dead rels onto the PlannerInfo struct or perhaps just generate one and passing that in prohibited_rels in generate_implied_equalities_for_column (). I don't really care for this solution very much as it seems better to make the join removal code pay for this extra processing rather than (probably) most queries.

Of course this is my problem as I'm unable to create the same situation with the existing left join removals. The point here is more to justify why I added code to strip eclass members of dead rels.

Any thoughts? Or arguments against me keeping the code that strips out the eclass members of dead rels?

Regards

David Rowley
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)
Next
From: Andres Freund
Date:
Subject: Re: WAL format and API changes (9.5)