Thread: pgsql: Fix an oversight in the 8.2 patch that improved mergejoin
pgsql: Fix an oversight in the 8.2 patch that improved mergejoin
From
tgl@postgresql.org (Tom Lane)
Date:
Log Message: ----------- Fix an oversight in the 8.2 patch that improved mergejoin performance by inserting a materialize node above an inner-side sort node, when the sort is expected to spill to disk. (The materialize protects the sort from having to support mark/restore, allowing it to do its final merge pass on-the-fly.) We neglected to teach cost_mergejoin about that hack, so it was failing to include the materialize's costs in the estimated cost of the mergejoin. The materialize's costs are generally going to be pretty negligible in comparison to the sort's, so this is only a small error and probably not worth back-patching; but it's still wrong. In the similar case where a materialize is inserted to protect an inner-side node that can't do mark/restore at all, it's still true that the materialize should not spill to disk, and so we should cost it cheaply rather than expensively. Noted while thinking about a question from Tom Raney. Modified Files: -------------- pgsql/src/backend/optimizer/path: costsize.c (r1.196 -> r1.197) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/path/costsize.c?r1=1.196&r2=1.197) pgsql/src/backend/optimizer/plan: createplan.c (r1.247 -> r1.248) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/plan/createplan.c?r1=1.247&r2=1.248) pgsql/src/backend/optimizer/util: pathnode.c (r1.146 -> r1.147) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/util/pathnode.c?r1=1.146&r2=1.147)
tgl@postgresql.org (Tom Lane) writes: > (The materialize protects the sort from having to support mark/restore, > allowing it to do its final merge pass on-the-fly.) We neglected to teach > cost_mergejoin about that hack, so it was failing to include the > materialize's costs in the estimated cost of the mergejoin. Is that right? The materialize is just doing the same writing that the final pass of the sort would have been doing. Did we discount the costs for sort for that skipping writing that final pass when that was done? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
On Sat, 2008-09-06 at 13:06 +0100, Gregory Stark wrote: > tgl@postgresql.org (Tom Lane) writes: > > > (The materialize protects the sort from having to support mark/restore, > > allowing it to do its final merge pass on-the-fly.) We neglected to teach > > cost_mergejoin about that hack, so it was failing to include the > > materialize's costs in the estimated cost of the mergejoin. > > Is that right? The materialize is just doing the same writing that the final > pass of the sort would have been doing. Did we discount the costs for sort for > that skipping writing that final pass when that was done? IIRC the cost of the sort didn't include the final merge, so when we avoided the final merge the cost model for the sort became accurate. Perhaps we should add something when we don't do that. It seems reasonable than an extra node should cost something anyhow, and the per tuple cost is the current standard way of indicating that extra cost. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support