Re: Our trial to TPC-DS but optimizer made unreasonable plan - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: Our trial to TPC-DS but optimizer made unreasonable plan |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F801138AF0@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: Our trial to TPC-DS but optimizer made unreasonable plan (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: Our trial to TPC-DS but optimizer made unreasonable
plan
|
List | pgsql-hackers |
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Peter Geoghegan > Sent: Thursday, August 27, 2015 8:31 AM > To: Kaigai Kouhei(海外 浩平) > Cc: Greg Stark; PostgreSQL-development > Subject: Re: [HACKERS] Our trial to TPC-DS but optimizer made unreasonable plan > > On Mon, Aug 17, 2015 at 6:40 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > I think SortSupport logic provides a reasonable way to solve this > > kind of problem. For example, btint4sortsupport() informs a function > > pointer of the fast version of comparator (btint4fastcmp) which takes > > two Datum argument without indirect memory reference. > > This mechanism will also make sense for HashAggregate logic, to reduce > > the cost of function invocations. > > > > Please comment on the idea I noticed here. > > Is this a 9.5-based system? If so, then you'd benefit from the > memcmp() pre-check within varstr_cmp() by being on 9.5, since the > pre-check is not limited to cases that use text/varchar SortSupport -- > this could make a big difference here. If not, then it might be > somewhat helpful to add a pre-check that considers total binary > equality only before bcTruelen() is ever called. Not so sure about the > latter idea, though. > My measurement is done on v9.5 based system. So, it also seems to me replacement of CHAR(n) by VARCHAR(n) will make sense. > I'm not sure if it would help with hash aggregates to use something > like SortSupport to avoid fmgr overhead. It might make enough of a > difference to matter, but maybe the easier win would come from > considering simple binary equality first, and only then using an > equality operator (think HOT style checks). That would have the > advantage of requiring no per-type/operator class support at all, > since it's safe to assume that binary equality is a proxy for > "equivalence" of sort order (or whatever we call the case where > 5.00::numeric and 5.000::numeric are considered equal). > My presumption was wrong, at least not major portion, according to the perf result. So, I don't think elimination of fmgr overhead has the first priority. However, shortcut pass of equality checks seems to me a great leap, to avoid strict equality checks implemented per data type; that often takes complicated logic. Probably, it is more intelligent to apply this binary equality proxy on only problematic data types, like bpchar(n). But less effective on simple data types, like int4. On the other hands, one other big portion of HashAggregate is calculation of hash-value by all the grouping key. It may be beneficial to have an option to reference the result attribute of underlying plan. It potentially allows co-processor to compute hash-value instead of CPU. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: