Re: Parallel Seq Scan - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Parallel Seq Scan |
Date | |
Msg-id | CA+TgmobR=EuLaNjKUS3gZQSZfpi15+hyGc9Vyweu_Ab9bWALgw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Seq Scan (Noah Misch <noah@leadboat.com>) |
Responses |
Re: Parallel Seq Scan
|
List | pgsql-hackers |
On Tue, Oct 13, 2015 at 9:08 PM, Noah Misch <noah@leadboat.com> wrote: > On Mon, Oct 12, 2015 at 11:46:08AM -0400, Robert Haas wrote: >> plpgsql_param_fetch() assumes that it can detect whether it's being >> called from copyParamList() by checking whether params != >> estate->paramLI. I don't know why this works, but I do know that this > > It works because PL/pgSQL creates an unshared list whenever copyParamList() is > forthcoming. (This in turn relies on intimate knowledge of how the rest of > the system processes param lists.) The comments at setup_param_list() and > setup_unshared_param_list() are most pertinent. Thanks for the pointer. >> test fails to detect the case where it's being called from >> SerializeParamList(), which causes failures in exec_eval_datum() as >> predicted. Calls from SerializeParamList() need the same treatment as >> calls from copyParamList() because it, too, will try to evaluate every >> parameter in the list. Here, I've taken the approach of making that >> check unconditional, which seems to work, but I'm not sure if some >> other approach would be better, such as adding an additional Boolean >> (or enum context?) argument to ParamFetchHook. I *think* that >> skipping this check is merely a performance optimization rather than >> anything that affects correctness, and bms_is_member() is pretty >> cheap, so perhaps the way I've done it is OK. > > Like you, I don't expect bms_is_member() to be expensive relative to the task > at hand. However, copyParamList() and SerializeParamList() copy non-dynamic > params without calling plpgsql_param_fetch(). Given the shared param list, > they will copy non-dynamic params the current query doesn't use. That cost is > best avoided, not being well-bounded; consider the case of an unrelated > variable containing a TOAST pointer to a 1-GiB value. One approach is to have > setup_param_list() copy the paramnos pointer to a new ParamListInfoData field: > > Bitmapset *paramMask; /* if non-NULL, ignore params lacking a 1-bit */ > > Test it directly in copyParamList() and SerializeParamList(). As a bonus, > that would allow use of the shared param list for more cases involving > cursors. Furthermore, plpgsql_param_fetch() would never need to test > paramnos. A more-general alternative is to have a distinct "paramIsUsed" > callback, but I don't know how one would exploit the extra generality. I'm anxious to minimize the number of things that must be fixed in order for a stable version of parallel query to exist in our master repository, and I fear that trying to improve ParamListInfo generally could take me fairly far afield. How about adding a paramListCopyHook to ParamListInfoData? SerializeParamList() would, if it found a parameter with !OidIsValid(prm->prmtype) && param->paramFetch != NULL, call this function, which would return a new ParamListInfo to be serialized in place of the original? This wouldn't require any modification to the current plpgsql_param_fetch() at all, but the new function would steal its bms_is_member() test. Furthermore, no user of ParamListInfo other than plpgsql needs to care at all -- which, with your proposals, they would. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: