Thread: WARNING in parallel index creation.
If i run:
pgbench -i -s30
And then create the function:
CREATE OR REPLACE FUNCTION foobar(text)
RETURNS text
LANGUAGE plperl
IMMUTABLE PARALLEL SAFE STRICT COST 10000
AS $function$
return scalar reverse($_[0]);
$function$;
Then when I create in index, I get a warning:
jjanes=# create index on pgbench_accounts (foobar(filler));
WARNING: cannot set parameters during a parallel operation
WARNING: cannot set parameters during a parallel operation
If I create the index again within the same session, there is no WARNING.
This only occurs if plperl.on_init is set in the postgresql.conf file. It doesn't seem to matter what it is set to,
even the empty string triggers the warning.
plperl.on_init=''
As far as I can tell the index is created correctly despite the warning.
Cheers,
Jeff
On Sun, Mar 11, 2018 at 10:15 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > Then when I create in index, I get a warning: > > jjanes=# create index on pgbench_accounts (foobar(filler)); > WARNING: cannot set parameters during a parallel operation > WARNING: cannot set parameters during a parallel operation > > If I create the index again within the same session, there is no WARNING. > > This only occurs if plperl.on_init is set in the postgresql.conf file. It > doesn't seem to matter what it is set to, > even the empty string triggers the warning. I wonder why DefineCustomStringVariable() does not set var->reset_val. We see that within DefineCustomEnumVariable(), DefineCustomRealVariable(), DefineCustomIntVariable(), etc. -- Peter Geoghegan
On 2018-03-12 20:44:01 -0700, Peter Geoghegan wrote: > On Sun, Mar 11, 2018 at 10:15 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > > Then when I create in index, I get a warning: > > > > jjanes=# create index on pgbench_accounts (foobar(filler)); > > WARNING: cannot set parameters during a parallel operation > > WARNING: cannot set parameters during a parallel operation > > > > If I create the index again within the same session, there is no WARNING. > > > > This only occurs if plperl.on_init is set in the postgresql.conf file. It > > doesn't seem to matter what it is set to, > > even the empty string triggers the warning. > > I wonder why DefineCustomStringVariable() does not set var->reset_val. > We see that within DefineCustomEnumVariable(), > DefineCustomRealVariable(), DefineCustomIntVariable(), etc. Peter, have you investigated this any further? This has been an open item for a while. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-03-12 20:44:01 -0700, Peter Geoghegan wrote: >> I wonder why DefineCustomStringVariable() does not set var->reset_val. >> We see that within DefineCustomEnumVariable(), >> DefineCustomRealVariable(), DefineCustomIntVariable(), etc. > Peter, have you investigated this any further? This has been an open > item for a while. It looks to me like the "oversight" in DefineCustomStringVariable is not really one; rather, the reset_val assignments in the sibling routines are dead code. define_custom_variable will call InitializeOneGUCOption which overwrites reset_val from the boot_val in all cases. We should probably clean this up by removing the dead assignments. The WARNING seems to indicate that the error check in set_config_option is too aggressive. I kind of wonder why it was placed there at all, rather than in SQL-level operations like ExecSetVariableStmt. regards, tom lane
I wrote: > The WARNING seems to indicate that the error check in set_config_option > is too aggressive. I kind of wonder why it was placed there at all, > rather than in SQL-level operations like ExecSetVariableStmt. BTW, looking back at the thread, nobody seems to have posted an analysis of why this happens, but it's pretty simple. During initial library load of plperl.so, that module does DefineCustomStringVariable. If there is a pre-existing reset_val of the variable (taken from the postgresql.conf file, in this example), then define_custom_variable uses set_config_option to apply that value to the newly created variable struct. So if plperl library load happens during a parallel operation, the IsInParallelMode check in set_config_option will bleat about it. That test is, therefore, wrong. Otherwise, no non-builtin function could ever be marked parallel safe, for fear that the shlib it lives in might try to set up a custom variable at load time. I'm of the opinion that having such a test here at all is crummy design. It implies that set_config_option is in charge of knowing about every one of its callers and passing judgment on whether they represent parallel-safe usages, which is the exact opposite of modularity, even if set_config_option had enough information to make that judgment which it does not. In any case, the parallel index build patch is not at fault, it just happens to be involved in this particular example. I'd put the blame on commit 924bcf4f1 which installed this test in the first place. regards, tom lane
On Wed, Apr 11, 2018 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > That test is, therefore, wrong. Otherwise, no non-builtin function > could ever be marked parallel safe, for fear that the shlib it lives > in might try to set up a custom variable at load time. I don't follow that logic. If the check is more stringent than it needs to be for safety, then we can relax it. If the check is less stringent than it needs to be for safety, we need to tighten it. If it's exactly right, then it has to stay the way it is, even if that prohibits some things we wish we could allow. > I'm of the opinion that having such a test here at all is crummy design. > It implies that set_config_option is in charge of knowing about every > one of its callers and passing judgment on whether they represent > parallel-safe usages, which is the exact opposite of modularity, > even if set_config_option had enough information to make that judgment > which it does not. The point of SerializeGUCState and RestoreGUCState is to enforce an invariant that all cooperating processes have the same idea about the values of every GUC. If you let the GUCs be changed, then isn't that invariant is violated regardless of why you let it happen? For example: rhaas=# set pg_trgm.similarity_threshold = '002'; SET rhaas=# select current_setting('pg_trgm.similarity_threshold'); current_setting ----------------- 002 (1 row) rhaas=# load 'pg_trgm'; WARNING: 2 is outside the valid range for parameter "pg_trgm.similarity_threshold" (0 .. 1) LOAD rhaas=# select current_setting('pg_trgm.similarity_threshold'); current_setting ----------------- 0.3 (1 row) This shows that if you load a library that defines a custom variable in process A but not process B, it's trivial to construct a query that no longer returns the same answer in both backends, which isn't too good, because the point of parallel query is to deliver the same answer with parallel query that you would have gotten without it, or at the very least, to make it look like the answer was generated by a single process with a unified view of the world rather than multiple uncoordinated processes. I tried the following test with Jeff's example function and with plperl.on_init='' and it does not produce a warning: rhaas=# set force_parallel_mode = on; SET rhaas=# select foobar('dsgsdg'); foobar -------- gdsgsd (1 row) I haven't tracked it down, but I think what must be going on here is that, when the function is called via a query, something triggers the shared library to get loaded before we enter parallel mode. Then things work OK, because the worker will load the shared library before restoring the leader's GUC state, and so the state is actually guaranteed to match. Here it will *probably* end up matching because most likely every process that loads the module will interpret the existing textual setting of the newly-defined GUC in the same way, but it's a bit less ironclad, and if they decide to emit warnings (as in my first example, above) then you'll get multiple copies of them. So I don't think just drilling a hole through the prohibition in set_config_option() is a particularly appealing solution. It would amount to suppressing a warning that is basically legitimate. I wonder if there's an easy way to force the libraries that we're going to need to be loaded before we reach CreateParallelContext(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company