Thread: plpgsql performance - SearchCatCache issue

plpgsql performance - SearchCatCache issue

From
Pavel Stehule
Date:
Hello

I tried to optimize repeated assign in plpgsql with elimination
unnecessary  palloc/free calls.

I tested changes on simple bublesort

postgres=# \sf buble
CREATE OR REPLACE FUNCTION public.buble(integer[])RETURNS integer[]LANGUAGE plpgsql
AS $function$
declare unsorted bool := true; aux int;
begin while unsorted loop   unsorted := false;   for i in array_lower($1,1) .. array_upper($1, 1) - 1   loop     if
$1[i]> $1[i+1] then       aux := $1[i];       $1[i] := $1[i+1];       $1[i+1] := aux;       unsorted := true;     end
if;  end loop; end loop; return $1;
 
end
$function$

The performance tests shows so this optimization is useless. But when
I checked a oprofile' result I was surprised by high a SearchCatCache
calls.

3008     13.0493  SearchCatCache
1306      5.6657  ExecEvalParamExtern
1143      4.9586  GetSnapshotData
1122      4.8675  AllocSetAlloc
1058      4.5898  MemoryContextAllocZero
1002      4.3469  ExecMakeFunctionResultNoSets
986       4.2775  ExecEvalArrayRef
851       3.6918  LWLockAcquire
783       3.3968  LWLockRelease
664       2.8806  RevalidateCachedPlan
646       2.8025  AllocSetFree
568       2.4641  array_ref
551       2.3904  CopySnapshot
519       2.2515  AllocSetReset
510       2.2125  array_set
492       2.1344  PopActiveSnapshot
381       1.6529  ArrayGetOffset
369       1.6008  AcquireExecutorLocks
348       1.5097  pfree
347       1.5054  MemoryContextAlloc
313       1.3579  bms_is_member
285       1.2364  CatalogCacheComputeHashValue
267       1.1583  PushActiveSnapshot
266       1.1540  hash_uint32
253       1.0976  pgstat_init_function_usage
233       1.0108  array_seek.clone.0

when I mark function buble as immutable I got a profile:

3006     18.6384  SearchCatCache
1239      7.6823  ExecEvalParamExtern
1061      6.5786  MemoryContextAllocZero
931       5.7726  ExecMakeFunctionResultNoSets
881       5.4625  ExecEvalArrayRef
590       3.6582  RevalidateCachedPlan
580       3.5962  array_ref
518       3.2118  AllocSetAlloc
488       3.0258  array_set
447       2.7716  AllocSetReset
383       2.3748  AcquireExecutorLocks
334       2.0709  bms_is_member
311       1.9283  ArrayGetOffset
285       1.7671  CatalogCacheComputeHashValue
269       1.6679  pgstat_init_function_usage
240       1.4881  hash_uint32
237       1.4695  ResourceOwnerForgetPlanCacheRef
214       1.3269  oideq
210       1.3021  ReleaseCachedPlan
204       1.2649  array_seek.clone.0
202       1.2525  ResourceOwnerForgetCatCacheRef
196       1.2153  SearchSysCache
188       1.1657  pg_detoast_datum
185       1.1471  ArrayGetNItems
183       1.1347  ExecEvalConst
181       1.1223  DirectFunctionCall1Coll
178       1.1037  hashoid
176       1.0913  check_stack_depth
174       1.0789  heap_getsysattr
174       1.0789  pgstat_end_function_usage
173       1.0727  FunctionCall2Coll

Is this profile expected?

Regards

Pavel Stehule


Re: plpgsql performance - SearchCatCache issue

From
Robert Haas
Date:
On Sat, Jun 18, 2011 at 9:21 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Is this profile expected?

I've certainly seen profiles before where the catcache overhead was
significant.  I don't think that I've seen SearchCatCache() quite this
high on any of the profiling I've done, but then again I don't tend to
profile the same things you do, so maybe that's not surprising.  I
think the interesting question is probably "where are all those calls
coming from?" and "can we optimize any of them away?" rather than "how
do we make SearchCatCache() run faster?".   I would be willing to bet
money that the latter is largely an exercise in futility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: plpgsql performance - SearchCatCache issue

From
Pavel Stehule
Date:
2011/6/19 Robert Haas <robertmhaas@gmail.com>:
> On Sat, Jun 18, 2011 at 9:21 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> Is this profile expected?
>
> I've certainly seen profiles before where the catcache overhead was
> significant.  I don't think that I've seen SearchCatCache() quite this
> high on any of the profiling I've done, but then again I don't tend to
> profile the same things you do, so maybe that's not surprising.  I
> think the interesting question is probably "where are all those calls
> coming from?" and "can we optimize any of them away?" rather than "how
> do we make SearchCatCache() run faster?".   I would be willing to bet
> money that the latter is largely an exercise in futility.

I would not to attack on SearchCatCache. This is relative new area for
me, so I just asked.

The "suspect" part should be inside exec_assign_value
               case PLPGSQL_DTYPE_ARRAYELEM:                       {
        ....
                               /* Fetch current value of array datum */
exec_eval_datum(estate,target, 

&arraytypeid, &arraytypmod,

&oldarraydatum, &oldarrayisnull);
                               /* If target is domain over array,
reduce to base type */                               arraytypeid =
getBaseTypeAndTypmod(arraytypeid, &arraytypmod);
                               /* ... and identify the element type */                               arrayelemtypeid =
get_element_type(arraytypeid);                              if (!OidIsValid(arrayelemtypeid))
           ereport(ERROR, 

(errcode(ERRCODE_DATATYPE_MISMATCH),

errmsg("subscripted object is not an array")));
                               get_typlenbyvalalign(arrayelemtypeid,
 &elemtyplen,
 &elemtypbyval,
 &elemtypalign);                               arraytyplen = get_typlen(arraytypeid);


so any update of array means a access to CatCache.

These data should be cached in some referenced data type info
structure and should be accessed via new exec_eval_array_datum()
function.

Regards

Pavel Stehule


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: plpgsql performance - SearchCatCache issue

From
Pavel Stehule
Date:
2011/6/19 Pavel Stehule <pavel.stehule@gmail.com>:
> 2011/6/19 Robert Haas <robertmhaas@gmail.com>:
>> On Sat, Jun 18, 2011 at 9:21 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> Is this profile expected?
>>
>> I've certainly seen profiles before where the catcache overhead was
>> significant.  I don't think that I've seen SearchCatCache() quite this
>> high on any of the profiling I've done, but then again I don't tend to
>> profile the same things you do, so maybe that's not surprising.  I
>> think the interesting question is probably "where are all those calls
>> coming from?" and "can we optimize any of them away?" rather than "how
>> do we make SearchCatCache() run faster?".   I would be willing to bet
>> money that the latter is largely an exercise in futility.
>
> I would not to attack on SearchCatCache. This is relative new area for
> me, so I just asked.
>
> The "suspect" part should be inside exec_assign_value
>
>                case PLPGSQL_DTYPE_ARRAYELEM:
>                        {
>
>         ....
>
>                                /* Fetch current value of array datum */
>                                exec_eval_datum(estate, target,
>
> &arraytypeid, &arraytypmod,
>
> &oldarraydatum, &oldarrayisnull);
>
>                                /* If target is domain over array,
> reduce to base type */
>                                arraytypeid =
> getBaseTypeAndTypmod(arraytypeid, &arraytypmod);
>
>                                /* ... and identify the element type */
>                                arrayelemtypeid = get_element_type(arraytypeid);
>                                if (!OidIsValid(arrayelemtypeid))
>                                        ereport(ERROR,
>
> (errcode(ERRCODE_DATATYPE_MISMATCH),
>
> errmsg("subscripted object is not an array")));
>
>                                get_typlenbyvalalign(arrayelemtypeid,
>
>  &elemtyplen,
>
>  &elemtypbyval,
>
>  &elemtypalign);
>                                arraytyplen = get_typlen(arraytypeid);
>
>
> so any update of array means a access to CatCache.
>
> These data should be cached in some referenced data type info
> structure and should be accessed via new exec_eval_array_datum()
> function.

Using a cache for these values increased speed about 30% - I'll
prepare patch to next commitfest.

Regards

Pavel Stehule

>
> Regards
>
> Pavel Stehule
>
>
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>