Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql - Mailing list pgsql-hackers

From Noah Misch
Subject Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql
Date
Msg-id 20110115175355.GA28753@tornado.leadboat.com
Whole thread Raw
In response to patch: fix performance problems with repated decomprimation of varlena values in plpgsql  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Hello Pavel,

I'm reviewing this patch for CommitFest 2011-01.

The patch seems fully desirable.  It appropriately contains no documentation
updates.  It contains no new tests, and that's probably fine, too; I can't think
of any corner cases where this would do something other than work correctly or
break things comprehensively.

Using your test case from here:
http://archives.postgresql.org/message-id/AANLkTikDCW+c-C4U4NgaOBhpFSZkb5Uy_ZuaTDZfPMSn@mail.gmail.com
I observed a 28x speedup, similar to Álvaro's report.  I also made my own test
case, attached, to evaluate this from a somewhat different angle and also to
consider the worst case.  With a 100 KiB string (good case), I see a 4.8x
speedup.  With a 1 KiB string (worst case - never toasted), I see no
statistically significant change.  This is to be expected.

A few specific questions and comments follow, all minor.  Go ahead and mark the
patch ready for committer when you've acted on them, or declined to do so, to
your satisfaction.  I don't think a re-review will be needed.

Thanks,
nm

On Mon, Nov 22, 2010 at 01:46:51PM +0100, Pavel Stehule wrote:
> *** ./pl_exec.c.orig    2010-11-16 10:28:42.000000000 +0100
> --- ./pl_exec.c    2010-11-22 13:33:01.597726809 +0100

The patch applies cleanly, but the format is slightly nonstandard (-p0 when
applied from src/pl/plpgsql/src, rather than -p1 from the root).

> ***************
> *** 3944,3949 ****
> --- 3965,3993 ----
>
>                   *typeid = var->datatype->typoid;
>                   *typetypmod = var->datatype->atttypmod;
> +
> +                 /*
> +                  * explicit deTOAST and decomprim for varlena types

"decompress", perhaps?

> +                  */
> +                 if (var->should_be_detoasted)
> +                 {
> +                     Datum dvalue;
> +
> +                     Assert(!var->isnull);
> +
> +                     oldcontext = MemoryContextSwitchTo(estate->fn_mcxt);
> +                     dvalue = PointerGetDatum(PG_DETOAST_DATUM(var->value));
> +                     if (dvalue != var->value)
> +                     {
> +                         if (var->freeval)
> +                             free_var(var);
> +                         var->value = dvalue;
> +                         var->freeval = true;
> +                     }
> +                     MemoryContextSwitchTo(oldcontext);

This line adds trailing white space.

> +                     var->should_be_detoasted = false;
> +                 }
> +
>                   *value = var->value;
>                   *isnull = var->isnull;
>                   break;

> *** ./plpgsql.h.orig    2010-11-16 10:28:42.000000000 +0100
> --- ./plpgsql.h    2010-11-22 13:12:38.897851879 +0100

> ***************
> *** 644,649 ****
> --- 645,651 ----
>       bool        fn_is_trigger;
>       PLpgSQL_func_hashkey *fn_hashkey;    /* back-link to hashtable key */
>       MemoryContext fn_cxt;
> +     MemoryContext    fn_mcxt;        /* link to function's memory context */
>
>       Oid            fn_rettype;
>       int            fn_rettyplen;
> ***************
> *** 692,697 ****
> --- 694,701 ----
>       Oid            rettype;        /* type of current retval */
>
>       Oid            fn_rettype;        /* info about declared function rettype */
> +     MemoryContext    fn_mcxt;        /* link to function's memory context */
> +
>       bool        retistuple;
>       bool        retisset;
>

I only see PLpgSQL_execstate.fn_mcxt getting populated in this patch.  Is the
PLpgSQL_function.fn_mcxt leftover from an earlier design?

I could not readily tell the difference between fn_cxt and fn_mcxt.  As I
understand it, fn_mcxt is the SPI procCxt, and fn_cxt is the long-lived context
used to cache facts across many transactions.  Perhaps name the member something
like "top_cxt", "exec_cxt" or "proc_cxt" and comment it like "/* SPI Proc memory
context */" to make this clearer.

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: textarray option for file FDW
Next
From: Andrew Dunstan
Date:
Subject: Re: textarray option for file FDW