Re: Row pattern recognition - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Row pattern recognition
Date
Msg-id 20251201.212138.1470631793001194849.ishii@postgresql.org
Whole thread Raw
In response to Re: Row pattern recognition  (Chao Li <li.evan.chao@gmail.com>)
List pgsql-hackers
>> On Nov 27, 2025, at 11:10, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> 
>> Hi Chao,
>> 
>> Any comment on this?
>> 
>>>> 13 - 0005 - nodeWindowAgg.c
>>>> ```
>>>> static
>>>> int
>>>> do_pattern_match(char *pattern, char *encoded_str, int len)
>>>> {
>>>> static regex_t *regcache = NULL;
>>>> static regex_t preg;
>>>> static char patbuf[1024]; /* most recent 'pattern' is cached here */
>>>> ```
>>>> 
>>>> Using static variable in executor seems something I have never seen, which may persistent data across queries. I
thinkusually per query data are stored in state structures.
 
>>> 
>>> Yeah, such a usage maybe rare. But I hesitate to store the data
>>> (regex_t) in WindowAggState because:
>>> 
>>> (1) it bloats WindowAggState which we want to avoid if possible
>>> (2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
> 
> With the static function-scope variables, those data persist across queries, which is error prone. I am afraid that
evenif I let this pass, other reviewers might have the same concern.
 
> 
> Maybe define a sub structure, say WindowAggCache, and put a pointer of WindowAggCache in WindowAggState, and only
allocatememory when a query needs to.
 
> 
>>> 
>>>> 14 - 0005 - nodeWindowAgg.c
>>>> ```
>>>> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
>>>> +
>>>> + if (regcache != NULL)
>>>> + pg_regfree(regcache); /* free previous re */
>>>> +
>>>> + /* we need to convert to char to pg_wchar */
>>>> + plen = strlen(pattern);
>>>> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
>>>> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
>>>> + /* compile re */
>>>> + sts = pg_regcomp(&preg, /* compiled re */
>>>> + data, /* target pattern */
>>>> + data_len, /* length of pattern */
>>>> + cflags, /* compile option */
>>>> + C_COLLATION_OID /* collation */
>>>> + );
>>>> + pfree(data);
>>>> +
>>>> + MemoryContextSwitchTo(oldContext);
>>>> ```
>>>> 
>>>> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
 
>>>> ```
>>>> /*
>>>> * pg_regcomp - compile regular expression
>>>> *
>>>> * Note: on failure, no resources remain allocated, so pg_regfree()
>>>> * need not be applied to re.
>>>> */
>>>> int
>>>> pg_regcomp(regex_t *re,
>>>> const chr *string,
>>>> size_t len,
>>>> int flags,
>>>> Oid collation)
>>>> ```
>>>> 
>>>> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a
memoryleak.
 
>>> 
>>> I admit current patch leaves the memory unfreed even after a query
>>> finishes. What about adding something like:
>>> 
>>> static void do_pattern_match_end(void)
>>> {
>>> if (regcache != NULL)
>>> pg_regfree(regcache);
>>> }
>>> 
>>> and let ExecEndWindowAgg() call it?
>> 
> 
> I’m not sure. But I think if we move the data into WindowAggState, then I guess we don’t have to use TopmemoryContext
here.

I decided to add new fields to WindowAggState:

    /* regular expression compiled result cache. Used for RPR. */
    char       *patbuf;            /* pattern to be compiled */
    regex_t        preg;            /* compiled re pattern */

Then allocate the memory for them using
winstate->ss.ps.ps_ExprContext->ecxt_per_query_memory;

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Next
From: Pavel Stehule
Date:
Subject: Re: Migrate to autoconf 2.72?