Re: Row pattern recognition - Mailing list pgsql-hackers
| From | Tatsuo Ishii |
|---|---|
| Subject | Re: Row pattern recognition |
| Date | |
| Msg-id | 20251124.135725.1126381089646688370.ishii@postgresql.org Whole thread Raw |
| In response to | Re: Row pattern recognition (Chao Li <li.evan.chao@gmail.com>) |
| List | pgsql-hackers |
Hi Chao,
Thank you for the review!
>> On Nov 20, 2025, at 15:33, Chao Li <li.evan.chao@gmail.com> wrote:
>>
>>
>> I’d stop here, and continue 0005 tomorrow.
>>
>
> I reviewed 0005 today. I'm still not very familiar with the executor code, so going through 0005 has also been a
valuablelearning process for me.
>
> 12 - 0005 - nodeWindowAgg.c
> ```
> + if (string_set_get_size(str_set) == 0)
> + {
> + /* no match found in the first row */
> + register_reduced_frame_map(winstate, original_pos, RF_UNMATCHED);
> + destroyStringInfo(encoded_str);
> + return;
> + }
> ```
>
> encoded_str should also be destroyed if not entering the “if” clause.
Subsequent string_set_discard() will free encode_str since encoded_str
is part of str_set. So no need to call destroyStringInfo(encoded_str)
in not entering "if" clause.
string_set_discard(str_set);
So I think this is Ok.
> 13 - 0005 - nodeWindowAgg.c
> ```
> static
> int
> do_pattern_match(char *pattern, char *encoded_str, int len)
> {
> static regex_t *regcache = NULL;
> static regex_t preg;
> static char patbuf[1024]; /* most recent 'pattern' is cached here */
> ```
>
> Using static variable in executor seems something I have never seen, which may persistent data across queries. I
thinkusually per query data are stored in state structures.
Yeah, such a usage maybe rare. But I hesitate to store the data
(regex_t) in WindowAggState because:
(1) it bloats WindowAggState which we want to avoid if possible
(2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
> 14 - 0005 - nodeWindowAgg.c
> ```
> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
> +
> + if (regcache != NULL)
> + pg_regfree(regcache); /* free previous re */
> +
> + /* we need to convert to char to pg_wchar */
> + plen = strlen(pattern);
> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
> + /* compile re */
> + sts = pg_regcomp(&preg, /* compiled re */
> + data, /* target pattern */
> + data_len, /* length of pattern */
> + cflags, /* compile option */
> + C_COLLATION_OID /* collation */
> + );
> + pfree(data);
> +
> + MemoryContextSwitchTo(oldContext);
> ```
>
> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
> ```
> /*
> * pg_regcomp - compile regular expression
> *
> * Note: on failure, no resources remain allocated, so pg_regfree()
> * need not be applied to re.
> */
> int
> pg_regcomp(regex_t *re,
> const chr *string,
> size_t len,
> int flags,
> Oid collation)
> ```
>
> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory
leak.
I admit current patch leaves the memory unfreed even after a query
finishes. What about adding something like:
static void do_pattern_match_end(void)
{
if (regcache != NULL)
pg_regfree(regcache);
}
and let ExecEndWindowAgg() call it?
> Okay, I’d stop here and continue to review 0006 next week.
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>
pgsql-hackers by date: