Re: text_position worst case runtime - Mailing list pgsql-hackers

From Tom Lane
Subject Re: text_position worst case runtime
Date
Msg-id 23183.1148000052@sss.pgh.pa.us
Whole thread Raw
In response to text_position worst case runtime  (Mark Dilger <pgsql@markdilger.com>)
Responses Re: text_position worst case runtime
List pgsql-hackers
Mark Dilger <pgsql@markdilger.com> writes:
> The function
>   static int32 text_position(text *t1, text *t2, int matchnum)
> defined in src/backend/utils/adt/varlena.c uses repeated calls to
> strncmp (or pg_wchar_strncmp) to find the location of the pattern in
> the text.  The worst case runtime for such an approach is O(n*m) where
> n and m are the lengths of the pattern and text.  The best case would
> be O(n), I guess, because it only takes n comparisons to find the
> pattern at the very start of the text.  I'm not sure how to determine
> the average case runtime, because it depends what your data looks like
> on average.

I would think that the worst-case times would be fairly improbable.
I'm disinclined to push something as complicated as Boyer-Moore matching
into this function without considerable evidence that it's a performance
bottleneck for real applications.

The question that comes to mind for me is why we're not using simple
strncmp in all cases in that code.  The conversion to pg_wchar format
looks expensive and unnecessary ...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: PL/pgSQL 'i = i + 1' Syntax
Next
From: Christopher Kings-Lynne
Date:
Subject: Re: [OT] MySQL is bad, but THIS bad?