Scoring - Mailing list pgsql-general

From Eric Jain
Subject Scoring
Date
Msg-id NCBBJFHBEGOIAHBCBNCLAEJACIAA.jain@gmx.net
Whole thread Raw
List pgsql-general
Any tips on how to efficiently score fields with altavista-like query
strings?

I currently use the following PL/Perl function, which unfortunatly is
rather slow, even though I have already simplified it quite a bit...

# Example: SELECT id, score(description, 'a?pha -"beta gamma"') FROM
table;

CREATE FUNCTION score(TEXT, VARCHAR) RETURNS INT2 AS
'
  my @regex = ();
  my $score = 0;

  $_[1] =~ s{(-)?"(.+?)"}{ push(@regex, $1 . $2); () }egs;
  push(@regex, split(/\\s/, $_[1]));

  foreach (@regex)
  {
    s/\\?/\\./g;

    if (s/^-//)
    {
      return 0 if ($_[0] =~ /\\b$_/i);
    }

    else
    {
      my @matches = ();
      @matches = $_[0] =~ /\\b$_/gi;
      return 0 unless scalar @matches;
      $score += scalar @matches or return 0;
    }
  }

  return $score;
'
LANGUAGE 'plperl';


--
Eric Jain


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Need help with error
Next
From: Ron Peterson
Date:
Subject: Re: [HACKERS] Re: Revised Copyright: is this morepalatable?