Any tips on how to efficiently score fields with altavista-like query
strings?
I currently use the following PL/Perl function, which unfortunatly is
rather slow, even though I have already simplified it quite a bit...
# Example: SELECT id, score(description, 'a?pha -"beta gamma"') FROM
table;
CREATE FUNCTION score(TEXT, VARCHAR) RETURNS INT2 AS
'
my @regex = ();
my $score = 0;
$_[1] =~ s{(-)?"(.+?)"}{ push(@regex, $1 . $2); () }egs;
push(@regex, split(/\\s/, $_[1]));
foreach (@regex)
{
s/\\?/\\./g;
if (s/^-//)
{
return 0 if ($_[0] =~ /\\b$_/i);
}
else
{
my @matches = ();
@matches = $_[0] =~ /\\b$_/gi;
return 0 unless scalar @matches;
$score += scalar @matches or return 0;
}
}
return $score;
'
LANGUAGE 'plperl';
--
Eric Jain