Re: Performance Question Followup No.2 - Mailing list pgsql-general

From Gordan Bobic
Subject Re: Performance Question Followup No.2
Date
Msg-id 200111071810.fA7IA9M05526@sentinel.bobich.net
Whole thread Raw
In response to Re: Performance Question Followup No.2  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Wednesday 07 Nov 2001 17:25, Tom Lane wrote:
> Gordan Bobic <gordan@bobich.net> writes:

Thanks for the reply.

> > After just having split the action into two parts (FTI delete + Master
> > delete), it would appear that most of the delay does come from the
> > triggers executing.
>
> I imagine that the problem is that the triggers have to delete the FTI
> records retail --- one master record's worth at a time.  That's
> inherently far less efficient than getting rid of all of them in a
> single query, as your comparison case is doing.  I see no easy way
> to get around that in the context of the existing FTI design.

Would that really explain such a HUGE difference in performance? Even without
any corresponding FTI records (if they are deleted first - I tried it)?

I am not talking about a few percent, or even factor 2 difference. I am
talking about a difference between 10 seconds to completion and aborting
after 45 minutes - on a 1 GHz machine.

> There is a new "tsearch" contrib module in 7.2 that might be worth your
> time to look at instead.  I'm not sure whether it's any better on this
> measure, but at least it's a fresh implementation...

I didn't use the FTI module implementation because again, it uses triggers -
this, yet again proved to be too slow. The query performance wasn't improved,
though, even with properly set up indices. In order to get it to be of
benefit I
1) Implemented it "in software" in the application layer.
2) Made it not insert duplicates.
3) Made it not do word-stemming/subwords.
4) Made the stop-word table separate (for ease of use - application reads
this).
5) Inserted in excess of 200 stop words (finding them all wasy hard work, and
it is a rather application specific thing to do) to get the Master/FTI ratio
to under 35 unique words/record.

Now the performance is slightly improved, although with enough memory and a
fast processor, the difference isn't all that great when compared to an ILIKE
search on the text fields. It's a few times faster, but I guess I was
expecting more...

Regards.

Gordan

pgsql-general by date:

Previous
From: Gordan Bobic
Date:
Subject: Re: More Performance Questions
Next
From: Javier Dussaillant
Date:
Subject: Re: Old Template0 is annoying me!