Thread: TSearch: CLUSTER using GIST, query using GIN?

TSearch: CLUSTER using GIST, query using GIN?

From

Hannes Dorbath

Date:

19 January 2008, 10:24:50

Does it make any sense to CLUSTER by a GIST index to move tuples with
similar lexems physically closer together on disc, drop it and use GIN
for the actual queries?

My queries are bound by HDD seek speed currently, might the above help
me or can it even be counterproductive?


--
Best regards,
Hannes Dorbath

Re: TSearch: CLUSTER using GIST, query using GIN?

From

Oleg Bartunov

Date:

19 January 2008, 11:10:33

On Sat, 19 Jan 2008, Hannes Dorbath wrote:

> Does it make any sense to CLUSTER by a GIST index to move tuples with similar
> lexems physically closer together on disc, drop it and use GIN for the actual
> queries?
>
> My queries are bound by HDD seek speed currently, might the above help me or
> can it even be counterproductive?

what do you want to speed up ? Search is very fast, see explain analyze.
The problem usually in the access to documents found to calculate
rank, headlines. If GIN  returns N documents, then you need to read
all of them to calculate rank and here you get slowdown.

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: TSearch: CLUSTER using GIST, query using GIN?

From

Hannes Dorbath

Date:

19 January 2008, 11:24:50

Oleg Bartunov wrote:
> what do you want to speed up ? Search is very fast, see explain analyze.
> The problem usually in the access to documents found to calculate
> rank, headlines. If GIN  returns N documents, then you need to read
> all of them to calculate rank and here you get slowdown.

I don't use headline or rank yet. It's a pure test setup. The table is
around 130GB in size. The disc sub system is able to deliver around
430MB/sec sequential read, but it dies in random seek activity. To my
understanding this because of MVCC checking the visibility of each
matching row. Now I though I could improve that by moving similar rows
physically closer together. It's static data, it won't change at all.

--
Best regards,
Hannes Dorbath