Re: [SQL] selecting from indexes - Mailing list pgsql-sql

From Tim Joyce
Subject Re: [SQL] selecting from indexes
Date
Msg-id 000201bf32d3$28805310$0501a8c0@noonoo
Whole thread Raw
In response to Re: [SQL] selecting from indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [SQL] selecting from indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
> > the query above selects 294072 rows, which i obviously don't want to do,
but
> > I do want to use the clause above in a query that involves a join.  eg
>
> Ah, I begin to understand.  With an index scan you're going to get
> 294072 probes into the table (maybe even more, if there are deleted rows
> that match the category_key).  If the rows are scattered all over the
> disk then that may actually take about 300k disk reads.  After you
> cluster the table, the rows with the same category_key are all
> contiguous in the table, so many fewer blocks have to be read to visit
> them all.  That's why clustering helps here.

Indeed, and this lead me to see if there was a way of getting the ids
without hitting the books table, but it doesn't look like there is :(

>
> Since you're selecting about 1/4th of the table, this particular query
> would probably be better off *not* using the index, but just doing a
> sequential scan of the whole table :-(.  I assume most of your
> categories are more selective than this one, though, so dropping the
> category index entirely is probably not the answer.
>
> > select id from books, book_words where book_words.word='happy' and
> > book_words.id = books.id and books.category_key=1471;
>
> If this is what you're really doing, I think what you actually want is
> indexes on book_words.word and books.id.

I have indexes on both of these.

>That would allow book_words
> to be searched on the word (hopefully giving a more selective result
> than the category does), and then books would be probed using the id
index.
> id has unique values, right?

yep, but how can i use this subset to then select for category?  perhaps at
this stage, I should start to do things in the application code?

>
> > perhaps (in the above query) there is a way of directing postgres to
only
> > access the books that are selected by the 'words' part of the query?
>
> You might want to look at contrib/fulltextindex in the distribution for
> ideas about indexing words.  fulltextindex might be overkill for your
> needs, or maybe not, but you could probably adapt it for your purposes.

we have adapted this already (taking out the reg expression stuff so that it
is a bit quicker)

thanks for your help

timj

>
> regards, tom lane



pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: [SQL] selecting from indexes
Next
From: Tom Lane
Date:
Subject: Re: [SQL] selecting from indexes