Guidelines on best indexing strategy for varying searches on 20+ columns - Mailing list pgsql-performance

From Niels Kristian Schjødt
Subject Guidelines on best indexing strategy for varying searches on 20+ columns
Date
Msg-id 79622954-DDEF-4A12-97A0-C67E99D64957@autouncle.com
Whole thread Raw
Responses Re: Guidelines on best indexing strategy for varying searches on 20+ columns  (Merlin Moncure <mmoncure@gmail.com>)
Re: Guidelines on best indexing strategy for varying searches on 20+ columns  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
Hi,
I’m running a search engine for cars. It’s backed by a postgresql 9.3 installation.

Now I’m unsure about the best approach/strategy on doing index optimization for the fronted search.

The problem:

The table containing the cars holds a around 1,5 million rows. People that searches for cars needs different criteria
tosearch by. Some search by brand/model, some by year, some by mileage, some by price and some by special equipment
etc.etc. - and often they combine a whole bunch of criteria together. Of cause some, like brand/mode and price, are
usedmore frequently than others. In total we offer: 9 category criteria like brand/model or body type, plus 5 numeric
criterialike price or mileage, plus 12 boolean criteria like equipment. Lastly people can order the results by
differentcolumns (year, price, mileage and a score we create about the cars). By default we order by our own generated
score.

What I’ve done so far:

I have analyzed the usage of the criteria “lightly”, and created a few indexes (10). Among those, are e.g. indexes on
price,mileage and a combined index on brand/model. Since we are only interested in showing results for cars which is
actuallyfor sale, the indexes are made as partial indexes on a sales state column. 

Questions:

1. How would you go about analyzing and determining what columns should be indexed, and how?
2. What is the best strategy when optimizing indexes for searches happening on 20 + columns, where the use and the
combinationsvaries a lot? (To just index everything, to index some of the columns, to do combined indexes, to only do
singlecolumn indexes etc. etc.) 
3. I expect that it does not make sense to index all columns?
4. I expect it does not make sense to index boolean columns?
5. Is it better to do a combined index on 5 frequently used columns rather than having individual indexes on each of
them?
6. Would it be a goof idea to have all indexes sorted by my default sorting?
7. Do you have so experiences with other approaches that could greatly improve performance (e.g. forcing indexes to
stayin memory etc.)? 






pgsql-performance by date:

Previous
From: Brio
Date:
Subject: Re: postgres files in use not staying in linux file cache
Next
From: AJ Weber
Date:
Subject: how to improve perf of 131MM row table?