Re: Could use some advice on search architecture - Mailing list pgsql-general

From Susan Cassidy
Subject Re: Could use some advice on search architecture
Date
Msg-id CAE3Q8o=WRnc8h3J_D1UBLA1yfKHf4vusTSD5WCC8jmGaearRHA@mail.gmail.com
Whole thread Raw
In response to Could use some advice on search architecture  (Ron Pasch <postgresql@ronpasch.nl>)
List pgsql-general
First, I would not restrict color to 30 colors, if anything like furniture or clothing, etc. is involved.  Colors are very important to consumers, and exact colors are important.  I would re-think my color selections.

Make sure you have indexes on all the appropriate columns, of course.

Susan


On Fri, Apr 18, 2014 at 6:59 AM, Ron Pasch <postgresql@ronpasch.nl> wrote:
Hello,

I'm contemplating what architecture I should use to make searching as fast as possible given the information available and the search requirements. Let me give some background first;

- The database contains products of can potentially have a lot of them (up to about 3 to 5 million)
- Each product has about 30 different properties defined about them. Things like what color they are etc. All these properties are enumerated choices, so for instance for color there is a list of available static never changing options of which one can be chosen for that product. This is the same for all those 30 properties. Currently they are stored as enumerated types (CREATE TYPE propertyvalue AS ENUM ('option1', 'option2', etc..)
- It should be possible to search for products and provide properties that the product SHOULD have, not must have. For instance, for color, the search could specify that it should return products that are either red, blue or green.
- The products that match with the most properties should be in the top of the search results
- If different products match with the same amount of properties, the ordering should then be on the product that is most popular. There is information in the database (and if need be also in the same table) about how many times a product is sold.
- The results will be paginated per 15 products

The requirement is that these searches should be as fast as possible, with a maximum of about 200 ms time taken for a search query.

What would be the best approach to this if I were to do this in the database only? Should/can this be done with postgresql only or should I look into other types of technology? (Lucene? Sphinx? others?)

Any advice on this would be greatly appreciated.

Thx in advance!

Ron



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Arduino SQL Connector
Next
From: Robin
Date:
Subject: Re: Arduino SQL Connector