Thread: searchable book database
Hi,I need to make a database of books. Several specific subject books that are to be searchable.Is it viable to have the complete book text on a database and search inside it? Or should i consider keeping only its metadata (name, author, filename, etc) on the DB, keep the book file on the HD and use some sort of search algorithm on the file? If you agree on the second option, what would you guys suggest for text file searching? Its for a web project, so how could i go about doing this? (PHP, python...)Thanks.MV
CLucene is one possibility:
http://sourceforge.net/projects/clucene/
Since you are asking in the PostgreSQL group, why not use the built-in full text search:
http://www.postgresql.org/docs/8.4/static/textsearch.html
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Sandeep Srinivasa
Sent: Thursday, August 19, 2010 10:11 PM
To: Miguel Vaz
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] searchable book database
If you dont ever need to return the complete book text to a user (which means, you only need the book text for your search indexes only), then keep the text on file and use Apache Solr to index it.
regards
Sandeep
On Fri, Aug 20, 2010 at 1:05 AM, Miguel Vaz <pagongski@gmail.com> wrote:
Hi,
I need to make a database of books. Several specific subject books that are to be searchable.
Is it viable to have the complete book text on a database and search inside it? Or should i consider keeping only its metadata (name, author, filename, etc) on the DB, keep the book file on the HD and use some sort of search algorithm on the file? If you agree on the second option, what would you guys suggest for text file searching? Its for a web project, so how could i go about doing this? (PHP, python...)
Thanks.
MV
On Thu, 19 Aug 2010 20:35:50 +0100 Miguel Vaz <pagongski@gmail.com> wrote: > Hi, > > I need to make a database of books. Several specific subject books > that are to be searchable. > > Is it viable to have the complete book text on a database and search > inside it? Or should i consider keeping only its metadata (name, > author, filename, etc) on the DB, keep the book file on the HD and > use some sort of search algorithm on the file? If you agree on the > second option, what would you guys suggest for text file searching? > Its for a web project, so how could i go about doing this? (PHP, > python...) > > Thanks. > > MV Don't knopw if that's what you need but you can setup a DocManager site. Check it at http://wiki.docmgr.org/index.php/DocMGR_-_Document_Management and see if it fills your needs. HTH
On Thu, 19 Aug 2010 20:35:50 +0100Don't knopw if that's what you need but you can setup a DocManager> Hi,
>
> I need to make a database of books. Several specific subject books
> that are to be searchable.
>
> Is it viable to have the complete book text on a database and search
> inside it? Or should i consider keeping only its metadata (name,
> author, filename, etc) on the DB, keep the book file on the HD and
> use some sort of search algorithm on the file? If you agree on the
> second option, what would you guys suggest for text file searching?
> Its for a web project, so how could i go about doing this? (PHP,
> python...)
>
> Thanks.
>
> MV
site. Check it at
http://wiki.docmgr.org/index.php/DocMGR_-_Document_Management and see
if it fills your needs.
HTH
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu, 19 Aug 2010 20:35:50 +0100 Miguel Vaz <pagongski@gmail.com> wrote: > Hi, > > I need to make a database of books. Several specific subject books > that are to be searchable. > > Is it viable to have the complete book text on a database and search > inside it? Or should i consider keeping only its metadata (name, > author, filename, etc) on the DB, keep the book file on the HD and > use some sort of search algorithm on the file? If you agree on the > second option, what would you guys suggest for text file searching? > Its for a web project, so how could i go about doing this? (PHP, > python...) > > Thanks. > > MV Don't knopw if that's what you need but you can setup a DocManager site. Check it at http://wiki.docmgr.org/index.php/DocMGR_-_Document_Management and see if it fills your needs. HTH
Thank you all for your replies. I already had read about Lucene in its general flavour and eventually caught up about it being used with zend framework, but it seems theres a lot more out there.Will plan the second option. Have the books as files and build some search/index/hash/super-power-ninja engine to do all the hard work behind the scenes and only deliver the pretty bits to the users.This wont be merely a search and find project, as it will have the search, find, analyse/treat results, etc. and then display analysis.Apache Solr..nice one, seems very interesting. Has an API also, that maybe will allow me to plug to the Flex side of the interface.Again, than you all for the great information.MVOn Fri, Aug 20, 2010 at 12:09 PM, Eduardo <emorras@xroff.net> wrote:On Thu, 19 Aug 2010 20:35:50 +0100Don't knopw if that's what you need but you can setup a DocManager> Hi,
>
> I need to make a database of books. Several specific subject books
> that are to be searchable.
>
> Is it viable to have the complete book text on a database and search
> inside it? Or should i consider keeping only its metadata (name,
> author, filename, etc) on the DB, keep the book file on the HD and
> use some sort of search algorithm on the file? If you agree on the
> second option, what would you guys suggest for text file searching?
> Its for a web project, so how could i go about doing this? (PHP,
> python...)
>
> Thanks.
>
> MV
site. Check it at
http://wiki.docmgr.org/index.php/DocMGR_-_Document_Management and see
if it fills your needs.
HTH
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Filip Rembiałkowski
JID,mailto:filip.rembialkowski@gmail.com
http://filip.rembialkowski.net/