electronic-izing unicode texts - Mailing list pgsql-general

From A. Cropi
Subject electronic-izing unicode texts
Date
Msg-id 3076735805042011276a653103@mail.gmail.com
Whole thread Raw
Responses Re: electronic-izing unicode texts  (Richard Huxton <dev@archonet.com>)
List pgsql-general
hi everyone,

i have several hundred books that were typed using unicode and would
liek to put them into a database so that i can perform searches on
them.  how does one design a database for this?

i was planning to make a table with these columns: ID, Title, Authors,
Publishers, Content

the Content column will contain the entire book in unicode; then, to
find out which books contain the string "blah" i'd just do somethig
like select * from table where content contains "blah"

my problem is: (1) i have never done database work before (2) i do not
have any experience in anything like this

my objectives: (1) allow users to make query through the web (i guess
i will do this via PHP interacting with the postgresql)

my questions are: (1) is it reasonable to put the bookcontent into the
CONTENT column? (2) the content of the book can be very long (some of
them have nearly 1 milloin words), so, what kind of considerations
should i be making? (3) how should i design something like this? there
must be someone outthere that has done somethign similar to this.. if
so, please share your experiences.

note: these texts are not copyrighted.. so i do not have to worry
about the legal problems.

tia

pgsql-general by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Regular expression. How to disable
Next
From: Michael Fuhr
Date:
Subject: Re: Regular expression. How to disable ALL meta-character in a regular expression