1. Convert PDF to file with e.g xpdf
2. Insert parsed text to a table of your choice.
3. Make vectors from the text.
Cheers,
11 dec 2006 kl. 18:23 skrev Philip Johnson:
> Do you know what kind of table should I use ?
> Is there a shell script or a php script that does the work ?
>
> regards
>
>> -----Message d'origine-----
>> De : pgsql-general-owner@postgresql.org [mailto:pgsql-general-
>> owner@postgresql.org] De la part de Hannes Dorbath
>> Envoyé : lundi 11 décembre 2006 12:21
>> À : pgsql-general@postgresql.org
>> Objet : Re: [GENERAL] tsearch2 and pdf files
>>
>> You just need software that extracts the text from it. Search
>> google for
>> pdf2txt and others. Printer drivers that try to get text from
>> anything
>> are available as well.
>>
>>
>> On 11.12.2006 11:41, Philip Johnson wrote:
>>> I'm using Postgresql 8.1.5
>>>
>>> Tsearch2 is installed and runs well
>>>
>>> I'd like to use tsearch2 to index PDF files.
>>>
>>> Do someone has a detailed process to implement that?
>>
>>
>> --
>> Regards,
>> Hannes Dorbath
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org/