Home > mailing lists

Re: tsearch2 and pdf files - Mailing list pgsql-general

From	Henrik Zagerholm
Subject	Re: tsearch2 and pdf files
Date	December 11, 2006 18:06:14
Msg-id	179575E2-2F49-427F-9961-CEE966187950@mac.se Whole thread Raw
In response to	Re: tsearch2 and pdf files ("Philip Johnson" <philip.johnson@atempo.com>)
Responses	Re: tsearch2 and pdf files
List	pgsql-general

Tree view

1. Convert PDF to file with e.g xpdf
2. Insert parsed text to a table of your choice.
3. Make vectors from the text.

Cheers,


11 dec 2006 kl. 18:23 skrev Philip Johnson:

> Do you know what kind of table should I use ?
> Is there a shell script or a php script that does the work ?
>
> regards
>
>> -----Message d'origine-----
>> De : pgsql-general-owner@postgresql.org [mailto:pgsql-general-
>> owner@postgresql.org] De la part de Hannes Dorbath
>> Envoyé : lundi 11 décembre 2006 12:21
>> À : pgsql-general@postgresql.org
>> Objet : Re: [GENERAL] tsearch2 and pdf files
>>
>> You just need software that extracts the text from it. Search
>> google for
>> pdf2txt and others. Printer drivers that try to get text from
>> anything
>> are available as well.
>>
>>
>> On 11.12.2006 11:41, Philip Johnson wrote:
>>> I'm using Postgresql 8.1.5
>>>
>>> Tsearch2 is installed and runs well
>>>
>>> I'd like to use tsearch2 to index PDF files.
>>>
>>> Do someone has a detailed process to implement that?
>>
>>
>> --
>> Regards,
>> Hannes Dorbath
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org/

pgsql-general by date:

From: John McCawley
Date: 11 December 2006, 17:44:58
Subject: Re: Status of SSL encryption in ODBC driver

From: "Magnus Hagander"
Date: 11 December 2006, 18:08:52
Subject: Re: tsearch2 and pdf files

Re: tsearch2 and pdf files - Mailing list pgsql-general

Previous

Next