Thread: How to read an external pdf file from postgres?
Hi;
I want to read an external pdf file from postgres. pdf file will exist on the disk. postgres only know the disk full path as metadata. Is there any software or extension that can be used for this? Or do we have to develop software for it? Or what is the best approach for this? I'd appreciate it if anyone with experience could make suggestions.
Thanks.
On 12.01.22 12:16, Amine Tengilimoglu wrote: > I want to read an external pdf file from postgres. pdf file will > exist on the disk. postgres only know the disk full path as metadata. Is > there any software or extension that can be used for this? Or do we have > to develop software for it? Or what is the best approach for this? I'd > appreciate it if anyone with experience could make suggestions. You could write a function in PL/Perl or PL/Python to open and read the file and process the PDF data, using some third-party module that surely exists somewhere.
What are you going to do with the data?
If you want to analyze it in some way, I can't think of a better option with a Python function. Or do you just want to transfer them? There are options here too, but in this case I like Python better.
If you want to analyze it in some way, I can't think of a better option with a Python function. Or do you just want to transfer them? There are options here too, but in this case I like Python better.
--
Regards, Dmitry!ср, 12 янв. 2022 г. в 16:16, Amine Tengilimoglu <aminetengilimoglu@gmail.com>:
Hi;I want to read an external pdf file from postgres. pdf file will exist on the disk. postgres only know the disk full path as metadata. Is there any software or extension that can be used for this? Or do we have to develop software for it? Or what is the best approach for this? I'd appreciate it if anyone with experience could make suggestions.Thanks.
2022年1月12日(水) 20:16 Amine Tengilimoglu <aminetengilimoglu@gmail.com>: > > Hi; > > I want to read an external pdf file from postgres. pdf file will exist on the disk. postgres only know the disk fullpath as metadata. Is there any software or extension that can be used for this? Or do we have to develop software forit? Or what is the best approach for this? I'd appreciate it if anyone with experience could make suggestions. By "read" do you mean "open the file and meaningful extract data from it"? If so, speaking from prior experience, don't. And if you really have to, make sure the source PDF is guaranteed to be in a well-defined, predictable format enforceable by contract law and/or people with sharp pointy sticks. I have successfully suppressed the memories of whatever it is I once had to do with reading data from PDFs, but though the data was eventually imported into PostgreSQL, there was a lot of mangling probably involving a Perl module (other languages are probably available) before it got anywhere near the database. Reagrds Ian Barwick -- EnterpriseDB: https://www.enterprisedb.com