RE: Ideas for building a system that parses medical research publications/articles [EXT] - Mailing list pgsql-general

From Daniel Perrett
Subject RE: Ideas for building a system that parses medical research publications/articles [EXT]
Date
Msg-id a79013c5c1d94d85b3498fc298218efc@sanger.ac.uk
Whole thread Raw
In response to Ideas for building a system that parses medical research publications/articles  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
List pgsql-general
I think the key word here that will help you is biocuration and it's an established field involving people with
scientific,computational, and linguistic backgrounds who are familiar with the problem space so I would suggest talking
topeople working in this area first to get an idea of what's feasible, what's already out there, etc., as they will
knowthis better than the Postgres community.
 

You can see an example of the sort of annotation that is fully automated at the moment here:

https://monarchinitiative.org/tools/text-annotate

Given the potential impact on human health, some level of manual involvement in annotation is frequently part of the
workflow.

Daniel

-----Original Message-----
From: Achilleas Mantzios <achill@matrix.gatewaynet.com> 
Sent: 05 June 2021 10:49
To: pgsql-general@lists.postgresql.org
Subject: Ideas for building a system that parses medical research publications/articles [EXT]

Hello

I am imagining a system that can parse papers from various sources
(web/files/etc) and in various formats (text, pdf, etc) and can store metadata for this paper ,some kind of global ID
ifapplicable, authors, areas of research, whether the paper is "new", "highlighted", "historical", type (e.g. Case
reports,Clinical trials), symptoms (e.g. 
 
tics, GI pain, psychological changes, anxiety, ), and other key attributes (I guess dynamic), it must be full text
searchable,etc.
 

I am at the very beginning in this and it is done on a fully volunteer basis.

Lots of questions : is there any scientific/scholar analysis software already available? If yes and is really good and
opensource , then this will influence the rest of decisions. Otherwise , I'll have to form a team that can write one,
inthis case I'll have to decide DB, language, etc. I work 20 years with pgsql so it is the natural choice for any kind
ofdata, I just ask this for the sake of completeness.
 

All ideas welcome.







--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

pgsql-general by date:

Previous
From: RAJAMOHAN
Date:
Subject: Re: Symbolic link breaks for postgresql.auto.conf
Next
From: Laurenz Albe
Date:
Subject: Re: base directory size getting increased