Re: Ideas for building a system that parses medical research publications/articles - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Ideas for building a system that parses medical research publications/articles
Date
Msg-id f12921a5-0409-8c8e-3032-60e95b9a4a5d@aklaver.com
Whole thread Raw
In response to Re: Ideas for building a system that parses medical research publications/articles  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
Responses Re: Ideas for building a system that parses medical research publications/articles  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
List pgsql-general
On 6/5/21 10:39 AM, Achilleas Mantzios wrote:
> 
> Στις 5/6/21 8:03 μ.μ., ο/η Adrian Klaver έγραψε:
>> On 6/5/21 9:56 AM, Achilleas Mantzios wrote:
>>>
>>> Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε:
>>>> On 6/5/21 2:49 AM, Achilleas Mantzios wrote:
>>>>> Hello
>>>>>
>>>>> I am imagining a system that can parse papers from various sources 
>>>>> (web/files/etc) and in various formats (text, pdf, etc) and can 
>>>>> store metadata for this paper ,some kind of global ID if 
>>>>> applicable, authors, areas of research, whether the paper is "new", 
>>>>> "highlighted", "historical", type (e.g. Case reports, Clinical 
>>>>> trials), symptoms (e.g. tics, GI pain, psychological changes, 
>>>>> anxiety, ), and other key attributes (I guess dynamic), it must be 
>>>>> full text searchable, etc.
>>>>>
>>>>> I am at the very beginning in this and it is done on a fully 
>>>>> volunteer basis.
>>>>>
>>>>> Lots of questions : is there any scientific/scholar analysis 
>>>>> software already available? If yes and is really good and open 
>>>>> source , then this will influence the rest of decisions. Otherwise 
>>>>> , I'll have to form a team that can write one, in this case I'll 
>>>>> have to decide DB, language, etc. I work 20 years with pgsql so it 
>>>>> is the natural choice for any kind of data, I just ask this for the 
>>>>> sake of completeness.
>>>>>
>>>>> All ideas welcome.
>>>>
>>>> A quick search found this:
>>>>
>>>> https://solutionsreview.com/data-management/the-best-open-source-data-catalog-tools-to-consider/ 
>>>>
>>>>
>>>> Might be a good starting point on what is already out there.
>>>
>>> This is interesting, so the keywords are "Data Catalog" ?
>>
>> What I searched on was 'open source article catalog'.
>>
>>>
>>>>
>>>> There is also this:
>>>>
>>>> The Directory of Open Access Journals
>>>> https://doaj.org/
>>>>
>>> This seems very very poor. Just try a search there and then repeat in 
>>> PMC (PubMed Central).
>>
>> This is down to copyright issues I'm sure. For PubMed Central see:
>>
>> https://www.ncbi.nlm.nih.gov/pmc/about/copyright/
>>
>> for the if/ands/buts that restrict what you can do with the 
>> information and stay legal.
> 
> maybe but still :
> 
> https://www.ncbi.nlm.nih.gov/pmc/?term=open+access%5Bfilter%5D+PANDAS+IVIG

Yeah it is nice to have the resources of the NIH behind you. Still I 
would point out under Copyright and License information:

"This article is made available via the PMC Open Access Subset for 
unrestricted research re-use and secondary analysis in any form or by 
any means with acknowledgement of the original source. These permissions 
are granted for the duration of the World Health Organization (WHO) 
declaration of COVID-19 as a global pandemic."

Further on PMC Open Access Subset:

https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/

Again more ifs/ands/buts.

The point being, dealing with articles is a descent into legalese.  I am 
not saying this is show stopper, just that it will consume considerable 
resources to sort out. I for one applaud your effort and given what I 
have seen you do with the shipping software over the years I don't see 
this project as out of the realm of possibility.

> 
>  >
> 
>
https://doaj.org/search/articles?ref=homepage-box&source=%7B%22query%22%3A%7B%22query_string%22%3A%7B%22query%22%3A%22IVIG%20PANDAS%22%2C%22default_operator%22%3A%22AND%22%7D%7D%7D

> 
> 
>>
>>>> It seems to be a service, not downloadable software.
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>
>>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



pgsql-general by date:

Previous
From: Achilleas Mantzios
Date:
Subject: Re: Ideas for building a system that parses medical research publications/articles
Next
From: Tom Lane
Date:
Subject: Re: strange behavior of WAL files