Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1 - Mailing list pgsql-general

From Alain Toussaint
Subject Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1
Date
Msg-id CAGo4VQ+bSGMk37zDgpvRkmTueJayuxv4x7gjKeZTssGz01hDgA@mail.gmail.com
Whole thread Raw
In response to Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1
Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1
List pgsql-general
> Narrowing down the entire file to a small problem region and posting a
> self-contained example,

The url here contain the set of xml records from a publication I
worked on many years ago:

https://www.ncbi.nlm.nih.gov/pubmed/21833294?report=xml&format=text

The particularly problematic region of the xml content is this:

        <CommentsCorrectionsList>
            <CommentsCorrections RefType="Cites">
                <RefSource>Neuroreport. 2000 Sep 11;11(13):2969-72</RefSource>
                <PMID Version="1">11006976</PMID>
            </CommentsCorrections>
            <CommentsCorrections RefType="Cites">
                <RefSource>J Neurosci. 2005 May 25;25(21):5148-58</RefSource>
                <PMID Version="1">15917455</PMID>
            </CommentsCorrections>
            <CommentsCorrections RefType="Cites">
                <RefSource>Neuroimage. 2003 Dec;20(4):1944-54</RefSource>
                <PMID Version="1">14683700</PMID>
            </CommentsCorrections>

There is more of these type of comments in an given citation.

> or at least providing the error messages and
> content, might help elicit good responses.

here it is:

ERROR: syntax error at or near "44"
LINE 1: 44(1):37-43</RefSources>

the command I used is this one:

echo "INSERT INTO samples (xmldata) VALUES $(cat
/srv/pgsql/pubmed/medline17n0001.xml)" | /usr/bin/psql medline
1>/dev/null 2>error.log

wc -l error.log
11145 error.log

The error message given is repeated a metric ton of time but I didn't
check the entire log if there were other kind of error messages.

>  Even if you could load the data
> without incident using it make end up proving problematic.

Agreed, the box will definitely need more ram and I could be better
off with a more recent graphic card (nvidia or amd but whatever is
supported by tensorflow 1.2 and up). I'll figure it out as I go.

Many thanks.

Alain


pgsql-general by date:

Previous
From: Berend Tober
Date:
Subject: Re: [GENERAL] Question regarding pgsql-general mailing list.
Next
From: Arthur Zakirov
Date:
Subject: Re: [GENERAL] Configure Qt Creator to work with PostgreSQL to extensions development