Thread: Extract values from XML content

Extract values from XML content

From
celati Laurent
Date:
Good afternoon,
I have a table 'metadata' with 2000 records. With one column 'id' and one column 'data' with XML content.
I need to extract for all records the values regarding the Organisation names. 
I success in querying without error message thanks to this following sql query :

SELECT id, xpath('/contact/CI_ResponsibleParty/organisationName/CharacterString/text()',
            CAST(data AS XML)) AS organisation_name
FROM public.metadata;

But the values don't appear into the column :

"id"    "organisation_name"
16410    "[]"
16411    "[]"
16412    "[]"
16413    "[]"
16414    "[]"
16415    "[]"
16416    "[]"
16423    "[]"
16425    "[]"
16426    "[]"
16427    "[]"
16435    "[]"
2250     "[]"
16587    "[]"
16588    "[]"


If needed, i paste below the FULL extract of the XLM content up to my section of interest :

  <mri:pointOfContact>    <cit:CI_Responsibility>      <cit:role>        <cit:CI_RoleCode codeList="http://standards.iso.org/iso/19115/resources/Codelists/cat/codelists.xml#CI_RoleCode" codeListValue="originator" />      </cit:role>      <cit:party>        <cit:CI_Organisation>          <cit:name>            <gco:CharacterString>Office français de la biodiversité</gco:CharacterString>          </cit:name>


Thanks so much.





Re: Extract values from XML content

From
"David G. Johnston"
Date:
On Wednesday, November 13, 2024, celati Laurent <laurent.celati@gmail.com> wrote:

SELECT id, xpath('/contact/CI_ResponsibleParty/organisationName/CharacterString/text()',
            CAST(data AS XML)) AS organisation_name <mri:pointOfContact>
    <cit:CI_Responsibility>      <cit:role>        <cit:CI_RoleCode codeList="http://standards.iso.org/iso/19115/resources/Codelists/cat/codelists.xml#CI_RoleCode" codeListValue="originator" />      </cit:role>      <cit:party>        <cit:CI_Organisation>          <cit:name>            <gco:CharacterString>Office français de la biodiversité</gco:CharacterString>          </cit:name>

I’d look into how to specify namespaces since your XML elements all have them but you don’t have any in your xpath.

David J.
 

Re: Extract values from XML content

From
shammat@gmx.net
Date:
Am 13.11.24 um 15:58 schrieb celati Laurent:
> I have a table 'metadata' with 2000 records. With one column 'id' and one column 'data' with XML content.
> I need to extract for all records the values regarding the Organisation names.
> I success in querying without error message thanks to this following sql query :
>
> SELECT id, xpath('/contact/CI_ResponsibleParty/organisationName/CharacterString/text()',
>              CAST(data AS XML)) AS organisation_name
> FROM public.metadata;
>

I typically find xmltable() a lot easier to work with, especially if the XML contains namespaces.

I guess the namespaces are the problem in your case, you will have to pass them to the xpath()
function and reference them in the path expression as well

So you will need something like:

xpath('//cit:CI_Responsibility/cit:party/cit:CI_Organisation/cit:name/gco:CharacterString/text()',
        cast(data as xml),
        ARRAY[ARRAY['cit', 'http://...'], array['gco', 'http://...']])

The actual value for the namespace URIs depends on the definition in your XML

Note that xpath() returns an array, so you probably want (xpath(....))[1]