Thread: Import data from XML file
Hi! How do you import data from an xml-file? For instance, if I have a file like this: <?xml version="1.0" encoding="utf-8"?> <p_update> <main_categories> <main_category> <main_category_name>Sonstiges</main_category_name> <main_category_id>5</main_category_id> </main_category> <main_category> <main_category_name>Buehne</main_category_name> <main_category_id>2</main_category_id> </main_category> <main_category> <main_category_name>Konzerte</main_category_name> <main_category_id>1</main_category_id> </main_category> </main_categories> <categories> <category> <category_name>Reggae</category_name> <main_category_id>1</main_category_id> <category_id>45</category_id> </category> <category> <category_name>sonstige</category_name> <main_category_id>5</main_category_id> <category_id>44</category_id> </category> </categories> </p_update> ... and I want a CSV file like this: main_category_name main_category_id Sonstiges 5 Buehne 2 category_name main_category_id category_id Reggae 1 45 sonstige 5 44 Or is there a way to import directly into tables in a postgres database? Your help would be appreciated! Regards Erwin
Erwin Brandstetter wrote: > Hi! > > How do you import data from an xml-file? > For instance, if I have a file like this: > > <?xml version="1.0" encoding="utf-8"?> > <p_update> > <main_categories> > <main_category> > <main_category_name>Sonstiges</main_category_name> > <main_category_id>5</main_category_id> > </main_category> > <main_category> > <main_category_name>Buehne</main_category_name> > <main_category_id>2</main_category_id> > </main_category> > <main_category> > <main_category_name>Konzerte</main_category_name> > <main_category_id>1</main_category_id> > </main_category> > </main_categories> > <categories> > <category> > <category_name>Reggae</category_name> > <main_category_id>1</main_category_id> > <category_id>45</category_id> > </category> > <category> > <category_name>sonstige</category_name> > <main_category_id>5</main_category_id> > <category_id>44</category_id> > </category> > </categories> > </p_update> > > > ... and I want a CSV file like this: > > main_category_name main_category_id > Sonstiges 5 > Buehne 2 > > category_name main_category_id category_id > Reggae 1 45 > sonstige 5 44 > > > Or is there a way to import directly into tables in a postgres > database? > isn't it amazing how redundantly wordy XML is, yet it doesn't provide sufficient information to perform this simple task without more knowlege (for instance, there's no data types, but we sure know the name of the fields as they are spelled out twice for each row!)
I think.. you'll need to parse that one...
Where do you have your data types to xml document ?
Where do you have your data types to xml document ?
2009/8/26 Erwin Brandstetter <brsaweda@gmail.com>
Hi!
How do you import data from an xml-file?
For instance, if I have a file like this:
<?xml version="1.0" encoding="utf-8"?>
<p_update>
<main_categories>
<main_category>
<main_category_name>Sonstiges</main_category_name>
<main_category_id>5</main_category_id>
</main_category>
<main_category>
<main_category_name>Buehne</main_category_name>
<main_category_id>2</main_category_id>
</main_category>
<main_category>
<main_category_name>Konzerte</main_category_name>
<main_category_id>1</main_category_id>
</main_category>
</main_categories>
<categories>
<category>
<category_name>Reggae</category_name>
<main_category_id>1</main_category_id>
<category_id>45</category_id>
</category>
<category>
<category_name>sonstige</category_name>
<main_category_id>5</main_category_id>
<category_id>44</category_id>
</category>
</categories>
</p_update>
... and I want a CSV file like this:
main_category_name main_category_id
Sonstiges 5
Buehne 2
category_name main_category_id category_id
Reggae 1 45
sonstige 5 44
Or is there a way to import directly into tables in a postgres
database?
Your help would be appreciated!
Regards
Erwin
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Wed, Aug 26, 2009 at 09:10:25AM -0700, Erwin Brandstetter wrote: > How do you import data from an xml-file? If they're all that small, put the file into the database as is and then use xpath[1] to pull it apart and turn it into something a database understand. -- Sam http://samason.me.uk/ [1] http://www.postgresql.org/docs/current/static/functions-xml.html#FUNCTIONS-XML-PROCESSING
> Hi! > > How do you import data from an xml-file? > For instance, if I have a file like this: > > <?xml version="1.0" encoding="utf-8"?> > <p_update> > <main_categories> > <main_category> > <main_category_name>Sonstiges</main_category_name> > <main_category_id>5</main_category_id> > </main_category> > <main_category> > <main_category_name>Buehne</main_category_name> > <main_category_id>2</main_category_id> > </main_category> > <main_category> > <main_category_name>Konzerte</main_category_name> > <main_category_id>1</main_category_id> > </main_category> > </main_categories> > <categories> > <category> > <category_name>Reggae</category_name> > <main_category_id>1</main_category_id> > <category_id>45</category_id> > </category> > <category> > <category_name>sonstige</category_name> > <main_category_id>5</main_category_id> > <category_id>44</category_id> > </category> > </categories> > </p_update> > > > ... and I want a CSV file like this: > > main_category_name main_category_id > Sonstiges 5 > Buehne 2 > > category_name main_category_id category_id > Reggae 1 45 > sonstige 5 44 > > > Or is there a way to import directly into tables in a postgres > database? > > > Your help would be appreciated! > Regards > Erwin > Not sure why you are mentioning a CSV export. I ASSUME you want to import into database tables and not go directly to csv. (If that's the case, use another tool, not a database.) INSERT INTO main_categories(name, id) SELECT extract_value('//main_category_name', x) AS name, extract_value('//main_category_id', x)::int AS id -- without extract_value -- CAST(CAST(xpath('//main_category_id/text()', x)[0] AS varchar) AS int) AS id FROM unnest(xpath('//main_category', xml('...your xml here...'))) x INSERT INTO categories(name, main_id, id) SELECT extract_value('//category_name', x) AS name, extract_value('//main_category_id', x)::int AS main_id, extract_value('//category_id', x)::int AS id FROM unnest(xpath('//category', xml('...your xml here...'))) x Unnest isn't included until pg 8.4. And extract_value() is a function I borrowed from Oracle to make life easier. I have a write up about it on my blog. http://scottrbailey.wordpress.com/2009/06/19/xml-parsing-postgres/
One solution is to use Perl DBI. DBD::AnyData will read xml. DBD::Pg will write to Postgres. -Will > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of > Erwin Brandstetter > Sent: 26 August 2009 12:10 > To: pgsql-general@postgresql.org > Subject: [GENERAL] Import data from XML file > > Hi! > > How do you import data from an xml-file? > For instance, if I have a file like this: > > <?xml version="1.0" encoding="utf-8"?> > <p_update> > <main_categories> > <main_category> > <main_category_name>Sonstiges</main_category_name> > <main_category_id>5</main_category_id> > </main_category> > <main_category> > <main_category_name>Buehne</main_category_name> > <main_category_id>2</main_category_id> > </main_category> > <main_category> > <main_category_name>Konzerte</main_category_name> > <main_category_id>1</main_category_id> > </main_category> > </main_categories> > <categories> > <category> > <category_name>Reggae</category_name> > <main_category_id>1</main_category_id> > <category_id>45</category_id> > </category> > <category> > <category_name>sonstige</category_name> > <main_category_id>5</main_category_id> > <category_id>44</category_id> > </category> > </categories> > </p_update> > > > ... and I want a CSV file like this: > > main_category_name main_category_id > Sonstiges 5 > Buehne 2 > > category_name main_category_id category_id > Reggae 1 45 > sonstige 5 44 > > > Or is there a way to import directly into tables in a postgres > database? > > > Your help would be appreciated! > Regards > Erwin > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Erwin
did you try ems-data?
http://ems-data-import-2007-for-postgresql.software.informer.com/3.0/
Martin Gainty
______________________________________________
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen.
> Date: Wed, 26 Aug 2009 11:54:23 -0700
> From: artacus@comcast.net
> CC: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Import data from XML file
>
> > Hi!
> >
> > How do you import data from an xml-file?
> > For instance, if I have a file like this:
> >
> > <?xml version="1.0" encoding="utf-8"?>
> > <p_update>
> > <main_categories>
> > <main_category>
> > <main_category_name>Sonstiges</main_category_name>
> > <main_category_id>5</main_category_id>
> > </main_category>
> > <main_category>
> > <main_category_name>Buehne</main_category_name>
> > <main_category_id>2</main_category_id>
> > </main_category>
> > <main_category>
> > <main_category_name>Konzerte</main_category_name>
> > <main_category_id>1</main_category_id>
> > </main_category>
> > </main_categories>
> > <categories>
> > <category>
> > <category_name>Reggae</category_name>
> > <main_category_id>1</main_category_id>
> > <category_id>45</category_id>
> > </category>
> > <category>
> > <category_name>sonstige</category_name>
> > <main_category_id>5</main_category_id>
> > <category_id>44</category_id>
> > </category>
> > </categories>
> > </p_update>
> >
> >
> > ... and I want a CSV file like this:
> >
> > main_category_name main_category_id
> > Sonstiges 5
> > Buehne 2
> >
> > category_name main_category_id category_id
> > Reggae 1 45
> > sonstige 5 44
> >
> >
> > Or is there a way to import directly into tables in a postgres
> > database?
> >
> >
> > Your help would be appreciated!
> > Regards
> > Erwin
> >
>
> Not sure why you are mentioning a CSV export. I ASSUME you want to
> import into database tables and not go directly to csv. (If that's the
> case, use another tool, not a database.)
>
> INSERT INTO main_categories(name, id)
> SELECT extract_value('//main_category_name', x) AS name,
> extract_value('//main_category_id', x)::int AS id
> -- without extract_value
> -- CAST(CAST(xpath('//main_category_id/text()', x)[0] AS varchar) AS
> int) AS id
> FROM unnest(xpath('//main_category', xml('...your xml here...'))) x
>
> INSERT INTO categories(name, main_id, id)
> SELECT extract_value('//category_name', x) AS name,
> extract_value('//main_category_id', x)::int AS main_id,
> extract_value('//category_id', x)::int AS id
> FROM unnest(xpath('//category', xml('...your xml here...'))) x
>
> Unnest isn't included until pg 8.4. And extract_value() is a function I
> borrowed from Oracle to make life easier. I have a write up about it on
> my blog.
>
> http://scottrbailey.wordpress.com/2009/06/19/xml-parsing-postgres/
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Windows Live: Keep your friends up to date with what you do online. Find out more.
did you try ems-data?
http://ems-data-import-2007-for-postgresql.software.informer.com/3.0/
Martin Gainty
______________________________________________
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le destinataire prévu, nous te demandons avec bonté que pour satisfaire informez l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est interdite. Ce message sert à l'information seulement et n'aura pas n'importe quel effet légalement obligatoire. Étant donné que les email peuvent facilement être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité pour le contenu fourni.
> Date: Wed, 26 Aug 2009 11:54:23 -0700
> From: artacus@comcast.net
> CC: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Import data from XML file
>
> > Hi!
> >
> > How do you import data from an xml-file?
> > For instance, if I have a file like this:
> >
> > <?xml version="1.0" encoding="utf-8"?>
> > <p_update>
> > <main_categories>
> > <main_category>
> > <main_category_name>Sonstiges</main_category_name>
> > <main_category_id>5</main_category_id>
> > </main_category>
> > <main_category>
> > <main_category_name>Buehne</main_category_name>
> > <main_category_id>2</main_category_id>
> > </main_category>
> > <main_category>
> > <main_category_name>Konzerte</main_category_name>
> > <main_category_id>1</main_category_id>
> > </main_category>
> > </main_categories>
> > <categories>
> > <category>
> > <category_name>Reggae</category_name>
> > <main_category_id>1</main_category_id>
> > <category_id>45</category_id>
> > </category>
> > <category>
> > <category_name>sonstige</category_name>
> > <main_category_id>5</main_category_id>
> > <category_id>44</category_id>
> > </category>
> > </categories>
> > </p_update>
> >
> >
> > ... and I want a CSV file like this:
> >
> > main_category_name main_category_id
> > Sonstiges 5
> > Buehne 2
> >
> > category_name main_category_id category_id
> > Reggae 1 45
> > sonstige 5 44
> >
> >
> > Or is there a way to import directly into tables in a postgres
> > database?
> >
> >
> > Your help would be appreciated!
> > Regards
> > Erwin
> >
>
> Not sure why you are mentioning a CSV export. I ASSUME you want to
> import into database tables and not go directly to csv. (If that's the
> case, use another tool, not a database.)
>
> INSERT INTO main_categories(name, id)
> SELECT extract_value('//main_category_name', x) AS name,
> extract_value('//main_category_id', x)::int AS id
> -- without extract_value
> -- CAST(CAST(xpath('//main_category_id/text()', x)[0] AS varchar) AS
> int) AS id
> FROM unnest(xpath('//main_category', xml('...your xml here...'))) x
>
> INSERT INTO categories(name, main_id, id)
> SELECT extract_value('//category_name', x) AS name,
> extract_value('//main_category_id', x)::int AS main_id,
> extract_value('//category_id', x)::int AS id
> FROM unnest(xpath('//category', xml('...your xml here...'))) x
>
> Unnest isn't included until pg 8.4. And extract_value() is a function I
> borrowed from Oracle to make life easier. I have a write up about it on
> my blog.
>
> http://scottrbailey.wordpress.com/2009/06/19/xml-parsing-postgres/
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Windows Live: Keep your friends up to date with what you do online. Find out more.
We've used Pentaho Data Integration (aka Kettle) at http://kettle.pentaho.org to do this in the past. (Kettle is the free / open source version, although there is a bigger commercial version of PDI that does more.) It reads XML nicely and talks natively to PostgreSQL databases, so you can load data directly from your XML file right into your PostgreSQL database. If you're a Microsoft shop, you can also use SQL Server Integration Services [SSIS] to do the same thing but much better and faster, although I almost hesitate to mention a Microsoft tool on this forum even though it's just going from XML to PostgreSQL and CSV with no trace of SQL Server anywhere to be seen. (If you do go the SSIS route, the "dotConnect for PostgreSQL" ADO.NET driver from Devart [ http://www.devart.com/dotconnect/postgresql/ ] works wonderfully to connect from SSIS to PostgreSQL. Unfortunately, the Npgsql driver doesn't really work very well with SSIS...) - Bill > -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Erwin Brandstetter > Sent: Wednesday, August 26, 2009 12:10 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Import data from XML file > > Hi! > > How do you import data from an xml-file? > For instance, if I have a file like this: > > <?xml version="1.0" encoding="utf-8"?> > <p_update> > <main_categories> > <main_category> > <main_category_name>Sonstiges</main_category_name> > <main_category_id>5</main_category_id> > </main_category> > <main_category> > <main_category_name>Buehne</main_category_name> > <main_category_id>2</main_category_id> > </main_category> > <main_category> > <main_category_name>Konzerte</main_category_name> > <main_category_id>1</main_category_id> > </main_category> > </main_categories> > <categories> > <category> > <category_name>Reggae</category_name> > <main_category_id>1</main_category_id> > <category_id>45</category_id> > </category> > <category> > <category_name>sonstige</category_name> > <main_category_id>5</main_category_id> > <category_id>44</category_id> > </category> > </categories> > </p_update> > > > ... and I want a CSV file like this: > > main_category_name main_category_id > Sonstiges 5 > Buehne 2 > > category_name main_category_id category_id > Reggae 1 45 > sonstige 5 44 > > > Or is there a way to import directly into tables in a postgres > database? > > > Your help would be appreciated! > Regards > Erwin > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Thanks Scott! (And thanks for all the other hints!) Yes, the goal is to get the data into tables in a pg database. Having a CSV file or having the data in pg-tables, both equally solve the problem. I like this approach as it does not involve additional tools. I will have to upgrade to pg 8.4 first, though. Will this method scale well with XML files of around 1 MB? Import will have to be done once a weak. Regards, Erwin