Thread: XML and Postgres

XML and Postgres

From
Greg Lindstrom
Date:
Hello-
I am fairly new to postgres, but have been coding for over 20 years and
would like to know what, if anything, is available to post data in XML
format to postgres.  I have read (seemingly) conflicting reports that
Oracle will allow users to post an xml formatted document directly to
the database and my sysadmin "seems to recall" reading about it.  I am
in the process of dissecting a fairly complex xml file (health care
claims data in EDI x12 format) and would much rather use an existing
technology to post it if it exists.

So, does such a thing exist?  Or perhaps I have misread the literature
and am off base?

Thanks for your help.

--greg

Greg Lindstrom               501 975.4859
Computer Programmer          greg.lindstrom@novasyshealth.com
NovaSys Health
Little Rock, Arkansas

"We are the music makers, and we are the dreamers of dreams."  W.W.



Re: XML and Postgres

From
brew@theMode.com
Date:
On Wed, 16 Mar 2005, Greg Lindstrom wrote:

> I am fairly new to postgres, but have been coding for over 20 years and
> would like to know what, if anything, is available to post data in XML
> format to postgres.

Hopefully someone can tell you of a direct solution, but I can tell you my
experiences using perl.

I had to synchronize an on site office database (File Maker Pro) with a
webserver database.  File Maker Pro could output in XML and that's way the
DBA wanted to do it.  He set it up so that it ftped the data in XML to the
web server and I use cron to start a perl script and import the data into
the webserver DB.

There are several XML perl modules available but all the ones I tried had
the same flaw - they grabbed the entire XML file and loaded it into a huge
variable in memory.  The XML file they wanted to upload was so big it used
up all the available memory and the perl script failed.

I had to write a parsing routine that churns through the XML file and
loads the data for one record, then dumps it into the database before
going back for the next record - it's much more memory efficient, albeit
slow - no grouped transactions, but that's all right - it runs in the
middle of the night.

I hesitate to post it since it may not work with all XML data files, I
wrote and tested it for MY particular datafile.  In particular it may not
work for empty elements ( like <data /> ). I'll be glad to share it and
explain how it works if you need it.

I confess the webserver DB in this case is mySQL (it pre-existed me on
this project), but since I used the perl DBI (Database Interface Library)
it will work with postgreSQL just as easily.

Hope this helps if nobody steps forward with a better solution.

brew

 ==========================================================================
                  Strange Brew   (brew@theMode.com)
  Check out my Stock Option Covered Call website  http://www.callpix.com
     and my Musician's Online Database Exchange http://www.TheMode.com
 ==========================================================================


Re: XML and Postgres

From
Sean Davis
Date:
On 3/17/05 2:53, "brew@theMode.com" <brew@theMode.com> wrote:

>
> On Wed, 16 Mar 2005, Greg Lindstrom wrote:
>
>> I am fairly new to postgres, but have been coding for over 20 years and
>> would like to know what, if anything, is available to post data in XML
>> format to postgres.
>
> Hopefully someone can tell you of a direct solution, but I can tell you my
> experiences using perl.
>
> I had to synchronize an on site office database (File Maker Pro) with a
> webserver database.  File Maker Pro could output in XML and that's way the
> DBA wanted to do it.  He set it up so that it ftped the data in XML to the
> web server and I use cron to start a perl script and import the data into
> the webserver DB.
>
> There are several XML perl modules available but all the ones I tried had
> the same flaw - they grabbed the entire XML file and loaded it into a huge
> variable in memory.  The XML file they wanted to upload was so big it used
> up all the available memory and the perl script failed.
>
> I had to write a parsing routine that churns through the XML file and
> loads the data for one record, then dumps it into the database before
> going back for the next record - it's much more memory efficient, albeit
> slow - no grouped transactions, but that's all right - it runs in the
> middle of the night.

In these cases, you can use XML::Twig to break up the file into smaller
parts--it is quite easy and VERY fast.

Sean


Re: XML and Postgres

From
Greg Lindstrom
Date:
brew@theMode.com wrote:
<blockquote cite="midPine.BSF.4.58.0503170223470.35543@themode.com"
 type="cite">
  Hopefully someone can tell you of a direct solution, but I can tell you my
experiences using perl.

Thanks for the reply.  Can I ask some more basic questions before I
jump into this?


  Do I understand correctly that using one of these tools I can
hand my xml-formatted data file to postgres and it will be inserted
into the database?
  How does the database "know" the layout?  I am dealing with
health case claim data using the EDI X12 standard (such as it is).  It
is well defined, but large and complex (well beyond the "mailing list"
examples I've seen).  Do I need to post a DTD or Schema to Postgres? 
Or does it simply (!) create tables and links to accommodate whatever
relationships are defined in the data file? Or is the xml dumped into 1
large field?

  When I add additional data to the database (another claim file
from another vendor or day), will it be added to the existing data, or
-- I assume -- entered into the base using the new root node?
  How do I extract data from the database?  Do I use XPath type
queries?
  Is it possible to insert, update, change, delete data from the
data file on the base?  If so, can I issue a command to extract the
entire document with the modified data?
  Can you point me to documentation on any of this?  Is it worth
the trouble?  Or should I dissect the file into traditionally
normalized tables?

Thank-you so much for your help and advice.  This is a new world to me
and I'd like to understand what I might be getting in to.

--greg
Greg Lindstrom               501 975.4859
Computer Programmer          greg.lindstrom@novasyshealth.com
NovaSys Health
Little Rock, Arkansas

"We are the music makers, and we are the dreamers of dreams."  W.W.

Re: XML and Postgres

From
brew@theMode.com
Date:
Sean....

> > I had to write a parsing routine that churns through the XML file and
> > loads the data for one record, then dumps it into the database before
> > going back for the next record - it's much more memory efficient, albeit
> > slow - no grouped transactions, but that's all right - it runs in the
> > middle of the night.
>
> In these cases, you can use XML::Twig to break up the file into smaller
> parts--it is quite easy and VERY fast.

Thanks, it was a couple of years ago that I wrote my parsing code, so I'm
not going to change it at this time, but I'll keep that in mind for
the future.

I experimented with Twig (but don't remember the exact results - it was a
year and a half ago), but somehow I missed that you could break the data
up in smaller chunks.  That's really ironic because now I see that the
name Twig is because it's a branch off the tree!!!

Darn.

Thanks.... brew

 ==========================================================================
                  Strange Brew   (brew@theMode.com)
  Check out my Stock Option Covered Call website  http://www.callpix.com
     and my Musician's Online Database Exchange http://www.TheMode.com
 ==========================================================================


Re: XML and Postgres

From
brew@theMode.com
Date:
Greg.....

>  1. Do I understand correctly that using one of these tools I can hand my
>     xml-formatted data file to postgres and it will be inserted into the
>     database?

Hopefully for you somebody else might know of something like that!

Sorry to tell you, but I'm handling it manually.  The perl program just
serves as the bridge between the two databases.  I had to manually
normalize the database that I was going to pump the data into, then I had
to write a bridging program to parse the data out of the XML file, loading
each piece of data into a variable, then inserting or updating the proper
table in the database with it.

It's not the automatic solution you sound like you are hoping for.
That's what I meant by writing 'hopefully somebody else can give you a
direct solution'.

Good luck finding what you need, maybe somebody else knows how to make
postgreSQL handle XML data directly or knows of a database that will (and
still be fast, etc.).

brew

 ==========================================================================
                  Strange Brew   (brew@theMode.com)
  Check out my Stock Option Covered Call website  http://www.callpix.com
     and my Musician's Online Database Exchange http://www.TheMode.com
 ==========================================================================


Re: XML and Postgres

From
Sean Davis
Date:
On Mar 17, 2005, at 12:36 PM, brew@theMode.com wrote:

>
> Greg.....
>
>>  1. Do I understand correctly that using one of these tools I can
>> hand my
>>     xml-formatted data file to postgres and it will be inserted into
>> the
>>     database?
>
> Hopefully for you somebody else might know of something like that!
>
> Sorry to tell you, but I'm handling it manually.  The perl program just
> serves as the bridge between the two databases.  I had to manually
> normalize the database that I was going to pump the data into, then I
> had
> to write a bridging program to parse the data out of the XML file,
> loading
> each piece of data into a variable, then inserting or updating the
> proper
> table in the database with it.
>
> It's not the automatic solution you sound like you are hoping for.
> That's what I meant by writing 'hopefully somebody else can give you a
> direct solution'.
>
> Good luck finding what you need, maybe somebody else knows how to make
> postgreSQL handle XML data directly or knows of a database that will
> (and
> still be fast, etc.).

Unfortunately, I, like Brew, do it "by hand".  There are modules for
xml/dbi connection, but I haven't used them.  In practice, if you know
the database structure from which your data is coming, you can recreate
it easily enough.  Then, it is just a question of parsing the XML (also
relatively trivial) and them making a loader using DBI (again, fairly
straightforward).  It sounds like you data is coming out fairly
normalized already, so the process shouldn't require too much data
munging.  If you need to munge, you can always load to a "loader"
schema and then use triggers to change the data into another format in
the "working" schema.  As for querying your database, you could look at
contrib/xml, but that is only if you load straight XML.  You will
probably load into regular tables for a more "standard" solution, so
you would do your queries in SQL.  In case you want to look at more
automated solution, you might check out:

http://www.xml.com/pub/a/2000/12/13/perlxmldb.html

For a total different approach, there are a number of XML databases and
drivers:

http://xml.apache.org/xindice/
http://search.cpan.org/author/GSEAMAN/XML-DB/lib/XML/DB/Database/
Xindice.pm
http://search.cpan.org/author/GSEAMAN/XML-DB/lib/XML/DB/Database.pm
http://search.cpan.org/author/GSEAMAN/XML-DB/lib/XML/DB/Database/
Exist.pm

I haven't used them at all and have no idea how well they scale, etc.

Sean


Re: XML and Postgres

From
George Weaver
Date:
Greg,

I would suggest you check out  "The xpsql Project -- XpSQL, a XML database
enviroment using PostgreSQL ":

http://gborg.postgresql.org/project/xpsql/projdisplay.php


Regards,
George

----- Original Message -----
From: <brew@theMode.com>
To: "Greg Lindstrom" <greg.lindstrom@novasyshealth.com>
Cc: <pgsql-novice@postgresql.org>
Sent: Thursday, March 17, 2005 11:36 AM
Subject: Re: [NOVICE] XML and Postgres


>
> Greg.....
>
>>  1. Do I understand correctly that using one of these tools I can hand my
>>     xml-formatted data file to postgres and it will be inserted into the
>>     database?
>
> Hopefully for you somebody else might know of something like that!
>
> Sorry to tell you, but I'm handling it manually.  The perl program just
> serves as the bridge between the two databases.  I had to manually
> normalize the database that I was going to pump the data into, then I had
> to write a bridging program to parse the data out of the XML file, loading
> each piece of data into a variable, then inserting or updating the proper
> table in the database with it.
>
> It's not the automatic solution you sound like you are hoping for.
> That's what I meant by writing 'hopefully somebody else can give you a
> direct solution'.
>
> Good luck finding what you need, maybe somebody else knows how to make
> postgreSQL handle XML data directly or knows of a database that will (and
> still be fast, etc.).
>
> brew
>
> ==========================================================================
>                  Strange Brew   (brew@theMode.com)
>  Check out my Stock Option Covered Call website  http://www.callpix.com
>     and my Musician's Online Database Exchange http://www.TheMode.com
> ==========================================================================
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>      subscribe-nomail command to majordomo@postgresql.org so that your
>      message can get through to the mailing list cleanly
>