Thread: XML to Postgres conversion
Hello: I am trying to figure out how to import xml documents into a postgres database. I am running PostgreSql 7.1.3 on Red Hat 7.2 at the moment. I have several dynamic xml documents that I want imported into the database on a regular basis(through cron entries). From my research so far, I know that there is middleware available to perform this, but am having some difficulty in finding the actual applications. I am trying to stay with open source applications, if possible. Can anyone give me any suggestions or resources to pull from? Thanks, N. Hill __________________________________________________ Do You Yahoo!? Sign up for SBC Yahoo! Dial - First Month Free http://sbc.yahoo.com
Nathan Hill <cmsu_tech@yahoo.com> writes: > Hello: > > I am trying to figure out how to import xml documents > into a postgres database. I am running PostgreSql > 7.1.3 on Red Hat 7.2 at the moment. I have several > dynamic xml documents that I want imported into the > database on a regular basis(through cron entries). > >From my research so far, I know that there is > middleware available to perform this, but am having > some difficulty in finding the actual applications. I > am trying to stay with open source applications, if > possible. Can anyone give me any suggestions or > resources to pull from? > > Thanks, > > N. Hill What is it precisely that you want to do? Do you just want save the XML text to the database, or do you want to populate PostgreSQL tables with information provided in the documents? A more thorough explanation (including perhaps some sample XML and table schemas) would be helpful. Jason
Here is a snippit of code from one of the xml files: <?xml version="1.0" encoding="UTF-8"?> <Computer> <Created_on>2002/07/09 11:09:41</Created_on> <Client_version>1.8.9</Client_version> <General_info> <Operating_system> <Name>Microsoft Windows XP</Name> <Major_version>5</Major_version> <Minor_version>1</Minor_version> <Build_number>2600</Build_number> <Platform_ID>2</Platform_ID> <Additional_information></Additional_information> <User_name>someone</User_name> <Boot_time>07/09/2002 09:04:21</Boot_time> <Registered_to>registered</Registered_to> <Registration_code>534574-345-7467453-74538</Registration_code> </Operating_system> </Computer> Here is a snippit from the database setup script: create table computers ( computer_name varchar(50) primary key, operating_system_name varchar(50), operating_system_version varchar(50), operating_system_additional_info varchar(50), processor_name varchar(50), processor_manufacturer varchar(50), processor_count numeric, processor_speed numeric, processor_family varchar(50), processor_model varchar(50), processor_stepping varchar(50), memory_physical numeric, memory_virtual numeric, memory_page_file_size numeric, video_bios_date varchar(50), video_bios_version varchar(50), display_name varchar(50), display_manufacturer varchar(50), display_hor_res numeric, display_ver_res numeric, display_color_depth numeric, display_refresh_rate numeric, winsock_domain_name varchar(50), winsock_description varchar(50), winsock_status varchar(50), winsock_version varchar(50), winsock_name varchar(50) ); All the xml files will have the same format. A Windows program called Alchemy Network Inventory is used to generate the xml files. I want to populate the tables in Postgres with the data in the XML files. The software that generates the xml files was actually designed for use with MS Access and has built in ODBC capabilities. I actually designed the Postgres database off a template for Access. Where I'm running into issues is I'm using Postgre on Linux instead of Access on Windows. Yes, I can use the ODBC connection with the Postgres odbc drivers and export the files from a Windows machine to the database manually. There lies the other issue: I would like to automate the process. I want to try to keep as much as I possibly can on the linux box because it is much easier to script and automate than if I were try it from a Windows machine. If you (or anyone) would like to look at an entire xml or .sql file, I would be more than happy to forward them or if I'm still lacking information, please let me know. Thanks. __________________________________________________ Do You Yahoo!? Sign up for SBC Yahoo! Dial - First Month Free http://sbc.yahoo.com
If I was doing what you are I would use a XSL stylesheet to generate SQL insert scripts which could then be ran on PostgreSQL. To automate things you could use java (eg apache's xalan to transform the xml in memory and then use JDBC to run the batch insert statements), Hope this is of some help, Kris ----- Original Message ----- From: "Nathan Hill" <cmsu_tech@yahoo.com> To: "Jason Earl" <jason.earl@simplot.com> Cc: <pgsql-general@postgresql.org> Sent: Wednesday, July 10, 2002 9:53 PM Subject: Re: [GENERAL] XML to Postgres conversion > Here is a snippit of code from one of the xml files: > > <?xml version="1.0" encoding="UTF-8"?> > <Computer> > <Created_on>2002/07/09 11:09:41</Created_on> > <Client_version>1.8.9</Client_version> > <General_info> > <Operating_system> > <Name>Microsoft Windows XP</Name> > <Major_version>5</Major_version> > <Minor_version>1</Minor_version> > <Build_number>2600</Build_number> > <Platform_ID>2</Platform_ID> > > <Additional_information></Additional_information> > <User_name>someone</User_name> > <Boot_time>07/09/2002 09:04:21</Boot_time> > <Registered_to>registered</Registered_to> > > <Registration_code>534574-345-7467453-74538</Registration_code> > </Operating_system> > </Computer> > > Here is a snippit from the database setup script: > > create table computers > ( > computer_name varchar(50) primary key, > operating_system_name varchar(50), > operating_system_version varchar(50), > operating_system_additional_info varchar(50), > > processor_name varchar(50), > processor_manufacturer varchar(50), > processor_count numeric, > processor_speed numeric, > processor_family varchar(50), > processor_model varchar(50), > processor_stepping varchar(50), > > memory_physical numeric, > memory_virtual numeric, > memory_page_file_size numeric, > > video_bios_date varchar(50), > video_bios_version varchar(50), > display_name varchar(50), > display_manufacturer varchar(50), > display_hor_res numeric, > display_ver_res numeric, > display_color_depth numeric, > display_refresh_rate numeric, > > winsock_domain_name varchar(50), > winsock_description varchar(50), > winsock_status varchar(50), > winsock_version varchar(50), > winsock_name varchar(50) > ); > > All the xml files will have the same format. A > Windows program called Alchemy Network Inventory is > used to generate the xml files. I want to populate > the tables in Postgres with the data in the XML files. > The software that generates the xml files was > actually designed for use with MS Access and has built > in ODBC capabilities. I actually designed the > Postgres database off a template for Access. Where > I'm running into issues is I'm using Postgre on Linux > instead of Access on Windows. Yes, I can use the ODBC > connection with the Postgres odbc drivers and export > the files from a Windows machine to the database > manually. There lies the other issue: I would like to > automate the process. I want to try to keep as much > as I possibly can on the linux box because it is much > easier to script and automate than if I were try it > from a Windows machine. If you (or anyone) would like > to look at an entire xml or .sql file, I would be more > than happy to forward them or if I'm still lacking > information, please let me know. > > Thanks. > > > __________________________________________________ > Do You Yahoo!? > Sign up for SBC Yahoo! Dial - First Month Free > http://sbc.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
Did you think about perl ? Nathan Hill wrote: > Here is a snippit of code from one of the xml files: > > <?xml version="1.0" encoding="UTF-8"?> > <Computer> > <Created_on>2002/07/09 11:09:41</Created_on> > <Client_version>1.8.9</Client_version> > <General_info> > <Operating_system> > <Name>Microsoft Windows XP</Name> > <Major_version>5</Major_version> > <Minor_version>1</Minor_version> > <Build_number>2600</Build_number> > <Platform_ID>2</Platform_ID> > > <Additional_information></Additional_information> > <User_name>someone</User_name> > <Boot_time>07/09/2002 09:04:21</Boot_time> > <Registered_to>registered</Registered_to> > > <Registration_code>534574-345-7467453-74538</Registration_code> > </Operating_system> > </Computer> > > Here is a snippit from the database setup script: > > create table computers > ( > computer_name varchar(50) primary key, > operating_system_name varchar(50), > operating_system_version varchar(50), > operating_system_additional_info varchar(50), > > processor_name varchar(50), > processor_manufacturer varchar(50), > processor_count numeric, > processor_speed numeric, > processor_family varchar(50), > processor_model varchar(50), > processor_stepping varchar(50), > > memory_physical numeric, > memory_virtual numeric, > memory_page_file_size numeric, > > video_bios_date varchar(50), > video_bios_version varchar(50), > display_name varchar(50), > display_manufacturer varchar(50), > display_hor_res numeric, > display_ver_res numeric, > display_color_depth numeric, > display_refresh_rate numeric, > > winsock_domain_name varchar(50), > winsock_description varchar(50), > winsock_status varchar(50), > winsock_version varchar(50), > winsock_name varchar(50) > ); > > All the xml files will have the same format. A > Windows program called Alchemy Network Inventory is > used to generate the xml files. I want to populate > the tables in Postgres with the data in the XML files. > The software that generates the xml files was > actually designed for use with MS Access and has built > in ODBC capabilities. I actually designed the > Postgres database off a template for Access. Where > I'm running into issues is I'm using Postgre on Linux > instead of Access on Windows. Yes, I can use the ODBC > connection with the Postgres odbc drivers and export > the files from a Windows machine to the database > manually. There lies the other issue: I would like to > automate the process. I want to try to keep as much > as I possibly can on the linux box because it is much > easier to script and automate than if I were try it > from a Windows machine. If you (or anyone) would like > to look at an entire xml or .sql file, I would be more > than happy to forward them or if I'm still lacking > information, please let me know. > > Thanks. > > > __________________________________________________ > Do You Yahoo!? > Sign up for SBC Yahoo! Dial - First Month Free > http://sbc.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
This is just to demonstrate how XSLT can come to your rescue (You will obviously have to adapt the approach). The following stylesheet will generate a series of SQL inserts. I use the sablotron XSLT transformer but really any suitable program should work. Regards, MArk <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <!-- Assumes the XML file has a series of Operating_system tags --> <xsl:template match="/"> <xsl:apply-templates select="//Operating_system"/> </xsl:template> <!-- Generate a SQL INSERT for each "Operating_system" tag --> <xsl:template match="Operating_system"> INSERT INTO computers ( Name, Major_version, Minor_version, Build_number, Platform_ID, Additional_information, User_name, Boot_time, Registered_to, Registration_code VALUES ( '<xsl:value-of select="Name"/>', '<xsl:value-of select="Major_version"/>', '<xsl:value-of select="Minor_version"/>', '<xsl:value-of select="Build_number"/>', '<xsl:value-of select="Platform_ID"/>', '<xsl:value-of select="Additional_information"/>', '<xsl:value-of select="User_name"/>', '<xsl:value-of select="Boot_time"/>', '<xsl:value-of select="Registered_to"/>', '<xsl:value-of select="Registration_code"/>'); </xsl:template> </xsl:stylesheet> Nathan Hill wrote: > Here is a snippit of code from one of the xml files: > > <?xml version="1.0" encoding="UTF-8"?> > <Computer> > <Created_on>2002/07/09 11:09:41</Created_on> > <Client_version>1.8.9</Client_version> > <General_info> Please note that this tag is not closed!!!! > <Operating_system> > <Name>Microsoft Windows XP</Name> > <Major_version>5</Major_version> > <Minor_version>1</Minor_version> > <Build_number>2600</Build_number> > <Platform_ID>2</Platform_ID> > > <Additional_information></Additional_information> > <User_name>someone</User_name> > <Boot_time>07/09/2002 09:04:21</Boot_time> > <Registered_to>registered</Registered_to> > > <Registration_code>534574-345-7467453-74538</Registration_code> > </Operating_system> > </Computer> >