Insert/Update/Select large XML files(hstore?) - Mailing list pgsql-admin

From Raju Angani
Subject Insert/Update/Select large XML files(hstore?)
Date
Msg-id CAJ-04OoZVyjjiuDtnzUKUiC65X4jEB=E4b--UdDXii=p8xksVQ@mail.gmail.com
Whole thread Raw
List pgsql-admin
Hi pg gurus,

I'm very new into postgres, and I need some help on handling large xml files(20k lines or 1MB size) within the postgres database. Today we are using xDB(from EMC) to handle this data. Overall xDB is good, but it has got some serious limitations. I would like to explore hstore or xml datatype in postgres to do a POC, and my case is to prove postgres can handle this kind of data seamlessly. 

Could someone point me in the right direction on how to achieve my goals.
1) Fast inserts/updates
2) Select data seamlessly
3) Support adhoc queries(there could be 20k documents of size 300kb~1024kb each), 
e.g.: select all the matching attributes of device scsi from all 20k documents.


Sample data from my xml files. I could have massive xml doc in this format.

      <scsiLun xsi:type="HostScsiDisk" qs:id="ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk4">
        <deviceName>/devices/disks/disk4</deviceName>
        <deviceType>scsi-disk</deviceType>
        <key>host.ScsiDisk-disk4</key>
        <uuid>disk4</uuid>
        <canonicalName>hba4:1:1</canonicalName>
        <lunType>disk</lunType>
        <scsiLevel>0</scsiLevel>
        <durableName>
          <namespace>Unknown</namespace>
          <namespaceId>0</namespaceId>
          <data>9</data>
        </durableName>
        <queueDepth>0</queueDepth>
        <operationalState>ok</operationalState>
        <capacity>
          <blockSize>4096</blockSize>
          <block>100000000</block>
        </capacity>
        <devicePath>/devices/disks/disk4</devicePath>
      </scsiLun>
      <scsiLun xsi:type="HostScsiDisk" qs:id="ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk10">
        <deviceName>/devices/disks/disk10</deviceName>
        <deviceType>scsi-disk</deviceType>
        <key>host.ScsiDisk-disk10</key>
        <uuid>disk10</uuid>
        <canonicalName>hba10:0:0</canonicalName>
        <lunType>disk</lunType>
        <scsiLevel>0</scsiLevel>
        <durableName>
          <namespace>Unknown</namespace>
          <namespaceId>0</namespaceId>
          <data>9</data>
        </durableName>
        <queueDepth>0</queueDepth>
        <operationalState>ok</operationalState>
        <capacity>
          <blockSize>4096</blockSize>
          <block>10000000</block>
        </capacity>
        <devicePath>/devices/disks/disk10</devicePath>
      </scsiLun>
      <adapter>
          <key>host.ScsiTopology.Interface-hba11</key>
          <adapter xlink:type="simple" xlink:href="#HostHostBusAdapter:key%2dvim%2ehost%2eParallelScsiHba%2dhba11"/>
          <target>
            <key>host.ScsiTopology.Target-hba11:0:0</key>
            <target>0</target>
            <lun>
              <key>host.ScsiTopology.Lun-disk100</key>
              <lun>0</lun>
              <scsiLun xlink:type="simple" xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk100"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk101</key>
              <lun>1</lun>
              <scsiLun xlink:type="simple" xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk101"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk102</key>
              <lun>2</lun>
              <scsiLun xlink:type="simple" xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk102"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk103</key>
              <lun>3</lun>
              <scsiLun xlink:type="simple" xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk103"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk104</key>
              <lun>4</lun>
              <scsiLun xlink:type="simple" xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk104"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk105</key>
              <lun>5</lun>
              <scsiLun xlink:type="simple" xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk105"/>
            </lun>
             <transport xsi:type="HostParallelScsiTargetTransport"/>
          </target>
        </adapter>
      <mountInfo>
        <mountInfo>
          <path>/volumes/f121d633-405475583f86-bcacce3ac69a</path>
          <accessMode>readWrite</accessMode>
          <mounted>true</mounted>
          <accessible>true</accessible>
        </mountInfo>
        <volume xsi:type="HostVolume">
          <type>XFS</type>
          <name>sh-ds-3</name>
          <capacity>322122547200</capacity>
          <blockSizeMb>1</blockSizeMb>
          <maxBlocks>63963136</maxBlocks>
          <majorVersion>5</majorVersion>
          <version>5.00</version>
          <uuid>f121d633-405475583f86-bcacce3ac69a</uuid>
          <extent>
            <diskName>hba10:0:3</diskName>
            <partition>1</partition>
          </extent>
          <xfsUpgradable>false</xfsUpgradable>
        </volume>
      </mountInfo>

pgsql-admin by date:

Previous
From: Joe Tennant
Date:
Subject: pg_dump out of memory issue...
Next
From: "Saravanakumar Ramasamy"
Date:
Subject: When I executed type cast functions. The postgres normal concatenation operator query was breaking.