We have an awkward situation.
An affiliate organization periodically sends us a stack of CDs. On
the first one there are a couple of small scripts to handle installing
the data and/or upgrading the database schema. Most of the CD's
contents are large data files to be used to update a postgresql
database.
This would be OK if we had much need for the full database. In
practice, we typically spend all the time loading the data into the
database only to turn around and run our own C++ program that generates
extracts containing the relatively few data fields that we need on
another project (where we load the data into a Berkeley database and
use it from there).
I would like to skip the time consuming process of Postgresql
loading before extracting, and somehow directly generate the extracts
that we need with a program that reads directly from the database
dump/load files on the CDs. The affiliate organization has made it
clear that they cannot afford the resources to make a customized
limited edition of their data.
I would appreciate some hints on the issues. Which of the files in
such a situation would have the data layout information? If there are
some of you that have actually done something similar (directly read
the archive files without loading them), I'd greatly appreciate hearing
about it.
Thanks
-bC