Thread: Can I read data load files without loading in db?

Can I read data load files without loading in db?

From

barryc.ctr@gmail.com

Date:

25 August 2006, 07:04:26

  We have an awkward situation.

  An affiliate organization periodically sends us a stack of CDs.  On
the first one there are a couple of small scripts to handle installing
the data and/or upgrading the database schema.  Most of the CD's
contents are large data files to be used to update a postgresql
database.

  This would be OK if we had much need for the full database.  In
practice, we typically spend all the time loading the data into the
database only to turn around and run our own C++ program that generates
extracts containing the relatively few data fields that we need on
another project (where we load the data into a Berkeley database and
use it from there).
   I would like to skip the time consuming process of Postgresql
loading before extracting, and somehow directly generate the extracts
that we need with a program that reads directly from the database
dump/load files on the CDs.   The affiliate organization has made it
clear that they cannot afford the resources to make a customized
limited edition of their data.

   I would appreciate some hints on the issues.  Which of the files in
such a situation would have the data layout information?  If there are
some of you that have actually done something similar (directly read
the archive files without loading them), I'd greatly appreciate hearing
about it.

Thanks

-bC

Re: Can I read data load files without loading in db?

From

Martijn van Oosterhout

Date:

25 August 2006, 07:52:00

On Wed, Aug 23, 2006 at 12:27:09PM -0700, barryc.ctr@gmail.com wrote:
>   We have an awkward situation.
>
>   An affiliate organization periodically sends us a stack of CDs.  On
> the first one there are a couple of small scripts to handle installing
> the data and/or upgrading the database schema.  Most of the CD's
> contents are large data files to be used to update a postgresql
> database.

I assume from the question that the data is in pg_dump format? i.e. a
whole pile of COPY statements with data?

In that case just write a perl script to extract the data.

Or specify more precisely if that's not the problem.

Have anice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: Can I read data load files without loading in db?

From

Martijn van Oosterhout

Date:

25 August 2006, 08:52:07

On Fri, Aug 25, 2006 at 07:36:38AM -0400, barry conner wrote:
> Thanks for your reply Martijn.
>   I do not know for certain that its in pg_dump format, not having dealt
> with it specificly before.   I will examine the files more closely to look
> for COPY statements.   If not in that format, what else could it be that
> could be shoved right into a postgresql database?    In any case, would an
> associated file (schema type?) tell me the field name arrangements that the
> data files are loaded with?   or if COPY statements are used, would they
> name the fields being targeted?

It really depends on the format they send you. It's either plain text,
in which case the COPY commands will tell you which fields. If it's the
custom format you can use tar to extract it IIRC.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: Can I read data load files without loading in db?

From

Peter Eisentraut

Date:

25 August 2006, 10:18:36

Am Freitag, 25. August 2006 13:51 schrieb Martijn van Oosterhout:
> It really depends on the format they send you. It's either plain text,
> in which case the COPY commands will tell you which fields. If it's the
> custom format you can use tar to extract it IIRC.

You can also use pg_restore to extract "binary" dump formats to plain text.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Can I read data load files without loading in db?

From

"barry conner"

Date:

26 August 2006, 16:38:01

Thanks for your reply Martijn.
I do not know for certain that its in pg_dump format, not having dealt with it specificly before. I will examine the files more closely to look for COPY statements. If not in that format, what else could it be that could be shoved right into a postgresql database? In any case, would an associated file (schema type?) tell me the field name arrangements that the data files are loaded with? or if COPY statements are used, would they name the fields being targeted?

Thanks again.

On 8/25/06, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Wed, Aug 23, 2006 at 12:27:09PM -0700, barryc.ctr@gmail.com wrote:
>   We have an awkward situation.
>
>   An affiliate organization periodically sends us a stack of CDs.  On
> the first one there are a couple of small scripts to handle installing
> the data and/or upgrading the database schema.  Most of the CD's
> contents are large data files to be used to update a postgresql
> database.

I assume from the question that the data is in pg_dump format? i.e. a
whole pile of COPY statements with data?

In that case just write a perl script to extract the data.

Or specify more precisely if that's not the problem.

Have anice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFE7tZDIB7bNG8LQkwRAmqwAJ0S3UU1Im2CXnZ7XZ3231JUukrHGACgjiNH
FK36quKQzA7vwwweFfqaYCE=
=WLfn
-----END PGP SIGNATURE-----

Re: Can I read data load files without loading in db?

From

"consultmac2"

Date:

26 August 2006, 16:38:02

Peter Eisentraut wrote:
> Am Freitag, 25. August 2006 13:51 schrieb Martijn van Oosterhout:
> > It really depends on the format they send you. It's either plain text,
> > in which case the COPY commands will tell you which fields. If it's the
> > custom format you can use tar to extract it IIRC.
>
> You can also use pg_restore to extract "binary" dump formats to plain text.

  I used "less" to look at one of the files.  In this case its named
'db-private.pg-dump.1' and in a subdirectory called 'database' on the
CD.  "less" shows much familiar data in the file.  I would have assumed
it was in text format.
  I don't notice any COPY commands in the file as I'm looking at it via
"less", so I try to locate some via "grep" and it tells me that there
are matches, but that its a binary file so it won't show me the actual
occurrences.  !!

  I guess that I'll need to try your suggestion of using pg_restore to
extract them to plain text.   Apparently I'll need to add software to
this Fedora Core 3 installation since pg_restore isn't defined
currently.  Do you recall the switches needed?

Thanks!

-bC