Home > mailing lists

Re: pg_dump: Sorted output, referential integrity - Mailing list pgsql-hackers

From	Christof Petig
Subject	Re: pg_dump: Sorted output, referential integrity
Date	December 10, 2001 03:28:43
Msg-id	3C147094.DB58BBF4@petig-baender.de Whole thread Raw
In response to	Re: pg_dump: Sorted output, referential integrity ("Christopher Kings-Lynne" <chriskl@familyhealth.com.au>)
Responses	Re: pg_dump: Sorted output, referential integrity
List	pgsql-hackers

Tree view

Christopher Kings-Lynne wrote:

> > > > but to manage the data in a version
> > > >control system you need it consistently sorted. So a flag to sort by
> > > >either primary key or left to right would be of great value. (--sorted
> > > >?)
> > >
> > > Not really very generalizable when you consider user defined types,
> > > triggers etc.
> >
> > Hmmm. But if we have a primary key on columns (A,B,C) and request the data
> > 'order by A,B,C' this should be portable, shouldn't it?
> > If we don't have a primary key simply ordering by 1,2,3,...n
> > should also work.
> > Or am I missing something?
>
> I can see how ordering a dump by the primary key would be a neat way of
> 'clustering' your data after a restore, however I have qualms about the
> scalability of such a scheme.  What if someone has a 100GB table?  They may
> have arranged things so that they never get a sort from it or something, or
> it might take ages.  However I guess if it's an optional parameter it might
> be neat.
>
> My feeling is that it won't happen unless you actually code it into a patch
> that makes it a parameter to pg_dump.  Having an actual patch is a great way
> of getting something you want done ;)
>
> Alternatively, have you tried just writing a PERL script (or some clever sed
> script) that will just sort the COPY FROM sections...?

That's beyond my perl skills. And I believe sed to be not the right tool. (hmm,
perhaps split (at 'COPY FROM' and at '\.'), then sort, then cat ... many
(perhaps big) temporary files, let the db do the hard work)

But making a patch to pg_dump is a matter of (say) up to 4 hours.
I'll do it since you seem to like it and nobody started doing it so far.

Christof

pgsql-hackers by date:

From: Hiroshi Inoue
Date: 10 December 2001, 02:37:36
Subject: Re: [BUGS] Bug #533: BLOB (lo type) objects could not be restored

From: "Christopher Kings-Lynne"
Date: 10 December 2001, 03:58:47
Subject: Re: pg_dump: Sorted output, referential integrity

Re: pg_dump: Sorted output, referential integrity - Mailing list pgsql-hackers

Previous

Next