Re: pg_dump --split patch - Mailing list pgsql-hackers

From Marko Tiikkaja
Subject Re: pg_dump --split patch
Date
Msg-id 50A969B6.20906@joh.to
Whole thread Raw
In response to Re: pg_dump --split patch  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: pg_dump --split patch  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: pg_dump --split patch  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 16/11/2012 15:52, Dimitri Fontaine wrote:
> Marko Tiikkaja <pgmail@joh.to> writes:
>> The general output scheme looks like this:
>>     schemaname/OBJECT_TYPES/object_name.sql,
>
> I like this feature, I actually did have to code it myself in the past
> and several other people did so, so we already have at least 3 copies of
> `getddl` variants around. I really think this feature should be shipped
> by default with PostgreSQL.
>
> I don't much care for the all uppercase formating of object type
> directories in your patch though.

*shrug*  I have no real preference to one way or the other.

>> Overloaded functions are dumped into the same file.  Object names are
>> encoded into the POSIX Portable Filename Character Set ([a-z0-9._-]) by
>> replacing any characters outside that set with an underscore.
>
> What happens if you have a table foo and another table "FoO"?

They would go to the same file.  If you think there are technical issues 
behind that decision (e.g. the dump would not restore), I would like to 
hear an example case.

On the other hand, some people might find it preferrable to have them in 
different files (for example foo, foo.1, foo.2 etc).  Or some might 
prefer some other naming scheme.  One of the problems with this patch is 
exactly that people prefer different things, and providing switches for 
all of the different options people come up with would mean a lot of 
switches. :-(

>> Restoring the dump is supported through an index.sql file containing
>> statements which include (through \i) the actual object files in the dump
>> directory.
>
> I think we should be using \ir now that we have that.

Good point, will have to get that fixed.

>> Any thoughts?  Objections on the idea or the implementation?
>
> As far as the implementation goes, someone with more experience on the
> Archiver Handles should have a look. To me, it looks like you are trying
> to shoehorn your feature in the current API and that doesn't feel good.

It feels a bit icky to me too, but I didn't feel comfortable with 
putting in a lot of work to refactor the API because of how 
controversial this feature is.

> The holly grail here that we've been speaking about in the past would be
> to separate out tooling and formats so that we have:
>
>     pg_dump | pg_restore
>     pg_export | psql
>
> In that case we would almost certainly need libpgdump to share the code,
> and we maybe could implement a binary output option for pg_dump too
> (yeah, last time it was proposed we ended up with bytea_output = 'hex').

While I agree that this idea - when implemented - would be nicer in 
practically every way, I'm not sure I want to volunteer to do all the 
necessary work.

> That libpgdump idea basically means we won't have the --split feature in
> 9.3, and that's really bad, as we already are some releases late on
> delivering that, in my opinion.
>
> Maybe the pg_export and pg_dump tool could share code by just #include
> magic rather than a full blown lib in a first incantation?

That's one idea..


Regards,
Marko Tiikkaja



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: autovacuum stress-testing our system
Next
From: Tomas Vondra
Date:
Subject: Re: autovacuum stress-testing our system