Thread: parallel restore

parallel restore

From

Andrew Dunstan

Date:

29 December 2008, 19:42:07

Attached is the latest parallel restore patch. I think this is getting
fairly close.

Includes help text, docco and some extra error checking.

cheers

andrew

Attachment

parallel_restore_13.patch.gz

Re: parallel restore

From

Laurent Coustet

Date:

31 December 2008, 05:53:14

Andrew Dunstan wrote:
> 
> Attached is the latest parallel restore patch. I think this is getting 
> fairly close.

Just some details, you often mix tab and spaces for indentation... 
What's the standard in pgsql ?

-- 
Laurent COUSTET

Re: parallel restore

From

Magnus Hagander

Date:

31 December 2008, 06:51:03

Laurent Coustet wrote:
> Andrew Dunstan wrote:
>>
>> Attached is the latest parallel restore patch. I think this is getting
>> fairly close.
> 
> Just some details, you often mix tab and spaces for indentation...
> What's the standard in pgsql ?

It's tabs, see:
http://www.postgresql.org/docs/8.3/static/source-format.html.

There's an entry in the dev faq as well.

Anyway, we have pgindent to fix any such issues - probably Andrew will
run it through that before he applies, if not it will still be run
before release.

//Magnus

Re: parallel restore

From

Andrew Dunstan

Date:

03 January 2009, 17:34:48

I wrote:
>
> Attached is the latest parallel restore patch. I think this is getting 
> fairly close.
>
> Includes help text, docco and some extra error checking.
>
>   

I propose to commit this unless someone wants more time for reviewing.

cheers

andrew

Re: parallel restore

From

Tom Lane

Date:

04 January 2009, 13:14:49

Andrew Dunstan <andrew@dunslane.net> writes:
> I propose to commit this unless someone wants more time for reviewing.

A moment's eyeballing of the patch finds rather a lot of debugging cruft
yet (inserted printfs, "#ifdef WIN32blah", etc); also I think the
addition to include/port.h belongs in port/win32.h instead.

More generally, though, I think it hasn't been looked at much by
anyone (certainly not by me).
        regards, tom lane

Re: parallel restore

From

Andrew Dunstan

Date:

04 January 2009, 13:28:32


Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> I propose to commit this unless someone wants more time for reviewing.
>>     
>
> A moment's eyeballing of the patch finds rather a lot of debugging cruft
> yet (inserted printfs, "#ifdef WIN32blah", etc); also I think the
> addition to include/port.h belongs in port/win32.h instead.
>
> More generally, though, I think it hasn't been looked at much by
> anyone (certainly not by me).
>
>             
>   

I must have sent the wrong patch file.

I'll recheck for debugging stuff and send a new patch.

I can certainly hold off committing it.

cheers

andrew

Re: parallel restore

From

Simon Riggs

Date:

04 January 2009, 13:43:39

On Mon, 2008-12-29 at 18:42 -0500, Andrew Dunstan wrote:
> Attached is the latest parallel restore patch. I think this is getting 
> fairly close.
> 
> Includes help text, docco and some extra error checking.

Very brief review.

Hopefully the --truncate-before-load option works in both parallel mode
and data-only mode?

Few typos on docs.

No README or comments explain how the patch works. This part of the code
was probably most opaque already, so its really needed unfortunately. 

* I'm particularly interested in error handling, for example if one
thread hits a deadlock, gets accidentally killed by user, hits bug in
custom add-in code etc..

* Earlier bugs with pre/post data were related to missing objects or
putting them in wrong groups. Documenting that will help identify
errors.

Few starnge names, sorry:

work_is_being_done --> work_in_progress
_inhibit_data?? --> _skip_data??
prestored --> parallel_restored
restore_one_te --> restore_TocEntry

Would like ability to increase/decrease number of parallel threads as
restore progresses. Experience says you always pick the wrong number
and/or situation changes while in progress.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: parallel restore

From

Alvaro Herrera

Date:

04 January 2009, 14:12:02

Andrew Dunstan wrote:
>
> Attached is the latest parallel restore patch. I think this is getting  
> fairly close.

Some random comments

- please #define the return type of prestore().  Also, that it returns
in one platform and exits in another seems weird as an API.  I think it
should return in both cases, and have the caller act appropriately.

- RestoreArchiveParallel has too many #ifdef WIN32.  It should be
possible to split it so that the cruft is contained better.

- Aren't there too many checks for libz presence?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: parallel restore

From

"Jaime Casanova"

Date:

06 January 2009, 17:07:58

On Mon, Dec 29, 2008 at 6:42 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Attached is the latest parallel restore patch. I think this is getting
> fairly close.
>

hi, i was making some tests in windows...

but for some reason the pg_restore simply hangs...

i'm using:
pg_restore -f mic.backup -Fc -v -m5

there is a way to know if it's really hanging or is simply too slow? i
plan to let it run all night long just in case...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: parallel restore

From

"Jaime Casanova"

Date:

06 January 2009, 17:08:28

On Tue, Jan 6, 2009 at 4:04 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Mon, Dec 29, 2008 at 6:42 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> Attached is the latest parallel restore patch. I think this is getting
>> fairly close.
>>
>
> hi, i was making some tests in windows...
>

anyway, when i try it, it prints on the screen "pgoff_t: 8, long:4"
maybe a debugging print you have to remove

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: parallel restore

From

Andrew Dunstan

Date:

06 January 2009, 17:13:33


Jaime Casanova wrote:
> On Mon, Dec 29, 2008 at 6:42 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>   
>> Attached is the latest parallel restore patch. I think this is getting
>> fairly close.
>>
>>     
>
> hi, i was making some tests in windows...
>
> but for some reason the pg_restore simply hangs...
>
> i'm using:
> pg_restore -f mic.backup -Fc -v -m5
>
> there is a way to know if it's really hanging or is simply too slow? i
> plan to let it run all night long just in case...
>   


Strange. Maybe the server log will show activity?

cheers

andrew

Re: parallel restore

From

Tom Lane

Date:

06 January 2009, 17:32:50

Andrew Dunstan <andrew@dunslane.net> writes:
> Jaime Casanova wrote:
>> i'm using:
>> pg_restore -f mic.backup -Fc -v -m5

> Strange. Maybe the server log will show activity?

There's no connection info, so that should just print to stdout, and
probably there is no point in any parallelism.  I'm guessing the -m
switch invokes code that fails to deal with this case.
        regards, tom lane

Re: parallel restore

From

"Jaime Casanova"

Date:

06 January 2009, 18:43:55

On Tue, Jan 6, 2009 at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Jaime Casanova wrote:
>>> i'm using:
>>> pg_restore -f mic.backup -Fc -v -m5
>
>> Strange. Maybe the server log will show activity?
>
> There's no connection info, so that should just print to stdout, and
> probably there is no point in any parallelism.  I'm guessing the -m
> switch invokes code that fails to deal with this case.
>

ah! ok, i run the command in this way instead:
pg_restore -p 54320 -Fc -v -d mic mic.backup (why i can't use -f?) and
it works fine, then to test parallel restore i did
pg_restore -p 54320 -Fc -v -m5 -d mic mic.backup
but i forgot to clean up the database... of course it throws a lot of
"$object_name already exists" messages and the last one was a little
strange, it says:

pg_restore: [archiver (db)] connection to database "public" failed:
FATAL: database "public" does not exist

but there isn't a "public" database in the backup...

besides that, maybe, unrelated issue, it seems to work fine...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: parallel restore

From

"Jaime Casanova"

Date:

06 January 2009, 18:48:20

On Mon, Dec 29, 2008 at 6:42 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Attached is the latest parallel restore patch. I think this is getting
> fairly close.
>

mmm... seems this patch are two in one... you're adding --multi-thread
and --truncate-before-load options where the second one seems to be an
optimization...

maybe it's better to split in two incremental patches?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: parallel restore

From

Andrew Dunstan

Date:

06 January 2009, 18:54:45

Jaime Casanova wrote:
> On Mon, Dec 29, 2008 at 6:42 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>   
>> Attached is the latest parallel restore patch. I think this is getting
>> fairly close.
>>
>>     
>
> mmm... seems this patch are two in one... you're adding --multi-thread
> and --truncate-before-load options where the second one seems to be an
> optimization...
>
> maybe it's better to split in two incremental patches?
>
>   

Well, the only reason it was needed was because you can't run a parallel 
restore in a single transaction. If the whole restore is run in a single 
transaction then truncate before load should be unnecessary.

But if people want it made more general I can split it out.

cheers

andrew

Re: parallel restore

From

"Jaime Casanova"

Date:

06 January 2009, 23:56:56

On Tue, Jan 6, 2009 at 5:54 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Well, the only reason it was needed was because you can't run a parallel
> restore in a single transaction. If the whole restore is run in a single
> transaction then truncate before load should be unnecessary.
>

doesn't understand you. Anyway i tried to run with
--truncate-before-load and got a message about that should be
necessary to run TRUNCATE CASCADE instead.

Sorry, don't have the real message at hand. Seems like the recently
applied patch about fseeko made this one to no longer apply cleanly

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: parallel restore

From

Andrew Dunstan

Date:

07 January 2009, 19:38:04

Jaime Casanova wrote:
>  Anyway i tried to run with
> --truncate-before-load and got a message about that should be
> necessary to run TRUNCATE CASCADE instead.
>
>   

Actually, this raises an interesting point. It doesn't seem safe to 
truncate before loading unless we have just created the table earlier in 
the restore. If we did create the table then any FK constraints that 
depend on the table should not have been created yet, so there should be 
no danger of getting this message (and there should be on danger of our 
wiping out actual data - the whole point of this is not to clean the 
tables but to inhibit unnecessary WAL logging of data loads).

That means that it would be useless for a data one restore, which was 
apparently the context in which Jaime was trying to use it.

Now, we could decide that we always want to do a safe truncate in a 
parallel restore (i.e. if we have created the table in the same 
restore), even if archive_mode is on. Then this switch would be 
redundant, and we might avoid some confusion. I'm inclined to do that 
right now. In that case we could leave for consideration for 8.5 a 
switch providing for a TRUNCATE CASCADE on tables before loading them.

Thoughts?

cheers

andrew

Re: parallel restore

From

Tom Lane

Date:

07 January 2009, 20:25:57

Andrew Dunstan <andrew@dunslane.net> writes:
> Now, we could decide that we always want to do a safe truncate in a 
> parallel restore (i.e. if we have created the table in the same 
> restore), even if archive_mode is on. Then this switch would be 
> redundant, and we might avoid some confusion. I'm inclined to do that 
> right now. In that case we could leave for consideration for 8.5 a 
> switch providing for a TRUNCATE CASCADE on tables before loading them.

+1.  I'm not at all clear on the use-case for a user visible switch
of this sort anyway; it seems more like a foot-gun than something
really helpful.
        regards, tom lane