Thread: no XLOG during COPY?

no XLOG during COPY?

From
Andrew Dunstan
Date:
Back in February, Tom said here: 
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00963.php :

> That defeats a couple of optimizations that
> Simon put in recently.  The one for no XLOG during COPY is not too
> hard to see how to re-enable, but I'm not sure what else there was.
>
>   

Could someone please point me at where this optimization was committed? 
I'm having trouble locating it.

Thanks

andrew



Re: no XLOG during COPY?

From
"Guillaume Smet"
Date:
On Thu, Sep 11, 2008 at 9:01 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Could someone please point me at where this optimization was committed? I'm
> having trouble locating it.

I think it's this one:
http://archives.postgresql.org/pgsql-committers/2007-01/msg00296.php

-- 
Guillaume


Re: no XLOG during COPY?

From
Heikki Linnakangas
Date:
Andrew Dunstan wrote:
> Back in February, Tom said here: 
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00963.php :
> 
>> That defeats a couple of optimizations that
>> Simon put in recently.  The one for no XLOG during COPY is not too
>> hard to see how to re-enable, but I'm not sure what else there was.
> 
> Could someone please point me at where this optimization was committed? 
> I'm having trouble locating it.

http://archives.postgresql.org/pgsql-committers/2007-01/msg00296.php

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: no XLOG during COPY?

From
Andrew Dunstan
Date:

Heikki Linnakangas wrote:
> Andrew Dunstan wrote:
>> Back in February, Tom said here: 
>> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00963.php :
>>
>>> That defeats a couple of optimizations that
>>> Simon put in recently.  The one for no XLOG during COPY is not too
>>> hard to see how to re-enable, but I'm not sure what else there was.
>>
>> Could someone please point me at where this optimization was 
>> committed? I'm having trouble locating it.
>
> http://archives.postgresql.org/pgsql-committers/2007-01/msg00296.php


Great, thanks (and also to Guillaume).

It looks to me like the simple way around this issue would be to provide 
an option to have pg_restore emit:   begin; truncate foo; copy foo ... commit;

The truncate will be trivial as there won't be any data or indexes at 
that stage anyway.

cheers

andrew


Re: no XLOG during COPY?

From
Simon Riggs
Date:
On Thu, 2008-09-11 at 15:25 -0400, Andrew Dunstan wrote:

> Great, thanks (and also to Guillaume).
> 
> It looks to me like the simple way around this issue would be to provide 
> an option to have pg_restore emit:
>     begin; truncate foo; copy foo ... commit;
> 
> The truncate will be trivial as there won't be any data or indexes at 
> that stage anyway.

Not sure which stage you're talking about. If this is a parallel restore
and you are running a create in one session and a load in another, then
ISTM you have no way of knowing that for certain.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: no XLOG during COPY?

From
Andrew Dunstan
Date:

Simon Riggs wrote:
> On Thu, 2008-09-11 at 15:25 -0400, Andrew Dunstan wrote:
>
>   
>> Great, thanks (and also to Guillaume).
>>
>> It looks to me like the simple way around this issue would be to provide 
>> an option to have pg_restore emit:
>>     begin; truncate foo; copy foo ... commit;
>>
>> The truncate will be trivial as there won't be any data or indexes at 
>> that stage anyway.
>>     
>
> Not sure which stage you're talking about. If this is a parallel restore
> and you are running a create in one session and a load in another, then
> ISTM you have no way of knowing that for certain.
>
>   

Er, who doesn't know what for certain, exactly? pg_restore will 
certainly know that it has created the table in another session and can 
thus safely truncate the table in the same transaction as the data load.

cheers

andrew


Re: no XLOG during COPY?

From
Russell Smith
Date:
Andrew Dunstan wrote:
> [snip]
>>  
>
> Er, who doesn't know what for certain, exactly? pg_restore will
> certainly know that it has created the table in another session and
> can thus safely truncate the table in the same transaction as the data
> load.
>
> cheers
>
> andrew
>
I'm confused about why table definition and data can't be loaded in the
same backend and transaction.  Can somebody explain that?

All items in the tree like  A -> B -> C -> D  should all be loaded in
the same transaction as they are serially dependent.  I can't think of a
way that the table data requires more than just the table to load. 
Foreign keys may produce this situation but if all tables are loaded
with the data I can't see how it can happen.  As Foreign key tables must
be loaded before the referencing table.  But then I think these
constraints are loaded at the end anyway.

The first cut of this may not have the dependency resolution smarts to
work out how best to group restore items together to send to a backend
together.  My research into how the directed graph dependency
information is stored should allow for dishing out the data to backends
in the best possible way.  But currently there is no graph as such, just
a serial list of items that are safe to load.  Producing the graph will
give a better idea of maximum concurrency based on what's dependent on
each other.  But the graph has to be built from the dependency
information that's stored.

Is it also feasible to have the -1 (single transaction) option to
complete the largest possible work unit inside a single transaction. 
This means there would be 1transaction per backend work unit, eg (A, B,
C, D in the above).  I don' t know if indexes can skip WAL if they are
in the table creation transaction but that would seem like another win
if they were added at the same time as the table.  That does play
against the ideas of running all of the index creation statements in
parallel to get the benefit of synchronized scan.  I don't know what
going to be the biggest win on big hardware as I don't have any.  Just
something to think about.

Thanks

Russell Smith


Re: no XLOG during COPY?

From
Andrew Dunstan
Date:

Russell Smith wrote:
> Andrew Dunstan wrote:
>   
>> [snip]
>>     
>>>  
>>>       
>> Er, who doesn't know what for certain, exactly? pg_restore will
>> certainly know that it has created the table in another session and
>> can thus safely truncate the table in the same transaction as the data
>> load.
>>
>> cheers
>>
>> andrew
>>
>>     
> I'm confused about why table definition and data can't be loaded in the
> same backend and transaction.  Can somebody explain that?
>   

That would delay other things that depend on the table definition (e.g. 
function definitions). If we don't need to make this link (and we don't) 
then I can't see why we should shackle ourselves with it.

cheers

andrew