RE: Parallel copy - Mailing list pgsql-hackers

From Hou, Zhijie
Subject RE: Parallel copy
Date
Msg-id 8f241649021d4cac85e884aac7166656@G08CNEXMBPEKD05.g08.fujitsu.local
Whole thread Raw
In response to Re: Parallel copy  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Parallel copy  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
Hi

> 
> my $bytes = $ARGV[0];
> for(my $i = 0; $i < $bytes; $i+=8){
>      print "longdata";
> }
> print "\n";
> --------
> 
> postgres=# copy longdata from program 'perl /tmp/longdata.pl 100000000'
> with (parallel 2);
> 
> This gets stuck forever (or at least I didn't have the patience to wait
> it finish). Both worker processes are consuming 100% of CPU.

I had a look over this problem.

the ParallelCopyDataBlock has size limit:
    uint8        skip_bytes;
    char        data[DATA_BLOCK_SIZE];    /* data read from file */

It seems the input line is so long that the leader process run out of the Shared memory among parallel copy workers.
And the leader process keep waiting free block.

For the worker process, it wait util line_state becomes LINE_LEADER_POPULATED,
But leader process won't set the line_state unless it read the whole line.

So it stuck forever.
May be we should reconsider about this situation.

The stack is as follows:

Leader stack:
#3  0x000000000075f7a1 in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents@entry=41, timeout=timeout@entry=1,
wait_event_info=wait_event_info@entry=150994945)at latch.c:411
 
#4  0x00000000005a9245 in WaitGetFreeCopyBlock (pcshared_info=pcshared_info@entry=0x7f26d2ed3580) at
copyparallel.c:1546
#5  0x00000000005a98ce in SetRawBufForLoad (cstate=cstate@entry=0x2978a88, line_size=67108864,
copy_buf_len=copy_buf_len@entry=65536,raw_buf_ptr=raw_buf_ptr@entry=65536, 
 
    copy_raw_buf=copy_raw_buf@entry=0x7fff4cdc0e18) at copyparallel.c:1572
#6  0x00000000005a1963 in CopyReadLineText (cstate=cstate@entry=0x2978a88) at copy.c:4058
#7  0x00000000005a4e76 in CopyReadLine (cstate=cstate@entry=0x2978a88) at copy.c:3863

Worker stack:
#0  GetLinePosition (cstate=cstate@entry=0x29e1f28) at copyparallel.c:1474
#1  0x00000000005a8aa4 in CacheLineInfo (cstate=cstate@entry=0x29e1f28, buff_count=buff_count@entry=0) at
copyparallel.c:711
#2  0x00000000005a8e46 in GetWorkerLine (cstate=cstate@entry=0x29e1f28) at copyparallel.c:885
#3  0x00000000005a4f2e in NextCopyFromRawFields (cstate=cstate@entry=0x29e1f28, fields=fields@entry=0x7fff4cdc0b48,
nfields=nfields@entry=0x7fff4cdc0b44)at copy.c:3615
 
#4  0x00000000005a50af in NextCopyFrom (cstate=cstate@entry=0x29e1f28, econtext=econtext@entry=0x2a358d8,
values=0x2a42068,nulls=0x2a42070) at copy.c:3696
 
#5  0x00000000005a5b90 in CopyFrom (cstate=cstate@entry=0x29e1f28) at copy.c:2985


Best regards,
houzj




pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Move OpenSSL random under USE_OPENSSL_RANDOM
Next
From: Euler Taveira
Date:
Subject: Re: redundant error messages