Re: COPY speedup - Mailing list pgsql-hackers

From Pierre Frédéric Caillaud
Subject Re: COPY speedup
Date
Msg-id op.uykjafsocke6l8@soyouz
Whole thread Raw
In response to Re: COPY speedup  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: COPY speedup  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
>> >> We don't touch datatype APIs
>> >> lightly, because it affects too much code.
>> >>
>> >>                        regards, tom lane
>> >
>> >        I definitely agree with that.

Actually, let me clarify:

When I modified the datatype API, I was feeling uneasy, like "I shouldn't  
really touch this".
But when I see a big red button, I just press it to see what happens.
Ugly hacks are useful to know how fast the thing can go ; then the  
interesting part is to reimplement it cleanly, trying to reach the same  
performance...

>> Is there any way to do this that is not as invasive?
>
> Maybe add new methods, fastrecv/fastsend etc.  Types that don't
> implement them would simply use the slow methods, maintaining
> backwards compatibility.

Well, this would certainly work, and it would be even faster.

I considered doing it like this, but it is a lot more work : adding  
entries to the system catalogs, creating all the new functions, deciding  
what to do with getTypeBinaryOutputInfo (since there would be 2 variants),  
etc. Types that don't support the new functions would need some form of  
indirection to call the old functions instead, etc. In a word, doable, but  
kludgy, and I would need help from a system catalog expert. Also, on  
upgrade, information about the new functions must be inserted in the  
system catalogs ? (I don't know how this process works). If you want to  
help...

The way I see COPY BINARY is that its speed should be really something  
massive.
COPY foo FROM ... BINARY should be as fast as CREATE TABLE foo AS SELECT *  FROM bar (which is extremely fast).
COPY foo TO ... BINARY should be as fast as the disk allows.

Why else would anyone use a binary format if it's slower than portable  
text ?

So, there are two other ways (besides fastsend/fastrecv) that I can see :

1- The way I implemented

I'm not saying it's the definitive solution : just a simple way to see how  
much overhead is introduced by the current API, returning BYTEAs and  
palloc()'ing every tuple of every row. I think this approach gave two  
interesting answers :

- once COPY's output buffer has been made more efficient, with things like  
removing fwrite() for every row etc (see patch), all that remains is the  
API overhead, which is very significant for binary mode, since I could get  
massive speedups (3-4x !) by bypassing it. The table scan itself is  
super-fast.

- however, for text mode, it is not so significant, as the speedups  
bypassing the API were roughly 0-20%, since most of the time is spent in  
datum to text conversions.

Now, I don't think the hack is so ugly. It does make me feel a bit uneasy,  
but :

- The context field in the fcinfo struct is there for a reason, so I used  
it.
- I checked every place in the code where SendFunctionCall() appears  
(which are quite few actually).
- The context field is never used for SendFuncs or ReceiveFuncs (it is  
always set to NULL)

2- Another way

- palloc() could be made faster for short blocks
- a generous sprinkling of inline's
- a few modifications to pq_send*
- a few modifications to StringInfo
- bits of my previous patch in copy.c (like not fwriting every row)

This would be less fast, but you'd still get a substantial speedup.

As a conclusion, I think :

- Alvaro's fastsend/fastrecv provides the cleanest solutin
- Method 2 is the easiest, but slower
- Method 1 is an intermediate, but the use of the context field is a  
touchy subject.

Also, I will work on COPY FROM ... BINARY. I should be able to make it  
also much faster. This will be useful for big imports.

Regards,
Pierre


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: WIP: getting rid of the pg_database flat file
Next
From: Tom Lane
Date:
Subject: Re: surprising trigger/foreign key interaction