Re: COPY FROM performance improvements - Mailing list pgsql-patches

From Simon Riggs
Subject Re: COPY FROM performance improvements
Date
Msg-id 1123661700.3670.621.camel@localhost.localdomain
Whole thread Raw
In response to Re: COPY FROM performance improvements  ("Luke Lonergan" <llonergan@greenplum.com>)
Responses Re: COPY FROM performance improvements
List pgsql-patches
On Tue, 2005-08-09 at 21:48 -0700, Luke Lonergan wrote:

> The key thing that is missing is the lack of micro-parallelism in the
> character processing in this version.  By "inverting the loop", or putting
> the characters into a buffer on the outside, then doing fast character
> scanning inside with special "fix-up" cases, we exposed long runs of
> pipeline-able code to the compiler.
>
> I think there is another way to accomplish the same thing and still preserve
> the current structure, but it requires "strip mining" the character buffer
> into chunks that can be processed with an explicit loop to check for the
> different characters.  While it may seem artificial (it is), it will provide
> the compiler with the ability to pipeline the character finding logic over
> long runs.  The other necessary element will have to avoid pipeline stalls
> from the "if" conditions as much as possible.

This is a key point, IMHO.

That part of the code was specifically written to take advantage of
processing pipelines in the hardware, not because the actual theoretical
algorithm for that approach was itself faster.

Nobody's said what compiler/hardware they have been using, so since both
Alon and Tom say their character finding logic is faster, it is likely
to be down to that? Name your platforms gentlemen, please.

My feeling is that we may learn something here that applies more widely
across many parts of the code.

Best Regards, Simon Riggs



pgsql-patches by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: 5 new entries for FAQ
Next
From: "Luke Lonergan"
Date:
Subject: Re: COPY FROM performance improvements