Re: [PATCHES] CopyReadLineText optimization - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [PATCHES] CopyReadLineText optimization
Date
Msg-id 47EBE766.3070305@enterprisedb.com
Whole thread Raw
List pgsql-hackers
Heikki Linnakangas wrote:
> 1. CopyReadLineText is all about finding the the next end of line; 
> splitting to fields is done later. We therefore only care about quotes 
> and escapes when they affect the end of line detection. In text mode, we 
>  only need to care about a backslash that precedes a LF/CR. Therefore, 
> we could search for the next LF/CR character with memchr(), and check if 
> the preceding character is a backslash (and if it is, check if there's 
> yet another backslash behind it, and so forth until we hit a 
> non-backslash character).

While looking into this, I realized that we also need to detect the 
end-of-copy marker, backslash+period+EOL. In CSV mode, we only honor the 
end-of-copy marker if it's on a line of it's own, as \. can occur in 
data, but in text mode we accept it at any point.

Does anyone object to changing that so that we only accept \. on a line 
of its own in text mode as well? That way we wouldn't need to care about 
backslashes in CopyReadLineText. AFAIK our tools have always output the 
\. like that, so this would only affect custom applications that use 
COPY and the \. marker.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Aidan Van Dyk
Date:
Subject: Re: psql and named pipes
Next
From: "Gurjeet Singh"
Date:
Subject: pg_standby for 8.2 (with last restart point)