Thread: pipe chunking vs Windows

pipe chunking vs Windows

From
Andrew Dunstan
Date:
I have just discovered that the recently implemented pipe chunking 
protocol is broken on Windows. This is because the pipes are operating 
in text mode and doing LF->CR-LF translation, so the number of bytes 
received is not the number transmitted and set in the protocol header.

I have not yet succeeded in turning this behaviour off (_setmode() 
didn't seem to affect it). If we can't find a way to turn it off, the 
only solution short of abandoning its use on Windows that I can think of 
is to translate LF on input to something unlikely like 0x1C and then 
translate it back when we read it from the pipe.

cheers

andrew


Re: pipe chunking vs Windows

From
Magnus Hagander
Date:
Andrew Dunstan wrote:
> 
> I have just discovered that the recently implemented pipe chunking
> protocol is broken on Windows. This is because the pipes are operating
> in text mode and doing LF->CR-LF translation, so the number of bytes
> received is not the number transmitted and set in the protocol header.
> 
> I have not yet succeeded in turning this behaviour off (_setmode()
> didn't seem to affect it). If we can't find a way to turn it off, the
> only solution short of abandoning its use on Windows that I can think of
> is to translate LF on input to something unlikely like 0x1C and then
> translate it back when we read it from the pipe.

At what point does it actually do the translation? Meaning what
system/library call has it?

Are we using the pipes from src/port/pipe.c? It does sound a bit weird
that they'd do that, since it's basically just emulating stuff over
standard tcp sockets, but perhaps something is broken in that code?

Sorry, haven't really checked up on the chunk code yet, so I don't know
offhand where to look.

//Magnus


Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

Magnus Hagander wrote:
> Andrew Dunstan wrote:
>   
>> I have just discovered that the recently implemented pipe chunking
>> protocol is broken on Windows. This is because the pipes are operating
>> in text mode and doing LF->CR-LF translation, so the number of bytes
>> received is not the number transmitted and set in the protocol header.
>>
>> I have not yet succeeded in turning this behaviour off (_setmode()
>> didn't seem to affect it). If we can't find a way to turn it off, the
>> only solution short of abandoning its use on Windows that I can think of
>> is to translate LF on input to something unlikely like 0x1C and then
>> translate it back when we read it from the pipe.
>>     
>
> At what point does it actually do the translation? Meaning what
> system/library call has it?
>
> Are we using the pipes from src/port/pipe.c? It does sound a bit weird
> that they'd do that, since it's basically just emulating stuff over
> standard tcp sockets, but perhaps something is broken in that code?
>
> Sorry, haven't really checked up on the chunk code yet, so I don't know
> offhand where to look.
>
>
>   

It looks like we aren't. In fact. it looks like the only call to 
pgpipe() in the whole source tree is in the syslogger and it's in 
specifically non-Windows code, meaning that that whole file is currently 
useless.

Maybe you should have a good look at src/backend/postmaster/syslogger.c. 
If we could get rid of the pipe-read threads and all the special Windows 
cruft there that would certainly be an advance.

cheers

andrew




Re: pipe chunking vs Windows

From
Magnus Hagander
Date:
Andrew Dunstan wrote:
>>> I have just discovered that the recently implemented pipe chunking
>>> protocol is broken on Windows. This is because the pipes are operating
>>> in text mode and doing LF->CR-LF translation, so the number of bytes
>>> received is not the number transmitted and set in the protocol header.
>>>
>>> I have not yet succeeded in turning this behaviour off (_setmode()
>>> didn't seem to affect it). If we can't find a way to turn it off, the
>>> only solution short of abandoning its use on Windows that I can think of
>>> is to translate LF on input to something unlikely like 0x1C and then
>>> translate it back when we read it from the pipe.
>>>     
>>
>> At what point does it actually do the translation? Meaning what
>> system/library call has it?
>>
>> Are we using the pipes from src/port/pipe.c? It does sound a bit weird
>> that they'd do that, since it's basically just emulating stuff over
>> standard tcp sockets, but perhaps something is broken in that code?
>>
>> Sorry, haven't really checked up on the chunk code yet, so I don't know
>> offhand where to look.
>>
>>
>>   
> 
> It looks like we aren't. In fact. it looks like the only call to
> pgpipe() in the whole source tree is in the syslogger and it's in
> specifically non-Windows code, meaning that that whole file is currently
> useless.

Uh, see port.h, lines 212-224. If you're using the pipe() command to
create it, it's used.


> Maybe you should have a good look at src/backend/postmaster/syslogger.c.
> If we could get rid of the pipe-read threads and all the special Windows
> cruft there that would certainly be an advance.

I'll try to squeeze some time in to do that - I'll have to read up on
the whole pipe/chunk thing first though, so it'll be a while.

//Magnus


Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

Magnus Hagander wrote:
> Andrew Dunstan wrote:
>   
>>>> I have just discovered that the recently implemented pipe chunking
>>>> protocol is broken on Windows. This is because the pipes are operating
>>>> in text mode and doing LF->CR-LF translation, so the number of bytes
>>>> received is not the number transmitted and set in the protocol header.
>>>>
>>>> I have not yet succeeded in turning this behaviour off (_setmode()
>>>> didn't seem to affect it). If we can't find a way to turn it off, the
>>>> only solution short of abandoning its use on Windows that I can think of
>>>> is to translate LF on input to something unlikely like 0x1C and then
>>>> translate it back when we read it from the pipe.
>>>>     
>>>>         
>>> At what point does it actually do the translation? Meaning what
>>> system/library call has it?
>>>
>>> Are we using the pipes from src/port/pipe.c? It does sound a bit weird
>>> that they'd do that, since it's basically just emulating stuff over
>>> standard tcp sockets, but perhaps something is broken in that code?
>>>
>>> Sorry, haven't really checked up on the chunk code yet, so I don't know
>>> offhand where to look.
>>>
>>>
>>>   
>>>       
>> It looks like we aren't. In fact. it looks like the only call to
>> pgpipe() in the whole source tree is in the syslogger and it's in
>> specifically non-Windows code, meaning that that whole file is currently
>> useless.
>>     
>
> Uh, see port.h, lines 212-224. If you're using the pipe() command to
> create it, it's used.
>   

No, it's the other way around :-) If you use pgpipe() on Unix you're 
calling pipe():


#ifndef WIN32
/**  The function prototypes are not supplied because every C file*  includes this file.*/
#define pgpipe(a)           pipe(a)
#define piperead(a,b,c)     read(a,b,c)
#define pipewrite(a,b,c)    write(a,b,c)
#else
extern int  pgpipe(int handles[2]);
extern int  piperead(int s, char *buf, int len);

#define pipewrite(a,b,c)    send(a,b,c,0)

#define PG_SIGNAL_COUNT 32
#define kill(pid,sig)   pgkill(pid,sig)
extern int  pgkill(int pid, int sig);
#endif


>
>   
>> Maybe you should have a good look at src/backend/postmaster/syslogger.c.
>> If we could get rid of the pipe-read threads and all the special Windows
>> cruft there that would certainly be an advance.
>>     
>
> I'll try to squeeze some time in to do that - I'll have to read up on
> the whole pipe/chunk thing first though, so it'll be a while.
>
>   

You don't need to understand the protocol (it's a very simple 
packetising protocol). The important point is that we have an anonymous 
pipe (created with CreatePipe) which has been dup'ed into stderr.

cheers

andrew


Re: pipe chunking vs Windows

From
Magnus Hagander
Date:
Andrew Dunstan wrote:
>> Uh, see port.h, lines 212-224. If you're using the pipe() command to
>> create it, it's used.
>>   
> 
> No, it's the other way around :-) If you use pgpipe() on Unix you're
> calling pipe():

D'oh. You're right, of course. I'm obviously not in a state where I
should be reading C code right now :)

>>> Maybe you should have a good look at src/backend/postmaster/syslogger.c.
>>> If we could get rid of the pipe-read threads and all the special Windows
>>> cruft there that would certainly be an advance.
>>>     
>>
>> I'll try to squeeze some time in to do that - I'll have to read up on
>> the whole pipe/chunk thing first though, so it'll be a while.
>>
> You don't need to understand the protocol (it's a very simple
> packetising protocol). The important point is that we have an anonymous
> pipe (created with CreatePipe) which has been dup'ed into stderr.

Ok.

//Magnus


Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

korry.douglas wrote:
>
>>> I have not yet succeeded in turning this behaviour off (_setmode() 
>>> didn't seem to affect it). If we can't find a way to turn it off, 
>>> the only solution short of abandoning its use on Windows that I can 
>>> think of is to translate LF on input to something unlikely like 0x1C 
>>> and then translate it back when we read it from the pipe.
>> Did you _setmode() the pipe, _setmode() stderr, or both?  (And did 
>> you try before or after calling dup2()?).
>>
>> It looks like the Win CRT implementation of dup2() copues the "mode" 
>> from the pipe that you've created into stderr.
> Sorry, I was looking at the wrong chunk of code in syslogger.c.
> Why are you calling _open_osfhandle() with O_TEXT?  That looks 
> suspicious giving the problem that you are seeing with CR -> CR/LF 
> translation.
>
>        

I have no idea why that's done - it goes back to the origins of the 
syslogger - probably because someone mistakenly thinks all WIndows text 
files have to have CRLF line endings.

I tried changing that to _O_BINARY, and calling _setmode on both the 
pipe before it's duped into stderr and stderr after the dup and both. 
Nothing seemed to work.

But that's not the only problem. I am now getting log file corruption 
even when I work around the text mode problem by not sending a '\n' at 
all, which makes me think even small packets aren't safely written 
atomically to Windows pipes. I wonder if we'd do better using the (so 
far unused) pipe simulation using sockets that we have. Not sure if we 
can dup a socket into stderr on Windows, but it might be worth trying, 
or even working around that problem. If we could safely use that the 
code would get a whole lot simpler - presumably we would no longer need 
those extra syslogger threads on Windows.

cheers

andrew


Re: pipe chunking vs Windows

From
Andreas Pflug
Date:
Andrew Dunstan wrote:
>
> I have no idea why that's done - it goes back to the origins of the
> syslogger - probably because someone mistakenly thinks all WIndows
> text files have to have CRLF line endings.
>
> I tried changing that to _O_BINARY, and calling _setmode on both the
> pipe before it's duped into stderr and stderr after the dup and both.
> Nothing seemed to work.
AFAIR the flag has to be set again in each child process.

Regards,
Andreas


Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

Andreas Pflug wrote:
> Andrew Dunstan wrote:
>   
>> I have no idea why that's done - it goes back to the origins of the
>> syslogger - probably because someone mistakenly thinks all WIndows
>> text files have to have CRLF line endings.
>>
>> I tried changing that to _O_BINARY, and calling _setmode on both the
>> pipe before it's duped into stderr and stderr after the dup and both.
>> Nothing seemed to work.
>>     
> AFAIR the flag has to be set again in each child process.
>
>
>   
Thanks. I'll try that.

cheers

andrew


Re: pipe chunking vs Windows

From
Andreas Pflug
Date:
Andrew Dunstan wrote:
>>  
>>> I have no idea why that's done - it goes back to the origins of the
>>> syslogger - probably because someone mistakenly thinks all WIndows
>>> text files have to have CRLF line endings.
Yes this was intentional, notepad still doesn't like LF line endings.
Not my preferred text viewer, but the only one that's always available.

Regards,
Andreas



Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

Andreas Pflug wrote:
> Andrew Dunstan wrote:
>   
>>>  
>>>       
>>>> I have no idea why that's done - it goes back to the origins of the
>>>> syslogger - probably because someone mistakenly thinks all WIndows
>>>> text files have to have CRLF line endings.
>>>>         
> Yes this was intentional, notepad still doesn't like LF line endings.
> Not my preferred text viewer, but the only one that's always available.
>
>
>   

Not for Wordpad though, and it's pretty universal too. And Notepad won't 
load a file of any great size anyway. Furthermore, we just can't have 
this alongside the pipe chunking protocol, so I'm inclined to blow it 
away altogether, unless there are pretty loud squawks. Especially for 
machine-readable logs, we want the log file to get *exactly* what we 
send it.

cheers

andrew


Re: pipe chunking vs Windows

From
Andreas Pflug
Date:
Andrew Dunstan wrote:
>  
> Not for Wordpad though, and it's pretty universal too. And Notepad
> won't load a file of any great size anyway. Furthermore, we just can't
> have this alongside the pipe chunking protocol, so I'm inclined to
> blow it away altogether, unless there are pretty loud squawks.
> Especially for machine-readable logs, we want the log file to get
> *exactly* what we send it.
Well I'd LOVE reliably machine-readable logs, but I think that will
collide with human-readability.

Regards
Andreas



Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

Andreas Pflug wrote:
> Andrew Dunstan wrote:
>   
>>  
>> Not for Wordpad though, and it's pretty universal too. And Notepad
>> won't load a file of any great size anyway. Furthermore, we just can't
>> have this alongside the pipe chunking protocol, so I'm inclined to
>> blow it away altogether, unless there are pretty loud squawks.
>> Especially for machine-readable logs, we want the log file to get
>> *exactly* what we send it.
>>     
> Well I'd LOVE reliably machine-readable logs, but I think that will
> collide with human-readability.
>
>
>   
No it won't. Watch this space :-)

cheers

andrew


Re: pipe chunking vs Windows

From
"korry.douglas"
Date:
> I have not yet succeeded in turning this behaviour off (_setmode() 
> didn't seem to affect it). If we can't find a way to turn it off, the 
> only solution short of abandoning its use on Windows that I can think 
> of is to translate LF on input to something unlikely like 0x1C and 
> then translate it back when we read it from the pipe.
Did you _setmode() the pipe, _setmode() stderr, or both?  (And did you 
try before or after calling dup2()?).

It looks like the Win CRT implementation of dup2() copues the "mode" 
from the pipe that you've created into stderr. 
         -- Korry


Re: pipe chunking vs Windows

From
"korry.douglas"
Date:
>> I have not yet succeeded in turning this behaviour off (_setmode() 
>> didn't seem to affect it). If we can't find a way to turn it off, the 
>> only solution short of abandoning its use on Windows that I can think 
>> of is to translate LF on input to something unlikely like 0x1C and 
>> then translate it back when we read it from the pipe.
> Did you _setmode() the pipe, _setmode() stderr, or both?  (And did you 
> try before or after calling dup2()?).
>
> It looks like the Win CRT implementation of dup2() copues the "mode" 
> from the pipe that you've created into stderr.
Sorry, I was looking at the wrong chunk of code in syslogger.c. 

Why are you calling _open_osfhandle() with O_TEXT?  That looks 
suspicious giving the problem that you are seeing with CR -> CR/LF 
translation.
         -- Korry


Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

Andreas Pflug wrote:
> Andrew Dunstan wrote:
>   
>> I have no idea why that's done - it goes back to the origins of the
>> syslogger - probably because someone mistakenly thinks all WIndows
>> text files have to have CRLF line endings.
>>
>> I tried changing that to _O_BINARY, and calling _setmode on both the
>> pipe before it's duped into stderr and stderr after the dup and both.
>> Nothing seemed to work.
>>     
> AFAIR the flag has to be set again in each child process.
>
>   

OK, this was the bit of info I was missing. I put a call in 
SubPostmasterMain() like this:
   _setmode(fileno(stderr), _O_BINARY);

and one in the syslogger code the same, just after the pipe is dup'ed 
into stderr.

After that I saw no more corruption issues.

However, the problem is that in backporting it we'd make a slight 
behaviour change - the log file just gets LF instead of CRLF line 
endings. I'm inclined to say that's a better result than living with the 
bug, though.

Thoughts?

cheers

andrew






Re: pipe chunking vs Windows

From
Magnus Hagander
Date:
On Sun, Jul 29, 2007 at 06:31:04PM -0400, Andrew Dunstan wrote:
> 
> 
> Andreas Pflug wrote:
> >Andrew Dunstan wrote:
> >  
> >>> 
> >>>      
> >>>>I have no idea why that's done - it goes back to the origins of the
> >>>>syslogger - probably because someone mistakenly thinks all WIndows
> >>>>text files have to have CRLF line endings.
> >>>>        
> >Yes this was intentional, notepad still doesn't like LF line endings.
> >Not my preferred text viewer, but the only one that's always available.
> >
> >
> >  
> 
> Not for Wordpad though, and it's pretty universal too. And Notepad won't 
> load a file of any great size anyway. Furthermore, we just can't have 

Just for the record, that hasn't been true for a long time. Notepad opens
large files just fine (well, fine is a matter of definition, since it needs
to load the whole file in RAM which is interesting when you open gigabyte
size files. But IIRC, so does wordpad).

And notepad doesn't carry the risk of people saving the file in word format
(yes, this happens all too frequently from what I've seen)

//Magnus


Re: pipe chunking vs Windows

From
Magnus Hagander
Date:
On Sun, Jul 29, 2007 at 07:43:34PM -0400, Andrew Dunstan wrote:
> 
> 
> Andreas Pflug wrote:
> >Andrew Dunstan wrote:
> >  
> >>I have no idea why that's done - it goes back to the origins of the
> >>syslogger - probably because someone mistakenly thinks all WIndows
> >>text files have to have CRLF line endings.
> >>
> >>I tried changing that to _O_BINARY, and calling _setmode on both the
> >>pipe before it's duped into stderr and stderr after the dup and both.
> >>Nothing seemed to work.
> >>    
> >AFAIR the flag has to be set again in each child process.
> >
> >  
> 
> OK, this was the bit of info I was missing. I put a call in 
> SubPostmasterMain() like this:
> 
>    _setmode(fileno(stderr), _O_BINARY);
> 
> and one in the syslogger code the same, just after the pipe is dup'ed 
> into stderr.
> 
> After that I saw no more corruption issues.

Seems right - for anything EXEC_BACKEND, if you change a state from the
default it won't be inherited, so you nede to explicitly add it in the
child.

> However, the problem is that in backporting it we'd make a slight 
> behaviour change - the log file just gets LF instead of CRLF line 
> endings. I'm inclined to say that's a better result than living with the 
> bug, though.

Can't we add back the CRLF combo when writing the file out from the
syslogger process?

//Magnus


Re: pipe chunking vs Windows

From
Andrew Dunstan
Date:

Magnus Hagander wrote:
>> However, the problem is that in backporting it we'd make a slight 
>> behaviour change - the log file just gets LF instead of CRLF line 
>> endings. I'm inclined to say that's a better result than living with the 
>> bug, though.
>>     
>
> Can't we add back the CRLF combo when writing the file out from the
> syslogger process?
>
>
>   

Yes, you're right. I can do this by setting text mode on the syslogFile. 
I'll have a patch for review shortly.

Note: this is behaviour we definitely won't want for CSVlogs, because it 
will translate embedded LF as well as LF at the end of a log line. This 
whole little mess chewed up all the time I wanted to spend yesterday 
making progress on CSVlogs :-(

cheers

andrew