Thread: V3 protocol gets out of sync on messages that cause allocation failures

V3 protocol gets out of sync on messages that cause allocation failures

From
Oliver Jowett
Date:
(Tom: this is not as severe a problem as I first thought)

If a client sends a V3 message that is sufficiently large to cause a 
memory allocation failure on the backend when allocating space to read 
the message, the backend gets out of sync with the protocol stream.

For example, sending this:

>  FE=> Parse(stmt=null,query="SELECT $1",oids={17})
>  FE=> Bind(stmt=null,portal=null,$1=<<stream of 1000000000 bytes>>)

provokes this:

> ERROR:  out of memory
> DETAIL:  Failed on request of size 1073741823.
> FATAL:  invalid frontend message type 0

What appears to be happening is that the backend goes into error 
recovery as soon as the allocation fails (just after reading the message 
length), and never does the read() of the body of the Bind message. So 
it falls out of sync, and tries to interpret the guts of the Bind as a 
new message. Bad server, no biscuit.

I was concerned that this was exploitable in applications that pass 
hostile binary parameters as protocol-level parameters, but it doesn't 
seem possible as the bytes at the start of a Bind are not under the 
control of the attacker and don't form a valid message.

The CopyData message could probably be exploited, but it seems unlikely 
that (security-conscious) applications will pass hostile data directly 
in a CopyData message.

I haven't looked at a fix to this in detail (I'm not really familiar 
with the backend's error-recovery path), but it seems like one easy 
option is to treate all errors that occur while a message is in the 
process of being read as FATAL?

-O


Oliver Jowett <oliver@opencloud.com> writes:
> What appears to be happening is that the backend goes into error 
> recovery as soon as the allocation fails (just after reading the message 
> length), and never does the read() of the body of the Bind message. So 
> it falls out of sync, and tries to interpret the guts of the Bind as a 
> new message. Bad server, no biscuit.

Yeah.  The intent of the protocol design was that the recipient could
skip over the correct number of bytes even if it didn't have room to
buffer them, but the memory allocation mechanism in the backend makes
it difficult to actually do that.  Now that we have PG_TRY, though,
it might not be out of reach to do it right.  Something like
PG_TRY();    buf = palloc(N);PG_CATCH();    read and discard N bytes;    re-raise the out-of-memory
error;PG_END_TRY();normalread path
 

I'm not sure how many places would need to be touched to make this
actually happen; if memory serves, the "read a packet" code extends
over multiple logical levels.
        regards, tom lane


I wrote:
> Yeah.  The intent of the protocol design was that the recipient could
> skip over the correct number of bytes even if it didn't have room to
> buffer them, but the memory allocation mechanism in the backend makes
> it difficult to actually do that.  Now that we have PG_TRY, though,
> it might not be out of reach to do it right.

And indeed it wasn't.  Patch committed.
        regards, tom lane


Re: V3 protocol gets out of sync on messages that cause allocation

From
Oliver Jowett
Date:
Tom Lane wrote:
> I wrote:
> 
>>Yeah.  The intent of the protocol design was that the recipient could
>>skip over the correct number of bytes even if it didn't have room to
>>buffer them, but the memory allocation mechanism in the backend makes
>>it difficult to actually do that.  Now that we have PG_TRY, though,
>>it might not be out of reach to do it right.
> 
> 
> And indeed it wasn't.  Patch committed.

Thanks!

Re your commit comment:

> I'm a bit dubious that this is a real problem, since the client likely
> doesn't have any more space available than the server, but it's not hard
> to make it behave according to the protocol intention.

It's quite possible that the client isn't keeping the whole parameter in 
memory. For example, JDBC has a method that allows a streamable 
parameter (with prespecified length) to be set, and the stream contents 
could be coming from disk or computed on demand. That is actually where 
I came across the problem in the first place.

-O