Thread: Replay attack of query cancel

Replay attack of query cancel

From
"Heikki Linnakangas"
Date:
It occurred to me a while ago that our query cancel messages are sent 
unencrypted, even when SSL is otherwise used. That's not a big issue on 
its own, because the cancellation message only contains the backend PID 
and the cancellation key, but it does open us to a replay attack. After 
the first query in a connection has been cancelled, an eavesdropper can 
reuse the backend PID and cancellation key to cancel subsequent queries 
on the same connection.

We discussed this on the security list, and the consensus was that this 
isn't worth a quick fix and a security release, because
- it only affects applications that use query cancel, which is rare
- it only affects SSL encrypted connections (the point is moot 
non-encrypted connections, as you can just snatch the cancel key from 
the initial message)
- it only let's you cancel queries, IOW it's only a DOS attack.
- there's no simple fix.

However, it is something to keep in mind, and perhaps fix for the next 
release.

One idea for fixing this is to make cancellation keys disposable, and 
automatically issue a new one through the main connection when one is 
used, but that's not completely trivial, and requires a change in both 
the clients and the server. Another idea is to send the query cancel 
message only after SSL authentication, but that is impractical for libpq 
because we PQcancel needs to be callable from a signal handler.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Replay attack of query cancel

From
Alvaro Herrera
Date:
Heikki Linnakangas wrote:

> One idea for fixing this is to make cancellation keys disposable, and  
> automatically issue a new one through the main connection when one is  
> used, but that's not completely trivial, and requires a change in both  
> the clients and the server. Another idea is to send the query cancel  
> message only after SSL authentication, but that is impractical for libpq  
> because we PQcancel needs to be callable from a signal handler.

I wonder if we can do something diffie-hellman'ish, where we have a
parameter exchanged in the initial SSL'ed handshake, which is later used
to generate new cancel keys each time the previous one is used.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Replay attack of query cancel

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I wonder if we can do something diffie-hellman'ish, where we have a
> parameter exchanged in the initial SSL'ed handshake, which is later used
> to generate new cancel keys each time the previous one is used.

Seems like the risk of getting out of sync would outweigh any benefits.
Lose one cancel message in the network, you have no hope of getting any
more accepted.
        regards, tom lane


Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
> 
>> One idea for fixing this is to make cancellation keys disposable, and  
>> automatically issue a new one through the main connection when one is  
>> used, but that's not completely trivial, and requires a change in both  
>> the clients and the server. Another idea is to send the query cancel  
>> message only after SSL authentication, but that is impractical for libpq  
>> because we PQcancel needs to be callable from a signal handler.
> 
> I wonder if we can do something diffie-hellman'ish, where we have a
> parameter exchanged in the initial SSL'ed handshake, which is later used
> to generate new cancel keys each time the previous one is used.

Seems much more complex than needed.

IIRC, the protocol allows us (at least does not explicitly forbid) to
issue new cancel keys at any time. And libpq will, again IIRC, accept
them. So we could just change the server to issue a new cancel key
whenever it has used the previous one (since the new key will then be
sent over the encrypted channel, we're safe).

The problem was (third IIRC here :-P) in other clients, such as the JDBC
driver (I think that one was checked specifically) which currently only
accept the BackendKeyData message during startup. All drivers not based
on libpq would have to be checked and potentially updated, but it's
sitll a lot easier than DHing or so.

//Magnus


Re: Replay attack of query cancel

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> The problem was (third IIRC here :-P) in other clients, such as the JDBC
> driver (I think that one was checked specifically) which currently only
> accept the BackendKeyData message during startup. All drivers not based
> on libpq would have to be checked and potentially updated, but it's
> sitll a lot easier than DHing or so.

The other problem was getting the new cancel key from the postmaster to
the backend and thence to the client (hopefully in a timely manner),
recognizing that (a) we don't want the postmaster touching shared memory
very much, and certainly not engaging in any locking behavior; (b)
backends feel free to ignore SIGINT when they're not doing anything.

Certainly the prospect of a de facto protocol change is the bigger
problem, but there are nontrivial implementation issues in the server
too.

If we were going to make it a de jure protocol change (ie new version
number) instead of just hoping nobody notices the behavioral difference,
I'd be inclined to think about widening the cancel key, too.  32 bits
ain't that much randomness anymore.
        regards, tom lane


Re: Replay attack of query cancel

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> Alvaro Herrera <alvherre@commandprompt.com> writes:>> I wonder if we can do something diffie-hellman'ish, where we
have>>a parameter exchanged in the initial SSL'ed handshake, which is>> later used to generate new cancel keys each
timethe previous one>> is used.
 
Tom> Seems like the risk of getting out of sync would outweigh anyTom> benefits.  Lose one cancel message in the
network,you have noTom> hope of getting any more accepted.
 

That's easily solved: when the client wants to do a cancel, have it
send, in place of the actual cancel key, an integer N and the value
HMAC(k,N) where k is the cancel key. Replay is prevented by requiring
the value of N to be strictly greater than any previous value
successfully used for this session. (Since we already have md5 code,
HMAC-MD5 would be the obvious choice.)

Migration to this could probably be handled without a version change
to the protocol, by defining a new SecureCancelRequest message and a
GUC to control whether the old CancelRequest message is accepted or
ignored. The key length for the cancel key can be increased with a
minor-version change to the protocol (if client asks for protocol 3.1,
send it a longer key, otherwise a shorter one).

-- 
Andrew (irc:RhodiumToad)


Re: Replay attack of query cancel

From
Zdenek Kotala
Date:
Tom Lane napsal(a):
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> I wonder if we can do something diffie-hellman'ish, where we have a
>> parameter exchanged in the initial SSL'ed handshake, which is later used
>> to generate new cancel keys each time the previous one is used.
> 
> Seems like the risk of getting out of sync would outweigh any benefits.
> Lose one cancel message in the network, you have no hope of getting any
> more accepted.

When cancellation key is used client should explicitly ask for a new regenerated 
cancel key.
    Zdenek

-- 
Zdenek Kotala              Sun Microsystems
Prague, Czech Republic     http://sun.com/postgresql



Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> The problem was (third IIRC here :-P) in other clients, such as the JDBC
>> driver (I think that one was checked specifically) which currently only
>> accept the BackendKeyData message during startup. All drivers not based
>> on libpq would have to be checked and potentially updated, but it's
>> sitll a lot easier than DHing or so.
> 
> The other problem was getting the new cancel key from the postmaster to
> the backend and thence to the client (hopefully in a timely manner),
> recognizing that (a) we don't want the postmaster touching shared memory
> very much, and certainly not engaging in any locking behavior; (b)
> backends feel free to ignore SIGINT when they're not doing anything.

In EXEC_BACKEND, we already store this in shared memory. If we could
live with doing that for the non-exec case as well, it'd be a lot easier.

And we could then just have the backend update the array when it sends
out a "new key" message.


> Certainly the prospect of a de facto protocol change is the bigger
> problem, but there are nontrivial implementation issues in the server
> too.

Yeah.

> If we were going to make it a de jure protocol change (ie new version
> number) instead of just hoping nobody notices the behavioral difference,
> I'd be inclined to think about widening the cancel key, too.  32 bits
> ain't that much randomness anymore.

That, or rely on something that's not just a simple shared secret
(something like what Andrew Gierth suggested). And AFAICS, his
suggestion allows us to manage without having to update the cancel key
in shared memory at all - but it does require a protocol modification.

//Magnus



Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Andrew Gierth wrote:
>>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
>  > Alvaro Herrera <alvherre@commandprompt.com> writes:
>  >> I wonder if we can do something diffie-hellman'ish, where we have
>  >> a parameter exchanged in the initial SSL'ed handshake, which is
>  >> later used to generate new cancel keys each time the previous one
>  >> is used.
> 
>  Tom> Seems like the risk of getting out of sync would outweigh any
>  Tom> benefits.  Lose one cancel message in the network, you have no
>  Tom> hope of getting any more accepted.
> 
> That's easily solved: when the client wants to do a cancel, have it
> send, in place of the actual cancel key, an integer N and the value
> HMAC(k,N) where k is the cancel key. Replay is prevented by requiring
> the value of N to be strictly greater than any previous value
> successfully used for this session. (Since we already have md5 code,
> HMAC-MD5 would be the obvious choice.)

I like this approach. It does away with the sharead memory hackery - we
just need to store one more number in the postmaster array, which
shouldn't be a problem.


> Migration to this could probably be handled without a version change
> to the protocol, by defining a new SecureCancelRequest message and a
> GUC to control whether the old CancelRequest message is accepted or
> ignored.

We could even have a third setting - accept, but write a warning to the log.


> The key length for the cancel key can be increased with a
> minor-version change to the protocol (if client asks for protocol 3.1,
> send it a longer key, otherwise a shorter one).

Yeah, that seems easy enough. The question is how important is it to
increase the cancel key length? Should we do it now, or should we push
it off until whenever we have to bump the protocol version number anyway?


If we don't touch the protocol version, we could in theory at least
backpatch this as a fix for those who are really concerned about this
issue. But it's probably too big a change for that?

/Magnus


Re: Replay attack of query cancel

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Andrew Gierth wrote:
>> That's easily solved: when the client wants to do a cancel, have it
>> send, in place of the actual cancel key, an integer N and the value
>> HMAC(k,N) where k is the cancel key. Replay is prevented by requiring
>> the value of N to be strictly greater than any previous value
>> successfully used for this session. (Since we already have md5 code,
>> HMAC-MD5 would be the obvious choice.)

> I like this approach.

It's not a bad idea, if we are willing to change the protocol.

> If we don't touch the protocol version, we could in theory at least
> backpatch this as a fix for those who are really concerned about this
> issue.

Huh?  How can you argue this isn't a protocol change?

[ thinks for a bit... ]  You could make it a change in the cancel
protocol, which is to some extent independent of the main FE/BE
protocol.  The problem is: how can the client know whether it's okay to
use this new protocol for cancel?
        regards, tom lane


Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Andrew Gierth wrote:
>>> That's easily solved: when the client wants to do a cancel, have it
>>> send, in place of the actual cancel key, an integer N and the value
>>> HMAC(k,N) where k is the cancel key. Replay is prevented by requiring
>>> the value of N to be strictly greater than any previous value
>>> successfully used for this session. (Since we already have md5 code,
>>> HMAC-MD5 would be the obvious choice.)
> 
>> I like this approach.
> 
> It's not a bad idea, if we are willing to change the protocol.
> 
>> If we don't touch the protocol version, we could in theory at least
>> backpatch this as a fix for those who are really concerned about this
>> issue.
> 
> Huh?  How can you argue this isn't a protocol change?

Um. By looking at it only from the backend perspective? *blush*


> [ thinks for a bit... ]  You could make it a change in the cancel
> protocol, which is to some extent independent of the main FE/BE
> protocol.  The problem is: how can the client know whether it's okay to
> use this new protocol for cancel?

Yeah, that's the point that will require a protocol bump, I think. Since
there is no response to the cancel packet, we can't even do things like
sending in a magic key and look at the response (which would be a rather
ugly hack, but doable if we had a success/fail response to the cancel
packet).

I guess bumping the protocol to 3.1 pretty much kills any chance for a
backpatch though :( Since a "new libpq" would no longer be able to talk
to an old server, if I remember the logic correctly?

//Magnus


Re: Replay attack of query cancel

From
"Stephen R. van den Berg"
Date:
Tom Lane wrote:
>[ thinks for a bit... ]  You could make it a change in the cancel
>protocol, which is to some extent independent of the main FE/BE
>protocol.  The problem is: how can the client know whether it's okay to
>use this new protocol for cancel?

Two options:
a. Send two cancelkeys in rapid succession at session startup, whereas  the first one is 0 or something.  The client
candetect the first  "special" cancelkey and then knows that the connection supports  cancelmethod 2.
 
b. At sessionstartup, advertise a new runtimeparameter:  cancelmethod=plainkey,hmaccoded  which the client can then
chosefrom.
 

I'd prefer b over a.
-- 
Sincerely,          Stephen R. van den Berg.

"And now for something *completely* different!"


Re: Replay attack of query cancel

From
Gregory Stark
Date:
"Magnus Hagander" <magnus@hagander.net> writes:

> Yeah, that's the point that will require a protocol bump, I think. Since
> there is no response to the cancel packet, we can't even do things like
> sending in a magic key and look at the response (which would be a rather
> ugly hack, but doable if we had a success/fail response to the cancel
> packet).

From the server point of view we could accept either kind of cancel message
for the first cancel message and set a variable saying which to expect from
there forward. If the first cancel message is an old-style message then we
always expect old-style messages. If it's a new-style message then we require
new-style messages and keep track of the counter to require a monotically
increasing counter.

From the client point-of-view we have no way to know if the server is going to
accept new-style cancel messages though. We could try sending the new-style
message and see if we get an error (do we get an error if you send an invalid
cancel message?).

We could have the server indicate it's the new protocol by sending the initial
cancel key twice. If the client sees more than one cancel key it automatically
switches to new-style cancel messages.

Or we could just bump the protocol version.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Gregory Stark wrote:
> "Magnus Hagander" <magnus@hagander.net> writes:
> 
>> Yeah, that's the point that will require a protocol bump, I think. Since
>> there is no response to the cancel packet, we can't even do things like
>> sending in a magic key and look at the response (which would be a rather
>> ugly hack, but doable if we had a success/fail response to the cancel
>> packet).
> 
> From the server point of view we could accept either kind of cancel message
> for the first cancel message and set a variable saying which to expect from
> there forward. If the first cancel message is an old-style message then we
> always expect old-style messages. If it's a new-style message then we require
> new-style messages and keep track of the counter to require a monotically
> increasing counter.
> 
> From the client point-of-view we have no way to know if the server is going to
> accept new-style cancel messages though. We could try sending the new-style
> message and see if we get an error (do we get an error if you send an invalid
> cancel message?).

No, that is the point I made above - we don't respond to the cancel
message *at all*.

> We could have the server indicate it's the new protocol by sending the initial
> cancel key twice. If the client sees more than one cancel key it automatically
> switches to new-style cancel messages.

That will still break things like JDBC I think - they only expect one
cancel message, and then move on to expect other things.

> Or we could just bump the protocol version.

Yeah, but that would kill backwards compatibility in that the new libpq
could no longer talk to old servers.

What would work is using a parameter field, per Stephen's suggestion
elsewhere in the thread. Older libpq versions should just ignore the
parameter if they don't know what it is. Question is, is that too ugly a
workaround, since we'll need to keep it around forever? (We have special
handling of a few other parameters already, so maybe not?)


//Magnus



Re: Replay attack of query cancel

From
"Stephen R. van den Berg"
Date:
Magnus Hagander wrote:
>Gregory Stark wrote:
>> "Magnus Hagander" <magnus@hagander.net> writes:
>> We could have the server indicate it's the new protocol by sending the initial
>> cancel key twice. If the client sees more than one cancel key it automatically
>> switches to new-style cancel messages.

>That will still break things like JDBC I think - they only expect one
>cancel message, and then move on to expect other things.

Why didn't they follow recommended practice to process any message
anytime?  Was/is there a specific reason to avoid that in that driver?
(Just curious).

>What would work is using a parameter field, per Stephen's suggestion
>elsewhere in the thread. Older libpq versions should just ignore the
>parameter if they don't know what it is. Question is, is that too ugly a
>workaround, since we'll need to keep it around forever? (We have special
>handling of a few other parameters already, so maybe not?)

You only need to keep the runtimeparameter for as long as you don't bump
the protocol version.
Then again, runtimeparameters are cheap.
-- 
Sincerely,          Stephen R. van den Berg.

"And now for something *completely* different!"


Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Stephen R. van den Berg wrote:
> Magnus Hagander wrote:
>> Gregory Stark wrote:
>>> "Magnus Hagander" <magnus@hagander.net> writes:
>>> We could have the server indicate it's the new protocol by sending the initial
>>> cancel key twice. If the client sees more than one cancel key it automatically
>>> switches to new-style cancel messages.
> 
>> That will still break things like JDBC I think - they only expect one
>> cancel message, and then move on to expect other things.
> 
> Why didn't they follow recommended practice to process any message
> anytime?  Was/is there a specific reason to avoid that in that driver?
> (Just curious).

No idea, but that's how it is IIRC. And there are other drivers to think
about as well.


>> What would work is using a parameter field, per Stephen's suggestion
>> elsewhere in the thread. Older libpq versions should just ignore the
>> parameter if they don't know what it is. Question is, is that too ugly a
>> workaround, since we'll need to keep it around forever? (We have special
>> handling of a few other parameters already, so maybe not?)
> 
> You only need to keep the runtimeparameter for as long as you don't bump
> the protocol version.
> Then again, runtimeparameters are cheap.

Yeah, I guess that's true. Once you break backwards compatibility once,
you can break a couple of things at the same time :)

//Magnus


Re: Replay attack of query cancel

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> What would work is using a parameter field, per Stephen's suggestion
> elsewhere in the thread. Older libpq versions should just ignore the
> parameter if they don't know what it is.

+1 for that one; we have done it already for several send-at-startup
parameters that didn't use to be there, eg standard_conforming_strings.

I'd go with defining it as the maximum cancel version number
supported, eg "cancel_version = 31", with the understanding that
if it's not reported then 3.0 should be assumed.

So the plan looks like this, I think:

* Main FE/BE protocol doesn't change, but there might be an additional
value reported in the initial ParameterStatus messages.  We believe that
this will not break any existing clients.

* Server accepts two different styles of cancel messages, identified
by different protocol numbers.

* Client decides which type to send based on looking for the
cancel_version parameter.

This seems back-patchable since neither old clients nor old servers
are broken: the only consequence of using an old one is your cancels
aren't sent securely, which frankly a lot of people aren't going
to care about anyway.

Note after looking at the postmaster code: the contents of new-style
cancel keys ought to be the backend PID in clear, the sequence number in
clear, and the hash of backend PID + cancel key + sequence number.
If we don't do it that way, the postmaster will need to apply the hash
function for *each* backend to see if it matches, which seems like
a lot more computation than we want.  The postmaster needs to be able to
identify which backend is the potential match before executing the hash.

The main drawback I can see to keeping this backward-compatible is that
keeping the cancel key to 32 bits could still leave us vulnerable to
brute force attacks: once you've seen a cancel message, just try all
the possible keys till you get a match, and then you can generate a
cancel that will work.  Can we refine the HMAC idea to defeat that?
Otherwise we're assuming that 2^32 HMACs take longer than the useful
life of a cancel key, which doesn't seem like a very future-proof
assumption.
        regards, tom lane


Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> What would work is using a parameter field, per Stephen's suggestion
>> elsewhere in the thread. Older libpq versions should just ignore the
>> parameter if they don't know what it is.
> 
> +1 for that one; we have done it already for several send-at-startup
> parameters that didn't use to be there, eg standard_conforming_strings.
> 
> I'd go with defining it as the maximum cancel version number
> supported, eg "cancel_version = 31", with the understanding that
> if it's not reported then 3.0 should be assumed.
> 
> So the plan looks like this, I think:
> 
> * Main FE/BE protocol doesn't change, but there might be an additional
> value reported in the initial ParameterStatus messages.  We believe that
> this will not break any existing clients.
> 
> * Server accepts two different styles of cancel messages, identified
> by different protocol numbers.

With the additional point that there is a GUC variable to turn this off
or warn about it, right?


> * Client decides which type to send based on looking for the
> cancel_version parameter.
> 
> This seems back-patchable since neither old clients nor old servers
> are broken: the only consequence of using an old one is your cancels
> aren't sent securely, which frankly a lot of people aren't going
> to care about anyway.

Right. All those people running without SSL in the first place, for
example...


> Note after looking at the postmaster code: the contents of new-style
> cancel keys ought to be the backend PID in clear, the sequence number in
> clear, and the hash of backend PID + cancel key + sequence number.
> If we don't do it that way, the postmaster will need to apply the hash
> function for *each* backend to see if it matches, which seems like
> a lot more computation than we want.  The postmaster needs to be able to
> identify which backend is the potential match before executing the hash.

Yes, certainly. Otherwise, the cost ends up a lot higher on the server
than the client, which is a really simple DOS opportunity.


> The main drawback I can see to keeping this backward-compatible is that
> keeping the cancel key to 32 bits could still leave us vulnerable to
> brute force attacks: once you've seen a cancel message, just try all
> the possible keys till you get a match, and then you can generate a
> cancel that will work.  Can we refine the HMAC idea to defeat that?
> Otherwise we're assuming that 2^32 HMACs take longer than the useful
> life of a cancel key, which doesn't seem like a very future-proof
> assumption.

Well, you're also going to have to increment <n> every time. We could
just cap <n> at <arbitrary level>. Say you can only cancel queries on a
single connection a million times or so. It's not perfect, but it gets
you somewhere.

Another option would be to send a new, longer, cancel key as part of the
separate parameter we're sending during startup (when we're indicating
which version we support). Then we'll use the longer cancel key if we're
dealing with "new style cancel" but keep the old 32-bit one for
backwards compatibility.

//Magnus


Re: Replay attack of query cancel

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Tom Lane wrote:
>> * Server accepts two different styles of cancel messages, identified
>> by different protocol numbers.

> With the additional point that there is a GUC variable to turn this off
> or warn about it, right?

I see pretty much no value in that.

>> The main drawback I can see to keeping this backward-compatible is that
>> keeping the cancel key to 32 bits could still leave us vulnerable to
>> brute force attacks: once you've seen a cancel message, just try all
>> the possible keys till you get a match, and then you can generate a
>> cancel that will work.  Can we refine the HMAC idea to defeat that?
>> Otherwise we're assuming that 2^32 HMACs take longer than the useful
>> life of a cancel key, which doesn't seem like a very future-proof
>> assumption.

> Well, you're also going to have to increment <n> every time. We could
> just cap <n> at <arbitrary level>. Say you can only cancel queries on a
> single connection a million times or so. It's not perfect, but it gets
> you somewhere.

Once you've brute-forced the secret key, you can just use an <n> that's
somewhat more than the last one the client used, assuming you've been
sniffing the connection the whole time.  Or use one that's just a bit
less than what you can predict the server will take.

Not only do you get to kill the current query, but you'll have prevented
the client from issuing legitimate cancels after that, since it won't
know you bumped the server's <n> value.  So the idea still needs work.

> Another option would be to send a new, longer, cancel key as part of the
> separate parameter we're sending during startup (when we're indicating
> which version we support). Then we'll use the longer cancel key if we're
> dealing with "new style cancel" but keep the old 32-bit one for
> backwards compatibility.

Yeah.  So then we just need one added parameter: secure_cancel_key =
string.

BTW, should we make all of this conditional on the use of an SSL
connection?  If the original sending of the cancel key isn't secure
against sniffing, it's hard to see what anyone is buying with all the
added computation.
        regards, tom lane


Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Tom Lane wrote:
>>> * Server accepts two different styles of cancel messages, identified
>>> by different protocol numbers.
> 
>> With the additional point that there is a GUC variable to turn this off
>> or warn about it, right?
> 
> I see pretty much no value in that.

The server is the point where you enforce security policies. The server
is where you say that SSL is required, for example. The same way, this
would let you at the server level say that secure cancels are required.


>>> The main drawback I can see to keeping this backward-compatible is that
>>> keeping the cancel key to 32 bits could still leave us vulnerable to
>>> brute force attacks: once you've seen a cancel message, just try all
>>> the possible keys till you get a match, and then you can generate a
>>> cancel that will work.  Can we refine the HMAC idea to defeat that?
>>> Otherwise we're assuming that 2^32 HMACs take longer than the useful
>>> life of a cancel key, which doesn't seem like a very future-proof
>>> assumption.
> 
>> Well, you're also going to have to increment <n> every time. We could
>> just cap <n> at <arbitrary level>. Say you can only cancel queries on a
>> single connection a million times or so. It's not perfect, but it gets
>> you somewhere.
> 
> Once you've brute-forced the secret key, you can just use an <n> that's
> somewhat more than the last one the client used, assuming you've been
> sniffing the connection the whole time.  Or use one that's just a bit
> less than what you can predict the server will take.
> 
> Not only do you get to kill the current query, but you'll have prevented
> the client from issuing legitimate cancels after that, since it won't
> know you bumped the server's <n> value.  So the idea still needs work.

Yeah, I like the idea below much better.


>> Another option would be to send a new, longer, cancel key as part of the
>> separate parameter we're sending during startup (when we're indicating
>> which version we support). Then we'll use the longer cancel key if we're
>> dealing with "new style cancel" but keep the old 32-bit one for
>> backwards compatibility.
> 
> Yeah.  So then we just need one added parameter: secure_cancel_key =
> string.

Yup. Seems a whole lot cleaner.


> BTW, should we make all of this conditional on the use of an SSL
> connection?  If the original sending of the cancel key isn't secure
> against sniffing, it's hard to see what anyone is buying with all the
> added computation.

Not sure. In practice it makes no difference, but our code is more
readable with less #ifdefs and branches. I don't think the added
computation makes any noticable difference at all in the normal
non-attacker scenario, so I'd just as well leave it in.

//Magnus


Re: Replay attack of query cancel

From
"Kevin Grittner"
Date:
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> BTW, should we make all of this conditional on the use of an SSL
> connection?  If the original sending of the cancel key isn't secure
> against sniffing, it's hard to see what anyone is buying with all
the
> added computation.
+1
All of our important production work is done with local connections. 
If the machine has been compromised to the level that loopback traffic
is being intercepted, these protections won't help.
-Kevin


Re: Replay attack of query cancel

From
"Marko Kreen"
Date:
On 8/8/08, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
> It occurred to me a while ago that our query cancel messages are sent
> unencrypted, even when SSL is otherwise used. That's not a big issue on its
> own, because the cancellation message only contains the backend PID and the
> cancellation key, but it does open us to a replay attack. After the first
> query in a connection has been cancelled, an eavesdropper can reuse the
> backend PID and cancellation key to cancel subsequent queries on the same
> connection.
>
>  We discussed this on the security list, and the consensus was that this
> isn't worth a quick fix and a security release, because
>  - it only affects applications that use query cancel, which is rare
>  - it only affects SSL encrypted connections (the point is moot
> non-encrypted connections, as you can just snatch the cancel key from the
> initial message)
>  - it only let's you cancel queries, IOW it's only a DOS attack.
>  - there's no simple fix.
>
>  However, it is something to keep in mind, and perhaps fix for the next
> release.
>
>  One idea for fixing this is to make cancellation keys disposable, and
> automatically issue a new one through the main connection when one is used,
> but that's not completely trivial, and requires a change in both the clients
> and the server. Another idea is to send the query cancel message only after
> SSL authentication, but that is impractical for libpq because we PQcancel
> needs to be callable from a signal handler.

Why not establish SSL before sending cancel key?

That way potential SSL auth is also enforced.

I'm not against improving cancel protocol generally, also for non-SSL
clients, but this seems orthogonal to SSL issue - if client uses SSL then
I'd expect cancel packet also be sent over SSL.

-- 
marko


Re: Replay attack of query cancel

From
Heikki Linnakangas
Date:
On Wed, 13 Aug 2008, Marko Kreen wrote:
> On 8/8/08, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
>>  One idea for fixing this is to make cancellation keys disposable, and
>> automatically issue a new one through the main connection when one is used,
>> but that's not completely trivial, and requires a change in both the clients
>> and the server. Another idea is to send the query cancel message only after
>> SSL authentication, but that is impractical for libpq because we PQcancel
>> needs to be callable from a signal handler.
>
> Why not establish SSL before sending cancel key?
>
> That way potential SSL auth is also enforced.
>
> I'm not against improving cancel protocol generally, also for non-SSL
> clients, but this seems orthogonal to SSL issue - if client uses SSL then
> I'd expect cancel packet also be sent over SSL.

Because libpq PQcancel needs to be callable from a signal handler. There's 
limitations on what you can safely do in a signal handler, and calling an 
external SSL library probably isn't safe, at least not on all platforms.

It certainly would be possible for many other clients like JDBC, though. 
In fact, we might want to do that anyway, even if we change the protocol, 
just on the grounds that it's surprising that the cancellation isn't SSL 
protected while the rest of the protocol is. In theory, one might have a 
strict firewall rule that only let's through SSL connections, or you might 
want to hide the fact that a query is being cancelled from eavesdroppers, 
or the PID. It's a bit far-fetched and I doubt anyone cares in practice, 
but for the clients that it's easy to do, I think we should.

- Heikki


Re: Replay attack of query cancel

From
Bruce Momjian
Date:
Added to TODO:

* Prevent query cancel packets from being replayed by an attacker, especially when using SSL
 http://archives.postgresql.org/pgsql-hackers/2008-08/msg00345.php



---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> It occurred to me a while ago that our query cancel messages are sent 
> unencrypted, even when SSL is otherwise used. That's not a big issue on 
> its own, because the cancellation message only contains the backend PID 
> and the cancellation key, but it does open us to a replay attack. After 
> the first query in a connection has been cancelled, an eavesdropper can 
> reuse the backend PID and cancellation key to cancel subsequent queries 
> on the same connection.
> 
> We discussed this on the security list, and the consensus was that this 
> isn't worth a quick fix and a security release, because
> - it only affects applications that use query cancel, which is rare
> - it only affects SSL encrypted connections (the point is moot 
> non-encrypted connections, as you can just snatch the cancel key from 
> the initial message)
> - it only let's you cancel queries, IOW it's only a DOS attack.
> - there's no simple fix.
> 
> However, it is something to keep in mind, and perhaps fix for the next 
> release.
> 
> One idea for fixing this is to make cancellation keys disposable, and 
> automatically issue a new one through the main connection when one is 
> used, but that's not completely trivial, and requires a change in both 
> the clients and the server. Another idea is to send the query cancel 
> message only after SSL authentication, but that is impractical for libpq 
> because we PQcancel needs to be callable from a signal handler.
> 
> -- 
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Replay attack of query cancel

From
Andrew Gierth
Date:
>>>>> "Magnus" == Magnus Hagander <magnus@hagander.net> writes:
[snip]

I'm looking (at Magnus' suggestion) at implementing this.

There appears to be only one significant obstacle; since the query
cancel message is received _after_ forking a new backend, there has to
be some mechanism for recording the new value of N on success.  This
is obviously fairly easy in the EXEC_BACKEND case, but it seems quite
intrusive a change to have the non-EXEC_BACKEND case use shared memory
as well.

I can think of a couple of other ways to do it (e.g. some standard
Unix pipe tricks) but I'm not sure of what portability assumptions are
usually made. (I'm assuming that Windows always uses EXEC_BACKEND.)
Ideas?

(To sum up the previous discussion, this is the proposal as I understand
it so far:
1. Servers that support secure cancels will report secure_cancel_key in   the startup GUC settings; the value of this
keyis just randomness   (presumably in hex for convenience).
 
2. The server accepts either the old-style or the secure cancel   request from the client, but doesn't allow old-style
requests  once a valid secure request has been seen.
 
3. The client doesn't send secure cancel requests unless   secure_cancel_key was reported. The client may or may not
choose  to send secure cancels based on whether SSL is in use; we can   leave this decision up to the client in
general,even if we make   libpq use secure cancels only in the SSL case.
 

The upshot is that replay protection is automatically available if
both the client and server support it, and the client chooses to use it.
The net protocol change is one new GUC and one new message format for
the cancel message.)

-- 
Andrew.


Re: Replay attack of query cancel

From
Alvaro Herrera
Date:
Andrew Gierth wrote:

> There appears to be only one significant obstacle; since the query
> cancel message is received _after_ forking a new backend, there has to
> be some mechanism for recording the new value of N on success.  This
> is obviously fairly easy in the EXEC_BACKEND case, but it seems quite
> intrusive a change to have the non-EXEC_BACKEND case use shared memory
> as well.

I think you should look at making the memory used for this shared in
both cases, EXEC_BACKEND and not.  The only downside is that shared
memory usage will grow a bit on a minor release, but it'll be tiny.  The
portability problems caused by any other trick you use to transmit the
value is probably going to be a lot harder.

>  2. The server accepts either the old-style or the secure cancel
>     request from the client, but doesn't allow old-style requests
>     once a valid secure request has been seen.

Hmm, I think there should be a way to turn off acceptance of old-style
without necessarily requiring a new-style request.  Otherwise, how are
you protected from DoS if you have never sent a cancel request at all?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Replay attack of query cancel

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Andrew Gierth wrote:
>> 2. The server accepts either the old-style or the secure cancel
>> request from the client, but doesn't allow old-style requests
>> once a valid secure request has been seen.

> Hmm, I think there should be a way to turn off acceptance of old-style
> without necessarily requiring a new-style request.  Otherwise, how are
> you protected from DoS if you have never sent a cancel request at all?

Assuming you were using SSL, it's hard to see how an attacker is going
to get your cancel key without having seen a cancel request.

However, I dislike Andrew's proposal above even without that issue,
because it means *still more* changeable state that has to be magically
shared between postmaster and backends.  If we want to have a way for
people to disable insecure cancels, we should just have a postmaster
configuration parameter that does it.

Also, this whole proposal has gotten far past what I'd consider a
sanely back-patchable thing.  Don't bother thinking about whether it
will go into pre-8.4 code.
        regards, tom lane


Re: Replay attack of query cancel

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Andrew Gierth wrote:>>> 2. The server accepts either the old-style or the secure cancel>>> request from the client,
butdoesn't allow old-style requests>>> once a valid secure request has been seen.
 
>> Hmm, I think there should be a way to turn off acceptance of>> old-style without necessarily requiring a new-style
request.>>Otherwise, how are you protected from DoS if you have never sent a>> cancel request at all?
 
Tom> Assuming you were using SSL, it's hard to see how an attacker isTom> going to get your cancel key without having
seena cancelTom> request.
 
Tom> However, I dislike Andrew's proposal above even without thatTom> issue, because it means *still more* changeable
statethat hasTom> to be magically shared between postmaster and backends.
 

You get it for free; initialize N on the server side to 0, and accept
old-style cancels only if it is still 0. (Require the first secure
cancel to have N > 0)

-- 
Andrew.


Re: Replay attack of query cancel

From
Magnus Hagander
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Andrew Gierth wrote:
>>> 2. The server accepts either the old-style or the secure cancel
>>> request from the client, but doesn't allow old-style requests
>>> once a valid secure request has been seen.
> 
>> Hmm, I think there should be a way to turn off acceptance of old-style
>> without necessarily requiring a new-style request.  Otherwise, how are
>> you protected from DoS if you have never sent a cancel request at all?
> 
> Assuming you were using SSL, it's hard to see how an attacker is going
> to get your cancel key without having seen a cancel request.

Not only that, but he'll have to see an *old-style* cancel request,
since the new style doesn't contain the key.

And if you're *not* using SSL, the attacker can just sniff they key off
the initial packet instead.


> However, I dislike Andrew's proposal above even without that issue,
> because it means *still more* changeable state that has to be magically
> shared between postmaster and backends.  If we want to have a way for
> people to disable insecure cancels, we should just have a postmaster
> configuration parameter that does it.

Agreed. Your security policy also should not depend on what your client
happens to do, it should be enforceable.


//Magnus



Re: Replay attack of query cancel

From
Bruce Momjian
Date:
This bug has not been fixed, but it is on the TODO list:
o Prevent query cancel packets from being replayed by an attacker,especially when using SSL 

I am going to consider this item closed meaning I am not going to track
that it is fixed for 8.4;  it is just documented on our TODO as a known
limitation.

---------------------------------------------------------------------------

Magnus Hagander wrote:
> Tom Lane wrote:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> >> Andrew Gierth wrote:
> >>> 2. The server accepts either the old-style or the secure cancel
> >>> request from the client, but doesn't allow old-style requests
> >>> once a valid secure request has been seen.
> > 
> >> Hmm, I think there should be a way to turn off acceptance of old-style
> >> without necessarily requiring a new-style request.  Otherwise, how are
> >> you protected from DoS if you have never sent a cancel request at all?
> > 
> > Assuming you were using SSL, it's hard to see how an attacker is going
> > to get your cancel key without having seen a cancel request.
> 
> Not only that, but he'll have to see an *old-style* cancel request,
> since the new style doesn't contain the key.
> 
> And if you're *not* using SSL, the attacker can just sniff they key off
> the initial packet instead.
> 
> 
> > However, I dislike Andrew's proposal above even without that issue,
> > because it means *still more* changeable state that has to be magically
> > shared between postmaster and backends.  If we want to have a way for
> > people to disable insecure cancels, we should just have a postmaster
> > configuration parameter that does it.
> 
> Agreed. Your security policy also should not depend on what your client
> happens to do, it should be enforceable.
> 
> 
> //Magnus
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +