Thread: Practical Timing Side Channel Attacks on Memory Compression

Practical Timing Side Channel Attacks on Memory Compression

From

Filip Janus

Date:

06 April 2022, 11:17:52

Hi all,

A few months ago a group of researchers published a paper about LZ77 vulnerability[1]. And it also affects PGLZ. From my point of view, it could be a really dangerous issue for some kind of application. If I understand it correctly there is a possibility of leaking approx. 24B secret data per hour(but it depends on HW configuration).

I understand that there is no simple and easy solution. But I would like to know Your opinion on this. Or if you have any plan on how to deal with this?

Thanks

-Filip-

[1] https://arxiv.org/abs/2111.08404

Re: Practical Timing Side Channel Attacks on Memory Compression

From

Robert Haas

Date:

06 April 2022, 13:58:28

On Wed, Apr 6, 2022 at 7:18 AM Filip Janus <fjanus@redhat.com> wrote:
> A few months ago a group of researchers published a paper about LZ77 vulnerability[1]. And it also affects PGLZ. From
mypoint of view, it could be a really dangerous issue for some kind of application. If I understand it correctly there
isa possibility of leaking approx. 24B secret data per hour(but it depends on HW configuration).
>
> I understand that there is no simple and easy solution. But I would like to know Your opinion on this. Or if you
haveany plan on how to deal with this?

I hadn't heard of this before. It seems to be a real vulnerability in
PGLZ. Fortunately, the attack relies on the presence of conditions
that may not always be present, and the rate of data leakage is pretty
slow. Some threats of this kind are going to need to be addressed
outside the database, perhaps. For example, you could rate-limit
attempts to access your web application to make it harder to
accumulate enough accesses to get any meaningful data leakage, and you
could store highly secret data in a different place than you store
data that the user has the ability to modify. It sounds like even just
putting those things in separate jsonb columns rather than the same
one would block this particular attack. A user could also choose to
disable compression for a certain column entirely if they're worried
about this kind of thing.

However, there are new attacks all the time, and it's going to be
really hard to block them all. Variable latency is extremely difficult
to avoid, because pretty much every piece of code anyone writes is
going to have if statements and loops that can iterate for different
numbers of iterations on different input, and then there are CPU
effects like caching and branch prediction that add to the problem.
There are tons of attacks like this, and even if we could somehow, by
magic, secure PostgreSQL against this one completely, there will be
lots more in the future. I think it's inevitable that there are going
to be more and more papers demonstrating that a determined attacker
can leak information out of system A by very carefully measuring the
latency of operation X under different conditions, and there is no
real solution to that problem in general.

One thing that we could do internally to PostgreSQL is add more
possible TOAST compression algorithms. In addition to PGLZ, which the
attack in the paper targets, we now have LZ4 as an option. That's
probably vulnerable too, and probably zstd is as well, but if a state
of the art algorithm emerges that somehow isn't vulnerable, we can
consider adding support for it. I don't think that as a project we
really ought to be in the business of trying to design our own
compression algorithms. PGLZ is a great job for something that was
written by a PostgreSQL hacker, and many years ago at that, but not
surprisingly, people who spend all day thinking about compression are
really, really good at it. We should leave it up to them to figure out
whether there's something to be done here, and if the answer is yes,
then we can consider adopting whatever they come up with. Personally,
I don't quite see how such a thing would be possible, but I'm not a
compression expert.

One last thought: I don't think it's right to suppose that every
security vulnerability is the result of some design flaw and every
security vulnerability must be patched. Imagine, for example, that
someone posted a paper showing that they could break into your house.
Your reaction to that paper would probably depend on how they did it.
If it turns out that the lock you have on your front door will unlock
if you give it a hard bump with your fist, you'd probably want to
replace the lock with one that didn't have that design flaw. But if
the paper showed that they could break into your house by breaking one
of the windows with a crowbar, would you replace all of those windows
with solid steel? Most people understand that a window is likely to be
made of a more breakable substance than whatever surrounds it, because
it has an additional design constraint: it has to permit light to pass
through it. We accept that as a trade-off when we choose to live in a
house rather than a bunker. In the same way, without denying that
there's a real vulnerability here, I don't think that anyone who
understands a little bit about how compression and decompression work
would expect decompression to take the same amount of time on every
input. Every compression algorithm pretty much has a mode where
incompressible data is copied through byte for byte, and other modes
that take advantage of repeated byte sequences. It's only reasonable
to suppose that those various code paths are not all going to run at
the same speed, and nobody would want them to. It would mean trying to
slow down the fast paths through the code to the same speed as the
slow paths, and because decompression speed is so important, that
sounds like a thing that most people would not want.

Do you have any suggestions on what we should do here?

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Practical Timing Side Channel Attacks on Memory Compression

From

Tom Lane

Date:

06 April 2022, 14:14:01

Robert Haas <robertmhaas@gmail.com> writes:
> One last thought: I don't think it's right to suppose that every
> security vulnerability is the result of some design flaw and every
> security vulnerability must be patched.

As far as Postgres is concerned, I'm kind of unimpressed by timing-based
attacks.  There are enough layers between a hypothetical attacker and a
particular algorithm in the backend that it'd be really hard to get any
reliable numbers.  Length-based attacks are more realistic, since e.g.
we allow you to find out the compressed size of a data value.  But as
you noted, those can be defeated by not storing sensitive data in the
same place as attacker-controlled data.  Or turning off compression,
but that's largely throwing the baby out with the bathwater.  In the
end I think it's up to the DBA how concerned to be about this and
what measures she should take to mitigate any risks.

            regards, tom lane

Re: Practical Timing Side Channel Attacks on Memory Compression

From

Robert Haas

Date:

06 April 2022, 14:29:01

On Wed, Apr 6, 2022 at 10:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > One last thought: I don't think it's right to suppose that every
> > security vulnerability is the result of some design flaw and every
> > security vulnerability must be patched.
>
> As far as Postgres is concerned, I'm kind of unimpressed by timing-based
> attacks.  There are enough layers between a hypothetical attacker and a
> particular algorithm in the backend that it'd be really hard to get any
> reliable numbers.  Length-based attacks are more realistic, since e.g.
> we allow you to find out the compressed size of a data value.  But as
> you noted, those can be defeated by not storing sensitive data in the
> same place as attacker-controlled data.  Or turning off compression,
> but that's largely throwing the baby out with the bathwater.  In the
> end I think it's up to the DBA how concerned to be about this and
> what measures she should take to mitigate any risks.

I think that the paper shows that, under the right set of
circumstances, a timing-based attack is possible here. How frequently
those circumstances will arise is debatable, but I don't find it hard
to believe that there are production PostgreSQL clusters out there
against which such an attack could be mounted. I think you're right
when you say that length-based attacks might be practical, and perhaps
some of those might have more to do with PostgreSQL than than this
does, since this is really mostly about the properties of compression
algorithms in general rather than PostgreSQL specifically. I also
completely agree that it's really up to the DBA to decide how worried
to be and what to do about it. I think that the fact that compression
doesn't always run at the same speed or produce an output of the same
size is pretty much intrinsic to the idea of a compression algorithm,
and in a wide variety of circumstances that is absolutely fine and
absolutely desirable. When it permits this kind of attack, it's not,
but then don't use compression, or mitigate the problem some other
way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Practical Timing Side Channel Attacks on Memory Compression

From

Greg Stark

Date:

06 April 2022, 15:23:37

On Wed, 6 Apr 2022 at 10:29, Robert Haas <robertmhaas@gmail.com> wrote:
>
> I think that the paper shows that, under the right set of
> circumstances, a timing-based attack is possible here.

Generally any argument that an attack is infeasible is risky and
usually leads to security professionals showing that surprisingly
difficult attacks are entirely feasible.

However I think the opposite argument is actually much more
compelling. There are *so many* timing attacks on a general purpose
computing platform like Postgres that any defense to them can't
concentrate on just one code path and has to take a more comprehensive
approach.

So for example a front-end can add some stochastic latency or perhaps
padd latency to fixed amount.

Perhaps postgres could offer some protection at that level by e.g.
offering a function to do it. For most users I think they're better
off implementing it at the application level but some people use
database stored functions as their application level so it might be
useful for them.

Something like pg_sleep_until_multiple_of('50ms') which looks at the
transaction start time and calculates the amount of time to sleep
automatically.

-- 
greg

Re: Practical Timing Side Channel Attacks on Memory Compression

From

Filip Janus

Date:

11 April 2022, 06:36:17

Thanks for all of your opinions. I have almost the same feeling.

The best layer for mitigation should be probably a user application.

There can be arranged the correct data layout in the database, set up access limit for the app, and many other mitigation mechanisms.

-Filip-

st 6. 4. 2022 v 17:24 odesílatel Greg Stark <stark@mit.edu> napsal:

On Wed, 6 Apr 2022 at 10:29, Robert Haas <robertmhaas@gmail.com> wrote:
>
> I think that the paper shows that, under the right set of
> circumstances, a timing-based attack is possible here.

Generally any argument that an attack is infeasible is risky and
usually leads to security professionals showing that surprisingly
difficult attacks are entirely feasible.

However I think the opposite argument is actually much more
compelling. There are *so many* timing attacks on a general purpose
computing platform like Postgres that any defense to them can't
concentrate on just one code path and has to take a more comprehensive
approach.

So for example a front-end can add some stochastic latency or perhaps
padd latency to fixed amount.

Perhaps postgres could offer some protection at that level by e.g.
offering a function to do it. For most users I think they're better
off implementing it at the application level but some people use
database stored functions as their application level so it might be
useful for them.

Something like pg_sleep_until_multiple_of('50ms') which looks at the
transaction start time and calculates the amount of time to sleep
automatically.

--
greg