Re: RFC: seccomp-bpf support - Mailing list pgsql-hackers

From Tom Lane
Subject Re: RFC: seccomp-bpf support
Date
Msg-id 22580.1567014153@sss.pgh.pa.us
Whole thread Raw
In response to Re: RFC: seccomp-bpf support  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: RFC: seccomp-bpf support  (Joshua Brindle <joshua.brindle@crunchydata.com>)
List pgsql-hackers
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 2019-08-28 17:13, Joe Conway wrote:
>> Attached is a patch for discussion, adding support for seccomp-bpf
>> (nowadays generally just called seccomp) syscall filtering at
>> configure-time using libseccomp. I would like to get this in shape to be
>> committed by the end of the November CF if possible.

> To compute the initial set of allowed system calls, you need to have
> fantastic test coverage.  What you don't want is some rarely used error
> recovery path to cause a system crash.  I wouldn't trust our current
> coverage for this.

Yeah, that seems like quite a serious problem.  I think you'd want
to have some sort of static-code-analysis-based way of identifying
the syscalls in use, rather than trying to test your way to it.

> Overall, I think this sounds like a maintenance headache, and the
> possible benefits are unclear.

After thinking about this for awhile, I really don't follow what
threat model it's trying to protect against.  Given that we'll allow
any syscall that an unmodified PG executable might use, it seems
like the only scenarios being protected against involve someone
having already compromised the server enough to have arbitrary code
execution.  OK, fine, but then why wouldn't the attacker just
bypass libseccomp?  Or tell it to let through the syscall he wants
to use?  Having the list of allowed syscalls be determined inside
the process seems like fundamentally the wrong implementation.
I'd have expected a feature like this to be implemented by SELinux,
or some similar technology where the filtering is done by logic
that's outside the executable you wish to not trust.

(After googling for libseccomp, I see that it's supposed to not
allow syscalls to be turned back on once turned off, but that isn't
any protection against this problem.  An attacker who's found an ACE
hole in Postgres can just issue ALTER SYSTEM SET to disable the
feature, then force a postmaster restart, then profit.)

I follow the idea of limiting the attack surface for kernel bugs,
but this doesn't seem like a useful implementation of that, even
ignoring the ease-of-use problems Peter mentions.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: RFC: seccomp-bpf support
Next
From: Andres Freund
Date:
Subject: Re: RFC: seccomp-bpf support