Thread: Making pglister work with exim 4.96+

Making pglister work with exim 4.96+

From
Célestin Matte
Date:
Exim introduced variable tainting as a security measure starting from exim 4.93. Starting from exim 4.96, tainting is
mandatoryfor commands. This means that it is no longer possible to pass variables to pglister's inject.py (which
requirespassing $sender_address, $local_part, $domain and $header_message-id) or pgarchives' load_message.py (which
requirespassing $local_part).
 

Exim, while enforcing these strict security policies, only provides the possibility to de-taint variables in very
specificsituations (explicit matching with a list). Problem is known and discussed here [1]. This is very annoying in
ourcase, as $sender_address or $header_message-id can be pretty much anything (and no, matching against a regexp is not
sufficientto de-taint).
 
I've been scratching my head over this for a while and can't figure out a proper way to fix this issue. A possible
workaroundwould be to blindly de-taint anything using an ugly hack [2] but that defeats the purpose of having variable
taintingin the first place, and doesn't seem like a valid, long-term, production-ready solution.
 

I'm starting to wonder if the only solution would be to have pglister fetch information from exim in some way, instead
ofthe other way around.
 

Any idea?

[1] : https://lists.exim.org/lurker/message/20201109.222746.24ea3904.fi.html
[2] : https://jimbobmcgee.wordpress.com/2020/07/29/de-tainting-exim-configuration-variables/

-- 
Célestin Matte



Re: Making pglister work with exim 4.96+

From
Magnus Hagander
Date:


On Mon, Jun 17, 2024 at 11:22 AM Célestin Matte <celestin.matte@cmatte.me> wrote:
Exim introduced variable tainting as a security measure starting from exim 4.93. Starting from exim 4.96, tainting is mandatory for commands. This means that it is no longer possible to pass variables to pglister's inject.py (which requires passing $sender_address, $local_part, $domain and $header_message-id) or pgarchives' load_message.py (which requires passing $local_part).

Exim, while enforcing these strict security policies, only provides the possibility to de-taint variables in very specific situations (explicit matching with a list). Problem is known and discussed here [1]. This is very annoying in our case, as $sender_address or $header_message-id can be pretty much anything (and no, matching against a regexp is not sufficient to de-taint).
I've been scratching my head over this for a while and can't figure out a proper way to fix this issue. A possible workaround would be to blindly de-taint anything using an ugly hack [2] but that defeats the purpose of having variable tainting in the first place, and doesn't seem like a valid, long-term, production-ready solution.

I'm starting to wonder if the only solution would be to have pglister fetch information from exim in some way, instead of the other way around.

Any idea?


I'm no exim expert, so I'm explicitly copying in Stefan here in case he didn't spot this one.

Maybe we could have a switch to inject that picks these up from the environment: I *think* most of those are actually made available by default as environment variables in exim if I understand https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html point 4 correct. Or would those have the same problems with tainting?

AIUI the only thing we couldn't get that way might be the message-id? The question is, can we add that to the environment without getting into taint problems?

//Magnus

Re: Making pglister work with exim 4.96+

From
Célestin Matte
Date:
Update:
Fix for pgarchives' load_message.py is pretty straightforward: exim provides the untainted version of $local_part,
$local_part_data.Same for $domain and $domain_data.
 
Pglister's inject.py is a tougher situation. I can't seem to get an untainted version of $sender_address and
$header_message-id.

However, replacing them with fake values does get things delivered properly. I'm starting to wonder if we really need
thesevalues. Why does inject.py need them for exactly? Header-message-id seems to only be displayed in the moderation
queue,and sender address is correctly retrieved anyway (or is it just for the "envelope:" field of the moderation
queue?).

> Maybe we could have a switch to inject that picks these up from the environment: I *think* most of those are actually
madeavailable by default as environment variables in exim if I understand
https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html
<https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html>point 4 correct. Or would those
havethe same problems with tainting?
 
> 
> AIUI the only thing we couldn't get that way might be the message-id? The question is, can we add that to the
environmentwithout getting into taint problems?
 

Can't get that to work (${env{SENDER_ADDRESS}} or SENDER is replaced by an empty value). I could keep trying, but that
stillwouldn't solve the problem for $header_message-id.
 

-- 
Célestin Matte




Re: Making pglister work with exim 4.96+

From
Stefan Kaltenbrunner
Date:
On 17.06.24 12:57, Célestin Matte wrote:
> Update:
> Fix for pgarchives' load_message.py is pretty straightforward: exim 
> provides the untainted version of $local_part, $local_part_data. Same 
> for $domain and $domain_data.
> Pglister's inject.py is a tougher situation. I can't seem to get an 
> untainted version of $sender_address and $header_message-id.
> 
> However, replacing them with fake values does get things delivered 
> properly. I'm starting to wonder if we really need these values. Why 
> does inject.py need them for exactly? Header-message-id seems to only be 
> displayed in the moderation queue, and sender address is correctly 
> retrieved anyway (or is it just for the "envelope:" field of the 
> moderation queue?).
> 
>> Maybe we could have a switch to inject that picks these up from the 
>> environment: I *think* most of those are actually made available by 
>> default as environment variables in exim if I understand 
>> https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html
<https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html>point 4 correct. Or would those
havethe same problems with tainting?
 
>>
>> AIUI the only thing we couldn't get that way might be the message-id? 
>> The question is, can we add that to the environment without getting 
>> into taint problems?
> 
> Can't get that to work (${env{SENDER_ADDRESS}} or SENDER is replaced by 
> an empty value). I could keep trying, but that still wouldn't solve the 
> problem for $header_message-id.

SENDER_ADDRESS is not among the default set of environment variables - 
did you actually add them in the pipe transport using an environment = 
stanza?



Stefan



Re: Making pglister work with exim 4.96+

From
Stefan Kaltenbrunner
Date:
On 17.06.24 11:42, Magnus Hagander wrote:
> 
> 
> On Mon, Jun 17, 2024 at 11:22 AM Célestin Matte 
> <celestin.matte@cmatte.me <mailto:celestin.matte@cmatte.me>> wrote:
> 
>     Exim introduced variable tainting as a security measure starting
>     from exim 4.93. Starting from exim 4.96, tainting is mandatory for
>     commands. This means that it is no longer possible to pass variables
>     to pglister's inject.py (which requires passing $sender_address,
>     $local_part, $domain and $header_message-id) or pgarchives'
>     load_message.py (which requires passing $local_part).
> 
>     Exim, while enforcing these strict security policies, only provides
>     the possibility to de-taint variables in very specific situations
>     (explicit matching with a list). Problem is known and discussed here
>     [1]. This is very annoying in our case, as $sender_address or
>     $header_message-id can be pretty much anything (and no, matching
>     against a regexp is not sufficient to de-taint).
>     I've been scratching my head over this for a while and can't figure
>     out a proper way to fix this issue. A possible workaround would be
>     to blindly de-taint anything using an ugly hack [2] but that defeats
>     the purpose of having variable tainting in the first place, and
>     doesn't seem like a valid, long-term, production-ready solution.
> 
>     I'm starting to wonder if the only solution would be to have
>     pglister fetch information from exim in some way, instead of the
>     other way around.
> 
>     Any idea?
> 
> 
> I'm no exim expert, so I'm explicitly copying in Stefan here in case he 
> didn't spot this one.


I did not - so thanks for the heads-up

> 
> Maybe we could have a switch to inject that picks these up from the 
> environment: I *think* most of those are actually made available by 
> default as environment variables in exim if I understand 
> https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html
<https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html>point 4 correct. Or would those
havethe same problems with tainting?
 
> 
> AIUI the only thing we couldn't get that way might be the message-id? 
> The question is, can we add that to the environment without getting into 
> taint problems?

We have been briefly discussing that very issue last year and the 
consensus was basically going the environment variable route (which can 
also be implemented on older exim installs) - the above list is only the 
"default" set of environment variables available and we can add more.




Stefan



Re: Making pglister work with exim 4.96+

From
Célestin Matte
Date:
> We have been briefly discussing that very issue last year and the consensus was basically going the environment
variableroute (which can also be implemented on older exim installs) - the above list is only the "default" set of
environmentvariables available and we can add more.
 

But using environment variable is just working around the problem by evading the security mechanism. Documentation
stillwarns about being careful [1]. And given that exim keeps extending tainting to more places, it's possible this
solutionwill break in the future.
 

I think I found a good, yet hacky, workaround: using a pgsql lookup to insert the values directly into the database.
Thisway, we avoid passing dangerous data through a shell, and we can escape them using ${quote_pgsql}. Using
event_action,I can execute this at the right time (after delivery).
 
My current solution is something like this:
  command = /pglister_path/web/pglister/bin/python /pglister_path/bin/inject.py -d $local_part_data@$domain_data -m
$message_id-s ''
 
  event_action = ${if eq {msg:delivery}{$event_name} {${lookup pgsql{update incoming_mail set
sender='${quote_pgsql:$sender_address}'where messageid='${quote_pgsql:$message_id}'; notify incoming}} {}}}
 
and removing the "notify incoming" in inject.py.

This still requires tweaking and adding the bounce case, but I think it's a good start and tests are working so far.

[1] : https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html point 4
-- 
Célestin Matte




Re: Making pglister work with exim 4.96+

From
Stefan Kaltenbrunner
Date:
On 17.06.24 22:59, Célestin Matte wrote:
>> We have been briefly discussing that very issue last year and the 
>> consensus was basically going the environment variable route (which 
>> can also be implemented on older exim installs) - the above list is 
>> only the "default" set of environment variables available and we can 
>> add more.
> 
> But using environment variable is just working around the problem by 
> evading the security mechanism. Documentation still warns about being 
> careful [1]. And given that exim keeps extending tainting to more 
> places, it's possible this solution will break in the future.

personally I doubt that environment variable passing will ever be part 
of tainting - in effect that would mean that the environment feature 
would have to be dropped entirely and that would break compatibility 
with myriads of CLI tools expecting at least the default set of variables.

> 
> I think I found a good, yet hacky, workaround: using a pgsql lookup to 
> insert the values directly into the database. This way, we avoid passing 
> dangerous data through a shell, and we can escape them using 
> ${quote_pgsql}. Using event_action, I can execute this at the right time 
> (after delivery).
> My current solution is something like this:
>   command = /pglister_path/web/pglister/bin/python 
> /pglister_path/bin/inject.py -d $local_part_data@$domain_data -m 
> $message_id -s ''
>   event_action = ${if eq {msg:delivery}{$event_name} {${lookup 
> pgsql{update incoming_mail set sender='${quote_pgsql:$sender_address}' 
> where messageid='${quote_pgsql:$message_id}'; notify incoming}} {}}}
> and removing the "notify incoming" in inject.py.
> 
> This still requires tweaking and adding the bounce case, but I think 
> it's a good start and tests are working so far.
> 
> [1] : 
> https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html point 4


I don't think we should use the event infrastructure for this - I have 
limited faith into the robustness of the event infrastructure for a use 
case like that and I envision various race conditions in failure cases 
especially during concurrent operations and failure cases.

The msg:delivery event fires after delivery and is kinda independent on 
the (more or less) atomic operation of the inject.py script.
In production we will have multiple inject.pl running in parallel and 
one delivery process might send the NOTIFY before the other even reaches 
the update.
It also adds additional complexity for maintenance operations because a 
failure in running the event_action(say a database restart as part of a 
server reboot) AFTER the script ran might result in strange states.



Stefan



Re: Making pglister work with exim 4.96+

From
Magnus Hagander
Date:
On Tue, Jun 18, 2024 at 10:19 AM Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
On 17.06.24 22:59, Célestin Matte wrote:
>> We have been briefly discussing that very issue last year and the
>> consensus was basically going the environment variable route (which
>> can also be implemented on older exim installs) - the above list is
>> only the "default" set of environment variables available and we can
>> add more.
>
> But using environment variable is just working around the problem by
> evading the security mechanism. Documentation still warns about being
> careful [1]. And given that exim keeps extending tainting to more
> places, it's possible this solution will break in the future.

personally I doubt that environment variable passing will ever be part
of tainting - in effect that would mean that the environment feature
would have to be dropped entirely and that would break compatibility
with myriads of CLI tools expecting at least the default set of variables.

Yeah, and I don't see why they would? The reason they do the taint marking in variables used in commands and filenames is that it would be a potential venue for attackers to inject things. No such vulnerability exists with environment variables. Obviously the receiving code, whether a shellscript or a python program or a c program or whatever, can have injection vulnerabilities of it's own, but the passing values layer (which is what Exim is responsible for there) does not.
 
 
> I think I found a good, yet hacky, workaround: using a pgsql lookup to
> insert the values directly into the database. This way, we avoid passing
> dangerous data through a shell, and we can escape them using
> ${quote_pgsql}. Using event_action, I can execute this at the right time
> (after delivery).
> My current solution is something like this:
>   command = /pglister_path/web/pglister/bin/python
> /pglister_path/bin/inject.py -d $local_part_data@$domain_data -m
> $message_id -s ''
>   event_action = ${if eq {msg:delivery}{$event_name} {${lookup
> pgsql{update incoming_mail set sender='${quote_pgsql:$sender_address}'
> where messageid='${quote_pgsql:$message_id}'; notify incoming}} {}}}
> and removing the "notify incoming" in inject.py.
>
> This still requires tweaking and adding the bounce case, but I think
> it's a good start and tests are working so far.
>
> [1] :
> https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_pipe_transport.html point 4


I don't think we should use the event infrastructure for this - I have
limited faith into the robustness of the event infrastructure for a use
case like that and I envision various race conditions in failure cases
especially during concurrent operations and failure cases.

The msg:delivery event fires after delivery and is kinda independent on
the (more or less) atomic operation of the inject.py script.
In production we will have multiple inject.pl running in parallel and
one delivery process might send the NOTIFY before the other even reaches
the update.
It also adds additional complexity for maintenance operations because a
failure in running the event_action(say a database restart as part of a
server reboot) AFTER the script ran might result in strange states.


Yeah, this seems extremely fragile. Concurrent delivery is a common thing, and not the only potential problem I bet. The proper fix surely is to make invoke.py work properly.

And the above doesn't actually solve the problem does it? It still requires passing the message-id which is a tainted variable? 

//Magnus

Re: Making pglister work with exim 4.96+

From
Célestin Matte
Date:
> Yeah, and I don't see why they would? The reason they do the taint marking in variables used in commands and
filenamesis that it would be a potential venue for attackers to inject things. No such vulnerability exists with
environmentvariables. Obviously the receiving code, whether a shellscript or a python program or a c program or
whatever,can have injection vulnerabilities of it's own, but the passing values layer (which is what Exim is
responsiblefor there) does not.
 

Yet this is what we want to do here: bypass security protection by passing dangerous data through environment
variables.It would make sense for them to prevent that usage
 

> Yeah, this seems extremely fragile. Concurrent delivery is a common thing, and not the only potential problem I bet.
Theproper fix surely is to make invoke.py work properly.
 

What's invoke.py? Do you mean inject.py?

I'm aware of the potential concurrency issues. One fix could be to only process emails in mailqueuehandler.py if their
senderaddress is not empty (or we could add a boolean field for that purpose).
 

> And the above doesn't actually solve the problem does it? It still requires passing the message-id which is a tainted
variable?

$message_id is not the header, it's exim's internal message ID and is untainted.
Here's my current version, handling the header as well:
event_action = ${if eq {msg:delivery}{$event_name} {${lookup pgsql{update incoming_mail set
sender='${quote_pgsql:$sender_address}',messageid='${quote_pgsql:$header_message-id:}' where
messageid='${quote_pgsql:$message_id}';notify incoming; update bounce_mail set sender='${quote_pgsql:$sender_address}',
messageid='${quote_pgsql:$header_message-id:}'where messageid='${quote_pgsql:$message_id}'; notify bounce}} {}}}
 


Another overall solution may be to fetch header_message-id and sender_address from exim in inject.py using a subprocess
(providedit's still queued at that point?)
 

-- 
Célestin Matte




Re: Making pglister work with exim 4.96+

From
Célestin Matte
Date:
OK I understand better now how environment variable is a better working solution. I completely removed argparse, since
nomore parameters are used.
 

Patch to modify inject.py accordingly attached.

On the exim side, final solution involves:
- modifying pglister transport to add
environment = HEADER_MESSAGE_ID=$header_message-id:
- modifying pgarchives transport to use $local_part_data instead of $local_part

On 19/06/2024 11:29, Célestin Matte wrote:
>> Yeah, and I don't see why they would? The reason they do the taint marking in variables used in commands and
filenamesis that it would be a potential venue for attackers to inject things. No such vulnerability exists with
environmentvariables. Obviously the receiving code, whether a shellscript or a python program or a c program or
whatever,can have injection vulnerabilities of it's own, but the passing values layer (which is what Exim is
responsiblefor there) does not.
 
> 
> Yet this is what we want to do here: bypass security protection by passing dangerous data through environment
variables.It would make sense for them to prevent that usage
 
> 
>> Yeah, this seems extremely fragile. Concurrent delivery is a common thing, and not the only potential problem I bet.
Theproper fix surely is to make invoke.py work properly.
 
> 
> What's invoke.py? Do you mean inject.py?
> 
> I'm aware of the potential concurrency issues. One fix could be to only process emails in mailqueuehandler.py if
theirsender address is not empty (or we could add a boolean field for that purpose).
 
> 
>> And the above doesn't actually solve the problem does it? It still requires passing the message-id which is a
taintedvariable?
 
> 
> $message_id is not the header, it's exim's internal message ID and is untainted.
> Here's my current version, handling the header as well:
> event_action = ${if eq {msg:delivery}{$event_name} {${lookup pgsql{update incoming_mail set
sender='${quote_pgsql:$sender_address}',messageid='${quote_pgsql:$header_message-id:}' where
messageid='${quote_pgsql:$message_id}';notify incoming; update bounce_mail set sender='${quote_pgsql:$sender_address}',
messageid='${quote_pgsql:$header_message-id:}'where messageid='${quote_pgsql:$message_id}'; notify bounce}} {}}}
 
> 
> 
> Another overall solution may be to fetch header_message-id and sender_address from exim in inject.py using a
subprocess(provided it's still queued at that point?)
 
> 

-- 
Célestin Matte

Attachment

Re: Making pglister work with exim 4.96+

From
Magnus Hagander
Date:
On Wed, Jun 19, 2024 at 11:29 AM Célestin Matte <celestin.matte@cmatte.me> wrote:
> Yeah, and I don't see why they would? The reason they do the taint marking in variables used in commands and filenames is that it would be a potential venue for attackers to inject things. No such vulnerability exists with environment variables. Obviously the receiving code, whether a shellscript or a python program or a c program or whatever, can have injection vulnerabilities of it's own, but the passing values layer (which is what Exim is responsible for there) does not.

Yet this is what we want to do here: bypass security protection by passing dangerous data through environment variables. It would make sense for them to prevent that usage

How is the data dangerous?

I mean, we pass the body of the email to inject.py as well, and that's also under control of the sender. And if that's a problem, they'd have to remove the entire pipe transport completely, which I really doubt wil lhappen...

The data is only dangerous when potentially badly escaped for e.g. a commandline. Passing it in an environment variable does not have that problem.


> Yeah, this seems extremely fragile. Concurrent delivery is a common thing, and not the only potential problem I bet. The proper fix surely is to make invoke.py work properly.

What's invoke.py? Do you mean inject.py?

Yes.


I'm aware of the potential concurrency issues. One fix could be to only process emails in mailqueuehandler.py if their sender address is not empty (or we could add a boolean field for that purpose).

I mean, yeah, we could. But it still adds a complexity and fragitlity, for from what I can tell zero actual gain.



> And the above doesn't actually solve the problem does it? It still requires passing the message-id which is a tainted variable?

$message_id is not the header, it's exim's internal message ID and is untainted.

Oh. But we need the actual message-id. Having the exim internal one is of no value.

 
Here's my current version, handling the header as well:
event_action = ${if eq {msg:delivery}{$event_name} {${lookup pgsql{update incoming_mail set sender='${quote_pgsql:$sender_address}', messageid='${quote_pgsql:$header_message-id:}' where messageid='${quote_pgsql:$message_id}'; notify incoming; update bounce_mail set sender='${quote_pgsql:$sender_address}', messageid='${quote_pgsql:$header_message-id:}' where messageid='${quote_pgsql:$message_id}'; notify bounce}} {}}}


Another overall solution may be to fetch header_message-id and sender_address from exim in inject.py using a subprocess (provided it's still queued at that point?)

I still completely fail to see why this complexity is a good idea.

--

Re: Making pglister work with exim 4.96+

From
Célestin Matte
Date:
> I still completely fail to see why this complexity is a good idea.

Yea I sent a patch following the environment variable solution after this email. What do you think about it?

-- 
Célestin Matte