Thread: 404 on message-ID with slashes

404 on message-ID with slashes

From
Daniel Gustafsson
Date:
Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
ran into an unexpected 404 which, it seems, is due to a message-id with
slashes: X//bJ6HKQJWx1wxA@paquier.xyz

Is this a known problem?

cheers ./daniel


Re: 404 on message-ID with slashes

From
Magnus Hagander
Date:
On Fri, Jan 15, 2021 at 1:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
> ran into an unexpected 404 which, it seems, is due to a message-id with
> slashes: X//bJ6HKQJWx1wxA@paquier.xyz
>
> Is this a known problem?

Alvaro mentioned it yesterday, but that's the first I've heard of it
(and AFAIK with the exact same message-id). A quick look showed it
wasn't dead obvious what the problem was, but I haven't had time to
dig into it anymore. Alvaro disappeared out of the discussion, so I'm
not sure if he's had time to look anythign more beyond that either.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: 404 on message-ID with slashes

From
Daniel Gustafsson
Date:
> On 15 Jan 2021, at 13:30, Magnus Hagander <magnus@hagander.net> wrote:
>
> On Fri, Jan 15, 2021 at 1:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>> Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
>> ran into an unexpected 404 which, it seems, is due to a message-id with
>> slashes: X//bJ6HKQJWx1wxA@paquier.xyz
>>
>> Is this a known problem?
>
> Alvaro mentioned it yesterday, but that's the first I've heard of it
> (and AFAIK with the exact same message-id). A quick look showed it
> wasn't dead obvious what the problem was, but I haven't had time to
> dig into it anymore. Alvaro disappeared out of the discussion, so I'm
> not sure if he's had time to look anythign more beyond that either.

From a quick skim, RFC 1036 a slash is an allowed (but discouraged) character
in a message-id and the obsoleting RFC (5322) support that.  I would expect
message-id generators to avoid using slashes though so I wouldn't be surprised
if it's quite rate.

cheers ./daniel


Re: 404 on message-ID with slashes

From
Magnus Hagander
Date:
On Fri, Jan 15, 2021 at 1:46 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 15 Jan 2021, at 13:30, Magnus Hagander <magnus@hagander.net> wrote:
> >
> > On Fri, Jan 15, 2021 at 1:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >>
> >> Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
> >> ran into an unexpected 404 which, it seems, is due to a message-id with
> >> slashes: X//bJ6HKQJWx1wxA@paquier.xyz
> >>
> >> Is this a known problem?
> >
> > Alvaro mentioned it yesterday, but that's the first I've heard of it
> > (and AFAIK with the exact same message-id). A quick look showed it
> > wasn't dead obvious what the problem was, but I haven't had time to
> > dig into it anymore. Alvaro disappeared out of the discussion, so I'm
> > not sure if he's had time to look anythign more beyond that either.
>
> From a quick skim, RFC 1036 a slash is an allowed (but discouraged) character
> in a message-id and the obsoleting RFC (5322) support that.  I would expect
> message-id generators to avoid using slashes though so I wouldn't be surprised
> if it's quite rate.
>

About 0.08% of the messages in the archives have it.

So yeah, while it's clearly not common, we should make it work.. (I
have confirmed the message in question is *in* the archives, so it's
something in the app serving them up)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: 404 on message-ID with slashes

From
Alvaro Herrera
Date:
On 2021-Jan-15, Magnus Hagander wrote:

> On Fri, Jan 15, 2021 at 1:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
> > ran into an unexpected 404 which, it seems, is due to a message-id with
> > slashes: X//bJ6HKQJWx1wxA@paquier.xyz
> >
> > Is this a known problem?
> 
> Alvaro mentioned it yesterday, but that's the first I've heard of it
> (and AFAIK with the exact same message-id). A quick look showed it
> wasn't dead obvious what the problem was, but I haven't had time to
> dig into it anymore. Alvaro disappeared out of the discussion, so I'm
> not sure if he's had time to look anythign more beyond that either.

Apologies for disappearing.

I did try to access the message by url-encoding the / character, but
that shows the same behavior.

I agree with the conclusion that the problem appears to be in the django
application ... but the regex looks fine:

    url(r'^message-id/(.+)$', archives.mailarchives.views.message),

I don't know if anything would make a / not match ".+" -- that would be
quite odd. 

I'm not a Django person, but it looks like the problem might be in this
bit:

def message(request, msgid):
    ...
    try:
        m = Message.objects.get(messageid=msgid)
    except Message.DoesNotExist:
        raise Http404('Message does not exist')

-- 
Álvaro Herrera       Valdivia, Chile
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)



Re: 404 on message-ID with slashes

From
Magnus Hagander
Date:
On Fri, Jan 15, 2021 at 4:35 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Jan-15, Magnus Hagander wrote:
>
> > On Fri, Jan 15, 2021 at 1:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > >
> > > Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
> > > ran into an unexpected 404 which, it seems, is due to a message-id with
> > > slashes: X//bJ6HKQJWx1wxA@paquier.xyz
> > >
> > > Is this a known problem?
> >
> > Alvaro mentioned it yesterday, but that's the first I've heard of it
> > (and AFAIK with the exact same message-id). A quick look showed it
> > wasn't dead obvious what the problem was, but I haven't had time to
> > dig into it anymore. Alvaro disappeared out of the discussion, so I'm
> > not sure if he's had time to look anythign more beyond that either.
>
> Apologies for disappearing.
>
> I did try to access the message by url-encoding the / character, but
> that shows the same behavior.

Some more digging shows the problem does not appear to be the /, but the //.

The weird thing is that it works fine in my local dev. Something very
fishy is afoot here :)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: 404 on message-ID with slashes

From
Magnus Hagander
Date:
On Fri, Jan 15, 2021 at 5:29 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Fri, Jan 15, 2021 at 4:35 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2021-Jan-15, Magnus Hagander wrote:
> >
> > > On Fri, Jan 15, 2021 at 1:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > > >
> > > > Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
> > > > ran into an unexpected 404 which, it seems, is due to a message-id with
> > > > slashes: X//bJ6HKQJWx1wxA@paquier.xyz
> > > >
> > > > Is this a known problem?
> > >
> > > Alvaro mentioned it yesterday, but that's the first I've heard of it
> > > (and AFAIK with the exact same message-id). A quick look showed it
> > > wasn't dead obvious what the problem was, but I haven't had time to
> > > dig into it anymore. Alvaro disappeared out of the discussion, so I'm
> > > not sure if he's had time to look anythign more beyond that either.
> >
> > Apologies for disappearing.
> >
> > I did try to access the message by url-encoding the / character, but
> > that shows the same behavior.
>
> Some more digging shows the problem does not appear to be the /, but the //.
>
> The weird thing is that it works fine in my local dev. Something very
> fishy is afoot here :)

Aand, that was the hint.

merge_slashes off;

in the nginx config fixed it.

The weirdest thing is that the request looks identical in the log that
 arrived at the django app. But it started delivering 200 instead of
404. Which is mighty weird.

But that said, it does appear to be working now!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: 404 on message-ID with slashes

From
Alvaro Herrera
Date:
On 2021-Jan-15, Magnus Hagander wrote:

> Some more digging shows the problem does not appear to be the /, but the //.

Hmm ... is it possible that Python somehow interprets the // as a
regex-close marker or escape character or something like that?

> The weird thing is that it works fine in my local dev. Something very
> fishy is afoot here :)

If the Python/django versions are different, maybe there was a bugfix in
between?

-- 
Álvaro Herrera       Valdivia, Chile
"Some men are heterosexual, and some are bisexual, and some
men don't think about sex at all... they become lawyers" (Woody Allen)



Re: 404 on message-ID with slashes

From
Magnus Hagander
Date:
On Fri, Jan 15, 2021 at 3:08 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Fri, Jan 15, 2021 at 1:46 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > > On 15 Jan 2021, at 13:30, Magnus Hagander <magnus@hagander.net> wrote:
> > >
> > > On Fri, Jan 15, 2021 at 1:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > >>
> > >> Trying to load the last message in https://commitfest.postgresql.org/31/2858/ I
> > >> ran into an unexpected 404 which, it seems, is due to a message-id with
> > >> slashes: X//bJ6HKQJWx1wxA@paquier.xyz
> > >>
> > >> Is this a known problem?
> > >
> > > Alvaro mentioned it yesterday, but that's the first I've heard of it
> > > (and AFAIK with the exact same message-id). A quick look showed it
> > > wasn't dead obvious what the problem was, but I haven't had time to
> > > dig into it anymore. Alvaro disappeared out of the discussion, so I'm
> > > not sure if he's had time to look anythign more beyond that either.
> >
> > From a quick skim, RFC 1036 a slash is an allowed (but discouraged) character
> > in a message-id and the obsoleting RFC (5322) support that.  I would expect
> > message-id generators to avoid using slashes though so I wouldn't be surprised
> > if it's quite rate.
> >
>
> About 0.08% of the messages in the archives have it.
>
> So yeah, while it's clearly not common, we should make it work.. (I
> have confirmed the message in question is *in* the archives, so it's
> something in the app serving them up)

And for those interested, there are a total of *25* messages across
all time (and almost a million and a half messages) that have the
double slashes that cause problems. 10 of those are from Michael,  14
are from 2008 or older, and one from 2016.

So it's safe to say that these days Michael are pretty alone in not
following the recommendation :)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: 404 on message-ID with slashes

From
Alvaro Herrera
Date:
On 2021-Jan-15, Magnus Hagander wrote:

> merge_slashes off;
> 
> in the nginx config fixed it.

Ooh ... thanks for fixing :-)

> The weirdest thing is that the request looks identical in the log that
>  arrived at the django app. But it started delivering 200 instead of
> 404. Which is mighty weird.

Hmm, yeah, it is weird.

-- 
Álvaro Herrera       Valdivia, Chile