Thread: Broken linkparsing in archives
Looking at past announcements I noticed that Markdown links were parsed and/or rendered incorrectly in the archives. The example email that I noticed it on was this: https://www.postgresql.org/message-id/163724833494.26187.1931723451787420391@wrigleys.postgresql.org ..but it happens on all it seems, a more recent example: https://www.postgresql.org/message-id/166472941958.662.2706300812023074847%40wrigleys.postgresql.org The rendered links follow the same pattern, the last word in the markdown text block is prepended to the url block and all of it added as the href: [call for papers](https://2022.nordicpgday.org/cfp/) becomes: [call for <a href="http://papers](https://2022.nordicpgday.org/cfp/)" rel="nofollow">papers](https://2022.nordicpgday.org/cfp/)</a> Is this a known issue? -- Daniel Gustafsson https://vmware.com/
On Wed, Nov 2, 2022 at 1:31 PM Daniel Gustafsson <daniel@yesql.se> wrote:
Looking at past announcements I noticed that Markdown links were parsed and/or
rendered incorrectly in the archives. The example email that I noticed it on
was this:
https://www.postgresql.org/message-id/163724833494.26187.1931723451787420391@wrigleys.postgresql.org
..but it happens on all it seems, a more recent example:
https://www.postgresql.org/message-id/166472941958.662.2706300812023074847%40wrigleys.postgresql.org
The rendered links follow the same pattern, the last word in the markdown text
block is prepended to the url block and all of it added as the href:
[call for papers](https://2022.nordicpgday.org/cfp/)
becomes:
[call for <a href="http://papers](https://2022.nordicpgday.org/cfp/)" rel="nofollow">papers](https://2022.nordicpgday.org/cfp/)</a>
Is this a known issue?
Well, there is no markdown support at all :) So what happens comes out as a result of trying to extract links out of plaintext. This in turn is handled by the django urlize filter: https://docs.djangoproject.com/en/3.2/ref/templates/builtins/#urlize
Thus:
>>> from django.utils.html import urlize
>>> urlize('[call for papers](https://2022.nordicpgday.org/cfp/)')
'[call for <a href="http://papers](https://2022.nordicpgday.org/cfp/)">papers](https://2022.nordicpgday.org/cfp/)</a>'
'[call for <a href="http://papers](https://2022.nordicpgday.org/cfp/)">papers](https://2022.nordicpgday.org/cfp/)</a>'
And I'm not sure they *should* be considered, since the mime type of the body isn't markdown...
//Magnus
> On 2 Nov 2022, at 13:39, Magnus Hagander <magnus@hagander.net> wrote: > And I'm not sure they *should* be considered, since the mime type of the body isn't markdown... For emails sent as text to -announce, sure. But. Since we support markdown formatting in news postings that go out to -announce, it seems a bit unhelpful to generate broken links for all those posts. If I can come up with a filter that converts a broken link from urlize for the known case of markdown links, would that be an accepted solution? -- Daniel Gustafsson https://vmware.com/
On Wed, Nov 2, 2022 at 1:52 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 2 Nov 2022, at 13:39, Magnus Hagander <magnus@hagander.net> wrote:
> And I'm not sure they *should* be considered, since the mime type of the body isn't markdown...
For emails sent as text to -announce, sure. But. Since we support markdown
formatting in news postings that go out to -announce, it seems a bit unhelpful
to generate broken links for all those posts.
I agree with the principe, but the question is how reliable we can make it. (One oculd also argue we *should* post those as text/markdown, but I fear that will break even more MUAs).
If I can come up with a filter that converts a broken link from urlize for the
known case of markdown links, would that be an accepted solution?
If it can be made reliable, I think that would be acceptable. It needs to be validated that it works in the full chain that we use on the site (we also include the silly obfuscation of email addresses in the filter chain), but as long as that's done I think we can and should do it.