Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki
Date
Msg-id 525A1C6C-2C8C-421C-A112-0CCA867D194F@yesql.se
Whole thread Raw
In response to Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki  (Fabien COELHO <fabien.coelho@mines-paristech.fr>)
Responses Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki  (Jan Michálek <godzilalalala@gmail.com>)
List pgsql-hackers
> On 08 May 2017, at 12:02, Fabien COELHO <fabien.coelho@mines-paristech.fr> wrote:
>
> Hello Jan,
>
> Please give a number to submitted patches. I think that this was v3.
>
> The patch does NOT fix various issues I pointed out in my previous review:
>
> - tabs introduced in "doc/src/sgml/ref/psql-ref.sgml"
> - too long help line in "src/bin/psql/help.c"
> - spurious space after a comma in "src/fe_utils/print.c"
>   and possibly elsewhere.
>
> On Sun, 23 Apr 2017, Jan Michálek wrote:
>
>>>> Markdown include characters/sequences which are interpreted as markers:
>>>> _Italic_, **Bold**, *** => horizontal rules, > block quote... `inline
>>>> code`... If they are interpreted within a table cell then probably they
>>>> should be escaped somehow.
>>
>> I have treated "_*|<>"
>
> Probably not enough, see below. Note the escaping chars should also be escaped.
>
>>> I`m able to sanitize characters, but complex sequences will be problem. I
>>> will look on this, but I don`t know, if I`m able to do this.
>
> I do not know whether only those are necessary. Have you checked? Guessing is probably not the right approach.
>
>
> Concerning MARKDOWN, and according to the following source about github markdown implementation:
>
>     https://enterprise.github.com/downloads/en/markdown-cheatsheet.pdf
>
> The following characters may need to be backslash escaped, although it does not cover HTML stuff.
>
>     \   backslash
>     `   backtick
>     *   asterisk
>     _   underscore
>     {}  curly braces
>     []  square brackets
>     ()  parentheses
>     #   hash mark
>     +   plus sign
>     -   minus sign (hyphen)
>     .   dot
>     !   exclamation
>
> Another source https://genius.com/3057216 suggests (* # / ( ) [ ] < >),
> which should protect HTML.
>
> However, the escaping seems to be the backslash character, NOT using html encoding < as done in your version.
>
> Where did you find the precise escaping rules for markdown? I do not think that it should be invented...
>
>
> I have looked at RST, according to this reference:
>
>     http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#grid-tables
>
> The good news is that you do not need to handle a special | case because you would only produce clean tables.
>
> I've tested UTF-8 with plane 1 (你好!) and plane 2 (𠀡) and the alignment seems to worked well, incredible!
>
>>> My main interest on this was in rst. I`m using markdown only in github issues and my knowldge about md is poor.
>
> Then maybe only do RST?!
>
> It looks much simpler anyway, and if you do MARKDOWN the support needs to be clean.
>
> About the code:
>
> I'm still at odds with the code which needs to test for markdown to call for different functions in multiple places.
Ifyou keep md and in order to avoid that, I would suggest to extend the pg_wcs* functions with a list of caracters
whichmay have different sizes with additionnal args, say: 
>
>  pg_wcssize(// same args, plus:
>             char * escaped_chars, // will require escaping
>             int escape_len, // how many chars added when escaping
>             int nllen // len of newline if substituted
>             );
>
> So that pg_wcssize(..., "\r", 1, 1) would behave as before (\n and \t are rather special cases), and the various
constantscould be held in the format description so the whole thing would be parametric. 
>
> Same approach with format.
>
> That would allow to simplify the code significantly and to share it between MARKDOWN and others. Also, the multiple
"elseif" list would be simplified by using strchr or the escaped_chars string. 

This patch was moved into the current Commitfest marked “Waiting for author”
with the above review.  Have you had a chance to work on it addressing the
review comments such that we can expect a new version within this CF?

cheers ./daniel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] Patches that don't apply or don't compile: 2017-09-12
Next
From: Daniel Gustafsson
Date:
Subject: Re: [HACKERS] PATCH: psql show index with type info