Re: New CF app deployment - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: New CF app deployment
Date
Msg-id CABUevEyEq2TUgH5FbY=1oFJL7rD5x-mg=Z23wAE6pjc=2CBYjA@mail.gmail.com
Whole thread Raw
In response to Re: New CF app deployment  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: New CF app deployment  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
On Mon, Feb 9, 2015 at 4:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 9, 2015 at 5:38 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, Feb 9, 2015 at 11:09 AM, Marco Nenciarini
> <marco.nenciarini@2ndquadrant.it> wrote:
>>
>> Il 08/02/15 17:04, Magnus Hagander ha scritto:
>> >
>> > Filenames are now shown for attachments, including a direct link to the
>> > attachment itself.  I've also run a job to populate all old threads.
>> >
>>
>> I wonder what is the algorithm to detect when an attachment is a patch.
>>
>> If you look at https://commitfest.postgresql.org/4/94/ all the
>> attachments are marked as "Patch: no", but many of them are
>> clearly a patch.
>
> It uses the "magic" module, same as the "file" command. And that one claims:
>
> mha@mha-laptop:/tmp$ file 0003-File-based-incremental-backup-v9.patch
> 0003-File-based-incremental-backup-v9.patch: ASCII English text, with very
> long lines
>
> I think it doesn't consider it a patch because it's not actually a patch -
> it looks like a git-format actual email message that *contains* a patch. It
> even includes the unix From separator line. So if anything it should have
> detected that it's an email message, which it apparently doesn't.
>
> Picking from the very top patch on the cf, an actual patch looks like this:
>
> mha@mha-laptop:/tmp$ file psql_fix_uri_service_004.patch
> psql_fix_uri_service_004.patch: unified diff output, ASCII text, with very
> long lines

Can we make it smarter, so that the kinds of things people produce
intending for them to be patches are thought by the CF app to be
patches?


Doing it wouldn't be too hard, as the code right now is simply:

                # Attempt to identify the file using magic information
                mtype = mag.buffer(contents)
                if mtype.startswith('text/x-diff'):
                        a.ispatch = True
                else:
                        a.ispatch = False


(where mag is the API call into the magic module)

So we could easily add for example our own regexp parsing or so. The question is do we want to - because we'll have to maintain it as well. But I guess if we have a restricted enough set of rules, we can probably live with that.

 

--

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: New CF app deployment
Next
From: David Rowley
Date:
Subject: Re: Patch to support SEMI and ANTI join removal