Re: request a new feature in fuzzystrmatch - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: request a new feature in fuzzystrmatch |
Date | |
Msg-id | CAB7nPqRatR2oqe6OpubvVyHnKgxLt2CbF4V5Hf3phPavgEcx7w@mail.gmail.com Whole thread Raw |
In response to | Re: request a new feature in fuzzystrmatch (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: request a new feature in fuzzystrmatch
|
List | pgsql-hackers |
Hi Liming, Here is a more formal review of this patch. First, when submitting a patch, please follow the following guidelines: http://wiki.postgresql.org/wiki/Submitting_a_Patch http://wiki.postgresql.org/wiki/Creating_Clean_Patches Particularly, when you develop a new feature, people will expect that you submit your patches based on the master branch in git. This extremely facilitates the review and test work that can be done based on your work. So do not create a new dedicated project on github like that => https://github.com/liminghu/fuzzystrmatch, but for example fork the postgres repository, and work directly on it using for example custom branches, then generate patches between your custom branches and postgres master branch. Also, new features need to be submitted to a commit fest (https://commitfest.postgresql.org/). The next commit fest is in September, so you have plenty of time to update your patch based on the comments of my review (and perhaps comments of others), then register a patch directly there when you are done. In order to perform my review, I took your github project and generated a diff patch. The patch is attached, and applies to master branch, so you could use it for your future work as a base. OK, now for the review, here are some comments about the structure of the patch, which is incorrect based on the structure extensions need to have. 1) This patch lacks documentation, you shouldn't add a README.md file. So remove it and updated the dedicated sgml documentation. In your case, the file to be updated with more details about Damerau–Levenshtein is doc/src/sgml/fuzzystrmatch.sgml. 2) Remove fuzzystrmatch--1.0.sql, this is not necessary anymore because this module is upgraded to 1.1 3) Remove fuzzystrmatch--unpackaged--1.1.sql, it is not needed 4) Add fuzzystrmatch--1.0--1.1.sql, which is a file that can be used to upgrade an existing fuzzystrmatch 1.0 to 1.1. This needs to contain all the modifications allowing to do a correct upgrade: alter existing objects to their new shape, add new objects, and remove unnecessary objects. In your case, this file should have this shape: /* contrib/fuzzystrmatch/fuzzystrmatch--1.0--1.1.sql */ -- complain if script is sourced in psql, rather than via ALTER EXTENSION \echo Use "ALTER EXTENSION fuzzystrmatch UPDATE" to load this file. \quit CREATE FUNCTION damerau_levenstein()... etc. 5) Your code copies a function from TOMOYO Linux, which is under GPL2 license, so I believe that this cannot be integrated to Postgres which is under PostgreSQL license (more permissive). Just based on that some portions of this code cannot be included in Postgres, I am not sure but this gives a sufficient reason to reject this patch. This is all I have about the shape of the patch... Also, I am not (yet) very familiar with Damerau–Levenshtein itself and I need to read more about that before giving a precise opinion... but here are some comments anyway based on what I can read from the code: 1) There is a lot of duplicated code among levenshtein.c, dameraulevenshtein.c and dameraulevenshtein_new.c, why is that? levenshtein.c refers even to a variable called trans_c which is even used nowehere. 2) By doing a simple diff between for example levenshtein.c and dameraulevenshtein.c, the only new thing is the function called dameraulevenshtein_internal_noncompatible copied from tomoyo linux... And this function is used nowhere... The new functions for Damerau–Levenshtein equivalent to levenshtein[_less_equal], called damaraulevenshtein[_less_equal], are exactly the same things So, in short, even if I may not be able to give a precise opinion about Damerau–Levenshtein, what this patch tries to achieve is unclear to me... The only new feature I can see is dameraulevenshtein_internal_noncompatible but this is taken from code licensed as GPL. -- Michael
Attachment
pgsql-hackers by date: