Re: PATCH: Update snowball stemmers - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: PATCH: Update snowball stemmers
Date
Msg-id 2d86af5d-051e-f92d-eb85-bfcd193a15e2@2ndQuadrant.com
Whole thread Raw
In response to PATCH: Update snowball stemmers  (Arthur Zakirov <a.zakirov@postgrespro.ru>)
Responses Re: PATCH: Update snowball stemmers
List pgsql-hackers

On 06/26/2018 08:20 AM, Arthur Zakirov wrote:
> Hello hackers,
>
> I'd like to propose the patch which syncs PostgreSQL snowball stemmers.
> As Tom pointed [1] stemmers haven't synced for a very long time.
>
> I copied all source files without changes, except replacing '#include
> "../runtime/header.h"' with '#include "header.h"' and removing includes
> of standard headers from utilities.c.
>
> Hungarian language uses ISO-8859-1 and UTF-8 charsets in Postgres HEAD.
> But in Snowball HEAD it is ISO-8859-2 per commit [2]. This patch changes
> hungarian's charset from ISO-8859-1 to ISO-8859-2 too.
>
> Additionally updated files in the patch are:
> - utilities.c
> - header.h
>
> Will add to the next commitfest.
>
> Any comments?
>
> 1 - https://www.postgresql.org/message-id/5689.1519054983%40sss.pgh.pa.us
> 2 - https://github.com/snowballstem/snowball/commit/4bcae97db044253ea2edae1dd3ca59f3cddd4b9d
>


I agree with Tom that we should sync with the upstream before we do 
anything else. This is a very large patch  but with fairly limited 
impact. I think now at the start of a dev cycle is the right time to 
apply it.

I don't know if we have a buildfarm animal testing Hungarian. Maybe we 
need a buildfarm animal or two testing a large number of locales.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: log_min_messages shows debug instead of debug2
Next
From: Dave Cramer
Date:
Subject: Re: How can we submit code patches that implement our (pending) patents?