Thread: Re: Retiring some encodings?

Re: Retiring some encodings?

From

Bruce Momjian

Date:

22 May, 17:02:16

On Thu, May 22, 2025 at 02:44:39PM +0300, Heikki Linnakangas wrote:
> On 22/05/2025 08:54, Michael Paquier wrote:
> > With all that in mind, I have wanted to kick a discussion about
> > potentially removing one or more encodings from the core code,
> > including the backend part, the frontend part and the conversion
> > routines, coupled with checks in pg_upgrade to complain with database
> > or collations include the so-said encoding (the collation part needs
> > to be checked when not using ICU).  Just being able to removing
> > GB18030 would do us a favor in the long-term, at least, but there's
> > more.
> 
> +1 at high level for deprecating and removing conversions that are not
> widely used anymore. As the first step, we can at least add a warning to the
> documentation, that they will be removed in the future.

Agreed on notification.  A radical idea would be to add a warning for
the use of such encodings in PG 18, and then mention their deprecation
in the PG 18 release notes so everyone is informed they will be removed
in PG 19.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

Re: Retiring some encodings?

From

Heikki Linnakangas

Date:

23 May, 10:18:34

On 23/05/2025 05:11, Michael Paquier wrote:
> On Thu, May 22, 2025 at 10:02:16AM -0400, Bruce Momjian wrote:
>> Agreed on notification.  A radical idea would be to add a warning for
>> the use of such encodings in PG 18, and then mention their deprecation
>> in the PG 18 release notes so everyone is informed they will be removed
>> in PG 19.
> 
> With v18beta1 already out in the wild, I think that we are too late
> for taking any action on this version at this stage.  Putting a
> deprecation notice for a selected set of conversions and/or encodings
> and do the actual removal work when v20 opens up around July 2026
> would sound like a better timing here, if the overall consensus goes
> in this direction, of course.

If we plan to remove something in the future, I think putting a 
deprecation notice in the docs in v18 is still a good idea. There's no 
point in hiding the plan by not documenting it sooner. The more advance 
notice people get the better.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Re: Retiring some encodings?

From

Daniel Gustafsson

Date:

23 May, 11:21:42

> On 23 May 2025, at 09:18, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

> If we plan to remove something in the future, I think putting a deprecation notice in the docs in v18 is still a good
idea.There's no point in hiding the plan by not documenting it sooner. The more advance notice people get the better. 

+1

--
Daniel Gustafsson

Re: Retiring some encodings?

From

wenhui qiu

Date:

23 May, 12:08:35

> The obvious question is how many people would suffer because
> of that removal, as it would prevent them from using pg_upgrade.

> Can anybody who works in a region that uses these encodings make
> an educated guess?

+1 Agree ,GB18030 A coding standard in China, if deleted, will have an impact on the application of postgresql in China, and China is now experiencing more and more hot postgresql heat, need to consider carefully!

On Fri, May 23, 2025 at 4:22 PM Daniel Gustafsson <daniel@yesql.se> wrote:

> On 23 May 2025, at 09:18, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

> If we plan to remove something in the future, I think putting a deprecation notice in the docs in v18 is still a good idea. There's no point in hiding the plan by not documenting it sooner. The more advance notice people get the better.

+1

--
Daniel Gustafsson

Re: Retiring some encodings?

From

Daniel Gustafsson

Date:

26 May, 19:54:49

> On 26 May 2025, at 18:07, Andrew Dunstan <andrew@dunslane.net> wrote:
> On 2025-05-24 Sa 8:58 PM, DEVOPS_WwIT wrote:

>> The GB18030 encoding standard is a mandatory Chinese character encoding standard required by regulations. Software
soldand used in China must support GB18030, with its latest version being the 2023 edition. The primary advantage of
GB18030is that most Chinese characters require only 2 bytes for storage, whereas UTF-8 necessitates 3 bytes for the
samecharacters. This makes GB18030 significantly more storage-efficient compared to UTF-8 in terms of space
utilization.
>
> Given this, removing it seems like a non-starter.

Agreed, it seems very unappealing to remove something so important to such a
large userbase.

--
Daniel Gustafsson

Re: Retiring some encodings?

From

Michael Paquier

Date:

27 May, 03:07:13

On Mon, May 26, 2025 at 06:54:49PM +0200, Daniel Gustafsson wrote:
> Agreed, it seems very unappealing to remove something so important to such a
> large userbase.

Agreed that the so-said "state" level requirement would be a
non-starter.
--
Michael

Attachment

signature.asc

Re: Retiring some encodings?

From

Christoph Berg

Date:

05 June, 16:35:19

Re: Michael Paquier
> On Mon, May 26, 2025 at 06:54:49PM +0200, Daniel Gustafsson wrote:
> > Agreed, it seems very unappealing to remove something so important to such a
> > large userbase.
> 
> Agreed that the so-said "state" level requirement would be a
> non-starter.

Or maybe support for using these as server encodings could be
removed, keeping the client_encoding part intact?

Christoph

Re: Retiring some encodings?

From

Ken Marshall

Date:

05 June, 18:14:58

On Thu, Jun 05, 2025 at 03:35:19PM +0200, Christoph Berg wrote:
> Re: Michael Paquier
> > On Mon, May 26, 2025 at 06:54:49PM +0200, Daniel Gustafsson wrote:
> > > Agreed, it seems very unappealing to remove something so important to such a
> > > large userbase.
> > 
> > Agreed that the so-said "state" level requirement would be a
> > non-starter.
> 
> Or maybe support for using these as server encodings could be
> removed, keeping the client_encoding part intact?
> 
> Christoph
> 

Hi,

Doesn't the ICU system support this encoding? They could just use it if
you still want to remove our own implementation.

Regards,
Ken

Re: Retiring some encodings?

From

Tatsuo Ishii

Date:

06 June, 02:50:56

>> Agreed that the so-said "state" level requirement would be a
>> non-starter.
> 
> Or maybe support for using these as server encodings could be
> removed, keeping the client_encoding part intact?

GB18030 is already client encoding only, and cannot be used as a
server encoding. The only way to save GB18030 data into database is,
converting GB18030 to UTF-8 (which can be done automatically).

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp