Thread: Re: Retiring some encodings?
On Thu, May 22, 2025 at 02:44:39PM +0300, Heikki Linnakangas wrote: > On 22/05/2025 08:54, Michael Paquier wrote: > > With all that in mind, I have wanted to kick a discussion about > > potentially removing one or more encodings from the core code, > > including the backend part, the frontend part and the conversion > > routines, coupled with checks in pg_upgrade to complain with database > > or collations include the so-said encoding (the collation part needs > > to be checked when not using ICU). Just being able to removing > > GB18030 would do us a favor in the long-term, at least, but there's > > more. > > +1 at high level for deprecating and removing conversions that are not > widely used anymore. As the first step, we can at least add a warning to the > documentation, that they will be removed in the future. Agreed on notification. A radical idea would be to add a warning for the use of such encodings in PG 18, and then mention their deprecation in the PG 18 release notes so everyone is informed they will be removed in PG 19. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
On 23/05/2025 05:11, Michael Paquier wrote: > On Thu, May 22, 2025 at 10:02:16AM -0400, Bruce Momjian wrote: >> Agreed on notification. A radical idea would be to add a warning for >> the use of such encodings in PG 18, and then mention their deprecation >> in the PG 18 release notes so everyone is informed they will be removed >> in PG 19. > > With v18beta1 already out in the wild, I think that we are too late > for taking any action on this version at this stage. Putting a > deprecation notice for a selected set of conversions and/or encodings > and do the actual removal work when v20 opens up around July 2026 > would sound like a better timing here, if the overall consensus goes > in this direction, of course. If we plan to remove something in the future, I think putting a deprecation notice in the docs in v18 is still a good idea. There's no point in hiding the plan by not documenting it sooner. The more advance notice people get the better. -- Heikki Linnakangas Neon (https://neon.tech)
> On 23 May 2025, at 09:18, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > If we plan to remove something in the future, I think putting a deprecation notice in the docs in v18 is still a good idea.There's no point in hiding the plan by not documenting it sooner. The more advance notice people get the better. +1 -- Daniel Gustafsson
HI
> The obvious question is how many people would suffer because
> of that removal, as it would prevent them from using pg_upgrade.
> Can anybody who works in a region that uses these encodings make
> an educated guess?
> of that removal, as it would prevent them from using pg_upgrade.
> Can anybody who works in a region that uses these encodings make
> an educated guess?
+1 Agree ,GB18030 A coding standard in China, if deleted, will have an impact on the application of postgresql in China, and China is now experiencing more and more hot postgresql heat, need to consider carefully!
On Fri, May 23, 2025 at 4:22 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 23 May 2025, at 09:18, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> If we plan to remove something in the future, I think putting a deprecation notice in the docs in v18 is still a good idea. There's no point in hiding the plan by not documenting it sooner. The more advance notice people get the better.
+1
--
Daniel Gustafsson
> On 26 May 2025, at 18:07, Andrew Dunstan <andrew@dunslane.net> wrote: > On 2025-05-24 Sa 8:58 PM, DEVOPS_WwIT wrote: >> The GB18030 encoding standard is a mandatory Chinese character encoding standard required by regulations. Software soldand used in China must support GB18030, with its latest version being the 2023 edition. The primary advantage of GB18030is that most Chinese characters require only 2 bytes for storage, whereas UTF-8 necessitates 3 bytes for the samecharacters. This makes GB18030 significantly more storage-efficient compared to UTF-8 in terms of space utilization. > > Given this, removing it seems like a non-starter. Agreed, it seems very unappealing to remove something so important to such a large userbase. -- Daniel Gustafsson
On Mon, May 26, 2025 at 06:54:49PM +0200, Daniel Gustafsson wrote: > Agreed, it seems very unappealing to remove something so important to such a > large userbase. Agreed that the so-said "state" level requirement would be a non-starter. -- Michael
Attachment
Re: Michael Paquier > On Mon, May 26, 2025 at 06:54:49PM +0200, Daniel Gustafsson wrote: > > Agreed, it seems very unappealing to remove something so important to such a > > large userbase. > > Agreed that the so-said "state" level requirement would be a > non-starter. Or maybe support for using these as server encodings could be removed, keeping the client_encoding part intact? Christoph
On Thu, Jun 05, 2025 at 03:35:19PM +0200, Christoph Berg wrote: > Re: Michael Paquier > > On Mon, May 26, 2025 at 06:54:49PM +0200, Daniel Gustafsson wrote: > > > Agreed, it seems very unappealing to remove something so important to such a > > > large userbase. > > > > Agreed that the so-said "state" level requirement would be a > > non-starter. > > Or maybe support for using these as server encodings could be > removed, keeping the client_encoding part intact? > > Christoph > Hi, Doesn't the ICU system support this encoding? They could just use it if you still want to remove our own implementation. Regards, Ken
>> Agreed that the so-said "state" level requirement would be a >> non-starter. > > Or maybe support for using these as server encodings could be > removed, keeping the client_encoding part intact? GB18030 is already client encoding only, and cannot be used as a server encoding. The only way to save GB18030 data into database is, converting GB18030 to UTF-8 (which can be done automatically). Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp