Re: Supporting SJIS as a database encoding - Mailing list pgsql-hackers

From Tsunakawa, Takayuki
Subject Re: Supporting SJIS as a database encoding
Date
Msg-id 0A3221C70F24FB45833433255569204D1F5E6615@G01JPEXMBYT05
Whole thread Raw
In response to Re: Supporting SJIS as a database encoding  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Supporting SJIS as a database encoding  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> writes:
> > Before digging into the problem, could you share your impression on
> > whether PostgreSQL can support SJIS?  Would it be hopeless?
> 
> I think it's pretty much hopeless.  Even if we were willing to make every
> bit of code that looks for '\' and other specific at-risk characters
> multi-byte aware (with attendant speed penalties), we could expect that
> third-party extensions would still contain vulnerable code.  More, we could
> expect that new bugs of the same ilk would get introduced all the time.
> Many such bugs would amount to security problems.  So the amount of effort
> and vigilance required seems out of proportion to the benefits.

Hmm, this sounds like a death sentence.  But as I don't have good knowledge of character set handling yet, I'm not
completelyconvinced about why PostgreSQL cannot support SJIS.  I wonder why and how other DBMSs support SJIS and what's
thedifference of the implementation.  Using multibyte-functions like mb... to process characters would solve the
problem? Isn't the current implementation blocking the support of other character sets that have similar
characteristics? I'll learn the character set handling...
 

> Most of the recent discussion about allowed backend encodings has run more
> in the other direction, ie, "why don't we disallow everything but
> UTF8 and get rid of all the infrastructure for multiple backend encodings?".
> I'm not personally in favor of that, but there are very few hackers who
> want to add any more overhead in this area.

Personally, I totally agree.  I want non-Unicode character sets to disappear from the world.  But the real business
doesn'tseem to forgive the lack of SJIS...
 

Regards
Takayuki Tsunakawa





pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Next
From: Amit Kapila
Date:
Subject: Re: Cache Hash Index meta page.