Hello,
I'd like to propose adding SJIS as a database encoding. You may wonder why SJIS is still necessary in the world of
Unicode. The purpose is to achieve comparable performance when migrating legacy database systems from other DBMSs
withoutlittle modification of applications.
Recently, we failed to migrate some customer's legacy database from DBMS-X to PostgreSQL. That customer wished for
PostgreSQL,but PostgreSQL couldn't meet the performance requirement.
The system uses DBMS-X with the database character set being SJIS. The main applications are written in embedded SQL,
whichrequire SJIS in their host variables. They insisted they cannot use UTF8 for the host variables because that
wouldrequire large modification of applications due to character handling. So no character set conversion is necessary
betweenthe clients and the server.
On the other hand, PostgreSQL doesn't support SJIS as a database encoding. Therefore, character set conversion from
UTF-8to SJIS has to be performed. The batch application runs millions of SELECTS each of which retrieves more than 100
columns. And many of those columns are of character type.
If PostgreSQL supports SJIS, PostgreSQL will match or outperform the performance of DBMS-X with regard to the
applications. We confirmed it by using psql to run a subset of the batch processing. When the client encoding is SJIS,
oneFETCH of 10,000 rows took about 500ms. When the client encoding is UTF8 (the same as the database encoding), the
sameFETCH took 270ms.
Supporting SJIS may somewhat regain attention to PostgreSQL here in Japan, in the context of database migration. BTW,
MySQLsupports SJIS as a database encoding. PostgreSQL used to be the most popular open source database in Japan, but
MySQLis now more popular.
But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there any technical difficulty? Is there anything
youare worried about if adding SJIS?
I'd like to write a patch for adding SJIS if there's no strong objection. I'd appreciate it if you could let me know
gooddesign information to add a server encoding (e.g. the URL of the most recent patch to add a new server encoding)
Regards
Takayuki Tsunakawa