inconsistency and inefficiency in setup_conversion() - Mailing list pgsql-hackers
From | John Naylor |
---|---|
Subject | inconsistency and inefficiency in setup_conversion() |
Date | |
Msg-id | CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com Whole thread Raw |
Responses |
Re: inconsistency and inefficiency in setup_conversion()
|
List | pgsql-hackers |
Taking a close look at the result of setup_conversion(), wrong or at least confusing comments are applied to the functions. Consider this family of conversions: select conproc, conname from pg_conversion where conproc = 'utf8_to_win'::regproc order by oid; conproc | conname -------------+---------------------- utf8_to_win | utf8_to_windows_866 utf8_to_win | utf8_to_windows_874 utf8_to_win | utf8_to_windows_1250 utf8_to_win | utf8_to_windows_1251 utf8_to_win | utf8_to_windows_1252 utf8_to_win | utf8_to_windows_1253 utf8_to_win | utf8_to_windows_1254 utf8_to_win | utf8_to_windows_1255 utf8_to_win | utf8_to_windows_1256 utf8_to_win | utf8_to_windows_1257 utf8_to_win | utf8_to_windows_1258 (11 rows) Then compare the comment on the function: select proname, description from pg_description d join pg_proc p on d.objoid=p.oid where classoid = 'pg_proc'::regclass and description ~ 'for UTF8 to WIN'; proname | description -------------+-------------------------------------------------- utf8_to_win | internal conversion function for UTF8 to WIN1258 (1 row) Notice how the comment refers to the last encoding created. This is because setup_conversion.sql invokes CREATE OR REPLACE FUNCTION utf8_to_win [...] multiple times, each with different comments specific to the encoding. It'd be messy at best to try to construct the right comment using the current Makefile script. It also can't be good for initdb performance to create 44 functions just to immediately drop them. Speaking of, from this thread about initdb performance [1], setup_conversion() consumed the biggest share of time. I propose to get rid of the ad hoc $(CONVERSIONS) format and solve the comment issue, while hopefully shaving a bit more time off of initdb. It seems our options are the following: Solution #1 - As alluded to in [1], turn the conversions into pg_proc.dat and pg_conversion.dat entries. Teach genbki.pl to parse pg_wchar.h to map conversion names to numbers. Pros: -likely easy to do -allows for the removal of an install target in the Makefile as well as ad hoc logic in MSVC -uses a format that developers need to use anyway Cons: -immediately burns up 88 hard-coded OIDs and one for each time a conversion proc is created -would require editing data in two catalogs every time a conversion proc is created Solution #2 - Write a new script that would read all the .c files in the various directories and output two files. These would be COPY'd into temp tables during initdb, and then inserted into pg_proc, pg_conversion, and pg_description using SQL. Pros: -eliminates all(?) manual catalog maintenance when adding new conversion procs Cons: -likely complex and difficult to debug -further complicates initdb.c -requires MSVC development If we do anything, I'd much rather do #1, but that way is not entirely without downsides compared to doing nothing. Any thoughts? [1] https://www.postgresql.org/message-id/b549c8ad-f12e-aad1-9a59-b24cb3e55a17@proxel.se
pgsql-hackers by date: