After digging into it, you are completely correct. I had to do a bit more reading to understand the relationships between UTF-8 and wchar, but ultimately the existing locale support works for my use case.
Therefore I have updated the patch with three much smaller changes:
* Support for `-` in addition to `_`
* Expanding the limit to 512 chars (from the existing 256); again it's not uncommon for non-English strings to be much longer
* Fixed the documentation to expand on what the ltree label's relationship to the DB locale is
Garen Torikian <gjtorikian@gmail.com> writes: >> Perhaps the docs are a bit unclear about that, but it's not >> restricted to ASCII alphanumerics. AFAICS the code will accept >> whatever iswalpha() and iswdigit() will accept in the database's >> default locale.