both lc_type and lc_collate are en_US.UTF-8. Sorry for missing them in the original post.
I understand that collate has impact on sorting order, but the fact that char_length() is not returning the correct length in char for certain characters (non-BMP) is an indication that unicode is not fully supported. If char_length() is not working properly, I'd expect that substring() won't work either.
The PostgreSQL I am using is an AWS PostgreSQL RDS. I can check with AWS, but presumably that they are running PostgreSQL RDS on some flavor of lunix.
My client is PgAdmin 4 running on a Windows 7 machine. I understand that some client tools may not be able to display all unicode chars, but I do expect that the function ascii() return correct values of the stored chars.
For me the primary requirement is storing and retrieving all unicode characters as they are, and char_length() returns the correct values for all supported unicode chars. Correct sorting is nice-to-have.
Any help to get unicode chars, particularly the mojos (0x1F478, 0x1F479), in and out of pg correctly is much appreciated. Thank you!
James