On 2020-11-30 22:15, Pavel Stehule wrote: > I would like some supporting documentation on this. So far we only > have > one stackoverflow question, and then this implementation, and they are > not even the same format. My worry is that if there is not precise > specification, then people are going to want to add things in the > future, and there will be no way to analyze such requests in a > principled way. > > > I checked this and it is "prefix backslash-u hex" used by Java, > JavaScript or RTF - > https://billposer.org/Software/ListOfRepresentations.html
Heh. The fact that there is a table of two dozen possible representations kind of proves my point that we should be deliberate in picking one.
I do see Oracle unistr() on that list, which appears to be very similar to what you are trying to do here. Maybe look into aligning with that.
unistr is a primitive form of proposed function. But it can be used as a base. The format is compatible with our "4.1.2.3. String Constants with Unicode Escapes".
What do you think about the following proposal?
1. unistr(text) .. compatible with Postgres unicode escapes - it is enhanced against Oracle, because Oracle's unistr doesn't support 6 digits unicodes.
2. there can be optional parameter "prefix" with default "\". But with "\u" it can be compatible with Java or Python.