Thread: Defining character sets for indicidual fields
Hi,
By default, my postgresql server is set to use UTF8 character set. I was wondering if there is any way to make sure that certain fields like url etc. only makes use of ascii. My main aim is to save space by using only 1 byte / character for urls (some of the urls are over 200 characters long). Is this possible? Or are all characters eventually converted to UTF8 during storage?
Thanks,
Ram
By default, my postgresql server is set to use UTF8 character set. I was wondering if there is any way to make sure that certain fields like url etc. only makes use of ascii. My main aim is to save space by using only 1 byte / character for urls (some of the urls are over 200 characters long). Is this possible? Or are all characters eventually converted to UTF8 during storage?
Thanks,
Ram
On May 31, 2008, at 6:22 PM, Ram Ravichandran wrote: > Hi, > > By default, my postgresql server is set to use UTF8 character set. I > was wondering if there is any way to make sure that certain fields > like url etc. only makes use of ascii. My main aim is to save space > by using only 1 byte / character for urls (some of the urls are > over 200 characters long). Is this possible? Or are all characters > eventually converted to UTF8 during storage? An ascii string and the UTF8 representation of it will take exactly the same number of bytes, so if space used is your concern it's not an issue. Cheers, Steve
Hi, Steve Atkins wrote: > > On May 31, 2008, at 6:22 PM, Ram Ravichandran wrote: > >> Hi, >> >> By default, my postgresql server is set to use UTF8 character set. I >> was wondering if there is any way to make sure that certain fields >> like url etc. only makes use of ascii. My main aim is to save space by >> using only 1 byte / character for urls (some of the urls are over 200 >> characters long). Is this possible? Or are all characters eventually >> converted to UTF8 during storage? > > An ascii string and the UTF8 representation of it will take exactly the > same number of bytes, so if space used is your concern it's not an issue. Even more, if you convert URLs from urlencoding to clear text, you can quickly leave the ASCII char range (think punicode for the fqdn, think utf-8 for the path) Cheers Tino