9.5. Binary String Functions and Operators #
 This section describes functions and operators for examining and manipulating binary strings, that is values of type bytea. Many of these are equivalent, in purpose and syntax, to the text-string functions described in the previous section. 
SQL defines some string functions that use key words, rather than commas, to separate arguments. Details are in Table 9.11. Postgres Pro also provides versions of these functions that use the regular function invocation syntax (see Table 9.12). 
Table 9.11. SQL Binary String Functions and Operators
|  Function/Operator   Description   Example(s)  | 
|---|
|  bytea||bytea→bytea  Concatenates the two binary strings.  '\x123456'::bytea || '\x789a00bcde'::bytea→\x123456789a00bcde
 | 
|  bit_length(bytea) →integer  Returns number of bits in the binary string (8 times the octet_length). bit_length('\x123456'::bytea)→24
 | 
|  btrim(bytesbytea,bytesremovedbytea) →bytea  Removes the longest string containing only bytes appearing in bytesremovedfrom the start and end ofbytes. btrim('\x1234567890'::bytea, '\x9012'::bytea)→\x345678
 | 
|  ltrim(bytesbytea,bytesremovedbytea) →bytea  Removes the longest string containing only bytes appearing in bytesremovedfrom the start ofbytes. ltrim('\x1234567890'::bytea, '\x9012'::bytea)→\x34567890
 | 
|  octet_length(bytea) →integer  Returns number of bytes in the binary string.  octet_length('\x123456'::bytea)→3
 | 
|  overlay(bytesbyteaPLACINGnewsubstringbyteaFROMstartinteger[FORcountinteger] ) →bytea  Replaces the substring of bytesthat starts at thestart'th byte and extends forcountbytes withnewsubstring. Ifcountis omitted, it defaults to the length ofnewsubstring. overlay('\x1234567890'::bytea placing '\002\003'::bytea from 2 for 3)→\x12020390
 | 
|  position(substringbyteaINbytesbytea) →integer  Returns first starting index of the specified substringwithinbytes, or zero if it's not present. position('\x5678'::bytea in '\x1234567890'::bytea)→3
 | 
|  rtrim(bytesbytea,bytesremovedbytea) →bytea  Removes the longest string containing only bytes appearing in bytesremovedfrom the end ofbytes. rtrim('\x1234567890'::bytea, '\x9012'::bytea)→\x12345678
 | 
|  substring(bytesbytea[FROMstartinteger] [FORcountinteger] ) →bytea  Extracts the substring of bytesstarting at thestart'th byte if that is specified, and stopping aftercountbytes if that is specified. Provide at least one ofstartandcount. substring('\x1234567890'::bytea from 3 for 2)→\x5678
 | 
|  trim( [LEADING|TRAILING|BOTH]bytesremovedbyteaFROMbytesbytea) →bytea  Removes the longest string containing only bytes appearing in bytesremovedfrom the start, end, or both ends (BOTHis the default) ofbytes. trim('\x9012'::bytea from '\x1234567890'::bytea)→\x345678
 | 
| trim( [LEADING|TRAILING|BOTH] [FROM]bytesbytea,bytesremovedbytea) →bytea
  This is a non-standard syntax for trim(). trim(both from '\x1234567890'::bytea, '\x9012'::bytea)→\x345678
 | 
 Additional binary string manipulation functions are available and are listed in Table 9.12. Some of them are used internally to implement the SQL-standard string functions listed in Table 9.11. 
Table 9.12. Other Binary String Functions
|  Function   Description   Example(s)  | 
|---|
|   bit_count(bytesbytea) →bigint  Returns the number of bits set in the binary string (also known as “popcount”).  bit_count('\x1234567890'::bytea)→15
 | 
|  get_bit(bytesbytea,nbigint) →integer  Extracts n'th bit from binary string.  get_bit('\x1234567890'::bytea, 30)→1
 | 
|  get_byte(bytesbytea,ninteger) →integer  Extracts n'th byte from binary string.  get_byte('\x1234567890'::bytea, 4)→144
 | 
|    length(bytea) →integer  Returns the number of bytes in the binary string.  length('\x1234567890'::bytea)→5
 | 
| length(bytesbytea,encodingname) →integer
  Returns the number of characters in the binary string, assuming that it is text in the given encoding. length('jose'::bytea, 'UTF8')→4
 | 
|  md5(bytea) →text  Computes the MD5 hash of the binary string, with the result written in hexadecimal.  md5('Th\000omas'::bytea)→8ab2d3c9689aaf18b4958c334c82d8b1
 | 
|  set_bit(bytesbytea,nbigint,newvalueinteger) →bytea  Sets n'th bit in binary string to newvalue. set_bit('\x1234567890'::bytea, 30, 0)→\x1234563890
 | 
|  set_byte(bytesbytea,ninteger,newvalueinteger) →bytea  Sets n'th byte in binary string to newvalue. set_byte('\x1234567890'::bytea, 4, 64)→\x1234567840
 | 
|  sha224(bytea) →bytea  Computes the SHA-224 hash of the binary string.  sha224('abc'::bytea)→\x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7
 | 
|  sha256(bytea) →bytea  Computes the SHA-256 hash of the binary string.  sha256('abc'::bytea)→\xba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
 | 
|  sha384(bytea) →bytea  Computes the SHA-384 hash of the binary string.  sha384('abc'::bytea)→\xcb00753f45a35e8bb5a03d699ac65007272c32ab0eded1631a8b605a43ff5bed8086072ba1e7cc2358baeca134c825a7
 | 
|  sha512(bytea) →bytea  Computes the SHA-512 hash of the binary string.  sha512('abc'::bytea)→\xddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a9eeee64b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f
 | 
|  substr(bytesbytea,startinteger[,countinteger] ) →bytea  Extracts the substring of bytesstarting at thestart'th byte, and extending forcountbytes if that is specified. (Same assubstring(.)bytesfromstartforcount) substr('\x1234567890'::bytea, 3, 2)→\x5678
 | 
 Functions get_byte and set_byte number the first byte of a binary string as byte 0. Functions get_bit and set_bit number bits from the right within each byte; for example bit 0 is the least significant bit of the first byte, and bit 15 is the most significant bit of the second byte. 
 For historical reasons, the function md5 returns a hex-encoded value of type text whereas the SHA-2 functions return type bytea. Use the functions encode and decode to convert between the two. For example write encode(sha256('abc'), 'hex') to get a hex-encoded text representation, or decode(md5('abc'), 'hex') to get a bytea value. 
  Functions for converting strings between different character sets (encodings), and for representing arbitrary binary data in textual form, are shown in Table 9.13. For these functions, an argument or result of type text is expressed in the database's default encoding, while arguments or results of type bytea are in an encoding named by another argument. 
Table 9.13. Text/Binary String Conversion Functions
|  Function   Description   Example(s)  | 
|---|
|  convert(bytesbytea,src_encodingname,dest_encodingname) →bytea  Converts a binary string representing text in encoding src_encodingto a binary string in encodingdest_encoding(see Section 23.3.4 for available conversions). convert('text_in_utf8', 'UTF8', 'LATIN1')→\x746578745f696e5f75746638
 | 
|  convert_from(bytesbytea,src_encodingname) →text  Converts a binary string representing text in encoding src_encodingtotextin the database encoding (see Section 23.3.4 for available conversions). convert_from('text_in_utf8', 'UTF8')→text_in_utf8
 | 
|  convert_to(stringtext,dest_encodingname) →bytea  Converts a textstring (in the database encoding) to a binary string encoded in encodingdest_encoding(see Section 23.3.4 for available conversions). convert_to('some_text', 'UTF8')→\x736f6d655f74657874
 | 
|  encode(bytesbytea,formattext) →text  Encodes binary data into a textual representation; supported formatvalues are:base64,escape,hex. encode('123\000\001', 'base64')→MTIzAAE=
 | 
|  decode(stringtext,formattext) →bytea  Decodes binary data from a textual representation; supported formatvalues are the same as forencode. decode('MTIzAAE=', 'base64')→\x3132330001
 | 
 The encode and decode functions support the following textual formats: 
- base64  #
-  The - base64format is that of RFC 2045 Section 6.8. As per the RFC, encoded lines are broken at 76 characters. However instead of the MIME CRLF end-of-line marker, only a newline is used for end-of-line. The- decodefunction ignores carriage-return, newline, space, and tab characters. Otherwise, an error is raised when- decodeis supplied invalid base64 data — including when trailing padding is incorrect.
 
- escape  #
-  The - escapeformat converts zero bytes and bytes with the high bit set into octal escape sequences (- \- nnn), and it doubles backslashes. Other byte values are represented literally. The- decodefunction will raise an error if a backslash is not followed by either a second backslash or three octal digits; it accepts other byte values unchanged.
 
- hex  #
-  The - hexformat represents each 4 bits of data as one hexadecimal digit,- 0through- f, writing the higher-order digit of each byte first. The- encodefunction outputs the- a-- fhex digits in lower case. Because the smallest unit of data is 8 bits, there are always an even number of characters returned by- encode. The- decodefunction accepts the- a-- fcharacters in either upper or lower case. An error is raised when- decodeis given invalid hex data — including when given an odd number of characters.
 
 See also the aggregate function string_agg in Section 9.21 and the large object functions in Section 38.4.