Thread: psql include text file with bom
Summary:
psql "include" or "\i" command chokes on UTF8 text files prefixed with BOM.
Steps to reproduce:
- create a UTF8 file with three byte BOM 'EF BB BF'.
- include the file from psql via the "include" or \i command.
Example output for file named "test.sql" below:
redacted-# \i test.sql
psql:test.sql:1: ERROR: syntax error at or near ""
LINE 1: 
Background
https://en.wikipedia.org/wiki/Byte_order_mark
Some text editors save text to a file prefixed by a BOM or byte marker. This includes Visual Studio, VSCode and others. I think it would be reasonable for the include command to skip over any BOM found in the first two or three bytes of a file.
Rick Parrish <ai5jt@unitrunker.net> writes: > I think it would be reasonable for the include command to skip over any > BOM found in the first two or three bytes of a file. This has been proposed before, and rejected before. psql has no inherent knowledge of what encoding an input file is in, and therefore no justification to assume that a bit-pattern it sees there is a BOM. In non-UTF8 encodings it could very easily be valid data. (For that matter, it's also valid data in UTF8: it's the same bit pattern as U+FEFF ZERO WIDTH NO-BREAK SPACE. Programs that emit one into UTF8 streams, and expect it not to be taken as data, are frankly broken.) regards, tom lane