On Mon, Mar 24, 2014 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> I suspect suspect trying to do this in the parser will be quite messy.
>> This needs to happen before the input is converted to the server
>> encoding, I think.
>
> Indeed --- what if the server isn't using utf8 internally?
>
> And a larger point is that the server has no idea where the file
> boundaries are. If we were to do this server-side, we'd essentially
> end up discarding BOM anywhere, which is more libertine than I care
> to be.
Right -- I had a feeling you'd say that. That's why the best solution
ISTM is to allow psql to be invoked in such a way that it *does* know
the file boundaries for consolidated scripts; this means better
handling of multiple file arguments. psql -1 already requires '-f' to
work (vs cat foo.sql | psql) and that's pretty reasonable. BOM
handling fixes should probably be injected in cases where the precise
beginning points of the file are known, which AFAICT are \i and -f.
So, in short, it seems prudent to:
1. make multiple -f invocation work (with -1 spanning)
2. strip BOM from -f or \i foo.sql if it's there
That will fix all non redirection usages. Cases involving redirection
are not psql's bailiwick.
merlin