Thread: error messages during restore
So we are in the process of converting our databases from SQL_ASCII to UTF8. If a particular row won't import because of the encoding issue we get an error like: pg_restore: [archiver (db)] Error from TOC entry 5317; 0 1266711 TABLE DATA logs postgres pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence for encoding "UTF8": 0x90 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". CONTEXT: COPY logs, line 590744 So as far as I can tell, this identifies the table by name, logs in this case, and then identifies the actula record by line. Question is, it would be really nice if we could figure out the actual column name in that table. Noting that I do get a line number that produced the error, but the fact that this is a binary dump makes it difficult to view that line. Is there a way to view that data line without converting this dump to a text dump? All I'd like to do is know which column in the table caused the problem so I could apply my fix to that particular column. -- Until later, Geoffrey "I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them." - Thomas Jefferson
Geoffrey Myers <lists@serioustechnology.com> writes: > So we are in the process of converting our databases from SQL_ASCII to > UTF8. If a particular row won't import because of the encoding issue we > get an error like: > pg_restore: [archiver (db)] Error from TOC entry 5317; 0 1266711 TABLE > DATA logs postgres > pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence > for encoding "UTF8": 0x90 > HINT: This error can also happen if the byte sequence does not match > the encoding expected by the server, which is controlled by > "client_encoding". > CONTEXT: COPY logs, line 590744 > Question is, it would be really nice if we could figure out the actual > column name in that table. Sorry, no chance of that. The line is converted to server encoding before any attempt is made to split it into columns. Since the column delimiter is potentially encoding-specific, there's not really any alternative to doing it that way. regards, tom lane
Tom Lane wrote: > Geoffrey Myers <lists@serioustechnology.com> writes: >> So we are in the process of converting our databases from SQL_ASCII to >> UTF8. If a particular row won't import because of the encoding issue we >> get an error like: > >> pg_restore: [archiver (db)] Error from TOC entry 5317; 0 1266711 TABLE >> DATA logs postgres >> pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence >> for encoding "UTF8": 0x90 >> HINT: This error can also happen if the byte sequence does not match >> the encoding expected by the server, which is controlled by >> "client_encoding". >> CONTEXT: COPY logs, line 590744 > >> Question is, it would be really nice if we could figure out the actual >> column name in that table. > > Sorry, no chance of that. The line is converted to server encoding > before any attempt is made to split it into columns. Since the column > delimiter is potentially encoding-specific, there's not really any > alternative to doing it that way. > > regards, tom lane Thanks for the follow up Tom. -- Until later, Geoffrey "I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them." - Thomas Jefferson