trivial DoS on char recoding - Mailing list pgsql-hackers

From Alvaro Herrera
Subject trivial DoS on char recoding
Date
Msg-id 20060620213249.GI26882@surnet.cl
Whole thread Raw
Responses Re: trivial DoS on char recoding  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: trivial DoS on char recoding  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Oswaldo Hernandez just reported this in the pgsql-es-ayuda list.
Basically, a conversion between UTF8 and windows_1250 can crash the
server.

I recall a bug around this general code but I don't recall it being able
to provoke a PANIC.

To reproduce, create a cluster with UTF-8 encoding and locale es_ES (I'm
actually using es_CL but it should be the same).  Note that the es_ES
locale is declared to use Latin1 encoding, not UTF-8.  In a psql
session,

template1=# copy foo from '/tmp/foo' ;
ERROR:  no existe la relación «foo»
template1=# \encoding latin1
template1=# copy foo from '/tmp/foo' ;
ERROR:  could not convert UTF8 character 0x00f3 to ISO8859-1
template1=# \encoding windows_1250
template1=# copy foo from '/tmp/foo' ;
PANIC:  ERRORDATA_STACK_SIZE exceeded

Table "foo" nor the /tmp/foo file need to exist.

In the server logs, I set "log_line_prefix" to %x (Xid) to make it
obvious that these reports are in processing the same message.  When the
PANIC occurs, the server logs this:

574 ERROR:  no existe la relación «foo»
574 WARNING:  ignorando el carácter UTF-8 no convertible 0xf36e20ab
574 WARNING:  ignorando el carácter UTF-8 no convertible 0xe16374
574 WARNING:  ignorando el carácter UTF-8 no convertible 0xe16374
574 WARNING:  ignorando el carácter UTF-8 no convertible 0xe16374
574 PANIC:  ERRORDATA_STACK_SIZE exceeded
574 SENTENCIA:  copy foo from '/tmp/datoscopy' ;


To reproduce, you using a non-C locale is (es_ES works for me).  If I
start the postmaster with -C lc_messages=C, the problem does not occur.
Note that the PO file for the spanish translation is written in Latin1,
not UTF8.  So I can adventure that the server is trying to recode a
string which is originally in Latin1, but assuming it is UTF-8, to
Win1250.

Now, it can be argued that this is really operator error -- because I
can't crash the server if I correctly initdb with es_CL.UTF8.  Should we
get firmer in rejecting invalid configurations?

I'm not sure up to what point this affects other translations, collates,
encodings -- right now I only have "es" (spanish) compiled and my system
is not configured to accept anything else.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Slightly bogus regression test for contrib/dblink
Next
From: Tom Lane
Date:
Subject: Re: UPDATE crash in HEAD and 8.1