Re: Bug #728: Interactions between bytea and character encoding when doing analyze - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Bug #728: Interactions between bytea and character encoding when doing analyze
Date
Msg-id 6274.1028427954@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bug #728: Interactions between bytea and character encoding  (Joe Conway <mail@joeconway.com>)
List pgsql-bugs
Joe Conway <mail@joeconway.com> writes:
> (gdb) bt
> #0  pg_verifymbstr (mbstr=0x837a698 "42", len=2) at wchar.c:541
> #1  0x08149c26 in textin (fcinfo=0xbfffeca0) at varlena.c:191
> #2  0x08160579 in DirectFunctionCall1 (func=0x8149c00 <textin>,
> arg1=137864856) at fmgr.c:657
> #3  0x080bbffa in update_attstats (relid=74723, natts=2,
> vacattrstats=0x8379f58) at analyze.c:1740

Ah.  So the issue is that ANALYZE tries to do textin(byteaout(...))
in order to produce a textual representation of the most common value
in the BYTEA column, and apparently textin feels that the string
generated by byteaout is not legal text.  While Joe says that the
problem has gone away in CVS tip, I'm not sure I believe that.

A possible answer is to change the pg_statistics columns from text to
some other less picky datatype.  (bytea maybe ;-))  Or should we
conclude that text is broken and needs to be fixed?  Choice #3 would
be "bytea is broken and needs to be fixed", but I don't care for that
answer --- if bytea can produce an output string that will break
pg_statistics, then so can some other future datatype.

Comments?

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: "analyze" putting wrong reltuples in pg_class
Next
From: Bruce Momjian
Date:
Subject: Re: "analyze" putting wrong reltuples in pg_class