Re: [GENERAL] trouble with pg_upgrade 9.0 -> 9.1 - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [GENERAL] trouble with pg_upgrade 9.0 -> 9.1
Date
Msg-id 20121220121948.GI20015@momjian.us
Whole thread Raw
In response to Re: [GENERAL] trouble with pg_upgrade 9.0 -> 9.1  (Bruce Momjian <bruce@momjian.us>)
Responses Re: [GENERAL] trouble with pg_upgrade 9.0 -> 9.1  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Wed, Dec 19, 2012 at 10:19:30PM -0500, Bruce Momjian wrote:
> On Wed, Dec 19, 2012 at 12:56:05PM -0500, Kevin Grittner wrote:
> > Groshev Andrey wrote:
> >
> > > >>>>>   Mismatch of relation names: database "database", old rel
public.lob.ВерсияВнешнегоДокумента$Документ_pkey,new rel public.plob.ВерсияВнешнегоДокумента$Документ 
> >
> > There is a limit on identifiers of 63 *bytes* (not characters)
> > after which the name is truncated. In UTF8 encoding, the underscore
> > would be in the 64th position.
>
> OK, Kevin is certainly pointing out a bug in the pg_upgrade code, though
> I am unclear how it would exhibit the mismatch error reported.
>
> pg_upgrade uses NAMEDATALEN for database, schema, and relation name
> storage lengths.  While NAMEDATALEN works fine in the backend, it is
> possible that a frontend client, like pg_upgrade, could retrieve a name
> in the client encoding whose length exceeds NAMEDATALEN if the client
> encoding did not match the database encoding (or is it the cluster
> encoding for system tables).  This would cause truncation of these
> values.  The truncation would not cause crashes, but might cause
> failures by not being able to connect to overly-long database names, and
> it weakens the checking of relation/schema names --- the same check that
> is reported above.
>
> (I believe initdb.c also erroneously uses NAMEDATALEN.)

I have developed the attached patch to pg_strdup() the string returned
from libpq, rather than use a fixed NAMEDATALEN buffer to store the
value.  I am only going to apply this to 9.3 because I can't see this
causing problems except for weaker comparisons for very long identifiers
where the client encoding is longer than the server encoding, and
failures for very long database names, no of which we have gotten bug
reports about.

Turns out initdb.c was fine because it expects only collation names to
be only in ASCII;   I added a comment to that effect.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

pgsql-hackers by date:

Previous
From: "Brett Maton"
Date:
Subject: pg_top
Next
From: "P. Christeas"
Date:
Subject: Re: pg_top