Re: BUG #18051: char(N) and varchar(N) behave opposite to the documentation - Mailing list pgsql-bugs

From David Rowley
Subject Re: BUG #18051: char(N) and varchar(N) behave opposite to the documentation
Date
Msg-id CAApHDvq5TaQ54LFSmzWVJe2oMERskXkFTi8fC5FknAMSKTq6eQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18051: char(N) and varchar(N) behave opposite to the documentation  (Nicolas Gouteux <nicolas.gouteux@sonarsource.com>)
Responses Re: BUG #18051: char(N) and varchar(N) behave opposite to the documentation  (David Rowley <dgrowleyml@gmail.com>)
Re: BUG #18051: char(N) and varchar(N) behave opposite to the documentation  (Nicolas Gouteux <nicolas.gouteux@sonarsource.com>)
List pgsql-bugs
On Thu, 10 Aug 2023 at 02:05, Nicolas Gouteux
<nicolas.gouteux@sonarsource.com> wrote:
> Sorry, but I still do not get it:

I think the key is this sentence from the documentation:

"Trailing spaces are removed when converting a character value to one
of the other string types."

Your concatenation example from earlier required that the type be
implicitly cast to another type as we have no char-to-char
concatenation operator, per:

postgres=# select oprname,oprleft::regtype,oprright::regtype,oprcode
from pg_operator where oprname = '||';
 oprname |      oprleft       |      oprright      |     oprcode
---------+--------------------+--------------------+-----------------
 ||      | anycompatiblearray | anycompatible      | array_append
 ||      | anycompatible      | anycompatiblearray | array_prepend
 ||      | anycompatiblearray | anycompatiblearray | array_cat
 ||      | text               | text               | textcat
 ||      | bit varying        | bit varying        | bitcat
 ||      | bytea              | bytea              | byteacat
 ||      | text               | anynonarray        | textanycat
 ||      | anynonarray        | text               | anytextcat
 ||      | tsvector           | tsvector           | tsvector_concat
 ||      | tsquery            | tsquery            | tsquery_or
 ||      | jsonb              | jsonb              | jsonb_concat
(11 rows)

Concatenating two char(10)s would just use textcat(), so the above
line from the doc applies since type conversion is required.

> - char type adds padding up to its length
> - select length(charcol), char_length(charcoal)
> both yield 1

There is a length function (bpcharlen) for char, per:

postgres=# select proargtypes[0]::Regtype,prosrc from pg_proc where
proname = 'length';
 proargtypes |       prosrc
-------------+--------------------
 text        | textlen
 character   | bpcharlen
 lseg        | lseg_length
 path        | path_length
 bytea       | length_in_encoding
 bit         | bitlength
 bytea       | byteaoctetlen
 tsvector    | tsvector_length
(8 rows)

However, that function does:

/* get number of bytes, ignoring trailing spaces */
len = bcTruelen(arg);

and that wasn't by accident, per:

commit f27976c85b1fb9002727cce65b9f9567e158f754
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sun Feb 1 06:27:48 2004 +0000

    Make length() disregard trailing spaces in char(n) values, per discussion
    some time ago and recent patch from Gavin Sherry.  Update documentation
    to point out that trailing spaces are insignificant in char(n).

unfortunately, we didn't link to discussions in commit messages back
then.  It might be worth you searching the archives shortly before
that date to see if anything interesting comes up.

As mentioned by David, there's not much call for using char(N) in
PostgreSQL.  I don't know the history, but I always imagined char(N)
existence was owed to improved internal optimizations in RDBMS
implementations that might have been possible if the tuples were
fixed-sized.  We have no such advantages in PostgreSQL as even a
char(N) will be stored as a variable length field. In theory, we could
take some advantage in that as tuple deformation becomes less
efficient for columns that come after a variable length field due to
the offset into the tuple not being fixed, however, because we store
char(N)s as variable length, we can't take advantage of that and it's
too late as changing it would change the binary format of the type.

David



pgsql-bugs by date:

Previous
From: Nicolas Gouteux
Date:
Subject: Re: BUG #18051: char(N) and varchar(N) behave opposite to the documentation
Next
From: David Rowley
Date:
Subject: Re: BUG #18051: char(N) and varchar(N) behave opposite to the documentation