Re: [GENERAL] char(xx) problem - Mailing list pgsql-general

From Gene Selkov
Subject Re: [GENERAL] char(xx) problem
Date
Msg-id 199912221857.MAA25056@mail.xnet.com
Whole thread Raw
In response to Re: [GENERAL] char(xx) problem  (Herouth Maoz <herouth@oumail.openu.ac.il>)
Responses Possible FAQs: single-quote and rename database
List pgsql-general
> > I'm just wondering: are there any alternatives to blank padding? Why
> > is it done in the first place?
>
> That's how fixed-length char type works, since the early days of SQL. You
> come to expect it, which means that if you use legacy code that has a
> fixed-width char type, or you decided to use it for its time-saving
> possibilities, it should behave according to some way which has been
> established long ago.

I thik I understand why a fixed-size type should be aligned to the
multiples of its size in storage -- that's what accounts for some
speed improvement. I am still not getting the point when it comes to
padding. Because it looks like it draws on speed -- both when you do
the padding and when you trim the results. The question is
whether a null-terminated string would do as well.

My suspicion is that somebody simply didn't like to see the garbage in the
database files, and then it stuck.

> What I don't get is why, given two bpchar argument, Postgres doesn't just
> pad the shorter one to the length of the other and then compares, selects
> and whatnot.

As the original post by Nikolay Mijaylov indicated, there is (was?) a
mechanism for correct comparison between various char(*) and text
types, but whether it works or not depends on the weather outside. I
can witness its existence in the past, as I still have some code that
relies on cross-type comparisons which do not seem to work
anymore. Unfortunately, I did not check since a few versions back, but
if I understood Nikolay Mijaylov right, he claims to have two
installations of the same version that behave differently.

Now these code snippets clearly shows how it was intended to work:


/*****************************************************************************
 *      Comparison Functions used for bpchar
 *****************************************************************************/

static int
bcTruelen(char *arg)
{
        char       *s = VARDATA(arg);
        int                     i;
        int                     len;

        len = VARSIZE(arg) - VARHDRSZ;
        for (i = len - 1; i >= 0; i--)
        {
                if (s[i] != ' ')
                        break;
        }
        return i + 1;
}


 . . . .


bool
bpchareq(char *arg1, char *arg2)
{
        int                     len1,
                                len2;

        if (arg1 == NULL || arg2 == NULL)
                return (bool) 0;
        len1 = bcTruelen(arg1);
        len2 = bcTruelen(arg2);

        if (len1 != len2)
                return 0;

        return strncmp(VARDATA(arg1), VARDATA(arg2), len1) == 0;
}

What's up with bcTruelen() then? Where does the noise come from?


--Gene

pgsql-general by date:

Previous
From: Charles Tassell
Date:
Subject: Re: [GENERAL] Getting value of SERIAL column after insert from libpq?
Next
From: Bruce Momjian
Date:
Subject: Re: [GENERAL] Interbase replacement