Thread: Open 6.4 items

Open 6.4 items

From

Bruce Momjian

Date:

06 October 1998, 23:20:44

Additions
---------
test new cidr/IP address type(Tom Helbekkmo)
complete rewrite system changes(Jan)
CREATE TABLE test (x text, s serial) fails if no database creation permission
regression test all platforms
vacuum crash

Serious Items
------------
change pg args for platforms that don't support argv changes
    (setproctitle()?, sendmail hack?)

Docs
----
man pages/sgml synchronization
generate html/postscript documentation
make sure all changes are documented properly

Minor items
-----------
cnf-ify still can exhaust memory, make SET KSQO more generic
permissions on indexes:  what do they do?  should it be prevented?
multi-verion concurrency control - work-in-progress for 6.5
improve reporting of syntax errors by showing location of error in query
use index with constants on functions
allow chaining of pages to allow >8k tuples
allow multiple generic operators in expressions without the use of parentheses
document/trigger/rule so changes to pg_shadow create pg_pwd
large objects orphanage
improve group handling
no min/max for oid type
improve PRIMARY KEY handling
generate postmaster pid file and remove flock/fcntl lock code


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

07 October 1998, 05:26:37

Bruce Momjian <maillist@candle.pha.pa.us> writes:

> test new cidr/IP address type(Tom Helbekkmo)

Looks good to me.  I haven't done really heavy testing yet, and I'd
also like to update the regression test -- what I included was just a
very quick sequence to see that it basically worked, but we should
have a more comprehensive test.  No great hurry there, though: for
now, I'd say it's ready for shipping, modulo the renaming of IPADDR to
INET, for which I'm sending a separate patch kit.

One problem though, seemingly lately introduced:  It's nice to be able
to input IP addresses in various formats, for compatibility with other
software.  One of the common formats is a network byte order hex
string, like 0x12345678.  Until very recently, I could check what the
heck the actual address behind such a representation was, by going
"select '0x12345678'::ipaddr;".  This no longer works, because the
system now helpfully transforms the hex into a long int or something
and then tries to treat the result as an ipaddr.  Uh-oh.

The intuitively correct thing would be for the hex string to be read
and converted as a numeric value only if the type it is being cast to
is, indeed, numeric in nature.  In the given case, it should be up to
ipaddr_in() to make sense of the character string.  Or what do you say?

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Bruce Momjian

Date:

07 October 1998, 20:12:51

> Bruce Momjian <maillist@candle.pha.pa.us> writes:
>
> > test new cidr/IP address type(Tom Helbekkmo)
>
> Looks good to me.  I haven't done really heavy testing yet, and I'd
> also like to update the regression test -- what I included was just a
> very quick sequence to see that it basically worked, but we should
> have a more comprehensive test.  No great hurry there, though: for
> now, I'd say it's ready for shipping, modulo the renaming of IPADDR to
> INET, for which I'm sending a separate patch kit.

OK.  I think Thomas is adding it to the regression tests, particularly
so people can see how it works.  Your readme is now in the manual.


> One problem though, seemingly lately introduced:  It's nice to be able
> to input IP addresses in various formats, for compatibility with other
> software.  One of the common formats is a network byte order hex
> string, like 0x12345678.  Until very recently, I could check what the
> heck the actual address behind such a representation was, by going
> "select '0x12345678'::ipaddr;".  This no longer works, because the
> system now helpfully transforms the hex into a long int or something
> and then tries to treat the result as an ipaddr.  Uh-oh.
>
> The intuitively correct thing would be for the hex string to be read
> and converted as a numeric value only if the type it is being cast to
> is, indeed, numeric in nature.  In the given case, it should be up to
> ipaddr_in() to make sense of the character string.  Or what do you say?

Thomas?

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [HACKERS] Open 6.4 items

From

"Thomas G. Lockhart"

Date:

08 October 1998, 03:13:13

> > Looks good to me.  I haven't done really heavy testing yet, and I'd
> > also like to update the regression test
> OK.  I think Thomas is adding it to the regression tests, particularly
> so people can see how it works.  Your readme is now in the manual.

It would be great if you could gin up a real regression test, since I'm
up to my eyeballs in other Postgres projects. But if you can't, or want
some help, let me know. I'd be happy to help with integration of the
test...

> > One problem though, seemingly lately introduced:  It's nice
> > to input IP addresses in various formats...
> > One of the common formats is a network byte order hex
> > string, like 0x12345678.  Until very recently, I could check what
> > the heck the actual address behind such a representation was, by
> > going "select '0x12345678'::ipaddr;".  This no longer works, because
> > the system now helpfully transforms the hex into a long int or
> > something and then tries to treat the result as an ipaddr.  Uh-oh.

There is a problem, but this is not the explanation. See below...

> > The intuitively correct thing would be for the hex string to be read
> > and converted as a numeric value only if the type it is being cast
> > to is, indeed, numeric in nature.  In the given case, it should be
> > up to ipaddr_in() to make sense of the character string.

What you describe as "no longer works" is actually a core dump on my
machine:

postgres=> select 'x12345678'::ipaddr;
ERROR:  could not parse "x12345678"
postgres=> select '0x12345678'::ipaddr;
pqReadData() -- backend closed the channel unexpectedly.

Any single-quote-delimited string is passed to the corresponding _in()
routine without change. So the system is not changing anything before
ipaddr_in() sees it for input. It appears that the _in() code tries to
handle strings starting with "X", but fails to do so (though I'm not
exactly sure what a correct value would look like; I would have guessed
that something like 'x12121212' might be legal, but it fails). It also
appears that the code crashes if there is a leading zero and an
otherwise hex-looking value following, even though it at least sometimes
is checking that each character is valid.

I haven't looked at it further, but the core dump feature should be a
"must-fix" and I would put the "hex input" in the same category since
the code claims to handle it but doesn't. Volunteers?

                      - Tom

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

09 October 1998, 04:51:45

"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:

(Oh, and "::ipaddr" is now "::inet", of course...)

> postgres=> select 'x12345678'::ipaddr;
> ERROR:  could not parse "x12345678"

This is as it should be -- it demands the leading 0 to work, as in:

> postgres=> select '0x12345678'::ipaddr;

...which should return:

?column?
---------------
18.52.86.120/32
(1 row)

but in your case gives:

> pqReadData() -- backend closed the channel unexpectedly.

...or a number of other, weird errors, with or without crashes, but
mostly with.  It also changes with time, so it looks very much like
a memory corruption.  Now, I've read and re-read my code, and I'm
pretty damn sure that I don't write outside the palloc()ed area at
all, and I do palloc() the sum of VARHDRSZ and the size of the data
structure I'm storing.  I also set VARSIZE(dst) to the right number
(the same as the number of bytes I palloc()).

Looking at debug output from the backend, it is obvious that inet_in()
does the right thing: the expected 12 byte data structure looks the
way it ought to, and since all written data ends up inside it (that
is, it's filled with what I expect to see there), it's hard to imagine
what might end up outside.  It's also very strange that this happens
only when the '0x12345678'::inet format is used.  It would be natural
from this to suspect that inet_net_pton() is at fault, but my tests of
that function don't show up any problems.

Weirdly, this patch fixes the problem:

Index: inet.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/inet.c,v
retrieving revision 1.2
diff -r1.2 inet.c
52c52
<     dst = palloc(VARHDRSZ + sizeof(inet_struct));
---
>     dst = palloc(VARHDRSZ + sizeof(inet_struct) + 1);

Now I'm explicitly asking for at least one byte more than I need, and
I'm pretty damn sure that I never touch that extra byte, but something
seems to, since the problem goes away.  It's arrogant of me, I know,
but barring a complete misunderstanding on my part of how variable
size records work (or a stupid bug that I've been staring at for hours
without seeing, of course), I'd say that something outside my code is
at fault.  Any ideas as to how to try to find out?

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Tom Lane

Date:

09 October 1998, 11:09:11

Tom Ivar Helbekkmo <tih@nhh.no> writes:
> Now I'm explicitly asking for at least one byte more than I need, and
> I'm pretty damn sure that I never touch that extra byte, but something
> seems to, since the problem goes away.  It's arrogant of me, I know,
> but barring a complete misunderstanding on my part of how variable
> size records work (or a stupid bug that I've been staring at for hours
> without seeing, of course), I'd say that something outside my code is
> at fault.  Any ideas as to how to try to find out?

Well, I hate to ruin your day, but coming pre-armed with the knowledge
that the code is writing one byte too many, it's pretty obvious that the
first loop in inet_net_pton_ipv4 does indeed do that.  Specifically at
            else if (size-- > 0)
                *++dst = 0, dirty = 0;
where, when size (the number of remaining destination bytes) is reduced
to 0, the code nonetheless advances dst and clears the next byte.
The loop logic is fundamentally faulty: you can't check for emsgsize
overflow until you get a digit that is supposed to go into another byte.
I'd try something like

    tmp = 0;
    ndigits = 0;           // ndigits is # of hex digits seen for cur byte
    while (ch = next hex digit)
    {
        n = numeric equivalent of ch;
        assert(n >= 0 && n <= 15);
        tmp = (tmp << 4) | n;
        if (++ndigits == 2)
        {
            if (size-- <= 0)
                goto emsgsize;
            *dst++ = (u_char) tmp;
            tmp = 0, ndigits = 0;
        }
    }
    if (ndigits)
        goto enoent;    // odd number of hex digits is bogus


BTW, shouldn't this routine clear out all "size" bytes of the
destination, even if the given data doesn't fill it all?  A memset
at the top might be worthwhile.

            regards, tom lane

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

09 October 1998, 16:38:44

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Well, I hate to ruin your day,

You sure didn't!  You made my day!  :-)

> but coming pre-armed with the knowledge
> that the code is writing one byte too many, it's pretty obvious that the
> first loop in inet_net_pton_ipv4 does indeed do that.  Specifically at
>             else if (size-- > 0)
>                 *++dst = 0, dirty = 0;
> where, when size (the number of remaining destination bytes) is reduced
> to 0, the code nonetheless advances dst and clears the next byte.

You're quite right, and I should have caught this one myself, only I
must have goofed when I tested the function.  (It is written in a
style that's just obscure enough that I chose testing it instead of
studying it until I understood in detail what it did.)  As far as I
can tell, the only actual error in Paul Vixie's code is that the two
lines you quote above should be:

            else if (--size > 0)
                *++dst = 0, dirty = 0;

That is, size should be predecremented instead of postdecremented.
I'm cc-ing Paul on this, as I assume he wants to get the fix into
BIND, which is where inet_net_ntop() and inet_net_pton() came from.

> BTW, shouldn't this routine clear out all "size" bytes of the
> destination, even if the given data doesn't fill it all?  A memset
> at the top might be worthwhile.

It does: there is code further down that handles that.  Another
example of the obscure programming style: this function really shows
what a low-level language C is!  :-)

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Tom Lane

Date:

09 October 1998, 17:21:02

Tom Ivar Helbekkmo <tih@nhh.no> writes:
> As far as I
> can tell, the only actual error in Paul Vixie's code is that the two
> lines you quote above should be:
>             else if (--size > 0)
>                 *++dst = 0, dirty = 0;

No, that's still wrong, because it will error out (jump to emsgsize)
one byte sooner than it should.  The loop is fundamentally broken
because it wants to grab and zero a byte before it knows whether there
are any digits for the byte.

I think its behavior for an odd number of digits is wrong too, or at
least pretty nonintuitive.  I like my code a lot better ;-)

            regards, tom lane

Re: [HACKERS] Open 6.4 items

From

Paul A Vixie

Date:

10 October 1998, 01:34:42

i'll take a tiny exception to the declaration that the code is obscure --
since that just means you have not looked at the rest of bind :-).  but
thanks for the bug report, i'll fix this in bind 8.1.3.

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

10 October 1998, 06:28:28

Tom Lane <tgl@sss.pgh.pa.us> writes:

> No, that's still wrong, [...]

I'll leave that to you and Paul.  Right now, the code works as I
expect it to, and that's enough for me.  :-)

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

10 October 1998, 08:12:44

I wrote:

> As far as I can tell, the only actual error in Paul Vixie's code is
> that the two lines you quote above should be:
>
>             else if (--size > 0)
>                 *++dst = 0, dirty = 0;


That is, the patch to apply to fix the INET type is:

Index: inet_net_pton.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/inet_net_pton.c,v
retrieving revision 1.2
diff -r1.2 inet_net_pton.c
120c120
<             else if (size-- > 0)
---
>             else if (--size > 0)

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

10 October 1998, 08:38:36

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Tom Ivar Helbekkmo <tih@nhh.no> writes:
> > As far as I
> > can tell, the only actual error in Paul Vixie's code is that the two
> > lines you quote above should be:
> >             else if (--size > 0)
> >                 *++dst = 0, dirty = 0;
>
> No, that's still wrong, because it will error out (jump to emsgsize)
> one byte sooner than it should.  The loop is fundamentally broken
> because it wants to grab and zero a byte before it knows whether there
> are any digits for the byte.

Damn!  You're right!  I swear I tested that change at home, and found
it to do what I wanted -- but now that I've put it into my production
system at work, it doesn't work here.  Heck, I even read the code very
carefully to verify that it was all that was needed!  I think I really
need the holiday I'm taking next week!

To keep this in sync with BIND, maybe Paul could take a look, and fix
the code the way he wants it there?

Whoops.  It just hit me.  When I tested my "quick fix" last night, I
just looked at what happened to the data, and ignored the return
value.  Silly of me...

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

10 October 1998, 14:15:05

Tom Lane <tgl@sss.pgh.pa.us> writes:

> I think its behavior for an odd number of digits is wrong too, or at
> least pretty nonintuitive.  I like my code a lot better ;-)

Well, I like your code better too on closer inspection.  I changed it
a bit, though, because I don't feel that an odd number of digits in
the hex string is wrong at all -- it's just a representation of a bit
string, with the representation padded to an even multiple of four
bits.

Anyway, here's what I ended up with for the code in question:

        if (ch == '0'&& (src[0] == 'x'|| src[0] == 'X')
                && isascii(src[1]) && isxdigit(src[1]))
        {
                /* Hexadecimal: Eat nybble string. */
                if (size <= 0)
                        goto emsgsize;
                tmp = 0;
                dirty = 0;
                src++;                                  /* skip x or X. */
                while ((ch = *src++) != '\0'&&
                           isascii(ch) && isxdigit(ch))
                {
                        if (isupper(ch))
                                ch = tolower(ch);
                        n = strchr(xdigits, ch) - xdigits;
                        assert(n >= 0 && n <= 15);
                        tmp = (tmp << 4) | n;
                        if (++dirty == 2) {
                                if (size-- <= 0)
                                        goto emsgsize;
                                *dst++ = (u_char) tmp;
                                tmp = 0, dirty = 0;
                        }
                }
                if (dirty) {
                        if (size-- <= 0)
                                goto emsgsize;
                        tmp <<= 4;
                        *dst++ = (u_char) tmp;
                }
        }

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Bruce Momjian

Date:

11 October 1998, 21:31:08

Applied

> I wrote:
>
> > As far as I can tell, the only actual error in Paul Vixie's code is
> > that the two lines you quote above should be:
> >
> >             else if (--size > 0)
> >                 *++dst = 0, dirty = 0;
>
>
> That is, the patch to apply to fix the INET type is:
>
> Index: inet_net_pton.c
> ===================================================================
> RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/inet_net_pton.c,v
> retrieving revision 1.2
> diff -r1.2 inet_net_pton.c
> 120c120
> <             else if (size-- > 0)
> ---
> >             else if (--size > 0)
>
> -tih
> --
> Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"
>
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

RE: [HACKERS] Open 6.4 items

From

"Taral"

Date:

11 October 1998, 22:16:40

> Applied
>
> > I wrote:
> >
> > > As far as I can tell, the only actual error in Paul Vixie's code is
> > > that the two lines you quote above should be:
> > >
> > >             else if (--size > 0)
> > >                 *++dst = 0, dirty = 0;

[snip]

> > <             else if (size-- > 0)
> > ---
> > >             else if (--size > 0)

Didn't we agree this WASN'T the fix for the problem? Something about return
values? I seem to remember that this piece of code needed to be rewritten...

Taral

Re: [HACKERS] Open 6.4 items

From

"Thomas G. Lockhart"

Date:

12 October 1998, 02:02:39

> Didn't we agree this WASN'T the fix for the problem? Something about
> return values? I seem to remember that this piece of code needed to be
> rewritten...

Yup. Bruce, Tom has another patch he posted which will behave better...

                  - Tom

Re: [HACKERS] Open 6.4 items

From

Bruce Momjian

Date:

12 October 1998, 10:00:51

> > Didn't we agree this WASN'T the fix for the problem? Something about
> > return values? I seem to remember that this piece of code needed to be
> > rewritten...
>
> Yup. Bruce, Tom has another patch he posted which will behave better...
>
>                   - Tom
>

I have lost it.  Could someone resend it, and the patch to reverse too.

Thanks.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [HACKERS] Open 6.4 items

From

Tom Ivar Helbekkmo

Date:

12 October 1998, 11:17:22

Bruce Momjian <maillist@candle.pha.pa.us> writes:

> I have lost it.  Could someone resend it, and the patch to reverse too.

This is the one that shouldn't have been applied:

Index: inet_net_pton.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/inet_net_pton.c,v
retrieving revision 1.2
diff -r1.2 inet_net_pton.c
120c120
<             else if (size-- > 0)
---
>             else if (--size > 0)

This is the one that works:

Index: inet_net_pton.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/inet_net_pton.c,v
retrieving revision 1.2
diff -r1.2 inet_net_pton.c
108c108,109
<         *dst = 0, dirty = 0;
---
>         tmp = 0;
>         dirty = 0;
117,122c118,127
<             *dst |= n;
<             if (!dirty++)
<                 *dst <<= 4;
<             else if (size-- > 0)
<                 *++dst = 0, dirty = 0;
<             else
---
>             tmp = (tmp << 4) | n;
>             if (++dirty == 2) {
>                 if (size-- <= 0)
>                     goto emsgsize;
>                 *dst++ = (u_char) tmp;
>                 tmp = 0, dirty = 0;
>             }
>         }
>         if (dirty) {
>             if (size-- <= 0)
123a129,130
>             tmp <<= 4;
>             *dst++ = (u_char) tmp;
125,126d131
<         if (dirty)
<             size--;

-tih
--
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"

Re: [HACKERS] Open 6.4 items

From

Bruce Momjian

Date:

12 October 1998, 11:57:54

Both applied, with the first one reverse applied.  Thanks.


> Bruce Momjian <maillist@candle.pha.pa.us> writes:
>
> > I have lost it.  Could someone resend it, and the patch to reverse too.
>
> This is the one that shouldn't have been applied:
>
> Index: inet_net_pton.c
> ===================================================================
> RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/inet_net_pton.c,v
> retrieving revision 1.2
> diff -r1.2 inet_net_pton.c
> 120c120
> <             else if (size-- > 0)
> ---
> >             else if (--size > 0)
>
> This is the one that works:
>
> Index: inet_net_pton.c
> ===================================================================
> RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/inet_net_pton.c,v
> retrieving revision 1.2
> diff -r1.2 inet_net_pton.c
> 108c108,109
> <         *dst = 0, dirty = 0;
> ---
> >         tmp = 0;
> >         dirty = 0;
> 117,122c118,127
> <             *dst |= n;
> <             if (!dirty++)
> <                 *dst <<= 4;
> <             else if (size-- > 0)
> <                 *++dst = 0, dirty = 0;
> <             else
> ---
> >             tmp = (tmp << 4) | n;
> >             if (++dirty == 2) {
> >                 if (size-- <= 0)
> >                     goto emsgsize;
> >                 *dst++ = (u_char) tmp;
> >                 tmp = 0, dirty = 0;
> >             }
> >         }
> >         if (dirty) {
> >             if (size-- <= 0)
> 123a129,130
> >             tmp <<= 4;
> >             *dst++ = (u_char) tmp;
> 125,126d131
> <         if (dirty)
> <             size--;
>
> -tih
> --
> Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [HACKERS] Open 6.4 items

From

Paul A Vixie

Date:

13 October 1998, 03:38:06

Wow.  Tough room!  I've got to admit that I don't have any test cases
with hex strings in them, and had the hex portion of inet_net_pton_ipv4()
ever been tested here it would have shown every error you folks found:

> From: Tom Ivar Helbekkmo <tih@nhh.no>
> Date: 09 Oct 1998 22:38:37 +0200
>
> As far as I can tell, the only actual error in Paul Vixie's code is that
> the two lines you quote above should be:
>
>             else if (--size > 0)
>                 *++dst = 0, dirty = 0;
>
> That is, size should be predecremented instead of postdecremented.
> I'm cc-ing Paul on this, as I assume he wants to get the fix into
> BIND, which is where inet_net_ntop() and inet_net_pton() came from.

As later bespoken by someone else, this is not yet the right answer.

> > BTW, shouldn't this routine clear out all "size" bytes of the
> > destination, even if the given data doesn't fill it all?  A memset
> > at the top might be worthwhile.

No.  "size" is the maximum I'm allowed to fill -- but I only touch the
bits I need, and I return the number of bits I touched.

> It does: there is code further down that handles that.  Another
> example of the obscure programming style: this function really shows
> what a low-level language C is!  :-)

C is not a language, it is a very smart macro assembler.  Modula-3 is a
language.  I'm told that Eiffel is a language.  Scheme is definitely a
language.  But C is not a language.  It's just what we all use :-).

More seriously, this function is called in the inner loop and speed was
important enough to trade off some simplicity for.  The amazing thing is,
because I was blasting the same byte of *dst twice per loop iteration, it
was actually sped up considerably by the patches suggested here.  I've got
egg on my face this time fershure man.

And as I said, if you think this routine is obscure, that just shows that
you've not looked at the rest of BIND.  This routine is at least self
contained and its bugs don't lean on other bugs.

> Date: Fri, 09 Oct 1998 17:20:22 -0400
> From: Tom Lane <tgl@sss.pgh.pa.us>
>
> > As far as I can tell, the only actual error in Paul Vixie's code is
> > that the two lines you quote above should be:
> >
> >             else if (--size > 0)
> >                 *++dst = 0, dirty = 0;
>
> No, that's still wrong, because it will error out (jump to emsgsize)
> one byte sooner than it should.  The loop is fundamentally broken
> because it wants to grab and zero a byte before it knows whether there
> are any digits for the byte.

Yes.

> From: Tom Ivar Helbekkmo <tih@nhh.no>
> Date: 10 Oct 1998 20:14:51 +0200
>
> > I think its behavior for an odd number of digits is wrong too, or at
> > least pretty nonintuitive.  I like my code a lot better ;-)
>
> Well, I like your code better too on closer inspection.  I changed it
> a bit, though, because I don't feel that an odd number of digits in
> the hex string is wrong at all -- it's just a representation of a bit
> string, with the representation padded to an even multiple of four
> bits.

I agree with this interpretation.  I'm now wracking my brain trying to
remember what other source file I got this code from, in what other
package, in what year, done for what project, because it's possible that
the bug was inherited.

> Anyway, here's what I ended up with for the code in question:
>
>         if (ch == '0'&& (src[0] == 'x'|| src[0] == 'X')
>                 && isascii(src[1]) && isxdigit(src[1]))
>         {
>                 /* Hexadecimal: Eat nybble string. */
>                 if (size <= 0)
>                         goto emsgsize;
>                 tmp = 0;
>                 dirty = 0;
>                 src++;                                  /* skip x or X. */
>                 while ((ch = *src++) != '\0'&&
>                            isascii(ch) && isxdigit(ch))
>                 {
>                         if (isupper(ch))
>                                 ch = tolower(ch);
>                         n = strchr(xdigits, ch) - xdigits;
>                         assert(n >= 0 && n <= 15);
>                         tmp = (tmp << 4) | n;
>                         if (++dirty == 2) {
>                                 if (size-- <= 0)
>                                         goto emsgsize;
>                                 *dst++ = (u_char) tmp;
>                                 tmp = 0, dirty = 0;
>                         }
>                 }
>                 if (dirty) {
>                         if (size-- <= 0)
>                                 goto emsgsize;
>                         tmp <<= 4;
>                         *dst++ = (u_char) tmp;
>                 }
>         }

Since that made some stylistic changes that had nothing to do with the
algorythm, it caught my eye (that's a hint.)  In keeping with its spirit,
I came up with the following.  Comments please?

        if (ch == '0' && (src[0] == 'x' || src[0] == 'X')
            && isascii(src[1]) && isxdigit(src[1])) {
                /* Hexadecimal: Eat nybble string. */
                if (size <= 0)
                        goto emsgsize;
                dirty = 0;
                src++;  /* skip x or X. */
                while ((ch = *src++) != '\0' && isascii(ch) && isxdigit(ch)) {
                        if (isupper(ch))
                                ch = tolower(ch);
                        n = strchr(xdigits, ch) - xdigits;
                        assert(n >= 0 && n <= 15);
                        if (dirty == 0)
                                tmp = n;
                        else
                                tmp = (tmp << 4) | n;
                        if (++dirty == 2) {
                                if (size-- <= 0)
                                        goto emsgsize;
                                *dst++ = (u_char) tmp;
                                dirty = 0;
                        }
                }
                if (dirty) {  /* Odd trailing nybble? */
                        if (size-- <= 0)
                                goto emsgsize;
                        *dst++ = (u_char) (tmp << 4);
                }
        }

For the record, I found my original code completely impenetrable, and I'd
like to both thank those of you who slogged through it, and apologize for
not testing every code path before I released it.  I oughta know better,
and you find folks deserve better.