Re: [HACKERS] Re: [GENERAL] Update of bitmask type - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Re: [GENERAL] Update of bitmask type
Date
Msg-id 11305.939484173@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Re: [GENERAL] Update of bitmask type  (Adriaan Joubert <a.joubert@albourne.com>)
List pgsql-hackers
Adriaan Joubert <a.joubert@albourne.com> writes:
> 1. In the varchar file there are some functions which I believe are for
> the conversion of char(n) to char(m). They take as argument a pointer to
> a char() and a len which is the length of the total data structure. I
> haven't figured out how conversions are implemented within postgres, but
> I would need to transfer the equivalent of an atttypmod value, which
> would contain the length of the bit string to do the conversions.

bpchar(), for example, is actually a user-callable SQL function; it
takes a char(n) value and an atttypmod value and coerces the string
to the right length for that atttypmod.  Although there are no *direct*
references to bpchar() anywhere except in pg_proc, the parser's
SizeTargetExpr routine nonetheless generates calls to it as part of
INSERT and UPDATE queries:

/** SizeTargetExpr()** If the target column type possesses a function named for the type* and having parameter
signature(columntype, int4), we assume that* the type requires coercion to its own length and that the said* function
shouldbe invoked to do that.** Currently, "bpchar" (ie, char(N)) is the only such type, but try* to be more general
thana hard-wired test...*/
 

So, if you want to implement a fixed-length BIT(N) type, the only
real difference between that and an any-width bitstring is the existence
of a coercion function matching SizeTargetExpr's criteria.

BTW, the last line of that comment is in error --- "varchar" also has a
function matching SizeTargetExpr's criteria.  Its function behaves
a little differently, since it only truncates and never pads, but
the interface to the system is the same.

> 2. there is a function _bpchar, which has something to do with arrays,
> but I can't see how it fits in with everything else.

Looks like it is the equivalent of bpchar() for arrays of char(N).

> 3. I need to write a hash function for bitstrings. I know nothing about
> hash functions, except that they are hard to do well. I looked at the
> function for text hashes and that is some weird code (i.e. it took me a
> while to figure out what it did).

If you're looking at the type-specific hash functions in hashfunc.c,
I think they are mostly junk.  They could all be replaced by two
functions, one for pass-by-val types and one for pass-by-ref types, a la
the type-independent hashFunc() in nodeHash.c.

The only situation where you really need a type-specific hasher is with
datatypes that have garbage bits in them (such as padding between struct
elements that might contain uninitialized bits).  If you're careful to
make sure that all unused bits are zeroes, so that logically equivalent
values of your type will always have the same bit contents, then you
should be able to just use hashtext().

Actually, unless you feel a compelling need to support hash indexes
on your datatype, you don't need a hash routine at all.  Certainly
getting btree index support should be a higher-priority item.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Martin Weinberg
Date:
Subject: memory problems in copying large table to STDOUT
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] memory problems in copying large table to STDOUT