Re: Bitmask trickiness - Mailing list pgsql-general

From Howard Rogers
Subject Re: Bitmask trickiness
Date
Msg-id AANLkTilEEbIR9GourAODbCS7zCO6FqfyyPzWaAa4U_sD@mail.gmail.com
Whole thread Raw
In response to Re: Bitmask trickiness  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-general
On Fri, Jul 23, 2010 at 8:37 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers <hjr@diznix.com> wrote:
>> On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
>> <peter.hunsberger@gmail.com> wrote:
>>> On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers <hjr@diznix.com> wrote:
>>>>
>>>> On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>>>
>>>>
>>>> >>
>>>> >> Why on Earth would I want to store this sort of stuff in a bit string?!
>>>> >
>>>> > Because you are manipulating bits and not integers?  I guess there are
>>>> > 10 kinds of people, those who like think in binary and those who
>>>> > don't.
>>>> >
>>>> >> I don't know about you, but I find looking at 21205 a darn'd site
>>>> >> easier than staring blankly at 101001011010101!!
>>>
>>> <snip> lots of stuff</snip>
>>>
>>>> >
>>>> > Note you can cast integer to bitstring, but there may be some odd
>>>> > behaviour for sign bits and such.  Which is again why I'd use the
>>>> > right type for the job, bit string.  But it's your project.
>>>> >
>>>>
>>>> Quoting...
>>>>
>>>> > Because you are manipulating bits and not integers?  I guess there are
>>>> > 10 kinds of people, those who like think in binary and those who
>>>> > don't.
>>>>
>>>> Er, no. 21205 is not an integer. It's an encoded bit of magic.
>>>>
>>>
>>> In that case your database design is fundamentally broken.  A database
>>> should have content fields that map to the needs of the application.
>>> As you describe your application requirements, that is a bit string
>>> and not an integer.  Use bit strings and your application logic is
>>> transparent, obvious and easy to maintain.  Use integers and you have
>>> to resort to "magic".  As you say, it's your choice, but you came here
>>> looking for advice and the advice you were given is very good....
>>>
>>> --
>>> Peter Hunsberger
>>
>> Hi Peter:
>>
>> It wasn't, as the original poster pointed out, 'advice' that was given
>> so much as personal preference. Had someone said, 'ah, but you see
>> storing your 15 meanings in decimal uses up 5 bytes, whereas a
>> bitstring only requires 15 bits, and over 10,000,000 records, the
>> saving of 3 bytes per record adds up...', then that would be technical
>> advice I could listen to, assess and make a call on.
>
> You do realize the first page I linked to told you that, right?  It's
> not a particularly big page.  I had made the erroneous assumption
> you'd read the link I posted.

If you mean, did I read the bit in the doco where it said nothing at
all in the 'these are great advantages' style I've just described, but
instead makes the fairly obvious point that a bit string takes 8 bits
to store a group of 8 bits (well, stone me!!) PLUS has extra overhead,
then yes, I did read that part of your first link... and nevertheless
concluded that, overall, there is... er, some extra overhead in
storing bitstrings.

So what precisely about that first article, which I did indeed read,
would you have expected to lead me to the conclusion that I'd SAVE
significant amounts of space or find some other technically-compelling
reason for switching?

My point is that there's nothing much in it, storage-wise, either way.
So there's no compelling technical reason to switch. And without a
technically-compelling reason, the rest of the post I was referring to
simply boiled down, as far as I could tell, to a matter of personal
preference. No less valid for that, of course. But ultimately, not
something that would hold much sway with me.

>> But simply saying "your design is broken... wooooo!" might well scare
>> the children, but doesn't really do anything for me, because I know
>> for a certainty that it's not broken at all.
>
> I asked if there was a reason you were avoiding bit strings.  Hardly a
> "your design is broken" point.

I'm getting a bit fed up of this thread now. It wasn't YOU that ever
said 'the design is broken', and I never suggested it was. That was
Peter Hunsberger, about three posts up in the thread, who wrote "In
that case your database design is fundamentally broken."

If you're going to take umbrage at something, please take umbrage at
things that were actually directed at you in the first place!

> You've now said why you are not using
> the type that was designed to handle bit strings for bit strings.
>
> I personally would store them as bit strings and change representation
> for users.

I'm a user, too. I get to see this stuff every time I do a select
statement. At the command line. Which I use a lot.

> There are some issues that come up if your bit strings are
> long enough to get close to the last bit in an integer (also mentioned
> on the links I posted that didn't get read).

Don't make false assumptions about other people, please. You don't
know what I read or didn't read. Just because you didn't make a
compelling technical argument in favour of bitstrings doesn't mean I
didn't read the article you linked to ...that also didn't make a
compelling technical argument in favour of bitstrings.

> But other than that it
> should work fine.

Yes, I know. I've only been using this technique for five years on
Oracle! I would be very surprised indeed if it wasn't transferrable to
PostgreSQL.

Still doesn't answer the precise, specific technical question I
actually asked, though, does it?!

And since there is indeed no technical content in these continued
to-and-fro posts, I'll be leaving it there, if that's OK.

Thanks for your contributions to date. I did read them (including both
of the links you included), and the second one especially was
enlightening, for which I am grateful.

Regards
HJR

pgsql-general by date:

Previous
From: "A. Kretschmer"
Date:
Subject: Re: Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime
Next
From: std pik
Date:
Subject: Information Extract